new XPA and Actian gateway - final verdict


magic9@aquari.com
 

And here is the final verdict from Bill...
"You MUST use the new gateway to get access to over-sized files. I doubt they will release the new gateway for the older Magic versions, though. This is a question for Magic. Bill"

So there it is, XPA 4.9? with the new gateway.
Thanks to Steven, Bill and the others for their help.
Andy


Steven Blank
 

Andy,

I'm happy you're happy.

I wish you had included a dump from a BUTIL -STAT, as I had previously requested, so that I could also be privy to the data file's intimate details, such as its fixed record length, whether or not records have a variable length portion, FILE VERSION, ETC., ETC., ETC.

Lacking that, I remain confident that I could have made it work for you, but we'll never know.

You're welcome.

Steven G. Blank
Ξ SGBlank Consulting


On 3/20/2023 2:36 PM, magic9@... wrote:

And here is the final verdict from Bill...
"You MUST use the new gateway to get access to over-sized files. I doubt they will release the new gateway for the older Magic versions, though. This is a question for Magic. Bill"

So there it is, XPA 4.9? with the new gateway.
Thanks to Steven, Bill and the others for their help.
Andy


magic9@aquari.com
 

Steven,
Attached is the image of the butil -stat command.
I am curious to know what you read from it.
Thanks
Andy


Keith Canniff
 

Andy

At a quick view 
1. It's a 2520 byte record with a 4096 page size. So you're getting 1 record per page and almost half the page is empty makkng for a very large file with lots of wasted space
2. There are no unique indexes, making exact data retrieval more difficult and the index is also taking up extra space 
3. The total bytes for all of your indexes for the table exceeds 255, which used to be the max. With the newer versions of Action I don't know if that's still true. 

Sorry my computer doesn't have reasonable internet right now so I'm doing my response from memory. I'm sure Steve can provide more. 

Keith. 


From: main@magicu-l.groups.io <main@magicu-l.groups.io> on behalf of magic9@... <magic9@...>
Sent: Tuesday, March 21, 2023 5:48:36 AM
To: main@magicu-l.groups.io <main@magicu-l.groups.io>
Subject: Re: [magicu-l] new XPA and Actian gateway - final verdict
 
Steven,
Attached is the image of the butil -stat command.
I am curious to know what you read from it.
Thanks
Andy


Steven Blank
 

Andy,

My observations, as I see them:
  • File Version = 9.50 — Because BUTIL reports itself to be v14.20.005.000, it's likely that this empty data file was created in v9.50 file-format simply because your v14 MKDE is configured to do so, that is, to create new data files in v9.50 format. Although it certainly could, Magic's Btrieve gateway does not specify a file-format version when it calls Create(14) — it never has. Magic's Btrieve gateway has always left that decision to the MKDE. SOLUTION: either reconfigure your MKDE to create files in v13.0 format, and then recreate this empty data file, OR use Actian's Rebuild utility to simply rebuild this empty file to File Format = 13.0.
  • Page Size = 4096, Record Length = 2520 — As Keith stated, this page size + record length combination wastes space up the wah-zoo. The direct result is that you're currently running up against the maximum number of pages (FCR, PAT, Data, & Index) supported by the file-format (v9.50) and not the maximum number of records. Increase the page size and you'll increase the number of records able to be stored.  Unlike File Version, Page Size is indeed specified by Magic's Btrieve Gateway. That it only knows of 4096-byte (or less) page sizes is proof of it's age, and this is where my friend Bill is spot-on. However, this limitation is a one-and-done, that is, once a data file has been created, Magic does not care, nor does it need to care, about page size. SOLUTION: again, use Actian's Rebuild utility to simply rebuild this empty data file to Page Size = 16384, the maximum page size available for a file format 13 file.
If you rebuild the data file as described above, I'll go out on a limb and assert that you will be able to load many, many more records than you are currently able to load. Will you be able to load as many as you want? Unknown.



If you're able, I would strongly recommend you make two other changes to the file:
  • Key Type = Float —  With all due respect, floating point values make lousy indexes. Seriously, even if Magic seems to handle Float segments OK, just don't do it. Not for indexed data.
Generally speaking, if your application requires an index to include a large Numeric column, then you'll always be better manually change the column's Stored As:Float Property to Stored As:Packed Decimal.
HTH.

Steven G. Blank
Ξ SGBlank Consulting


On 3/21/2023 3:48 AM, magic9@... wrote:

Steven,
Attached is the image of the butil -stat command.
I am curious to know what you read from it.
Thanks
Andy


magic9@aquari.com
 

Steven,
So I went ahead and made the changes that you suggested and have it running now to see how far it will go and if it will crash.
I should take about 3 days to run as it is processing 1 million records every 2 minutes.
Fingers crossed. 
I will report back when I have the result.
Thanks
Andy


magic9@aquari.com
 

Hello,
It has been a week now that the program is running using the version 13 file format and Actian 15 Server, with the maximum page size of 16,384 and page compression, and with changing the index fields that were floats to decimals.
It is only at about 500 million records (of about 2 billion) and it has slowed down from 2 minutes per million records at the beginning, to nearly 1 hour per million records now.
The file size is about 135 gig.
Is this speed degradation due to the indexing?
Would I have been better off with no page compression and letting the file grow much larger?
Or should I have allowed Actian to split the file into 2 gig pieces?
I ask these questions to know what is the fastest and best way to build this 1 huge file.
This will also have an impact in the future as I will need to add millions of records monthly.
Thanks
Andy


Steven Blank
 

Andy,

See replies interspersed:

On 3/28/2023 9:58 AM, magic9@... wrote:
Hello,
It has been a week now that the program is running using the version 13 file format and Actian 15 Server, with the maximum page size of 16,384 and page compression, and with changing the index fields that were floats to decimals.
It is only at about 500 million records (of about 2 billion) and it has slowed down from 2 minutes per million records at the beginning, to nearly 1 hour per million records now.
The file size is about 135 gig.
Is this speed degradation due to the indexing?

Yes. You will be much better off dropping all the indexes before loading the data and then (re)building the indexes after the fact. If you're using a Magic batch program to load the data, set the file's properties (Ctrl+D) to Open:Reindex, Share:None and, in Database Properties make sure that all Magic Locking is set to None. Be patient — after all the data has been loaded, the index building process will consume many more hours. Resist the urge to three-finger it. DO NOT WRAP THIS PROCESS IN A TRANSACTION OF ANY KIND OR DURATION.

Would I have been better off with no page compression and letting the file grow much larger?

Yes. A possible alternative might be to use Record Compression instead of Page Compression. I believe this would be somewhat faster because each record only gets compressed once, when it's written, instead of an entire page being compressed over and over and over each time a record is added. With that said, please note that ANY compression WILL slow it down. Full stop.

Or should I have allowed Actian to split the file into 2 gig pieces?

No. Splitting a data file into 2GB segments slows things down terribly. Leave the file alone. Unless you're actually attempting to run the process over a network (the pure definition of insanity) as opposed to 100% locally on the database server, disk throughput is likely your most limiting factor. This is a complex topic in and of itself and typically begins with a discussion of RAID and the various levels of caching.

I ask these questions to know what is the fastest and best way to build this 1 huge file.
This will also have an impact in the future as I will need to add millions of records monthly.

Obviously, I know nothing of your situation or the impetus behind this endeavor, and it's certainly not my intention to judge. But, with all due respect, I can't help feeling that a design that is based on a single table with a 2500-byte record (as opposed to a relational group of tables having smaller, normalized records) is ultimately doomed to fail.

Best of luck, my friend.

Steven G. Blank
Ξ SGBlank Consulting

Thanks
Andy


Craig Martin
 

Is this a corporate database or more a data warehouse type activity?

I've been seduced by Planetscale for a project, maybe you just need to be on another platform behind your Magic app?


From: main@magicu-l.groups.io <main@magicu-l.groups.io> on behalf of Steven Blank <sgblank@...>
Sent: Tuesday, March 28, 2023 11:49 AM
To: main@magicu-l.groups.io <main@magicu-l.groups.io>
Subject: Re: [magicu-l] new XPA and Actian gateway - final verdict
 
Andy,

See replies interspersed:

On 3/28/2023 9:58 AM, magic9@... wrote:
Hello,
It has been a week now that the program is running using the version 13 file format and Actian 15 Server, with the maximum page size of 16,384 and page compression, and with changing the index fields that were floats to decimals.
It is only at about 500 million records (of about 2 billion) and it has slowed down from 2 minutes per million records at the beginning, to nearly 1 hour per million records now.
The file size is about 135 gig.
Is this speed degradation due to the indexing?

Yes. You will be much better off dropping all the indexes before loading the data and then (re)building the indexes after the fact. If you're using a Magic batch program to load the data, set the file's properties (Ctrl+D) to Open:Reindex, Share:None and, in Database Properties make sure that all Magic Locking is set to None. Be patient — after all the data has been loaded, the index building process will consume many more hours. Resist the urge to three-finger it. DO NOT WRAP THIS PROCESS IN A TRANSACTION OF ANY KIND OR DURATION.

Would I have been better off with no page compression and letting the file grow much larger?

Yes. A possible alternative might be to use Record Compression instead of Page Compression. I believe this would be somewhat faster because each record only gets compressed once, when it's written, instead of an entire page being compressed over and over and over each time a record is added. With that said, please note that ANY compression WILL slow it down. Full stop.

Or should I have allowed Actian to split the file into 2 gig pieces?

No. Splitting a data file into 2GB segments slows things down terribly. Leave the file alone. Unless you're actually attempting to run the process over a network (the pure definition of insanity) as opposed to 100% locally on the database server, disk throughput is likely your most limiting factor. This is a complex topic in and of itself and typically begins with a discussion of RAID and the various levels of caching.

I ask these questions to know what is the fastest and best way to build this 1 huge file.
This will also have an impact in the future as I will need to add millions of records monthly.

Obviously, I know nothing of your situation or the impetus behind this endeavor, and it's certainly not my intention to judge. But, with all due respect, I can't help feeling that a design that is based on a single table with a 2500-byte record (as opposed to a relational group of tables having smaller, normalized records) is ultimately doomed to fail.

Best of luck, my friend.

Steven G. Blank
Ξ SGBlank Consulting

Thanks
Andy


magic9@aquari.com
 

Steven,
I tried using the open mode Reindex as John suggested, but it kept stopping along the way to index and eventually jammed up the machine. 
And if I were to create the table with no indexes, it may be much faster creating the table, but would it not then take just as long later to build the 5 indexes once the 2 billion records are in?
Thanks
Andy


magic9@aquari.com
 

Criag,
This data is transactional data that must be kept permanently in order to allow for analysis and auditing.
So the database grows larger by the millions monthly.
Whats more is that I have already heavily invested in MongoDB for deployment, but all of the management is being done in Magic and I want to keep it like that.
So unless Planetscale is some "magic" solution that can replace Actian and allow me to have all the data locally and accessible to Magic, then I don't see the need to change.
Thanks
Andy


Steven Blank
 

Andy,

Again, please see replies interspersed:

On 3/28/2023 1:38 PM, magic9@... wrote:

Steven,

I tried using the open mode Reindex as John suggested, but it kept stopping along the way to index ...


This observation is prima facie evidence that your program tree opens and closes the data file, repeatedly, during the process.

When Open=Reindex in a task, Magic drops all indexes once,  immediately after opening the file but before executing the task's Task Prefix. Magic subsequently recreates those indexes also only once, immediately after the task's Task Prefix has been executed but before closing the file. (You can observe this activity in Magic's Monitor using a simple test program with a dummy file.)

This means that, if you observe evidence of indexes being rebuilt repeatedly during the data load process, it can only be because the specific task in which the table is being opened in Reindex Mode has completed processing its entire Data View and its Task Suffix — also repeatedly.

In this case, you need to open the affected data file in Reindex Mode at least one level farther up the task tree.

On the other hand, if what you observe are simply lulls in processing, when everything seems to pause, it's more likely that the server has had to pause client-comms while it flushes its dirty cache buffers to disk. (suggests a larger cache?) I suggest you use the Zen Monitor to observe activity quantitatively.


And if I were to create the table with no indexes, it may be much faster creating the table, but would it not then take just as long later to build the 5 indexes once the 2 billion records are in?


No. When loading a new Btrieve data file, it's always significantly faster to load the data and then add indexes, separately, rather than having to maintain all existing indexes (lots o' juggling goin' on) in real time. If Magic won't/can't do it for you using Reindex Mode, then yes, by all means, create the empty file without any indexes (or use BUTIL to drop them from an empty file first) and then, after the data has been loaded, add the indexes one at a time using the Btrieve Maintenance Utility.


... and eventually jammed up the machine.



Last, regardless of whether you get Reindex Mode to work or use BUTIL to drop and add indexes manually, it just takes a long time to add indexes on a gazillion records — especially that 254-byte whopper — perhaps even as long or longer than it took to add the data in the first place.

Again, during reindexing, you can use the Zen Monitor to observe the MicroKernel Sessions' Read Records counts and other activity, if only to reassure yourself that the reindexing process is not hung, even if Windows reports "Not Responding" somewhere.


Thanks

Andy

HTH.

Steven G. Blank
Ξ SGBlank Consulting





magic9@aquari.com
 

Steven,
Thank you for this last explanation, it clears up a lot of questions.
Thanks
Andy