Date
1 - 13 of 13
new XPA and Actian gateway - final verdict
magic9@aquari.com
And here is the final verdict from Bill...
"You MUST use the new gateway to get access to over-sized files. I doubt they will release the new gateway for the older Magic versions, though. This is a question for Magic. Bill" So there it is, XPA 4.9? with the new gateway. Thanks to Steven, Bill and the others for their help. Andy |
|
Andy,
toggle quoted message
Show quoted text
I'm happy you're happy. I wish you had included a dump from a BUTIL -STAT, as I had previously requested, so that I could also be privy to the data file's intimate details, such as its fixed record length, whether or not records have a variable length portion, FILE VERSION, ETC., ETC., ETC. Lacking that, I remain confident that I could have made it work for you, but we'll never know. You're welcome. Steven G. Blank
Ξ SGBlank Consulting On 3/20/2023 2:36 PM, magic9@... wrote:
And here is the final verdict from Bill... |
|
magic9@aquari.com
Steven,
Attached is the image of the butil -stat command. I am curious to know what you read from it. Thanks Andy |
|
Andy
toggle quoted message
Show quoted text
At a quick view
1. It's a 2520 byte record with a 4096 page size. So you're getting 1 record per page and almost half the page is empty makkng for a very large file with lots of wasted space
2. There are no unique indexes, making exact data retrieval more difficult and the index is also taking up extra space
3. The total bytes for all of your indexes for the table exceeds 255, which used to be the max. With the newer versions of Action I don't know if that's still true.
Sorry my computer doesn't have reasonable internet right now so I'm doing my response from memory. I'm sure Steve can provide more.
Keith.
From: main@magicu-l.groups.io <main@magicu-l.groups.io> on behalf of magic9@... <magic9@...>
Sent: Tuesday, March 21, 2023 5:48:36 AM To: main@magicu-l.groups.io <main@magicu-l.groups.io> Subject: Re: [magicu-l] new XPA and Actian gateway - final verdict Steven,
Attached is the image of the butil -stat command. I am curious to know what you read from it. Thanks Andy |
|
Andy,
toggle quoted message
Show quoted text
My observations, as I see them:
If you're able, I would strongly recommend you make two other changes to the file:
HTH.
Steven G. Blank Ξ SGBlank Consulting On 3/21/2023 3:48 AM, magic9@... wrote:
Steven, |
|
magic9@aquari.com
Steven,
So I went ahead and made the changes that you suggested and have it running now to see how far it will go and if it will crash. I should take about 3 days to run as it is processing 1 million records every 2 minutes. Fingers crossed. I will report back when I have the result. Thanks Andy |
|
magic9@aquari.com
Hello,
It has been a week now that the program is running using the version 13 file format and Actian 15 Server, with the maximum page size of 16,384 and page compression, and with changing the index fields that were floats to decimals. It is only at about 500 million records (of about 2 billion) and it has slowed down from 2 minutes per million records at the beginning, to nearly 1 hour per million records now.
The file size is about 135 gig.
Is this speed degradation due to the indexing?
Would I have been better off with no page compression and letting the file grow much larger?
Or should I have allowed Actian to split the file into 2 gig pieces?
I ask these questions to know what is the fastest and best way to build this 1 huge file.
This will also have an impact in the future as I will need to add millions of records monthly.
Thanks Andy |
|
Andy,
See replies interspersed: On 3/28/2023 9:58 AM, magic9@...
wrote:
Yes. You will be much better off dropping all the indexes before loading the data and then (re)building the indexes after the fact. If you're using a Magic batch program to load the data, set the file's properties (Ctrl+D) to Open:Reindex, Share:None and, in Database Properties make sure that all Magic Locking is set to None. Be patient — after all the data has been loaded, the index building process will consume many more hours. Resist the urge to three-finger it. DO NOT WRAP THIS PROCESS IN A TRANSACTION OF ANY KIND OR DURATION.
Yes. A possible alternative might be to use Record Compression instead of Page Compression. I believe this would be somewhat faster because each record only gets compressed once, when it's written, instead of an entire page being compressed over and over and over each time a record is added. With that said, please note that ANY compression WILL slow it down. Full stop.
No. Splitting a data file into 2GB segments slows things down terribly. Leave the file alone. Unless you're actually attempting to run the process over a network (the pure definition of insanity) as opposed to 100% locally on the database server, disk throughput is likely your most limiting factor. This is a complex topic in and of itself and typically begins with a discussion of RAID and the various levels of caching.
Obviously, I know nothing of your situation or the impetus behind this endeavor, and it's certainly not my intention to judge. But, with all due respect, I can't help feeling that a design that is based on a single table with a 2500-byte record (as opposed to a relational group of tables having smaller, normalized records) is ultimately doomed to fail. Best of luck, my friend. Steven G. Blank Ξ SGBlank Consulting
|
|
Craig Martin
Is this a corporate database or more a data warehouse type activity?
I've been seduced by Planetscale for a project, maybe you just need to be on another platform behind your Magic app?
From: main@magicu-l.groups.io <main@magicu-l.groups.io> on behalf of Steven Blank <sgblank@...>
Sent: Tuesday, March 28, 2023 11:49 AM To: main@magicu-l.groups.io <main@magicu-l.groups.io> Subject: Re: [magicu-l] new XPA and Actian gateway - final verdict Andy,
See replies interspersed: On 3/28/2023 9:58 AM,
magic9@... wrote:
Yes. You will be much better off dropping all the indexes before loading the data and then (re)building the indexes after the fact. If you're using a Magic batch program to load the data, set the file's properties (Ctrl+D) to Open:Reindex, Share:None and, in Database Properties make sure that all Magic Locking is set to None. Be patient — after all the data has been loaded, the index building process will consume many more hours. Resist the urge to three-finger it. DO NOT WRAP THIS PROCESS IN A TRANSACTION OF ANY KIND OR DURATION.
Yes. A possible alternative might be to use Record Compression instead of Page Compression. I believe this would be somewhat faster because each record only gets compressed once, when it's written, instead of an entire page being compressed over and over and over each time a record is added. With that said, please note that ANY compression WILL slow it down. Full stop.
No. Splitting a data file into 2GB segments slows things down terribly. Leave the file alone. Unless you're actually attempting to run the process over a network (the pure definition of insanity) as opposed to 100% locally on the database server, disk throughput is likely your most limiting factor. This is a complex topic in and of itself and typically begins with a discussion of RAID and the various levels of caching.
Obviously, I know nothing of your situation or the impetus behind this endeavor, and it's certainly not my intention to judge. But, with all due respect, I can't help feeling that a design that is based on a single table with a 2500-byte record (as opposed to a relational group of tables having smaller, normalized records) is ultimately doomed to fail. Best of luck, my friend. Steven G. Blank Ξ SGBlank Consulting
|
|
magic9@aquari.com
Steven, |
|
magic9@aquari.com
Criag,
This data is transactional data that must be kept permanently in order to allow for analysis and auditing. So the database grows larger by the millions monthly. Whats more is that I have already heavily invested in MongoDB for deployment, but all of the management is being done in Magic and I want to keep it like that. So unless Planetscale is some "magic" solution that can replace Actian and allow me to have all the data locally and accessible to Magic, then I don't see the need to change. Thanks Andy |
|
Andy,
Again, please see replies interspersed: On 3/28/2023 1:38 PM, magic9@...
wrote:
This observation is prima facie evidence that your program tree opens and closes the data file, repeatedly, during the process. When Open=Reindex in a task, Magic drops all indexes once, immediately after opening the file but before executing the task's Task Prefix. Magic subsequently recreates those indexes also only once, immediately after the task's Task Prefix has been executed but before closing the file. (You can observe this activity in Magic's Monitor using a simple test program with a dummy file.) This means that, if you observe evidence of indexes being rebuilt repeatedly during the data load process, it can only be because the specific task in which the table is being opened in Reindex Mode has completed processing its entire Data View and its Task Suffix — also repeatedly. In this case, you need to open the affected data file in Reindex Mode at least one level farther up the task tree. On the other hand, if what you observe are simply lulls in processing, when everything seems to pause, it's more likely that the server has had to pause client-comms while it flushes its dirty cache buffers to disk. (suggests a larger cache?) I suggest you use the Zen Monitor to observe activity quantitatively. And if I were to create the table with no indexes, it may be much faster creating the table, but would it not then take just as long later to build the 5 indexes once the 2 billion records are in? No. When loading a new Btrieve data file, it's always significantly faster to load the data and then add indexes, separately, rather than having to maintain all existing indexes (lots o' juggling goin' on) in real time. If Magic won't/can't do it for you using Reindex Mode, then yes, by all means, create the empty file without any indexes (or use BUTIL to drop them from an empty file first) and then, after the data has been loaded, add the indexes one at a time using the Btrieve Maintenance Utility.
Last, regardless of whether you get Reindex Mode to work or use BUTIL to drop and add indexes manually, it just takes a long time to add indexes on a gazillion records — especially that 254-byte whopper — perhaps even as long or longer than it took to add the data in the first place. Again, during reindexing, you can use the Zen Monitor to observe the MicroKernel Sessions' Read Records counts and other activity, if only to reassure yourself that the reindexing process is not hung, even if Windows reports "Not Responding" somewhere.
HTH. Steven G. Blank Ξ SGBlank Consulting |
|
magic9@aquari.com
Steven,
Thank you for this last explanation, it clears up a lot of questions. Thanks Andy |
|