Optimizing Your Database
The MicroKernel Engine provides several features that allow you to conserve disk space and improve system performance. These features include the following:
Duplicatable Keys
If you define a key to be duplicatable, the MicroKernel Engine allows more than one record to have the same value for that key. Otherwise, each record must have a unique value for the key. If one segment of a segmented key is duplicatable, all the segments must be duplicatable.
Linked-Duplicatable Keys
By default in a v7.0 or later file, the MicroKernel Engine stores duplicatable keys as linked-duplicatable keys. When the first record with a duplicate key value is inserted into a file, the MicroKernel Engine stores the key value on an index page. The MicroKernel Engine also initializes two pointers to identify the first and last record with this key value. Additionally, the MicroKernel Engine stores a pair of pointers at the end of the record on the data page. These pointers identify the previous and next record with the same key value. When you create a file, you can reserve pointers for use in creating linked-duplicatable keys in the future.
If you anticipate adding duplicatable keys after you create a data file and you want the keys to use the linked-duplicatable method, you can preallocate space for pointers in the file.
Repeating-Duplicatable Keys
If no room is available to create a linked-duplicatable key (that is, if no duplicate pointers are available), the MicroKernel Engine creates a repeating-duplicatable key. The MicroKernel Engine stores every key value of a repeating-duplicatable key both on a data page and on an index page. In other words, the key’s value resides in the record on a data page and is repeated in the key entry on an index page.
*Note: For users of pre-6.0 Btrieve, the term linked-duplicatable corresponds to permanent, and the term repeating-duplicatable corresponds to supplemental.
You can define a key as a repeating-duplicatable key by setting bit 7 (0x80) of the key flags in the key’s specification block on a Create (14) or Create Index (31) operation. Prior to 6.10, you could not define a key to be a repeating-duplicatable key, and bit 7 of the key flags was not user-definable. In 6.0 and later, the Stat operation (15) sets bit 7 if no room is available to create a linked-duplicatable key (and therefore, if the MicroKernel Engine has to create the key as a repeating-duplicatable key).
Files that use the 5.x format use this same key flag (called the supplemental key attribute in 5.x) to designate that a key was created with the 5.x Create Supplemental Index operation (31).
*Note: In pre-6.0 files, you can drop supplemental indexes only. Permanent indexes, as their name implies, cannot be dropped. In 6.0 and later files, you can drop any index.
Linked vs. Repeating
Each method has performance advantages:
There are trade-offs in performance between linked-duplicate keys and repeating-duplicate keys. Generally, if the average number of duplicates of a key is two or more, a linked-duplicate key will take up less space on the disk and searches will be generally faster because there are fewer index pages. However, the opposite is true if your file stores very few records with duplicate keys and the key length is very short. This is because an entry in the linked-duplicatable tree requires 8 bytes for pointers, whereas a repeating-duplicatable key entry requires 4.
If only a small percentage of keys have any duplicates, it is advantageous to use repeating-duplicate keys to save the extra 8 bytes in each data record. There is no noticeable performance advantage to either choice when the average number of duplicates of a key is less than two.
If you expect several concurrent transactions to be active on the same file at the same time, repeating-duplicate keys will provide a greater probability that these transactions do not try to access the same pages. All pages involved in writes during a concurrent transaction have implicit locks on them. When a page is required in order to make a change during a concurrent transaction and the page is involved in another concurrent transaction, the MicroKernel Engine might wait until the other transaction is complete. If these kind of implicit waits happen very often, the performance of the application goes down.
Neither key storage method is recommended as a shortcut for tracking chronological order. For linked-duplicatable keys, the MicroKernel Engine does maintain the chronological order of records that are inserted after the key is created, but if you rebuild the key’s index, the chronological order is lost. For repeating-duplicatable keys, the MicroKernel Engine maintains the chronological order of the records only if there was no record deletion between the time the key is built and a new record is inserted. To track chronological order, use an AUTOINCREMENT data type on the key.
Page Preallocation
Preallocation guarantees that disk space is available when the MicroKernel Engine needs it. The MicroKernel Engine allows you to preallocate up to 65,535 pages to a file when you create a data file. Table 26 shows the maximum number of bytes the MicroKernel Engine allocates for a file of each possible page size (assuming you allocate 65,535 full pages of disk space):
1Value is page size multiplied by 65,535
If not enough space exists on the disk to preallocate the number of pages you specify, the MicroKernel Engine returns Status Code 18 (Disk Full) and does not create the file.
The speed of file operations can be enhanced if a data file occupies a contiguous area on the disk. The increase in speed is most noticeable on very large files. To preallocate contiguous disk space for a file, the device on which you are creating the file must have the required number of bytes of contiguous free space available. The MicroKernel Engine preallocates the number of pages you specify, whether or not the space on the disk is contiguous.
Use the formulas described earlier in this chapter to determine the number of data and index pages the file requires. Round any remainder from this part of the calculation to the next highest whole number.
When you preallocate pages for a file, that file actually occupies that area of the disk. No other data file can use the preallocated area of disk until you delete or replace that file.
As you insert records, the MicroKernel Engine uses the preallocated space for data and indexes. When all the preallocated space for the file is in use, the MicroKernel Engine expands the file as new records are inserted.
When you issue an API Stat operation (15), the MicroKernel Engine returns the difference between the number of pages you allocated at the time you created the file and the number of pages that the MicroKernel Engine currently has in use. This number is always less than the number of pages you specify for preallocation because the MicroKernel Engine considers a certain number of pages to be in use when a file is created, even if you have not inserted any records.
Once a file page is in use, it remains in use even if you delete all the records stored on that page. The number of unused pages that the Stat operation returns never increases. When you delete a record, the MicroKernel Engine maintains a list of free space in the file and reuses the available space when you insert new records.
Even if the number of unused pages that the Stat operation returns is zero, the file might still have free space available. The number of unused pages can be zero if one of the following is true:
Blank Truncation
If you choose to truncate blanks, the MicroKernel Engine does not store any trailing blanks (ASCII Space code 32 or hexadecimal 0x20) in the variable-length portion of the record when it writes the record to a file. Blank truncation has no effect on the fixed-length portion of a record. The MicroKernel Engine does not remove blanks that are embedded in the data.
When you read a record that contains truncated trailing blanks, the MicroKernel Engine expands the record to its original length. The value the MicroKernel Engine returns in the Data Buffer Length parameter includes any expanded blanks. Blank truncation adds either 2 bytes or 4 bytes of overhead to the physical size of the record (stored with the fixed-record portion): 2 if the file does not use VATs, 4 if it does.
Record Compression
When you create a file, you can specify whether you want the MicroKernel Engine to compress the data records when it stores them in the file. Record compression can result in a significant reduction of the space needed to store records that contain many repeating characters. The MicroKernel Engine compresses five or more of the same contiguous characters into 5 bytes.
Consider using record compression in the following circumstances:
*Note: The database engine automatically uses record compression on files that use system data and have a record length that exceeds the maximum length allowed. See Table 12.
When you perform record I/O on compressed files, the MicroKernel Engine uses a compression buffer to provide a block of memory for the record compression and expansion process. To ensure sufficient memory to compress or expand a record, the MicroKernel Engine requires enough buffer space to store twice the length of the longest record your task inserts into the compressed file. This requirement can have an impact on the amount of free memory remaining in the computer after the MicroKernel Engine loads. For example, if the longest record your task writes or retrieves is 64 KB long, the MicroKernel Engine requires 128 KB of memory to compress and expand that record.
*Note: If your file uses VATs, the MicroKernel Engine’s requirement for buffer space is the product of 16 times the file’s page size. For example, on a 4 KB record, 64 KB of memory are required to compress and expand the record.
Because the final length of a compressed record cannot be determined until the record is written to the file, the MicroKernel Engine always creates a compressed file as a variable-length record file. In the data page, the MicroKernel Engine stores either 7 bytes (if the file does not use VATs) or 11 bytes (if it does), plus an additional 8 bytes for each duplicate key pointer. The MicroKernel Engine then stores the record on the variable page. Because the compressed images of the records are stored as variable-length records, individual records may become fragmented across several file pages if your task performs frequent insertions, updates, and deletions. The fragmentation can result in slower access times because the MicroKernel Engine may need to read multiple file pages to retrieve a single record.
The record compression option is most effective when each record has the potential for containing a large number of repeating characters. For example, a record may contain several fields, all of which may be initialized to blanks by your task when it inserts the record into the file. Compression is more efficient if these fields are grouped together in the record, rather than being separated by fields containing other values.
To use record compression, the file must have been created with the compression flag set. Key-only files do not allow compression.
Index Balancing
The MicroKernel Engine allows you to further conserve disk space by employing index balancing. By default, the MicroKernel Engine does not use index balancing, so that each time a current index page is filled, the MicroKernel Engine must create a new index page. When index balancing is active, the MicroKernel Engine can frequently avoid creating a new index page each time a current index page is filled. Index balancing forces the MicroKernel Engine to look for available space on adjoining index pages. If space is available on one of those pages, the MicroKernel Engine moves keys from the full index page to the page with free space.
The balancing process not only results in fewer index pages but also produces more densely populated indexes, better overall disk usage, and faster response on most read operations. If you add keys to the file in sorted order, index page usage increases from 50 percent to nearly 100 percent when you use index balancing. If you add keys randomly, the minimum index page usage increases from 50 percent to 66 percent.
On insert and update operations, the balancing logic requires the MicroKernel Engine to examine more pages in the file and might possibly require more disk I/O. The extra disk I/O slows down file updates. Although the exact effects of balancing indexes vary in different situations, using index balancing typically degrades performance on write operations by about 5 to 10 percent.
The MicroKernel Engine allows you to fine-tune your MicroKernel Engine environment by offering two ways to turn on index balancing: at the engine level or at the file level. If you specify the Index Balancing configuration option during setup, the MicroKernel Engine applies index balancing to every file. For a description of how to specify the Index Balancing configuration option, refer to the PSQL User's Guide.
You can also designate that only specific files are to be index balanced. To do so, set bit 5 (0x20) of the file’s file flags at creation time. If the index Balancing configuration option is off when the MicroKernel Engine is started, the MicroKernel Engine applies index balancing only to indexes on files that have bit 5 of the file flags set.
The MicroKernel Engine ignores bit 5 in all files’ file flags if the index balancing configuration option was on when the MicroKernel Engine was started. In this situation, the MicroKernel Engine applies index balancing to every file.
Files remain compatible regardless of whether index balancing is active. Also, you do not have to specify index balancing to access files that contain balanced index pages. If you turn on the MicroKernel Engine’s index balancing option, index pages in existing files are not affected until they become full. The MicroKernel Engine does not re-balance indexes on existing files as a result of enabling this option. Similarly, turning off the index balancing option does not affect existing indexes. Whether this option is on or off determines how the MicroKernel Engine handles full index pages.
Variable-tail Allocation Tables
Variable-tail allocation tables give the MicroKernel Engine faster access to data residing at large offsets in very large records and significantly reduce the buffer sizes the MicroKernel Engine needs when processing records in files that use data compression. Using a record’s VAT, the MicroKernel Engine can divide the variable-length portion of a record into smaller portions and then store those portions in any number of variable tails. The MicroKernel Engine stores an equal amount of the record’s data in each of the record’s variable tails (except the final variable tail). The MicroKernel Engine calculates the amount to store in each variable tail by multiplying the file’s page size by 8. The last variable tail contains any data that remains after the MicroKernel Engine divides the data among the other variable tails.
*Note: The formula for finding the length of a variable tail (eight times the page size) is an implementation detail that may change in future versions of the MicroKernel Engine.
In a file that employs VATs and has a 4,096-byte page size, the first variable tail stores the bytes from offset 0 through 32,767 of the record’s variable portion, the second tail stores offsets 32,768 through 65,535, and so on. The MicroKernel Engine can use the VAT to accelerate a seek to a large offset in a record because the VAT allows it to skip the variable tails containing the record’s lower-offset bytes.
An application specifies whether a file will use VATs when it creates the file. If your application uses chunk operations on huge records (larger than eight times the physical page size) and accesses chunks in a random, nonsequential fashion, VATs may improve your application’s performance. If your application uses whole record operations, VATs do not improve performance because the MicroKernel Engine reads or writes the record sequentially; the MicroKernel Engine skips no bytes in the record.
If your application uses chunk operations but accesses records sequentially (for example, it reads the first 32 KB of a record, then reads the next 32 KB of the record, and so on until the end of the record), VATs do not improve performance, because the MicroKernel Engine saves your position in a record between operations, thereby eliminating the need to seek at the beginning of a chunk operation.
VATs also provide another advantage. When the MicroKernel Engine reads or writes a compressed record, it uses a buffer that must be up to twice as large as the record’s uncompressed size. If the file has VATs, that buffer needs to be only as large as two variable tails (16 times the physical page size).
Key-Only Files
In a key-only file, the entire record is stored with the key, so no data pages are required. Key-only files are useful when your records contain a single key, and that key takes up most of the record. Another common use for a key-only file is as an external index for a standard file.
The following restrictions apply to key-only files:
Key-only files contain only File Control Record pages followed by a number of PAT pages and index pages. If a key-only file has an ACS, it may also have an ACS page. If you use ODBC to define referential integrity constraints on the file, the file may contain one or more variable pages, as well.