Multi-Record Operations
Pervasive PSQL provides a high performance mechanism for filtering and returning multiple records or portions of multiple records. The mechanism is called “extended operations” and is supported by four specific operation codes:
For detailed information on how to code these operations, see Btrieve API Guide. This section explains how to optimize your use of these operations for best performance.
Terminology
The following words can have other meanings in other contexts. For the purposes of this section, definitions are provided below.
Descriptor
Also called the Extended Expression. The whole contents of the data buffer which describes how the Btrieve extended operation should be accomplished.
Filter
A portion of the Extended Expression which describes a selection criteria to apply to the records being selected.
Condition
A portion of a filter that uses a single logical operator.
Connector
A Boolean operator that connects a condition to what follows it. Either AND, OR, or NONE
Extractor
A portion of the Extended Expression which defines what data to return.
Key
A whole index definition which may have multiple segments. Get operations require the MicroKernel to move through the data file along a single key path.
Key Segment
Compound indices, also called multi-segmented keys, can have multiple segment definitions. Each segment defines an offset, length, data type, and so on.
Background
The filter evaluation mechanism for extended operations is designed to be very fast. It evaluates expressions in a straightforward manner without doing any extra processing. Because of this approach, you should be aware of some of the idiosyncracies of the interface.
This kind of search must jump around the file somewhat in order to be most efficient, but extended operations do not jump around. They move along a logical key or record path one record at a time. A compound logical expression such as the one above would require an optimizer like the one in the SQL Relational Database Engine (SRDE) that is part of Pervasive PSQL. Instead, the MicroKernel evaluates expressions left to right to make the filter evaluation process as fast as possible.
To do the search above, the caller must make two Get Extended calls, each after first performing a GetEqual operation to position the cursor to the first record. If you use the four conditions from the example above in an extended operation, it would be evaluated like (Age = 32 AND (Sex = "M" OR (Age = 30 AND Sex = "F"))). In other words, for each record, the MicroKernel evaluates the first condition, then looks at the Boolean operator to determine whether it must evaluate the next condition. If the first condition is false, the AND operator means the whole expression is false.
Validation
There are many ways by which an extended operation can return a status 62 (Invalid Descriptor). Below is a list of some of the most common:
Optimization
Optimizing an extended operation means that the MicroKernel can stop looking at records along the current key path because there is no possibility that records remaining on the key path can satisfy the filter. Starting with Pervasive PSQL 2000i SP3, the MicroKernel can optimize on multiple conditions as long as they occur sequentially and match the segments of the current key.
As it evaluates each condition, the MicroKernel determines if the given segment can be optimized. In order to do that, all the following must be true:
The actual direction is dependent on both the direction indicated by the operation, but also on whether the current key is descending or ascending.
Any one of the following can cause optimization to be turned off for all future segments.
*Note: There is a defect in Pervasive PSQL 2000i SP3 that requires the optimizable conditions to occur first in the filter. Thus, the ability to put non-optimizable conditions before the optimizable conditions is only available after SP3. This ability was available before SP3, but only one condition could be optimized.
Examples
The examples below refer to the following sample data:
Consider the table above, which has a compound key on fields 1, 2 and 3 in that order. Assume that the application performs a GetFirst operation on this file using this three-segment key, followed up with a GetNextExtended operation. Notice that these examples contain parentheses where they are assumed to be. This is the only way parentheses can occur when the filter is evaluated from left to right.
Remember that in order to optimize against a key segment, the offset, length, type, Case, & ACS identified in the filter condition all must be the same as the key definition.
(Field1 = AAA AND (Field2 = AAA AND (Field3 = AAA)))
The MKDE retrieves record 1 and stops searching with status 64 (Optimization limit exceeded). The last record examined that satisfies the optimization criteria is record 1 which becomes the current record. Engines before Pervasive.SQL 2000 SP3 can optimize only one condition and thus leave the current record at record 3.
(Field1 = AAA OR (Field2 = AAA OR (Field3 = AAA)))
The MKDE retrieves records 1, 2, 3, 4, 5, 6, 7, 10, 13 & 14 and returns with status 9 (End of file reached). No condition can be optimized since the first condition contains an OR connector. The current record becomes record 14.
(Field1 = BBB AND (Field2 = BBB OR (Field3 = BBB)))
The MKDE would retrieve records 5, 7, 8, 9 & 11 and return with status 64. The first condition was optimized, but the second condition was not since it contained an OR connector. The last record examined that satisfies the optimization criteria is record 12 which becomes the current record.
(Field4 = OOO AND (Field2 = BBB AND (Field3 = BBB)))
The MKDE retrieves records 2 & 8 and return with status 9. No condition can be optimized to the first key segment, so the following segments cannot be optimized. The current record becomes record 14.
(Field1 = BBB AND (first byte of Field2 = B AND (Field3 = BBB)))
The MKDE retrieves record 8 and returns with status 64. The first two conditions can be optimized, but since the second condition is a substring, the third condition cannot be optimized. The last record examined that satisfies the optimization criteria is record 9. Engines prior to Pervasive.SQL 2000 SP3 can optimize only one condition and thus return with record 12 as the current record.
(Field1 = BBB AND (Field2 = Field3))
This is done by using the +64 bias on the comparison code, which indicates that the second operand is another field of the record, rather than a constant. The MKDE retrieves records 4, 8 & 12 and returns with status 64. The first condition can be optimized, but since the second condition does not compare to a constant, it cannot be optimized. The last record examined that satisfies the optimization criteria is record 12.
(Field1<= BBB AND (Field2 <= BBB AND (Field3 <= BBB)))
The MKDE retrieves records 1, 2, 4, 5, 7, & 8 and returns with status 64. The first condition can be optimized, but since it does not have a logical operator of EQ, the following conditions cannot be optimized. The last record examined that satisfies the optimization criteria is record 12.
(Field1= BBB AND (Field2 < BBB AND (Field3 < BBB)))
The MKDE retrieves record 4 and returns with status 64. The first condition can be optimized, but since the second condition does not have a logical operator of EQ, it cannot be optimized. The last record examined that satisfies the optimization criteria would be record 12.
(Field1= BBB AND (Field2 = BBB AND (Field3 < BBB)))
The MKDE retrieves record 7 and returns with status 64. The first two conditions can be optimized because they use EQ, but the third condition cannot. The last record examined that satisfies the optimization criteria is record 9. Engines prior to Pervasive.SQL 2000 SP3 can optimize on only one condition and so the current record would be 12.
(Field2>= AAA AND (Field2 <= BBB AND (Field1 >= AAA) AND (Field1 <= BBB))))
The MKDE retrieves records 1, 2, 4, 5, 6, 7, 8 & 9 and returns with status 64. The first three conditions cannot be optimized to the first key segment, but since they are all ANDed together, the fourth condition can be used to optimize the search. The second condition would be optimizable if it occurred immediately after the fourth condition. But since it is out of position relative to the key segments, it cannot be optimized. Since only one key segment is optimized, the last record examined that satisfies the optimization criteria would be record 12. Note that there is a defect in Pervasive.SQL 2000 SP3 that prevents optimization unless the optimizable condition occurs first. So the SP3 engine would retrieve the same records, but would return status 9.
Performance Tips
This section provides some information on how to speed up your operations.
Connectors
Since extended operations evaluate logical expressions left to right, this feature is by no means a complete expression evaluator that can be used to extract whatever data you want in the most efficient manner. Extended operations are designed to be used in conjunction with a Get or Step operation to initially set the cursor in the file to the correct position. So the first suggestion is:
In other words, it may be appropriate to add some OR’ed conditions for fields not in the index after an optimizable condition for an indexed field. For example, let say you are searching a nationwide phone book for everyone in Texas with first name "William", "Bill", "Billy" or "Billybob". Using a key on the State field, you would use GetEqual to set the current record on the first person in Texas. Then call GetNextExtended with a filter like (State = "Texas" AND (FirstName = "William" OR (FirstName = "Bill" OR (FirstName = "Billy" OR (FirstName = "Billybob"))))). If your Extractor indicated a reject count of 10,000 and a return count of 100 records, the GetNextExtended would return after looking at about 10,000 records. But with 14 million people in Texas, you will need to keep issuing the same GetNextExtended operation over and over until you finally get to Utah and a status 64 (Optimization Limit exceeded). This process would be much faster than having each record transferred to your application one at a time.
But what if a compound index existed on State and FirstName? The GetNextExtended above would still work, but it would be much faster to do a GetEqual and GetNextExtended on each of the four State & FirstName combinations, optimizing on both fields.
So you can see that having filters with OR connectors is only useful when no index is available. Priority should be placed on AND connectors for fields that match the key.
Reject Count
Another issue that you should understand is how to set the reject count. If your application is the only one being serviced by a MicroKernel engine, then using the maximum reject count is most efficient since it keeps that network traffic or inter-process communication to a minimum. However, if there are a lot of applications running in a highly concurrent environment, there can be serious consequences if you have a reject count that is too large.
The MicroKernel can handle many Btrieve requests concurrently even while doing Btrieve operations atomically on a file. So it allows any number of reader threads, but only one write thread, to access the same file. Most Btrieve operations are read operations and they do not take much time to accomplish. So if a write operation comes in, it waits until all readers are finished before it locks the file for the instant it takes to insert, update, or delete a record. This coordination works great until a read operation comes in that takes a long time to finish. That is what an extended operation with a high reject count will do if it does not find any records. It keeps reading and reading and reading. The other read operation can get done without a problem, but the write operations start to back up. After a write operation has tried to get write access to the file 100 times, it reaches what is called a frustration count. At this time, it puts a block on all new reader threads. So now all Btrieve operations on this file are hung until the extended operation is done.
Even with a reject count of 100 to 1000, it is better to have the MicroKernel read and reject them than it is to have the records returned to your application to reject them.