Multi-Record Operations

Pervasive PSQL provides a high performance mechanism for filtering and returning multiple records or portions of multiple records. The mechanism is called “extended operations” and is supported by four specific operation codes:

•

Get Next Extended (36)

•

Get Prev Extended (37)

•

Step Next Extended (38)

•

Step Prev Extended (39)

For detailed information on how to code these operations, see Btrieve API Guide. This section explains how to optimize your use of these operations for best performance.

Terminology

The following words can have other meanings in other contexts. For the purposes of this section, definitions are provided below.

Descriptor
Also called the Extended Expression. The whole contents of the data buffer which describes how the Btrieve extended operation should be accomplished.

Filter
A portion of the Extended Expression which describes a selection criteria to apply to the records being selected.

Condition
A portion of a filter that uses a single logical operator.

Connector
A Boolean operator that connects a condition to what follows it. Either AND, OR, or NONE

Extractor
A portion of the Extended Expression which defines what data to return.

Key
A whole index definition which may have multiple segments. Get operations require the MicroKernel to move through the data file along a single key path.

Key Segment
Compound indices, also called multi-segmented keys, can have multiple segment definitions. Each segment defines an offset, length, data type, and so on.

Background

The filter evaluation mechanism for extended operations is designed to be very fast. It evaluates expressions in a straightforward manner without doing any extra processing. Because of this approach, you should be aware of some of the idiosyncracies of the interface.

•

Extended operations do not establish initial positioning. They only move forward or reverse from the current position. So in order to find all records where (lastname = 'Jones' AND firstname = 'Tom' AND city = 'LasVegas') your application must perform a Get Equal operation before performing a Get Next Extended operation.

•

Filter evaluation is strictly left to right. For example, your application cannot perform a single extended operations that will get all records where (Age = 32 AND Gender = "M") OR (Age = 30 AND Gender = "F").

This kind of search must jump around the file somewhat in order to be most efficient, but extended operations do not jump around. They move along a logical key or record path one record at a time. A compound logical expression such as the one above would require an optimizer like the one in the SQL Relational Database Engine (SRDE) that is part of Pervasive PSQL. Instead, the MicroKernel evaluates expressions left to right to make the filter evaluation process as fast as possible.

To do the search above, the caller must make two Get Extended calls, each after first performing a GetEqual operation to position the cursor to the first record. If you use the four conditions from the example above in an extended operation, it would be evaluated like (Age = 32 AND (Sex = "M" OR (Age = 30 AND Sex = "F"))). In other words, for each record, the MicroKernel evaluates the first condition, then looks at the Boolean operator to determine whether it must evaluate the next condition. If the first condition is false, the AND operator means the whole expression is false.

Validation

There are many ways by which an extended operation can return a status 62 (Invalid Descriptor). Below is a list of some of the most common:

•

Descriptor length is not long enough. This depends on the number of filter conditions, the length of each condition value and what, if any, ACSs or ISRs they use, and the number of fields to extract.

•

Data Buffer: Length must be at least long enough to contain the full descriptor.

•

Each condition must have a valid data type. It must be one of the valid Btrieve key types.

•

Flags used on the comparison code must be valid, (FILTER_NON_CASE_SENSITIVE, FILTER_BY_ACS, FILTER_BY_NAMED_ACS), and can only be used with string type fields (STRING_TYPE, LSTRING_TYPE, ZSTRING_TYPE, WSTRING_TYPE, WZSTRING_TYPE,

•

Must be a valid comparison code, (1-6).

•

Must be a valid connector, (0-2).

•

Any ACS of ISR referenced must be pre-defined.

•

The last filter condition needs a terminator (connector must be 0).

•

All other filter condition can not have a terminator (connector must be 1 or 2).

•

The extractor record count may not be zero.

Optimization

Optimizing an extended operation means that the MicroKernel can stop looking at records along the current key path because there is no possibility that records remaining on the key path can satisfy the filter. Starting with Pervasive PSQL 2000i SP3, the MicroKernel can optimize on multiple conditions as long as they occur sequentially and match the segments of the current key.

As it evaluates each condition, the MicroKernel determines if the given segment can be optimized. In order to do that, all the following must be true:

•

Must be a Get Extended operation (36 & 37), not a Step Extended operation (38 & 39).

•

Optimization must not have been turned off for remaining segments because of a matching record found when evaluating a previous segment.

•

An OR connector makes the current condition and any following condition un-optimizable, since the expression is evaluated left to right.

•

The condition must refer to the same offset in the record as the current key segment.

•

The condition must

•

refer to the same field length in the record as the current key segment,

•

OR be a substring of the key if the data type is one of the string types.

•

The condition must refer to the same field type as the key.

•

The condition must not be comparing a field with another field in the record.( FILTER_BY_FIELD)

•

The condition must have the same case sensitivity as the key.

•

The condition must have the same ACS or ISR specification as the key if the data type is one of the string types.

•

The logical operator must be EQ (=) unless the condition is being optimized to the first key segment.

•

For the first key segment, the logical operator can also be LT or LE (< or <=) if the direction is forward, and GT or GE (> or >=) if the direction is reverse. For these logical operators, only one filter condition will be optimized.

The actual direction is dependent on both the direction indicated by the operation, but also on whether the current key is descending or ascending.

Table 33

Actual Direction of Extended Operation

Ascending Key Segment

Descending Key Segment

Any one of the following can cause optimization to be turned off for all future segments.

•

If the key has no more segments to optimize to.

•

If the current condition has an OR connector.

•

If the current condition IS optimized, but is a sub-string of the associated key segment.

•

If the current condition IS optimized, but the logical operator is not EQ (=).

•

If the current condition failed to optimize and the previous condition was optimized. This means that if a bunch of conditions are ANDed together, the first optimizing condition, the one that matched the first key segment, does NOT have to be the first condition in the filter. But once a condition is found that CAN be optimized to the first key segment, the other optimizing conditions must occur sequentially and immediately following that first optimizing condition.

Note: There is a defect in Pervasive PSQL 2000i SP3 that requires the optimizable conditions to occur first in the filter. Thus, the ability to put non-optimizable conditions before the optimizable conditions is only available after SP3. This ability was available before SP3, but only one condition could be optimized.

Examples

The examples below refer to the following sample data:

Table 34

Sample Data for Multi-record Operations

Consider the table above, which has a compound key on fields 1, 2 and 3 in that order. Assume that the application performs a GetFirst operation on this file using this three-segment key, followed up with a GetNextExtended operation. Notice that these examples contain parentheses where they are assumed to be. This is the only way parentheses can occur when the filter is evaluated from left to right.

Remember that in order to optimize against a key segment, the offset, length, type, Case, & ACS identified in the filter condition all must be the same as the key definition.

(Field1 = AAA AND (Field2 = AAA AND (Field3 = AAA)))
The MKDE retrieves record 1 and stops searching with status 64 (Optimization limit exceeded). The last record examined that satisfies the optimization criteria is record 1 which becomes the current record. Engines before Pervasive.SQL 2000 SP3 can optimize only one condition and thus leave the current record at record 3.

(Field1 = AAA OR (Field2 = AAA OR (Field3 = AAA)))
The MKDE retrieves records 1, 2, 3, 4, 5, 6, 7, 10, 13 & 14 and returns with status 9 (End of file reached). No condition can be optimized since the first condition contains an OR connector. The current record becomes record 14.

(Field1 = BBB AND (Field2 = BBB OR (Field3 = BBB)))
The MKDE would retrieve records 5, 7, 8, 9 & 11 and return with status 64. The first condition was optimized, but the second condition was not since it contained an OR connector. The last record examined that satisfies the optimization criteria is record 12 which becomes the current record.

(Field4 = OOO AND (Field2 = BBB AND (Field3 = BBB)))
The MKDE retrieves records 2 & 8 and return with status 9. No condition can be optimized to the first key segment, so the following segments cannot be optimized. The current record becomes record 14.

(Field1 = BBB AND (first byte of Field2 = B AND (Field3 = BBB)))
The MKDE retrieves record 8 and returns with status 64. The first two conditions can be optimized, but since the second condition is a substring, the third condition cannot be optimized. The last record examined that satisfies the optimization criteria is record 9. Engines prior to Pervasive.SQL 2000 SP3 can optimize only one condition and thus return with record 12 as the current record.

(Field1 = BBB AND (Field2 = Field3))
This is done by using the +64 bias on the comparison code, which indicates that the second operand is another field of the record, rather than a constant. The MKDE retrieves records 4, 8 & 12 and returns with status 64. The first condition can be optimized, but since the second condition does not compare to a constant, it cannot be optimized. The last record examined that satisfies the optimization criteria is record 12.

(Field1<= BBB AND (Field2 <= BBB AND (Field3 <= BBB)))
The MKDE retrieves records 1, 2, 4, 5, 7, & 8 and returns with status 64. The first condition can be optimized, but since it does not have a logical operator of EQ, the following conditions cannot be optimized. The last record examined that satisfies the optimization criteria is record 12.

(Field1= BBB AND (Field2 < BBB AND (Field3 < BBB)))
The MKDE retrieves record 4 and returns with status 64. The first condition can be optimized, but since the second condition does not have a logical operator of EQ, it cannot be optimized. The last record examined that satisfies the optimization criteria would be record 12.

(Field1= BBB AND (Field2 = BBB AND (Field3 < BBB)))
The MKDE retrieves record 7 and returns with status 64. The first two conditions can be optimized because they use EQ, but the third condition cannot. The last record examined that satisfies the optimization criteria is record 9. Engines prior to Pervasive.SQL 2000 SP3 can optimize on only one condition and so the current record would be 12.

(Field2>= AAA AND (Field2 <= BBB AND (Field1 >= AAA) AND (Field1 <= BBB))))
The MKDE retrieves records 1, 2, 4, 5, 6, 7, 8 & 9 and returns with status 64. The first three conditions cannot be optimized to the first key segment, but since they are all ANDed together, the fourth condition can be used to optimize the search. The second condition would be optimizable if it occurred immediately after the fourth condition. But since it is out of position relative to the key segments, it cannot be optimized. Since only one key segment is optimized, the last record examined that satisfies the optimization criteria would be record 12. Note that there is a defect in Pervasive.SQL 2000 SP3 that prevents optimization unless the optimizable condition occurs first. So the SP3 engine would retrieve the same records, but would return status 9.

Performance Tips

This section provides some information on how to speed up your operations.

Connectors

Since extended operations evaluate logical expressions left to right, this feature is by no means a complete expression evaluator that can be used to extract whatever data you want in the most efficient manner. Extended operations are designed to be used in conjunction with a Get or Step operation to initially set the cursor in the file to the correct position. So the first suggestion is:

•

do not try to mix AND and OR connectors in the same filter. If you do, put the AND conditions up front to match the segments of the key, so that at least the engine can optimize the search to a shorter portion of the key path.

In other words, it may be appropriate to add some OR’ed conditions for fields not in the index after an optimizable condition for an indexed field. For example, let say you are searching a nationwide phone book for everyone in Texas with first name "William", "Bill", "Billy" or "Billybob". Using a key on the State field, you would use GetEqual to set the current record on the first person in Texas. Then call GetNextExtended with a filter like (State = "Texas" AND (FirstName = "William" OR (FirstName = "Bill" OR (FirstName = "Billy" OR (FirstName = "Billybob"))))). If your Extractor indicated a reject count of 10,000 and a return count of 100 records, the GetNextExtended would return after looking at about 10,000 records. But with 14 million people in Texas, you will need to keep issuing the same GetNextExtended operation over and over until you finally get to Utah and a status 64 (Optimization Limit exceeded). This process would be much faster than having each record transferred to your application one at a time.

But what if a compound index existed on State and FirstName? The GetNextExtended above would still work, but it would be much faster to do a GetEqual and GetNextExtended on each of the four State & FirstName combinations, optimizing on both fields.

So you can see that having filters with OR connectors is only useful when no index is available. Priority should be placed on AND connectors for fields that match the key.

Reject Count

Another issue that you should understand is how to set the reject count. If your application is the only one being serviced by a MicroKernel engine, then using the maximum reject count is most efficient since it keeps that network traffic or inter-process communication to a minimum. However, if there are a lot of applications running in a highly concurrent environment, there can be serious consequences if you have a reject count that is too large.

The MicroKernel can handle many Btrieve requests concurrently even while doing Btrieve operations atomically on a file. So it allows any number of reader threads, but only one write thread, to access the same file. Most Btrieve operations are read operations and they do not take much time to accomplish. So if a write operation comes in, it waits until all readers are finished before it locks the file for the instant it takes to insert, update, or delete a record. This coordination works great until a read operation comes in that takes a long time to finish. That is what an extended operation with a high reject count will do if it does not find any records. It keeps reading and reading and reading. The other read operation can get done without a problem, but the write operations start to back up. After a write operation has tried to get write access to the file 100 times, it reaches what is called a frustration count. At this time, it puts a block on all new reader threads. So now all Btrieve operations on this file are hung until the extended operation is done.

•

For this reason, if your application is used in a highly concurrent environment, use a reject count that is somewhere between 100 and 1000. Also try to make your extended operations optimizable so that the MicroKernel does not have to read and reject records very often.

Even with a reject count of 100 to 1000, it is better to have the MicroKernel read and reject them than it is to have the records returned to your application to reject them.