Retrieving Data

Retrieving Data

To retrieve data efficiently, return only the data that you need, and choose the most efficient method of doing so. The guidelines in this section will help you to optimize system performance when retrieving data with .NET applications.

Understanding the Architecture

ADO.NET uses ADO.NET data providers to access data. All .NET runtime data access is done through ADO.NET data providers.

ADO.NET development wizards use OLE DB to populate the grids of the wizards. Although OLE DB is accessed for these wizards at design time, the runtime components generated use ADO.NET data providers for accessing data. Optimizing data access via OLE DB in a .NET application is not a useful exercise. Instead, optimize the ADO.NET data provider as the component that will do all of the runtime work.

Retrieving Long Data

Unless it is necessary, applications should not request long data because retrieving long data across a network is slow and resource-intensive.

Most users don't want to see long data. If the user wants to see these result items, then the application can query the database again, specifying only the long columns in the select list. This method allows the average user to retrieve the result set without having to pay a high performance penalty for network traffic.

Although the best method is to exclude long data from the select list, some applications do not formulate the select list before sending the query to the ADO.NET data providers (that is, some applications use syntax such as select * from <table name> ...). If the select list contains long data, then some data providers must retrieve that data at fetch time even if the application does not bind the long data in the result set. When possible, try to implement a method that does not retrieve all columns of the table.

Sometimes long data must be retrieved. When this is the case, remember that most users do not want to see 100 KB, or more, of text on the screen.

Reducing the Size of Data Retrieved

To reduce network traffic and improve performance, you can reduce the size of any data being retrieved to some manageable limit by calling set max rows or set max field size, or some other database-specific command to limit row size or field size. Another method of reducing the size of the data being retrieved is to decrease the column size. If the data provider allows you to define the packet size, use the smallest packet size that will meet your needs.

In addition, be careful to return only the rows you need. If you return five columns when you only need two columns, performance is decreased, especially if the unnecessary rows include long data.

Using CommandBuilder Objects

It is tempting to use CommandBuilder objects because they generate SQL statements. However, this shortcut can have a negative effect on performance. Because of concurrency restrictions, the Command Builder does not generate efficient SQL statements. For example, the following SQL statement was created by the Command Builder:

CommandText: UPDATE TEST01.EMP SET EMPNO = ?, ENAME = ?, JOB = ?, MGR = ?, HIREDATE = ?, SAL = ?, COMM = ?, DEPT = ?

WHERE
( (EMPNO = ?) AND ((ENAME IS NULL AND ? IS NULL)
OR (ENAME = ?)) AND ((JOB IS NULL AND ? IS NULL)
OR (JOB = ?)) AND ((MGR IS NULL AND ? IS NULL)
OR (MGR = ?)) AND ((HIREDATE IS NULL AND ? IS NULL)
OR (HIREDATE = ?)) AND ((SAL IS NULL AND ? IS NULL)
OR (SAL = ?)) AND ((COMM IS NULL AND ? IS NULL)
OR (COMM = ?)) AND ((DEPT IS NULL AND ? IS NULL)
OR (DEPT = ?)) )

The end-user can often write more efficient Update and Delete statements than those that the CommandBuilder generates.

Another drawback is also implicit in the design of the CommandBuilder object. The CommandBuilder object is always associated with a DataAdapter object and registers itself as a listener for RowUpdating and RowUpdated events that the DataAdapter object generates. This means that two events must be processed for every row that is updated.

Choosing the Right Data Type

Retrieving and sending certain data types can be expensive. When you design a schema, select the data type that can be processed most efficiently. For example, integer data is processed faster than floating-point data. Floating-point data is defined according to internal database-specific formats, usually in a compressed format. The data must be decompressed and converted into a different format so that it can be processed by the wire protocol.

Processing time is shortest for character strings, followed by integers, which usually require some conversion or byte ordering. Processing floating-point data and timestamps is at least twice as slow as integers.