Hash Join Node

12. Using the Query Optimizer : Types of Nodes in a QEP : Join Nodes in a QEP : Hash Join Node

Share this page

Hash Join Node

The hash is an optimized join that replaces the FSM join when one or both of the subtrees must be sorted. It functions by loading the rows of one subtree into a memory resident hash table, keyed on the values of the join columns. The rows of the other subtree are hashed on their join key values into the hash table, allowing very efficient matching of joined rows. By avoiding the sort(s) of the FSM join, the hash join can be much more efficient.

A typical abbreviated example of a QEP diagram involving a hash join is shown below:

Join

/ \

proj-rest proj-rest

/ /

table table

This node is displayed with the following information on a QEP diagram:

• A label identifying it as a hash join node, along with the columns on which join processing is done

• If an outer join has been requested, one of the following labels indicating the type of join:

[LEFT JOIN]
[FULL JOIN]
[RIGHT JOIN]

• Storage structure (which is usually heap) or a list of sort columns if the data is being sorted

• Total number of pages and tuples

• Query cost information

• Optionally, sort information (whether the data produced by this node is already sorted or requires sorting)

Like the FSM join, the hash join is most common when a bulk join takes place with no row restrictions on either table involved in the join, as with a SELECT statement of the following form:

select from r1,r2 where r1.joincol= r2.joincol;

This example shows a QEP diagram with hash join node resulting from such a bulk join:

select a.col2, b.col2 from arel a, brel b

where a.col1 = b.col2;

QUERY PLAN 1,1, no timeout, of main query

HASH Join(col1)

Heap

Pages 1 Tups 156

D9 C40

/ \

Proj-rest Proj-rest

Heap Heap

Pages 1 Tups 156 Pages 1 Tups 156

D8 C1 D1 C1

/ /

arel(a) brel (b)

Hashed (NU) Isam (NU)

Pages 70 Tups 156 Pages 3 Tups 156