|
A logical join is a method of retrieving data from a primary database using criteria stored in a set of secondary databases. A logical join requires that your data be organized as a primary database which contains the primary key and primary data field, and a set of secondary databases. Each of the secondary databases is indexed by a different secondary key, and, for each key in a secondary database, there is a set of duplicate data items that match the primary keys in the primary database.
For example, consider a database that lists types of fruit as the key item, and the store where you can buy them as the data item:
Key: Data: apple Convenience Store peach Shopway pear Farmer's Market raspberry Shopway strawberry Farmer's Market blueberry Farmer's Market
A secondary index might have a key color, and, as the data items, the list of fruits of different colors.
Key: Data: red apple, raspberry, strawberry yellow peach, pear blue blueberry
This secondary index would allow an application to look up a color, and then use the data items to look up the stores where the colored fruit could be purchased. For example, by first looking up blue, the data item blueberry could be used as the lookup key in the primary database, returning Farmer's Market.
If there were a another secondary index that had as its key the cost of the fruit, a similar lookup could be done on stores where inexpensive fruit could be purchased:
Key: Data: expensive blueberry, peach, pear, strawberry inexpensive apple, pear, raspberry
The DB->join function provides logical join functionality. While not strictly cursor functionality, in that it is not a method off a cursor handle, it is more related to the cursor operations than to the standard DB operations.
Your data must be organized in the following manner in order to use the DB->join function:
These duplicate entries should be sorted for performance reasons, although it is not required. For more information see the DB_DUPSORT flag to the DB->set_flags function.
What the DB->join function does is review a list of secondary keys, and, when it finds a data item that appears as a data item for all of the secondary keys, it uses that data items as a lookup into the primary database, and returns the associated data item.
The DB->join function, makes it possible to do lookups based on multiple criteria in a single operation, e.g., it would be possible to look up fruits that were both red and expensive in a single operation. If the same fruit appeared as a data item in both the color and expense indices, then that fruit name would be used as the key for retrieval from the primary index, and would then return the store where expensive, red fruit could be purchased.
Consider the following three databases:
Consider the following query:
Return the personnel records of all smiths who are managers.
This query finds are all the records in the primary database (personnel) for whom the criteria lastname=smith and job title=manager is true.
Assume that all databases have been properly opened and have the handles: pers_db, name_db, job_db. We also assume that we have an active transaction referenced by the handle txn.
DBC *name_curs, *job_curs, *join_curs; DBC *carray[3]; DBT key, data; int ret, tret;name_curs = NULL; job_curs = NULL; memset(&key, 0, sizeof(key)); memset(&data, 0, sizeof(data));
if ((ret = name_db->cursor(name_db, txn, &name_curs)) != 0) goto err; key.data = "smith"; key.size = sizeof("smith"); if ((ret = name_curs->c_get(name_curs, &key, &data, DB_SET)) != 0) goto err;
if ((ret = job_db->cursor(job_db, txn, &job_curs)) != 0) goto err; key.data = "manager"; key.size = sizeof("manager"); if ((ret = job_curs->c_get(job_curs, &key, &data, DB_SET)) != 0) goto err;
carray[0] = name_curs; carray[1] = job_curs; carray[2] = NULL;
if ((ret = pers_db->join(pers_db, carray, 0, &join_curs)) != 0) goto err; while ((ret = join_curs->c_get(join_curs, &key, &data, 0)) == 0) { /* Process record returned in key/data. */ }
/* * If we exited the loop because we ran out of records, * then it has completed successfully. */ if (ret == DB_NOTFOUND) ret = 0;
err: if (join_curs != NULL && (tret = join_curs->c_close(join_curs)) != 0 && ret == 0) ret = tret; if (name_curs != NULL && (tret = name_curs->c_close(name_curs)) != 0 && ret == 0) ret = tret; if (job_curs != NULL && (tret = job_curs->c_close(job_curs)) != 0 && ret == 0) ret = tret;
return (ret);
The name cursor is placed at the beginning of the duplicate list for smith and the job cursor is placed at the beginning of the duplicate list for manager. The join cursor is returned from the logical join call. This code then loops over the join cursor getting the personnel records of each one until there are no more.