Close
dtSearch Text Retrieval Engine Programmer's Reference
dtsSearchFilter Class

Used to efficiently limit the results of a search to a specified subset of one or more indexes.

File: dtsearch.h

Syntax
C++
class dtsSearchFilter;
Overview

The dtsSearchFilter class provides a way to designate which documents can be returned by a search. It is useful in situations where a text search using must be combined with a search of a database. The database search is done first, and then the results of the database search are used to limit the dtSearch search. 

To use a search filter to limit a search,

  1. Create the search filter, or read a previously-create search filter from a disk file.
  2. Attach the search filter to a dtsSearchJob's searchFilterHandle
Document Ids

Search filters do not use names to identify documents because a filter may specify thousands, or hundreds of thousands, of documents, and a table of filenames would take too much memory and would take too long to check. Instead, each document is identified by (a) the index it belongs to, and (b) the document's document id, or docId, a unique integer that is assigned to each document in an index. The docId for a document can be obtained by searching for the document by name, and then examining the document's properties in search results. In the C++ API, the docId is returned as dtsSearchResultsItem.docId. See: Document Ids 

The first document added to an index has the DocId 1, and subsequent documents will have sequentially numbered DocIds 2, 3, 4, and so forth. When a document is reindexed, its docId is "cancelled" and a new docId is assigned. Compressing an index renumbers all of the docIds, so after an index has been compressed, a document's docId may change. 

A docId that is selected may be returned in search results. A document that is not selected will not be returned in search results, even if it otherwise satisfies the search request. 

If the criteria for the search filter can be expressed as one or more search requests, you can use selectItemsBySearch to select documents in the SearchFilter.

Indexes and Index identifiers

A search filter can cover any number of indexes. To add an index to a search filter, call addIndex() with the full path to the index. The path must be expressed exactly as it will be expressed in the search job. The addIndex() function returns an integer that is used to identify that index when selecting and de-selecting documents for the filter. (This makes the selection and de-selection functions, which may be called thousands of times, more effiicent.)

Implementation

A search filter is implemented in the dtSearch Engine using a table of bit vectors, one for each index in the filter. Each bit vector has one bit for each document in its index. For example, a search filter for a single index with 1,000,000 documents would have 1,000,000 bits, or 125 kilobytes of data. The write function compresses the bit vector where possible, so a saved search filter may be smaller than the bit vectors that it contains. 

When a search filter is created, the dtSearch Engine returns a handle to it. The dtsSearchFilter class wraps this handle.

Example

The following code sequence demonstrates the use of dtsSearchFilter to limit a search:

dtsSearchJob searchJob; ... set up search job ... // Set up a filter that will only allow documents 1 through 10 in the index // to be returned. dtsSearchFilter filter; int iIndex = filter.addIndex("c:\data\index"); filter.selectItems(iIndex, 1, 10, true); // Attach the search filter to the search job searchJob.searchFilterHandle = filter.getHandle(); // Execute the search short errFlag; dtssDoSearchJob(searchJob, errFlag);
dtsSearchFilter