Faceted Search

Article: dts0222

 

Applies to: dtSearch Engine 7.60 and later

The dtSearch Engine's WordListBuilder object can be used to quickly enumerate document properties for documents retrieved in a search.  This feature can be used to implement a faceted search interface, in which search results summarize values and document counts for one or more categories of document metadata after a search.  

An ASP.NET Core demo showing faceted search and multicolor hit-highlighting is available here: ASP.NET Core demo.  Please select "SEC Filings" to see the faceted search portion of the demo.  An article describing the demo's sample code is available here: Source code article.

Indexing

When documents are indexed, set IndexJob.EnumerableFields to a comma-separated list of field names.  EnumerableFields are fields whose values will be stored in the index in a way that permits fast enumeration. All EnumerableFields are also StoredFields.  The EnumerableFields setting has no effect on document retrieval.

If a field has multple delimited values, you can use the flag dtsIndexTokenizeEnumerableFields in IndexJob.IndexingFlags to automatically tokenize multi-value enumerable fields using Options.StoredFieldDelimiterChar. For example, if a field value is "First!Second!Third" and StoredFieldDelimiterChar is !, then instead of one enumerable field containing "First!Second!Third", the document will be indexed with three separate enumerable field values, "First", "Second", and "Third".

Searching

(1) In SearchJob, set WantResultsAsFilter=true so a SearchFilter will be returned along with the SearchResults. The SearchFilter will be a bit vector identifying all of the documents that match the search request.  (If you are executing a search specifically to create a SearchFilter and do not need SearchResults from the search, set the flag dtsSearchFastSearchFilterOnly in SearchJob to improve search speed.)

(2) To enumerate the values of a field for all documents retrieved in the search, create a WordListBuilder object and set it up as follows

- Set IndexPath to the index that was searched.

- Call WordListBuilder.SetFilterWeb to limit the values returned to the documents that were returned from the search.

- Call WordListBuilder.ListFieldValuesWeb with the field name to enumerate.

A sample application demonstrating EnumerableFields is installed here:

C:\Program Files (x86)\dtSearch Developer\examples\cs4\ListFields

To use the sample,

(1) Build an index of some faceted data with one or more fields listed in EnumerableFields (enter * under EnumerableFields to make all fields enumerable)

(2) Use the Search and Browse Fields dialogs to search and browse field values within the results of a search.

For API documentation, see WordListBuilder.ListFieldValues and IndexJob.EnumerableFields in C:\Program Files\dtSearch Developer\help\dtSearchNetApi2.chm

See www.kaleidosearch.com for information on a faceted search dtSearch Engine add-on developed by Contegra Systems.

For a developer tutorial on implementing a faceted search user interface, please see these articles from CodeProject.com:
Faceted search (1)   

Faceted search (2)  

Faceted search (3)