Faceted Search

Article: dts0222

Applies to: dtSearch Engine 7.60 and later

Overview

In a faceted search interface, search results include summaries of values and document counts for one or more categories of document fields after a search. The user can then progressively limit the search by picking values within each category.

To implement this type of interface using the dtSearch Engine, (1) identify the fields to be used for faceted search when you build the indexes, and (2) use the indexes at search time to generate efficient subsets of search results based on the user's choices among the field values.

For the indexing step, use IndexJob.EnumerableFields to designate the names of the fields to be enabled for faceted search. The indexer will collect every field value in each of the fields specified and index it in a way that permits fast enumeration at search time.

For the searching step, use SearchFilter objects, which you can generate from any search in dtSearch, in combination with the WordListBuilder object, to generate lists of field values based on search results and to progressively limit search results as the user choses values from the options offered.

Demo and sample code

An ASP.NET Core demo demonstrating faceted search and multicolor hit-highlighting is available here: ASP.NET Core demo. Select "SEC Filings" to see the faceted search portion of the demo. This demo is based on sample source code included with the dtSearch Engine SDK in the dtSearch examples\NetStd\WebDemo folder. An article describing the demo's sample code is available here: Source code article.

API documentation

Enumerable Fields (overview)

Retrieving Fields in Search Results (overview)

IndexJob.EnumerableFields (.NET API)

Limiting searches with SearchFilters (overview)

SearchFilter Class (.NET API)

Indexing to support faceted search

When documents are indexed, set IndexJob.EnumerableFields to a comma-separated list of field names. EnumerableFields are fields whose values will be stored in the index in a way that permits fast enumeration. All EnumerableFields are also StoredFields. The EnumerableFields setting has no effect on document retrieval. Instead, it tells the indexer to store field values in a way that WordListBuilder can use to quickly generate the data needed for the faceted search interface.

If a field has multiple delimited values, you can use the flag dtsIndexTokenizeEnumerableFields in IndexJob.IndexingFlags to automatically tokenize multi-value enumerable fields using Options.StoredFieldDelimiterChar. For example, if a field value is "First!Second!Third" and StoredFieldDelimiterChar is !, then instead of one enumerable field containing "First!Second!Third", the document will be indexed with three enumerable field values, "First", "Second", and "Third".

Implementing faceted search

(1) In SearchJob, set WantResultsAsFilter=true so a SearchFilter will be returned along with the SearchResults. The SearchFilter will be a bit vector identifying all of the documents that match the search request. (If you are executing a search specifically to create a SearchFilter and do not need SearchResults from the search, set the flag dtsSearchFastSearchFilterOnly in SearchJob to improve search speed.)

(2) To enumerate the values of a field for all documents retrieved in the search, create a WordListBuilder object and set it up as follows

- Set IndexPath to the index that was searched.

- Call WordListBuilder.SetFilter to limit the values returned to the documents that were returned from the search.

- Call WordListBuilder.ListFieldValues with the field name to enumerate.

ListFields sample

A sample application demonstrating EnumerableFields is installed here:

C:\Program Files (x86)\dtSearch Developer\examples\cs4\ListFields

To use the sample,

(1) Build an index of some faceted data with one or more fields listed in EnumerableFields (enter * under EnumerableFields to make all fields enumerable)

(2) Use the Search and Browse Fields dialogs to search and browse field values within the results of a search.

Additional Information

See www.kaleidosearch.com for information on a faceted search dtSearch Engine add-on developed by Contegra Systems.

For developer tutorials on implementing a faceted search user interface, please see these articles from CodeProject.com:
Faceted search (1)

Faceted search (2)

Faceted search (3)