WordListBuilder provides a way to list words, field names, or field values in an index
File: WordListBuilder.java
Package: com.dtsearch.engine
Method |
Description |
---|---|
Close the index that was opened by openIndex. | |
Number of words in the list | |
Returns zero if the last operation succeeded, or an ErrorCodes value if there was an error | |
Get the text of a word in the word list. iWord must be in the range from 0 to Count-1. | |
Get the number of times this word occurs in the index. | |
Get the number of documents in which this word occurs | |
List all field values in the index that match an expression | |
List all fields in the index. | |
List words that match an expression | |
Build a scrolling word list based on user input, with a fixed number of words before and after the word in the index. | |
Open the index at indexPath and return 0 if the index was opened successfully, or non-zero if the index could not be opened. | |
Set option settings for the WordListBuilder using WordListBuilderFlags | |
Sort the word list by document count or term |
WordListBuilder is intended for quick enumeration of words, field names, or field values in an index.
Two ways of listing words are provided, one for listing the words before and after a word in an index, and one for listing words that match a search term in an index.
The scrolling list of indexed words that updates as a user enters a search request in dtSearch Desktop is implemented using WordListBuilder's ListWords method. The "Browse Words" dialog box in dtSearch Desktop that lists words matching an expression is implemented using the ListMatchingWords method.
Listing of field values only lists values of fields that were designated as EnumerableFields when the documents were indexed.
For speed, WordListBuilder does not actually enumerate the references for each word and instead relies on counts incrementally stored in the index. Therefore, the reported counts may include artifacts of the indexing process such as reindexed or removed documents, so the counts may be higher than the actual count of references in the index. Compressing an index will remove these extra references.
To improve performance in cases where the same field values have to be enumerated repeatedly with different SearchFilters, you can set the flag dtsWordListEnableFieldValuesCache using setFlags(). This will make listFieldValues calls from the same WordListBuilder faster at the cost of substantial memory use, because the field values and occurrences in the index will be stored in memory until the WordListBuilder is deleted. The amount of memory required is proportional to the number of documents in the index times the number of values in the enumerable field that is cached.