You are here: C++ API > Classes > dtsFileConvertJob2 Structure > dtsFileConvertJob2::dtsListIndexJob Structure
dtSearch Text Retrieval Engine Programmer's Reference
dtsFileConvertJob2::dtsListIndexJob Structure

List words, fields, or filenames in an index to a file or to a memory buffer

File: dtsearch.h

struct dtsListIndexJob : public dtsJobBase { long outputStringMaxSize; long listFlags; long searchFlags; long fuzziness; const char * toMatch; const char * indexPath; const char * outputFile; dtsStringHandle outputString; long fOutputStringWasTruncated; };
long outputStringMaxSize;
Maximum size of the output string to generate
long listFlags;
ListIndexFlags specifying the type of list to generate
long searchFlags;
SearchFlags specifying search features to be used in matching the toMatch expression, such as fuzziness, stemming, etc
long fuzziness;
If the dtsSearchFuzzy flag is set in searchFlags, set the fuzziness value from 1 to 10 to specify the level of fuzzy searching to apply.
const char * toMatch;
toMatch is an optional search expression specifying the text to match against items being listed. For example, to list all field names starting with "A", you would set listFlags to dtsListIndexFields, and set toMatch to "A*".
const char * indexPath;
Location of the index to list.
const char * outputFile;
Name of file to create (if the dtsListIndexReturnString flag is not set
dtsStringHandle outputString;
If the dtsListIndexReturnString flag is set, the list will be returned through a dtsStringHandle. The dtsStringHandle must be released by the caller.
long fOutputStringWasTruncated;
If true, the output was halted due to outputStringMaxSize before all items were listed

You can use ListIndexFlags to specify the type of information included in the output. 

When listing words, if dtsListIndexIncludeField is not set, then multiple instances of a word in different fields will be aggregated. For example, if "smith" occurs once in the "author" field and once in the "subject" field, that will result in a document count of 2 and a hit count of 2. In this case, the document count may not reflect the possibility that the two instances occur in the same document. To prevent this type of inaccuracy in the output, the dtsListIndexIncludeField flag can be used to distinguish instances in different fields. 

For speed, ListIndexJob does not actually enumerate the references for each word and instead relies on counts incrementally stored in the index. Therefore, the reported counts may include artifacts of the indexing process such as reindexed or removed documents, so the counts may be higher than the actual count of references in the index. Compressing an index will remove these extra references.

Copyright (c) 1995-2021 dtSearch Corp. All rights reserved.