dtsOptions Structure

Sets dtSearch Engine indexing and searching option settings.

File

File: dtsearch.h

Syntax

C++

struct dtsOptions { long binaryFiles; char binaryFilterTextChars[256]; long hyphens; char alphabetFile[FileNameLen]; long indexNumbers; char noiseWordFile[FileNameLen]; char stemmingRulesFile[FileNameLen]; long maxWordsToRetrieve; long maxStoredFieldSize; long titleSize; char xmlIgnoreTags[512]; long maxWordLength; char segmentationRulesFile[FileNameLen]; char textFieldsFile[FileNameLen]; char userThesaurusFile[FileNameLen]; long updateFiles; long lzwEnableCode; char homeDir[FileNameLen]; char privateDir[FileNameLen]; char booleanConnectors[512]; char fileTypeTableFile[FileNameLen]; long textFlags; long maxFieldNesting; long autoFilterSizeMB; char macroChar; char fuzzyChar; char phonicChar; char stemmingChar; char synonymChar; char weightChar; char matchDigitChar; char storedFieldDelimiterChar; long fieldFlags; char unicodeFilterRanges[256]; long unicodeFilterBlockSize; long unicodeFilterFlags; long unicodeFilterMinTextSize; dtsLanguageAnalyzerInterface * pAnalyzer; long unicodeFilterWordOverlapAmount; char tempFileDir[200]; };

Data Members

Data Member	Description
alphabetFile	Name of dtSearch alphabet file to use when parsing text into words.
autoFilterSizeMB	Size of files that are always processed using the unicode filtering algorithm, in megabytes. Files larger than autoFilterSizeMB are assumed to be non-document files such as forensically recovered disk images or slack space, and are indexed using the unicode filtering algorithm. If zero, the unicode filtering algorithm is never applied based on file size for files smaller than 2gb.
binaryFiles	BinaryFilesSettings value specifying the treatment of binary files.
binaryFilterTextChars	Define characters considered to be text if binaryFiles is set ot dtsoFilterBinary
booleanConnectors	Use to replace the default connectors used in search requests.
fieldFlags	FieldFlags values that control indexing of metadata.
fileTypeTableFile	Name of the file containing a table of filename patterns for file formats that dtSearch cannot detect automatically, such as older versions of WordStar. The FileTypeTableFile is an XML file. To create the file, start dtSearch Desktop, click Options > Preferences > File Types and use the dialog box to set up the file type definitions. The XML file will be saved as filetype.xml in your dtSearch UserData folder. See File Types in the dtSearch Desktop help for information on this setting.
fuzzyChar	Character that enables fuzzy searching for a search term (default "%"); see Redefining Search Operators.
homeDir	Directory where the dtSearch Engine and support files are located.
hyphens	HyphenSettings value specifying the treatment of hyphens in text
indexNumbers	If false, any word that begins with a digit will not be indexed.
lzwEnableCode	Obsolete. This was used in older versions to enable LZW decompression in file parsers, which is now always enabled.
macroChar	Character that indicates that a search term is a macro (default "@"); see Redefining Search Operators.
matchDigitChar	Wildcard character that matches a single digit (default "=")
maxFieldNesting	Maximum depth of nested fields (value must be between 1 and 32)
maxStoredFieldSize	Maximum size of a single stored field. Stored fields are field data collected during indexing that is returned in search results.
maxWordLength	Words longer than the maxWordLength will be truncated when indexing. The default maxWordLength is 32. The maximum value is 128.
maxWordsToRetrieve	Maximum number of words that can be matched in a search. This can be any value from 16 to 512k (32-bit)or 4,096k (64-bit). The default is 64k. If a search matches more unique words than the maxWordsToRetrieve limit, the error code dtsErMaxWords (137) will be returned.
noiseWordFile	List of noise words to skip during indexing (default: "noise.dat") A noise word is a word such as the or if that is so common that it is not useful in searches. To save time, noise words are not indexed and are ignored in index searches. When an index is created, dtSearch copies the list of words from noise.dat into the index directory and also builds the word list into other index files. After an index is created, subsequent changes to the noise word list will not affect indexing for that index
pAnalyzer	Pointer to language analyzer to use for word breaking
phonicChar	Character that enables phonic searching for a search term (default "#"); see Redefining Search Operators. The regular expression mark ("##") is a doubling of the phonicChar.
privateDir	A directory that the dtSearch Engine can use to store per-user settings and temporary files.
segmentationRulesFile	File segmentation rules, used to split up long text files into logical subdocuments during indexing. The SegmentationRulesFile is an XML file. To create the file, start dtSearch Desktop, click Options > Preferences > File Segmentation Rules and use the dialog box to set up the rules. The XML file will be saved as fileseg.xml in your dtSearch UserData folder. See File Segmentation Rules in the dtSearch Desktop help for information on this setting.
stemmingChar	Character that enables stemming for a search term; see Redefining Search Operators. The expression used for range searching (default "~~") a doubling of the stemmingChar.
stemmingRulesFile	Stemming rules for stemming searches (default: "stemming.dat") The stemming.dat file uses a plain text format and includes comments in the file that describe the file format.
storedFieldDelimiterChar	Character to insert between multiple instances of a stored field in a single document
synonymChar	Character that enables synonym searching for a search term (default "&"); see Redefining Search Operators.
tempFileDir	Directory to use for temporary files.
textFieldsFile	Name of the file containing rules for extraction of field data from text files based on markers in the next The TextFieldsFile is an XML file. To create the file, start dtSearch Desktop, click Options > Preferences > Text Fields and use the dialog box to set up the text field definitions. The XML file will be saved as fields.xml in your dtSearch UserData folder. See Define Text Fields in the dtSearch Desktop help for information on this setting.
textFlags	Flags that control text-processing options. See the TextFlags enum for values.
titleSize	Use this option to change the number of characters stored as the "title" property of each document, up to a maximum of 512. By default, the dtSearch Engine collects the first 80 characters of text from a file for the title associated with each document.
unicodeFilterBlockSize	Specifies how each input file is divided into blocks before being filtered.
unicodeFilterFlags	UnicodeFilterFlags values controlling the behavior of the Unicode filtering algorithm.
unicodeFilterMinTextSize	Minimum length of a run of text when applying the Unicode Filtering algorithm.
unicodeFilterRanges	Indicates Unicode ranges that are of interest when filtering.
unicodeFilterWordOverlapAmount	Amount of overlap when automatically breaking words when applying the Unicode Filtering algorithm.
updateFiles	Set to true to force all configuration files to be re-read. If the contents of a configuration file such as TextFieldsFile changes, but the filename is not changed, set UpdateFiles=true to indicate that dtSearch should discard any internally-cached copies of configuration files and re-read them from disk.
userThesaurusFile	User-defined synonym sets. The UserThesaurusFile is an XML file. To create the file, start dtSearch Desktop, click Options > Preferences > User Thesaurus and use the dialog box to set up the synonym definitions. The XML file will be saved as thesaur.xml in your dtSearch UserData folder. See User Thesaurus in the dtSearch Desktop help for information on this setting.
weightChar	Character used to indicate term weighting (example: apple:5); see Redefining Search Operators. The prefix used to add field name in front of a word in an xfilter expression is a doubling of the weightChar (default "::"). For example, if you change the WeightChar to !, then an xfilter expression with a field would look like this: ... more
xmlIgnoreTags	Comma-separated list of tags to ignore when indexing XML

Group

Classes

Members

Data Members

Data Member	Description
alphabetFile	Name of dtSearch alphabet file to use when parsing text into words.
autoFilterSizeMB	Size of files that are always processed using the unicode filtering algorithm, in megabytes. Files larger than autoFilterSizeMB are assumed to be non-document files such as forensically recovered disk images or slack space, and are indexed using the unicode filtering algorithm. If zero, the unicode filtering algorithm is never applied based on file size for files smaller than 2gb.
binaryFiles	BinaryFilesSettings value specifying the treatment of binary files.
binaryFilterTextChars	Define characters considered to be text if binaryFiles is set ot dtsoFilterBinary
booleanConnectors	Use to replace the default connectors used in search requests.
fieldFlags	FieldFlags values that control indexing of metadata.
fileTypeTableFile	Name of the file containing a table of filename patterns for file formats that dtSearch cannot detect automatically, such as older versions of WordStar. The FileTypeTableFile is an XML file. To create the file, start dtSearch Desktop, click Options > Preferences > File Types and use the dialog box to set up the file type definitions. The XML file will be saved as filetype.xml in your dtSearch UserData folder. See File Types in the dtSearch Desktop help for information on this setting.
fuzzyChar	Character that enables fuzzy searching for a search term (default "%"); see Redefining Search Operators.
homeDir	Directory where the dtSearch Engine and support files are located.
hyphens	HyphenSettings value specifying the treatment of hyphens in text
indexNumbers	If false, any word that begins with a digit will not be indexed.
lzwEnableCode	Obsolete. This was used in older versions to enable LZW decompression in file parsers, which is now always enabled.
macroChar	Character that indicates that a search term is a macro (default "@"); see Redefining Search Operators.
matchDigitChar	Wildcard character that matches a single digit (default "=")
maxFieldNesting	Maximum depth of nested fields (value must be between 1 and 32)
maxStoredFieldSize	Maximum size of a single stored field. Stored fields are field data collected during indexing that is returned in search results.
maxWordLength	Words longer than the maxWordLength will be truncated when indexing. The default maxWordLength is 32. The maximum value is 128.
maxWordsToRetrieve	Maximum number of words that can be matched in a search. This can be any value from 16 to 512k (32-bit)or 4,096k (64-bit). The default is 64k. If a search matches more unique words than the maxWordsToRetrieve limit, the error code dtsErMaxWords (137) will be returned.
noiseWordFile	List of noise words to skip during indexing (default: "noise.dat") A noise word is a word such as the or if that is so common that it is not useful in searches. To save time, noise words are not indexed and are ignored in index searches. When an index is created, dtSearch copies the list of words from noise.dat into the index directory and also builds the word list into other index files. After an index is created, subsequent changes to the noise word list will not affect indexing for that index
pAnalyzer	Pointer to language analyzer to use for word breaking
phonicChar	Character that enables phonic searching for a search term (default "#"); see Redefining Search Operators. The regular expression mark ("##") is a doubling of the phonicChar.
privateDir	A directory that the dtSearch Engine can use to store per-user settings and temporary files.
segmentationRulesFile	File segmentation rules, used to split up long text files into logical subdocuments during indexing. The SegmentationRulesFile is an XML file. To create the file, start dtSearch Desktop, click Options > Preferences > File Segmentation Rules and use the dialog box to set up the rules. The XML file will be saved as fileseg.xml in your dtSearch UserData folder. See File Segmentation Rules in the dtSearch Desktop help for information on this setting.
stemmingChar	Character that enables stemming for a search term; see Redefining Search Operators. The expression used for range searching (default "~~") a doubling of the stemmingChar.
stemmingRulesFile	Stemming rules for stemming searches (default: "stemming.dat") The stemming.dat file uses a plain text format and includes comments in the file that describe the file format.
storedFieldDelimiterChar	Character to insert between multiple instances of a stored field in a single document
synonymChar	Character that enables synonym searching for a search term (default "&"); see Redefining Search Operators.
tempFileDir	Directory to use for temporary files.
textFieldsFile	Name of the file containing rules for extraction of field data from text files based on markers in the next The TextFieldsFile is an XML file. To create the file, start dtSearch Desktop, click Options > Preferences > Text Fields and use the dialog box to set up the text field definitions. The XML file will be saved as fields.xml in your dtSearch UserData folder. See Define Text Fields in the dtSearch Desktop help for information on this setting.
textFlags	Flags that control text-processing options. See the TextFlags enum for values.
titleSize	Use this option to change the number of characters stored as the "title" property of each document, up to a maximum of 512. By default, the dtSearch Engine collects the first 80 characters of text from a file for the title associated with each document.
unicodeFilterBlockSize	Specifies how each input file is divided into blocks before being filtered.
unicodeFilterFlags	UnicodeFilterFlags values controlling the behavior of the Unicode filtering algorithm.
unicodeFilterMinTextSize	Minimum length of a run of text when applying the Unicode Filtering algorithm.
unicodeFilterRanges	Indicates Unicode ranges that are of interest when filtering.
unicodeFilterWordOverlapAmount	Amount of overlap when automatically breaking words when applying the Unicode Filtering algorithm.
updateFiles	Set to true to force all configuration files to be re-read. If the contents of a configuration file such as TextFieldsFile changes, but the filename is not changed, set UpdateFiles=true to indicate that dtSearch should discard any internally-cached copies of configuration files and re-read them from disk.
userThesaurusFile	User-defined synonym sets. The UserThesaurusFile is an XML file. To create the file, start dtSearch Desktop, click Options > Preferences > User Thesaurus and use the dialog box to set up the synonym definitions. The XML file will be saved as thesaur.xml in your dtSearch UserData folder. See User Thesaurus in the dtSearch Desktop help for information on this setting.
weightChar	Character used to indicate term weighting (example: apple:5); see Redefining Search Operators. The prefix used to add field name in front of a word in an xfilter expression is a doubling of the weightChar (default "::"). For example, if you change the WeightChar to !, then an xfilter expression with a field would look like this: ... more
xmlIgnoreTags	Comma-separated list of tags to ignore when indexing XML

Methods

Method	Description
copy	Copy another dtsOptions
equals	Compare options for equality
validate	Force all data in dtsOptions to valid values. See dtsearch.cpp for the source code implementing this.

Methods

Method	Description
copy	Copy another dtsOptions
equals	Compare options for equality
validate	Force all data in dtsOptions to valid values. See dtsearch.cpp for the source code implementing this.

Remarks

Deprecated -- use dtsOptions2 instead of dtsOptions.

To change option settings,

Call dtssGetOptions to get the option settings currently in effect
Change the values in dtsOptions as needed, and
Call dtssSetOptions to apply the changes

Option settings are not persisted anywhere so changes must be made each time a new program instance starts.

Option settings apply to the current thread and any threads created after the current thread. Therefore, each thread can have its own settings.