Alphabet Customization

Menu option: Options > Preferences > Letters and words

The Edit Alphabet dialog box displays a list of all of the characters and how dtSearch classifies each one.  dtSearch classifies characters into four categories: letter, space, hyphen, and ignore. 


A searchable character.  All of the characters in the alphabet (a-z and A-Z) and all of the digits (0-9) should be classified as letters. 


A character that causes a word break.  For example, if you classify the period (".") as a space character, then dtSearch would process U.S.A.  as three separate words: U, S and A. 


A character that is disregarded in processing text.  For example, if you classify the period as ignore instead of space then dtSearch would process U.S.A.  as one word: USA. 


Hyphen characters can receive special processing in dtSearch.  By default, only the '-' is defined as a hyphen.  To specify the rules for processing hyphens, click Options > Preferences > Indexing Options.

For characters that are letters, you can specify whether the character is a lower case or upper case letter.

Only characters in the Unicode range 33-127 can be modified using Alphabet Customization.  Other character properties are determined by the Unicode specification.  See for more information about Unicode.


Copyright © 1991-2021 dtSearch Corp. All Rights Reserved.  /  Terms of use  /  Privacy