Regular expression searching provides a way to search for advanced combinations of characters.
A regular expression included in a search request must be quoted and must begin with ##. Examples:
Apple and "##199[0-9]" Apple and "##19[0-9]+"
dtSearch versions 7.62 and later use TR1 regular expressions. For more information on TR1 regular expressions, see:
http://msdn.microsoft.com/en-us/library/bb982727.aspx
(1) A regular expression must match a single whole word. For example, a search for "##app.*ie" would not find "apple pie".
(2) Only letters are searchable. Characters that are not indexed as letters are not searchable even using regular expressions, because the index does not contain any information about them.
(3) Because the dtSearch index does not store information about line breaks, searches that include begining-of-line or end-of-line regular expression criteria (^ and $) will not work.
A regular expression is like the * wildcard character in its effect on search speed: the closer to the front of a word the expression is, the more it will slow searching. "Appl.*" will be nearly as fast as "Apple", while ".*pple" will be much slower.
Using Options.MatchDigitChar, you can assign a character to be a wildcard that will match a single digit. In dtSearch Desktop, the = character is the MatchDigitChar.
MatchDigitChar is faster than regular expressions for matching patterns of numbers. For example, to search for a social security number, you could use "=== == ====" instead of the equivalent regular expression.
|
Copyright (c) 1995-2012 dtSearch Corp. All rights reserved.
|