You are here: Overviews > Search Requests > Search Features > Regular Expression Searching
Close
dtSearch Text Retrieval Engine Programmer's Reference
Regular Expression Searching

Regular expression searching provides a way to search for advanced combinations of characters.

A regular expression included in a search request must be quoted and must begin with ##. Examples:

Apple and "##199[0-9]" Apple and "##19[0-9]+"
Syntax

dtSearch versions 7.62 and later use TR1 regular expressions. For more information on TR1 regular expressions, see: 

http://msdn.microsoft.com/en-us/library/bb982727.aspx 

Limitations

(1) A regular expression must match a single whole word. For example, a search for "##app.*ie" would not find "apple pie". 

(2) Only letters are searchable. Characters that are not indexed as letters are not searchable even using regular expressions, because the index does not contain any information about them. 

(3) Because the dtSearch index does not store information about line breaks, searches that include begining-of-line or end-of-line regular expression criteria (^ and $) will not work. 

(4) No case or other conversion is done on regular expressions, so a regular expression must match the case of the information stored in the index. If an index is case-insensitive, all letters in the regular expression must be lower-case. If a character is not searchable in the index, then it cannot be included as a searchable character in the regular expression. Non-searchable characters in a regular expression are not ignored as they are in other search expressions.

Performance

A regular expression is like the * wildcard character in its effect on search speed: the closer to the front of a word the expression is, the more it will slow searching. "appl.*" will be nearly as fast as "apple", while ".*pple" will be much slower.

Searching for numbers

Using Options.MatchDigitChar, you can assign a character to be a wildcard that will match a single digit. In dtSearch Desktop, the = character is the MatchDigitChar. 

MatchDigitChar is faster than regular expressions for matching patterns of numbers. For example, to search for a social security number, you could use "=== == ====" instead of the equivalent regular expression.

Copyright (c) 1995-2021 dtSearch Corp. All rights reserved.