Links
dtSearch Text Retrieval Engine Programmer's Reference 7.70
UnicodeFilterFlags Enumeration
Enumerations | Send Feedback

Values for Options.UnicodeFilterFlags (.NET) or dtsOptions.unicodeFilterFlags (C++) or Options.setUnicodeFilterFlags(java).

enum UnicodeFilterFlags {
  dtsoUfExtractAsHtml = 0x0001,
  dtsoUfOverlapBlocks = 0x0002,
  dtsoUfAutoWordBreakByLength = 0x0004,
  dtsoUfAutoWordBreakByCase = 0x0008,
  dtsoUfAutoWordBreakOnDigit = 0x0010,
  dtsoUfAutoWordBreakOverlapWords = 0x0020,
  dtsoUfFilterFailedDocs = 0x0040,
  dtsoUfFilterAllDocs = 0x0080
};
File

dtsearch.h

Members
Members 
Description 
dtsoUfExtractAsHtml = 0x0001 
Extracting blocks as HTML has no effect on the text that is extracted, but it adds additional information in HTML comments to each extracted block. 
dtsoUfOverlapBlocks = 0x0002 
Overlapping blocks prevents text that crosses a block boundary from being missed in the filtering process. With overlapping enabled, each block extends 256 characters past the start of the previous block. 
dtsoUfAutoWordBreakByLength = 0x0004 
Automatically insert a word break in long sequences of letters. A word break will be inserted when the word length reaches Options.MaxWordLength. 
dtsoUfAutoWordBreakByCase = 0x0008 
Automatically insert a word break when a capital letter appears following lower-case letters. 
dtsoUfAutoWordBreakOnDigit = 0x0010 
Automatically insert a word break when a digit follows letters. 
dtsoUfAutoWordBreakOverlapWords = 0x0020 
When a word break is automatically inserted due to dtsoUfAutoWordBreakByLength, overlap the two words generated by the word break. 
dtsoUfFilterFailedDocs = 0x0040 
When a document cannot be indexed due to file corruption or encryption, apply the filtering algorithm to extract text from the file. 
dtsoUfFilterAllDocs = 0x0080 
Ignore file format information and apply Unicode Filtering to all documents. 
Remarks

UnicodeFilterFlags control the behavior of the Unicode Filtering algorithm when it is used to filter text from binary data. See Filtering Options.

Group
Links
You are here: C++ API > Enumerations > UnicodeFilterFlags Enumeration
Copyright (c) 1995-2012 dtSearch Corp. All rights reserved.