You are here: C++ API > Enumerations > FieldFlags Enumeration
Close
dtSearch Text Retrieval Engine Programmer's Reference
FieldFlags Enumeration

Control indexing of meta-data associated with documents

File: dtsearch.h

Syntax
C++
enum FieldFlags { dtsoFfSkipFilenameField = 0x0001, dtsoFfSkipDocumentProperties = 0x0002, dtsoFfHtmlShowLinks = 0x0004, dtsoFfHtmlShowImgSrc = 0x0008, dtsoFfHtmlShowComments = 0x0010, dtsoFfHtmlShowScripts = 0x0020, dtsoFfHtmlShowStylesheets = 0x0040, dtsoFfHtmlShowMetatags = 0x0080, dtsoFfHtmlShowNoframesContent = 0x0100, dtsoFfHtmlShowHiddenContent = 0x01fc, dtsoFfHtmlNoHeaderFields = 0x0200, dtsoFfOfficeSkipHiddenContent = 0x0400, dtsoFfXmlHideFieldNames = 0x0800, dtsoFfShowNtfsProperties = 0x1000, dtsoFfXmlSkipAttributes = 0x2000, dtsoFfSkipFilenameFieldPath = 0x4000, dtsoFfPdfSkipAttachments = 0x8000, dtsoFfHtmlSkipInputValues = 0x10000, dtsoFfHtmlSkipImageAlt = 0x20000, dtsoFfIncludeFileTypeField = 0x40000, dtsoFfIncludeFileTypeIdField = 0x80000, dtsoFfSkipDataSourceFields = 0x100000, dtsoFfSkipEmailHeaders = 0x200000, dtsoFfIndexArchiveFileLists = 0x400000, dtsoFfIncludeDocumentPropertiesCaption = 0x800000, dtsoFfShowImageProperties = 0x1000000, dtsoFfSkipEmailProperties = 0x2000000, dtsoFfPdfShowLinks = 0x4000000, dtsoFfGenerateMd5Hash = 0x8000000, dtsoFfGenerateSha256Hash = 0x10000000, dtsoFfHtmlIndexHeadersAsFields = 0x20000000, dtsoFfNormalizeEmailAddresses = 0x40000000 };
Members
Description
dtsoFfSkipFilenameField = 0x0001
Do not generate a field named Filename containing the name of the file.
dtsoFfSkipDocumentProperties = 0x0002
Do not index or search document summary fields
dtsoFfHtmlShowLinks = 0x0004
Make HTML links searchable
dtsoFfHtmlShowImgSrc = 0x0008
Make HTML IMG src= attribute searchable
dtsoFfHtmlShowComments = 0x0010
Make HTML Comments searchable
dtsoFfHtmlShowScripts = 0x0020
Make HTML Scripts searchable
dtsoFfHtmlShowStylesheets = 0x0040
Make HTML style sheets searchable
dtsoFfHtmlShowMetatags = 0x0080
Make HTML meta tags searchable and visible, appended to the body of the HTML file
dtsoFfHtmlShowNoframesContent = 0x0100
Make content inside NOFRAMES tags searchable and visible, appended to the body of the HTML file
dtsoFfHtmlShowHiddenContent = 0x01fc
All of the dtsoFfHtmlShow* flags
dtsoFfHtmlNoHeaderFields = 0x0200
Suppress generation of HtmlTitle, HtmlH1, etc. fields (Obsolete -- see dtsoFfHtmlIndexHeadersAsFields) These fields are now disabled by default. Set dtsoFfHtmlIndexHeadersAsFields to enable them.
dtsoFfOfficeSkipHiddenContent = 0x0400
Skip non-text streams in Office documents
dtsoFfXmlHideFieldNames = 0x0800
In XML, make field names not searchable
dtsoFfShowNtfsProperties = 0x1000
Make NTFS file properties searchable
dtsoFfXmlSkipAttributes = 0x2000
Do not index attributes in XML files
dtsoFfSkipFilenameFieldPath = 0x4000
Include only the filename (not the path) in the Filename field generated at the end of each document.
dtsoFfPdfSkipAttachments = 0x8000
Skip attachments in PDF files. If a PDF file has attachments, those attachments can be in any file format, so Adobe Reader cannot be used to highlight hits because it can only highlight hits in PDF data. Therefore, a PDF file with attachments must be hit-highlighted through file conversion like other document formats. Skipping PDF attachments enables PDF files with attachments to be hit-highlighted using Adobe Reader.
dtsoFfHtmlSkipInputValues = 0x10000
Skip HTML INPUT tag "value" attributes
dtsoFfHtmlSkipImageAlt = 0x20000
Skip HTML IMG tag "alt" attributes,
dtsoFfIncludeFileTypeField = 0x40000
Add file type field indicating the file format of the document (ex: "Microsoft Word")
dtsoFfIncludeFileTypeIdField = 0x80000
Add numeric type id field with the type id indicating the file format of the document
dtsoFfSkipDataSourceFields = 0x100000
Suppress fields passed through the DataSource API through DataSource.DocFields or FileConverter.InputFields.
dtsoFfSkipEmailHeaders = 0x200000
Suppress display of headers in emails.
dtsoFfIndexArchiveFileLists = 0x400000
Index the names of files in ZIP and RAR archives
dtsoFfIncludeDocumentPropertiesCaption = 0x800000
Include a caption "Document Properties" on the table of document properties in Word, Excel, and PowerPoint 2003 documents
dtsoFfShowImageProperties = 0x1000000
Display properties of image files embedded in documents
dtsoFfSkipEmailProperties = 0x2000000
Suppress display of email properties (subject, sender, recipient, etc.)
dtsoFfPdfShowLinks = 0x4000000
Make links in PDF files searchable
dtsoFfGenerateMd5Hash = 0x8000000
Generate an MD5 hash for each document indexed and append it as a field named MD5Hash. Generation of MD5 hashes is time-consuming so this will make indexing slower.
dtsoFfGenerateSha256Hash = 0x10000000
Generate an Sha256 hash for each document indexed and append it as a field named Sha256Hash. Generation of hashes is time-consuming so this will make indexing slower.
dtsoFfHtmlIndexHeadersAsFields = 0x20000000
Automatically generate HtmlH1, HtmlH2, etc. fields for content inside H1, H2, etc. tags, and an HtmlTitle field for content inside HTML Title tags. This replaces dtsoFfHtmlNoHeaderFields. Beginning with version 7.88, these rarely-used fields are disabled by default, so dtsoFfHtmlIndexHeadersAsFields must be set to enable them.
dtsoFfNormalizeEmailAddresses = 0x40000000
Normalize email addresses in email header fields, removing extra spacing and quotation marks and moving comments to the end of the name or address portion.

FieldFlags provide options to control the indexing of meta-data associated with documents. When highlighting hits, it is important to make sure that FieldFlags has the same options that were used when a document was indexed. Otherwise, hit highlighting may be incorrect due to differences in the words found in each document. 

By default, dtSearch will index fields in documents such as the Summary Information fields in Word files and META tags in HTML files. FieldFlags can be used to suppress some or all of this metadata. 

dtSearch will also add a "Filename" field to the end of each document, with the full path and filename of the document, so words in the document name will be searchable like other text. To suppress this completely, use dtsoFfSkipFilenameField. To include only the name of the document (not the path), use dtsoSkipFilenameFieldPath. 

The dtsoFfHtmlShow* flags can be used to make normally hidden HTML elements, such as styles or links, visible and searchable. For each category of element that is enabled, a section will be added to the end of the HTML file listing the items in that category. For example, if dtsoFfHtmlShowComments is set, then each HTML file will have a list of the embedded comments after the body of the HTML.

API

C++: dtsOptions.fieldFlags 

Java: Options.setFieldFlags() 

.NET: Options.FieldFlags 

COM: Options.FieldFlags

Copyright (c) 1995-2023 dtSearch Corp. All rights reserved.