You are here: Overviews > File Parsers > Automatically Detected Fields
Close
dtSearch Text Retrieval Engine Programmer's Reference
Automatically Detected Fields

Fields (metadata) that dtSearch automatically detects and indexes in documents.

The dtSearch Engine automatically detects fields in the following file formats:

File Format
Fields
Email files (Outlook Express, Eudora, MBOX, EML)
To, CC, BCC, From, Sent Via, Sender, Recipient, Subject, Date, Attachments
Outlook items and .MSG files
To, CC, BCC, From, Sent Via, Sender, Recipient, Subject, Date, Sent Date, Delivered Date, Attachments, contact fields (StreetAddress, CompanyName, etc.)
Microsoft Word, Excel, PowerPoint
Document summary information fields
OpenOffice/Open Document Format
Document properties fields
HTML
META tags; <TITLE> is indexed as HtmlTitle field; <H1>, <H2>, <H3> are indexed as HtmlH1, HtmlH2, HtmlH3, etc.
XML
All fields
DBF
All fields
CSV
All fields (CSV, or comma-separated values, files must have a .csv extension, a list of field names in the first line, and must use tab, comma, or semicolon delimiters)
PDF files
Document Properties
WordPerfect
Document summary information fields
MP3
All metadata fields
ASF, WMA, WMV
All metadata fields
JPG, TIF
XMP (Vista), EXIF and IPTC properties

To prevent fields from being indexed in documents, set the dtsoFfSkipDocumentProperties flag in Options.FieldFlags. This setting does not affect CSV, XML, or DBF files. 

The NTFS file system supports file properties for other formats. Set the dtsoFfShowNtfsProperties flag in Options.FieldFlags to have the dtSearch Engine check for and index these properties, where present.

Copyright (c) 1995-2023 dtSearch Corp. All rights reserved.