Note: For
updates to the list of supported file types, see:
http://support.dtsearch.com/faq/dts0103.htm
dtSearch can automatically recognize, index, search and display documents, including graphic marking of hits and multiple hit and file navigation options, in the following current formats. HTML and PDF documents appear with all formatting and embedded images and links intact, exactly as in the original document. dtSearch developer product can display XML files with XSL formatting. dtSearch converts other file types to HTML for display with highlighted hits. dtSearch uses its own built-in file viewers for document parsing and display, unless otherwise noted. All file formats are supported through the current release versions, unless otherwise noted.
While extensions are provided for some file formats below, dtSearch generally does not rely on extensions to detect file formats. For example, a Word document named "sample.mp3" would still be identified as a Word document.
Adobe Acrobat (*.pdf)
Ami Pro (*.sam)
Ansi Text (*.txt)
ASCII Text (See note 3)
ASF media files (metadata only) (*.asf)
CSV (Comma-separated values) (*.csv)
DBF (*.dbf)
EBCDIC
EML files (emails saved by Outlook Express) (*.eml)
Enhanced Metafile Format (*.emf)
Eudora MBX message files (*.mbx)
GZIP (*.gz)
HTML (*.htm, *.html)
JPG (*.jpg)
MBOX email archives (including Thunderbird) (*.mbx)
MHT archives (HTML archives saved by Internet Explorer) (*.mht)
MIME messages
MSG files (emails saved by Outlook) (*.msg)
Microsoft Access MDB files (see note 1) (*.mdb)
Microsoft Document Imaging (*.mdi)
Microsoft Excel (*.xls)
Microsoft Excel 2003 XML (*.xml)
Microsoft Excel 2007 (*.xlsx)
Microsoft Outlook/Exchange (See note 2)
Microsoft Outlook Express 5 and 6 (*.dbx) message stores
Microsoft PowerPoint (*.ppt)
Microsoft PowerPoint 2007 (*.pptx)
Microsoft Rich Text Format (*.rtf)
Microsoft Searchable Tiff (*.tiff)
Microsoft Word for DOS (*.doc)
Microsoft Word for Windows (*.doc)
Microsoft Word 2003 XML (*.xml)
Microsoft Word 2007 (*.docx)
Microsoft Works (*.wks)
MP3 (metadata only) (*.mp3)
Multimate Advantage II (*.dox)
Multimate version 4 (*.doc)
OpenOffice 2.x and 1.x documents, spreadsheets, and presentations (*.sxc, *.sxd, *.sxi, *.sxw, *.sxg, *.stc, *.sti, *.stw, *.stm, *.odt, *.ott, *.odg, *.otg, *.odp, *.otp, *.ods, *.ots, *.odf) (includes OASIS Open Document Format for Office Applications)
TAR (*.tar)
TIF (*.tif)
TNEF (winmail.dat)
Treepad HJT files (*.hjt)
Unicode (UCS16, Mac or Windows byte order, or UTF-8)
Windows Metafile Format (*.wmf)
WMA media files (metadata only) (*.wma)
WMV video files (metadata only) (*.wmv)
WordPerfect 4.2 (See note 3) (*.wpd, *.wpf)
WordPerfect (5.0 and later) (*.wpd, *.wpf)
WordStar version 1, 2, 3 (See note 3) (*.ws)
WordStar versions 4, 5, 6 (*.ws)
WordStar 2000
Write (*.wri)
XBase (including FoxPro, dBase, and other XBase-compatible formats) (*.dbf)
XML (*.xml)
XML Paper Specification (*.xps)
XSL
XyWrite (See note 3)
ZIP (*.zip)
[1] Databases. Using ODBC, dtSearch can also index and display records in Access databases. Each record is treated as a separate document. XBase databases are indexed without using ODBC. For information on indexing SQL databases, click here.
[2] Outlook and Exchange. dtSearch Desktop can index Outlook and Exchange message stores using MAPI. For more information, click here.
[3] Older Word Processor Formats. dtSearch can index and display, but cannot automatically recognize, documents in the following formats:
WordPerfect 4.2
WordStar versions before 4
XyWrite
Ascii Text
In dtSearch Desktop, click Options > Preferences > File Types tell dtSearch how to identify these types of files.
[4] Web Sites. dtSearch Desktop/Network includes a spider that can index and search dynamically-generated content or static content on web sites. For more information, click here.
The dtSearch Engine automatically detects fields in the following file formats:
|
File format |
Fields |
|---|---|
|
Email files (Outlook Express, Eudora, MBOX, EML) |
Sender, Recipient, Subject |
|
Outlook items and .MSG files |
Sender, Recipient, Subject, contact fields (StreetAddress, CompanyName, etc.) |
|
Microsoft Word, Excel, PowerPoint |
Document summary information fields |
|
OpenOffice/Open Document Format |
Document properties fields |
|
HTML |
META tags; <TITLE> is indexed as HtmlTitle field; <H1>, <H2>, <H3> are indexed as HtmlH1, HtmlH2, HtmlH3, etc. |
|
XML |
All fields |
|
DBF |
All fields |
|
CSV |
All fields (CSV, or comma-separated values, files must have a .csv extension, a list of field names in the first line, and must use tab, comma, or semicolon delimiters) |
|
PDF files |
Document Properties |
|
WordPerfect |
Document summary information fields |
|
MP3 |
All metadata fields |
|
JPEG, TIFF |
XMP (Vista), EXIF and IPTC metadata |
|
ASF, WMA, WMV |
All metadata fields |
dtSearch will still index, search, and display other file formats, but they will be treated as binary file types. In other words, all binary codes, etc. will be displayed along with the text. dtSearch can also use a proprietary binary file filtering algorithm to clean up these file formats. For more information see Indexing Options in the dtSearch help file.
For legacy file types in which multiple messages or log entries are stored in one very large text file, use the dtSearch File Segmentation Rules feature to tell dtSearch how to break up the file into multiple logical subdocuments. For more information, see File Segmentation Rules in the dtSearch help file.
dtSearch Desktop/Network can display images in the following formats:
BMP
EPSF
GIF
IMG
JPEG
PCX
PNG
TIFF
Targa
WMF
WPG (WPG version 1.0 only)