File type identifiers
File
File: TypeId.java
Package: com.dtsearch.engine
Syntax
Fields
Field |
Description |
---|---|
7-zip archive (not supported) | |
ASF file | |
AVI file | |
Ami Pro | |
Ansi text file | |
Ascii (DOS) text file | |
BMP image file | |
Binary file (unrecognized format) | |
CAB archive | |
Comma-separated values file | |
CALS metadata format described in MIL-STD-1840C | |
OLE Compound Document (or "DocFile") | |
Output format for FileConverter that organizes document content, metadata, and attachments into a standard XML format | |
CSV file parsed as a single file listing all records | |
CSV file parsed as report (like a spreadsheet) instead of a database | |
XBase database file | |
DWF CAD file | |
DWG CAD file | |
DXF CAD file | |
Record in a database file (such as XBase or Access) | |
Database record (rendered as HTML) | |
Compound document (new parser) | |
Windows Metafile Format (Win32) | |
Mime stream handled as a single document | |
ELF format executable | |
Obsolete | |
Message in a Eudora message store | |
Excel 2007 | |
Excel 2007 XLSB format | |
Excel Version 2 | |
Microsoft Excel 2003 XML format | |
Excel version 3 | |
Excel version 4 | |
Excel versions 5 and 7 | |
Excel 97, 2000, XP, or 2003 | |
Filtered binary file | |
Binary file filtered using Unicode Filtering | |
Binary file filtered using Unicode Filtering, not split into segments | |
Flash SWF | |
GIF image file | |
Archive compressed with gzip | |
HTML | |
HTML Help CHM file | |
Obsolete | |
ICalendar (*.ics) file | |
File type processed using installed IFilter | |
Ichitaro word processor file (versions 8 through 2011) | |
Ichitaro versions 5, 6, 7 | |
JPEG file | |
Windows Media Photo/HDPhoto/*.wdp | |
Lotus 123 spreadsheet | |
M4A file | |
Email archive conforming to the MBOX standard (dtSearch versions 7.50 and earlier) | |
Email archive conforming to the MBOX standard (dtSearch versions 7.51 and later) | |
MDI image file | |
MIDI file | |
MP3 file | |
MP4 file | |
MPEG file | |
Obsolete | |
Microsoft Works word processor | |
Music or video file | |
Microsoft Access database | |
Microsoft Access (parsed directly, not via ODBC or the Jet Engine) | |
Access database parsed as a single file listing all records | |
Microsoft Office .thmx file with theme data | |
Microsoft Publisher file | |
Microsoft Word 95 - 2003 (dtSearch versions 6.5 and later) | |
Framemaker MIF file | |
MIME-encoded message, processed as a container | |
dtSearch 6.40 and earlier file parser for .eml files | |
Microsoft Works WPS versions 4 and 5 | |
Microsoft Works WPS versions 6, 7, 8, and 9 | |
Multimate (any version) | |
File indexed with all content ignored (see dtsoIndexBinaryNoContent) | |
Data file with no text to index | |
oledata.mso file | |
not supported | |
OneNote 2007 | |
OneNote 2010, 2013, and 2016 | |
OneNote variant generated by Microsoft online services | |
OpenOffice versions 1, 2, and 3 documents, spreadsheets, and presentations (*.sxc, *.sxd, *.sxi, *.sxw, *.sxg, *.stc, *.sti, *.stw, *.stm, *.odt, *.ott, *.odg, *.otg, *.odp, *.otp, *.ods, *.ots, *.odf) (includes OASIS Open Document Format for Office Applications) | |
Message in an Outlook Express message store | |
Outlook Express dbx archive (versions 7.67 and earlier) | |
Outlook Express dbx archive | |
Outlook .MSG file processed as a container | |
Microsoft Outlook .MSG file | |
Outlook PST message store | |
PDF | |
PNG image file | |
PDF file with attachments | |
PFS Professional Write file | |
Photoshop Image (*.psd) | |
PowerPoint 97-2003 | |
PowerPoint 2007 | |
PowerPoint 3 | |
PowerPoint 4 | |
PowerPoint 95 | |
PropertySet stream in a Compound Document | |
Quattro Pro 9 and newer | |
Quattro Pro 8 and older | |
QuickTime file | |
RAR archive | |
Microsoft Rich Text Format | |
SASF call center audio file | |
Text segmented using File Segmentation Rules | |
Single-byte text, encoding automatically detected | |
SolidWorks file | |
TAR archive | |
TIFF file | |
Transport-neutral encapsulation format | |
TreePad file (HJT format in TreePad 6 and earlier) | |
TrueType TTF file | |
Output format only, for generating a synopsis that is HTML-encoded but that does not include formatting such as font settings, paragraph breaks, etc. | |
UCS-16 text | |
Unigraphics file (docfile format) | |
Unigraphics file (#UGC format) | |
UTF-8 text | |
Visio file | |
Visio 2013 document | |
Visio XML file | |
WAV sound file | |
Windows Metafile Format (Win16) | |
Wordstar 2000 | |
WordStar version 5 or 6 | |
Windows Write | |
Windows .exe or .dll | |
Word 2007 | |
Microsoft Word 2003 XML format | |
Word for DOS (same as Windows Write, it_WinWrite) | |
Obsolete | |
Microsoft Word 6.0 | |
Word For Windows 97, 2000, XP, or 2003 | |
Word for Windows 1 | |
Word for Windows 2 | |
List of words in UTF-8 format, with the word ordinal in front of each word | |
WordPerfect 4.2 | |
WordPerfect 5 | |
WordPerfect 6 | |
WordPerfect document embedded in another file | |
WordStar through version 4 | |
XBase database | |
XBase file parsed as a single file listing all records | |
XML | |
XML Paper Specification (Metro) | |
XFA form | |
XyWrite | |
ZIP archive | |
ZIP file parsed using zlib | |
dtSearch index file | |
IWork 2009 | |
IWork 2009 Keynote presentation | |
IWork 2009 Numbers spreadsheet | |
IWork 2009 Pages document |
Remarks
Because some older file parsers are still supported for backward compatibility, in a few cases there may be more than one TypeId for a file format. Not all file formats listed are supported for content extraction or indexing. For a current list of supported file types, see: http://support.dtsearch.com/faq/dts0103.htm
Class Hierarchy
com.dtsearch.engine.TypeId