ExtractionOptions (attached to a FileConvertJob) specifies how embedded images and attachments should be handled.
dtSearch.Engine.ExtractionOptions
C#
publicclassExtractionOptions;
Remarks
During conversion, embedded images and attachments can be extracted to a folder, with links to the extracted images and attachments inserted in the output. Example:
dtSearch.Engine.FileConverter conv = newFileConverter();
conv.OutputToString = true;
conv.InputFile = "c:\\docs\\sample.doc";
conv.OutputFormat = OutputFormats.itHTML;
dtSearch.Engine.ExtractionOptions extractionOptions = new dtSearch.Engine.ExtractionOptions();
// This prevents attachments from being created with executable extensions
extractionOptions.AllowedExtensions = "jpg jpeg pdf doc xls ppt zip";
// Attachments with disallowed extensions will have ".data" appended to the output filename
extractionOptions.DefaultExtension = "data";
extractionOptions.OutputLocation = "c:\\output";
extractionOptions.Flags = dtSearch.Engine.ExtractionOptionsFlags.dtsExoExtractAttachments |
dtSearch.Engine.ExtractionOptionsFlags.dtsExoExtractImages |
dtSearch.Engine.ExtractionOptionsFlags.dtsExoLimitExtensions;
extractionOptions.FilenamePrefix = "tmp_";
extractionOptions.OutputReference = extractionOptions.OutputLocation;
extractionOptions.UnnamedAttachmentLinkText = "[Attachment]";
conv.ExtractionOptions = extractionOptions;
conv.Execute();
When highlighting hits, flags should be consistent with the behavior of the indexer or hit highlighting will be inconsistent. To be consistent with indexing behavior:
dtsExoExtractAttachments should not be used when highlighting hits because in some cases text is added to the conversion output to provide a location for the link to the extracted attachment.
dtsExoDoNotConvertAttachments should not be used when highlighting hits because the indexer does convert attachments to text.