Close
dtSearch Engine API for .NET Framework 2.x-4.x 2023.02
ExtractionOptions Class

ExtractionOptions (attached to a FileConvertJob) specifies how embedded images and attachments should be handled.

dtSearch::Engine::ExtractionOptions
public class ExtractionOptions;

During conversion, embedded images and attachments can be extracted to a folder, with links to the extracted images and attachments inserted in the output. Example:

dtSearch.Engine.FileConverter conv = new FileConverter(); conv.OutputToString = true; conv.InputFile = "c:\\docs\\sample.doc"; conv.OutputFormat = OutputFormats.itHTML; dtSearch.Engine.ExtractionOptions extractionOptions = new dtSearch.Engine.ExtractionOptions(); // This prevents attachments from being created with executable extensions extractionOptions.AllowedExtensions = "jpg jpeg pdf doc xls ppt zip"; // Attachments with disallowed extensions will have ".data" appended to the output filename extractionOptions.DefaultExtension = "data"; extractionOptions.OutputLocation = "c:\\output"; extractionOptions.Flags = dtSearch.Engine.ExtractionOptionsFlags.dtsExoExtractAttachments | dtSearch.Engine.ExtractionOptionsFlags.dtsExoExtractImages | dtSearch.Engine.ExtractionOptionsFlags.dtsExoLimitExtensions; extractionOptions.FilenamePrefix = "tmp_"; extractionOptions.OutputReference = extractionOptions.OutputLocation; extractionOptions.UnnamedAttachmentLinkText = "[Attachment]"; conv.ExtractionOptions = extractionOptions; conv.Execute();

When highlighting hits, flags should be consistent with the behavior of the indexer or hit highlighting will be inconsistent. 

dtsExoExtractAttachments should not be used when highlighting hits because in some cases text is added to the conversion output to provide a location for the link to the extracted attachment. 

dtsExoDoNotConvertAttachments should not be used when highlighting hits because the indexer does convert attachments to text. 

dtsExoExtractImages can be used when highlighting hits.