Close
dtSearch .NET Standard API 2023.02
ExtractionOptions Class

ExtractionOptions (attached to a FileConvertJob) specifies how embedded images and attachments should be handled.

dtSearch.Engine.ExtractionOptions
public class ExtractionOptions;

During conversion, embedded images and attachments can be extracted to a folder, with links to the extracted images and attachments inserted in the output. Example:

dtSearch.Engine.FileConverter conv = new FileConverter(); conv.OutputToString = true; conv.InputFile = "c:\\docs\\sample.doc"; conv.OutputFormat = OutputFormats.itHTML; dtSearch.Engine.ExtractionOptions extractionOptions = new dtSearch.Engine.ExtractionOptions(); // This prevents attachments from being created with executable extensions extractionOptions.AllowedExtensions = "jpg jpeg pdf doc xls ppt zip"; // Attachments with disallowed extensions will have ".data" appended to the output filename extractionOptions.DefaultExtension = "data"; extractionOptions.OutputLocation = "c:\\output"; extractionOptions.Flags = dtSearch.Engine.ExtractionOptionsFlags.dtsExoExtractAttachments | dtSearch.Engine.ExtractionOptionsFlags.dtsExoExtractImages | dtSearch.Engine.ExtractionOptionsFlags.dtsExoLimitExtensions; extractionOptions.FilenamePrefix = "tmp_"; extractionOptions.OutputReference = extractionOptions.OutputLocation; extractionOptions.UnnamedAttachmentLinkText = "[Attachment]"; conv.ExtractionOptions = extractionOptions; conv.Execute();

When highlighting hits, flags should be consistent with the behavior of the indexer or hit highlighting will be inconsistent. To be consistent with indexing behavior:

  • dtsExoExtractAttachments should not be used when highlighting hits because in some cases text is added to the conversion output to provide a location for the link to the extracted attachment.
  • dtsExoDoNotConvertAttachments should not be used when highlighting hits because the indexer does convert attachments to text.
  • dtsExoExtractImages can be used when highlighting hits.