Close
dtSearch Engine API for Java
ExtractionOptions Class

ExtractionOptions (attached to a FileConvertJob) specifies how embedded images and attachments should be handled.

File: ExtractionOptions.java 

Package: com.dtsearch.engine 

Syntax
Java
public class ExtractionOptions;

During conversion, embedded images and attachments can be extracted to a folder, with links to the extracted images and attachments inserted in the output. 

 

Example:

com.dtsearch.engine.FileConverter cj = new com.dtsearch.engine.FileConverter(); cj.setInputFile(inputFilename); cj.setOutputToString(true); cj.setOutputFormat(com.dtsearch.engine.OutputFormats.itHTML); com.dtsearch.engine.ExtractionOptions exo = new com.dtsearch.engine.ExtractionOptions(); exo.setOutputLocation("c:\\output"); exo.setOutputReference("c:\\output"); // This prevents attachments from being written with executable extensions like .exe or .bat exo.setAllowedExtensions("doc docx xls xlsx ppt pptx jpg jpeg png txt zip xml pdf wpd"); // Attachments with disallowed extensions will have ".data" appended to the filename exo.setDefaultExtension("data"); exo.setFlags(ExtractionOptionsFlags.dtsExoExtractImages | ExtractionOptionsFlags.dtsExoExtractAttachments | ExtractionOptionsFlags.dtsExoLimitExtensions); cj.setExtractionOptions(exo); cj.execute();

 

When highlighting hits, flags should be consistent with the behavior of the indexer or hit highlighting will be inconsistent. 

dtsExoExtractAttachments should not be used when highlighting hits because in some cases text is added to the conversion output to provide a location for the link to the extracted attachment. 

dtsExoDoNotConvertAttachments should not be used when highlighting hits because the indexer does convert attachments to text. 

dtsExoExtractImages can be used when highlighting hits. 

 

com.dtsearch.engine.ExtractionOptions