How to display retrieved HTML documents with hits highlighted
When an HTML file is retrieved in a search, dtSearch can highlight hits in the HTML file by inserting hit highlight markings in the HTML without any conversion. As a result, all aspects of the original HTML file's formatting and structure are preserved.
To ensure that dtSearch will add hit highlight markings to an HTML file without performing any conversion,
1. Set the dtsConvertInputIsHtml flag in FileConverter or DFileConvertJob, so the dtSearch Engine will know that the input data is HTML.
2. Set the output format to it_HTML
ConvertFlags values that can be used to control highlighting of HTML files are:
|
Value |
Meaning |
|
dtsConvertInputIsHtml |
dtsConvertInputIsHtml tells the dtSearch Engine to assume that the input file is HTML. |
|
dtsConvertInputIsNotHtml |
dtsConvertInputIsHtml tells the dtSearch Engine to assume that the input file is not HTML for purposes of deciding whether to do HTML-to-HTML conversion. If the output format is HTML, this forces the dtSearch Engine to convert the file to simple text before adding hit highlight markings. |
|
dtsConvertRemoveScripts |
JavaScript in HTML files can cause errors if displayed outside of the expected context of the script. The dtsConvertRemoveScripts flag tells the dtSearch Engine to remove JavaScript from HTML files when adding hit highlight markings. |
|
dtsConvertSkipHiddenHits |
If a hit cannot be displayed in HTML because the text is not visible, this flag tells the dtSearch Engine not to insert any tags for the hit. If this flag is not set, a pair of beforeHit/afterHit tags will be added before the next visible text in the file. |
|
Copyright (c) 1995-2008 dtSearch Corp. All rights reserved.
|