dtSearch .NET Standard API 2021.02
Convert files to HTML, RTF, XML, or text, optionally marking hits with caller-supplied tags.
FileConverter converts files to HTML, RTF, XML, or text, optionally marking hits with caller-supplied tags.
Most commonly, FileConverter is used after a search to highlight hits in a retrieved document. To highlight hits in a document, FileConverter needs:
The first five items all come from the SearchResults object with the results of the search, so you can set them all in a single step by calling FileConverter.SetInputItem() with the SearchResults object and the ordinal of the document to select.
SetInputItem will set InputFile, InputTypeId, InputDocId, Hits, AlphabetLocation, and IndexRetrievedFrom. If the index was built with caching of documents, SetInputItem will also set up FileConverter to retrieve the cached version of the document from the index.
The document data to convert can consist of one binary document file, such as a Word document, and any number of field-value pairs in InputFields. InputText can be used to provide additional text to include in the converted output.
You can pass the binary document to FileConverter in several ways:
InputText and InputFields may only contain plain text. If HTML, RTF, or other text-like document data is passed in InputText, the HTML or RTF tags will be interpreted as text and included in the conversion output.
InputFile must be an accessible disk file. UNC paths will work, provided that the network resource can be accessed, but HTTP paths will not. To convert data accessed by HTTP, download the data to a memory buffer and supply it in InputBytes or InputStream.
Even when InputBytes or InputStream is used, a filename should be provided in InputFile if possible to tell dtSearch the original filename extension, which can provide useful information about the document format.
When you build an index, you can request that the documents be cached in the index, in which case dtSearch will zip-compress each document and store it in the index folder. This can be done with any type of indexed data, including dynamically-generated data returned through the DataSource API. To have FileConverter use the cached document as input, use SetInputItem to set up FileConverter as described above, and set the flag dtsConvertGetFromCache in FileConverter.Flags.
If the original data was indexed using the DataSource indexing API, then to highlight hits set InputBytes, InputFields, and InputText to the same values that were returned from the data source as DocBytes, DocFields, and DocText when the document was indexed. Alternatively, you can build the index with caching of documents enabled, and then use the cached document to highlight hits (see above).
The BeforeHit and AfterHit markers are inserted before and after each hit word. The BeforeHit and AfterHit markers can contain hypertext links or other HTML tags. To facilitate creation of hit navigation markers, the strings "%%ThisHit%%", "%%NextHit%%", and "%%PrevHit%%" will be replaced with ordinals representing the current hit, the next hit, and the previous hit in the document.
For more information on conversion output options, see:
Set dtsConvertAutoUpdateSearch to have dtSearch automatically correct out-of-date hit highlighting information.
Set dtsConvertUseStyles to have CSS styles included in output, and add a style sheet based on the dtSearch DocStyles.css file to specify the appearance of each style.
FileConverter requires the IDisposable Pattern.