Generates a report showing each hit in one or more documents, with a specified amount of context
To generate a search report,
(1) Start with a SearchResults object representing the results of a search.
(2) Call SearchJob.NewSearchReportJob to make a SearchReportJob
(3) Select the items to include in the search report using the Select*() methods in SearchReportJob
(4) Specify the amount of context to include using WordsOfContextExact, ParagraphsOfContext, or WordsOfContext
(5) Set the output format for the report using ContextFooter, ContextHeader, etc.
(6) Call Execute() to generate the report
Format
A search report lists the hits found in one or more documents, with each hit surrounded by a specified amount of context. Each block of context starts with a ContextHeader and ends with the ContextFooter. Contiguous or overlapping blocks of context will be combined. The amount of context included in the report can be specified by words or by paragraphs.
Each block of context is constructed as follows:
The report as a whole is constructed as follows:
Use the following symbols to insert file information into the FileHeader and FileFooter:
Symbol |
Meaning |
Filename |
The name of the file (without path information). For PDF and HTML files, this will be the Title. |
Location |
The location of the file |
Fullname |
The path and filename of the file. |
Size |
File size in bytes |
SizeK |
File size in kilobytes |
Date |
Modification date of the file when indexed |
Hits |
Number of hits in the file |
Title |
The first 80 characters of the file |
The docId of the file | |
Type |
The file type (Microsoft Word, PDF, HTML, etc.) |
Ordinal |
The 1-based ordinal of this item in the SearchResults from which it was generated |
IndexRetrievedFrom |
The index where the file was found |
Use %% around each symbol, like this: %%FullName%%
Use the following symbols to insert context information in the ContextHeader, which appears in front of each block of context:
Symbol |
Meaning |
Page |
Page number where the hit occurs |
Paragraph |
Paragraph number where the hit occurs (relative to the start of the page) |
Word |
Word offset of the block of context from the beginning of the file. |
FirstHit |
Word offset of the first hit in the block of context. |
You can use SearchReportJob to add a brief snippet of text to each SearchResults item showing a few hits with a limited amount of context around each hit. This "synopsis" can then be included in the displayed search results to make it easier for end-users to see why each document was found.
To add a synopsis to SearchResults,
1. Use MaxContextBlocks to limit the number of blocks of context included in the report. For example, if MaxContextBlocks = 1, then only the first hit will be included.
2. Use WordsOfContextExact to specify the number of words of context to included.
3. Set the OutputFormat to itUnformattedHTML, so output characters will be correctly HTML-encoded and formatting from the original document will not appear in the search results list. (If you use itHTML as the output format, the output could contain paragraph breaks, color changes, etc., that would not look right in a search results table.)
4. Set the dtsReportStoreInResults flag in SearchReportJob, which causes the synopsis to be stored in each search results item, making it easier to access the individual synopsis items.
5. Set the BeforeHit and AfterHit marks to HTML tags like <b> and </b> to mark the hits.
6. Select a range of items to include in the search report that corresponds to the range items to be displayed. For example, if you are displaying the first ten items, select items 0 through 9. Generating a synopsis can be time-consuming, so it is important to generate it only when needed for display.
Generation of a synopsis is much faster if you index the documents with caching of text enabled, because the context can be extracted from the index without the need to access the original files, and because the cached text includes tables designed to make context extraction more efficient.
SearchReportJob requires the IDisposable Pattern.
Topic |
Description |
The following tables list the members exposed by JobBase. | |
The properties of the JobBase class are listed here. |
Topic |
Description |
The following tables list the members exposed by OutputBase. | |
The properties of the OutputBase class are listed here. |
Topic |
Description |
The following tables list the members exposed by SearchReportJob. | |
The methods of the SearchReportJob class are listed here. | |
The properties of the SearchReportJob class are listed here. |
SearchReportJob Class |
Description |
Select no items in the SearchResults. | |
Generate the report. | |
Select all items in the SearchResults. | |
Select a range of items in the SearchResults. | |
The search results list that this SearchReportJob will use. |
OutputBase Class |
Description |
If an array of hit offsets has been provided in Hits, then the BeforeHit and AfterHit strings will be used to mark each hit in the document in the converted output (Inherited from OutputBase) | |
For HTML output, an HREF for a BASE tag to be inserted in the header. (Inherited from OutputBase) | |
If an array of hit offsets has been provided in Hits, then the BeforeHit and AfterHit strings will be used to mark each hit in the document in the converted output (Inherited from OutputBase) | |
For HTML output, a DocType tag such as <!DOCTYPE html>to go before the first tag in the output. (Inherited from OutputBase) | |
The Footer will be appended to the conversion output and can use tags in the output format, such as HTML tags in a document converted to HTML. (Inherited from OutputBase) | |
The Header will appear at the top of the conversion output and can use tags in the output format, such as HTML tags in a document converted to HTML. (Inherited from OutputBase) | |
Use HtmlHead to supply HTML data to appear inside the HEAD section of the output. (Inherited from OutputBase) | |
Name of the converted file to create. (Inherited from OutputBase) | |
By default, a FileConverter converts the input file to HTML. Other supported options are: itRTF, itUTF8 (Unicode text), itAnsi, and itXML (for XML input data only). (Inherited from OutputBase) | |
If OutputToString is true, output will be stored in OutputString rather than in a disk file. (Inherited from OutputBase) | |
When output is directed to an in-memory string, you may wish to limit the maximum amount of memory used. To do this, set OutputStringMaxSize to the maximum size you want to allow. (Inherited from OutputBase) | |
If true, output will be stored in an in-memory string variable rather than a disk file. (OutputFile will be ignored.) After the Execute method is done, the output will be in the OutputString property. (Inherited from OutputBase) |
SearchReportJob Class |
Description |
Text to appear after each block of context in the report. | |
Text to appear at the start of each block of context in the report. | |
Text to appear between blocks of context in the report (after one ContextFooter, before the next ContextHeader) | |
Text to appear after each document in the report. | |
Text to appear at the start of each document in the report. | |
Flags controlling generation of the report. | |
Number of blocks of context to include in the report for each document. | |
Number of words to scan in each document looking for blocks of context to include in the report. | |
Number of paragraphs of context to include around each hit. | |
Approximate number of words of context to include around each hit. | |
Number of words of context to include around each hit. |
Copyright (c) 1998-2023 dtSearch Corp. All rights reserved.
|