Abstract interface to any source of document data.
File: dtsviewr.h
Data Member |
Description |
---|---|
Creation date of the document | |
A user-friendly alternative filename for the document. | |
Null-delimited string set containing pairs of field names and values to be associated with this document in the index. | |
Name of the document. The name does not have to correspond to a disk file, but it must have the form of a valid Win32 file name. | |
Modification date of the document. | |
If item was extracted from a container, the type id of the container | |
Pointer to the function used to read data from the input. Read returns the number of bytes read, or 0 if no data could be read To indicate an I/O error that should force processing of the input file to halt, return a value of less than -10,000. | |
Pointer to a function that will destroy the dtsInputStream. This function pointer is provided for use in the dtsDataSource API. External file parsers must not call release() on a dtsInputStream passed to them. | |
Pointer to the function used to change the seek pointer for the input. | |
Pointer to the function used to change the seek pointer for the input. | |
Size of the document (obsolete -- use setSize() to set the size and getSize() to obtain the size). | |
Size of the document | |
If this stream provides access to a temporary file on disk, the filename of the temporary file. This is used in situations where a file parser requires a disk file. dtSearch internal file parsers do not require disk files. | |
The type id of the file parser that should be used with this input stream. If typeId is 0, the file format will be detected automatically. typeId should only be set to a non-zero value in situations in which it is necessary to override the standard file type detection. |
Data Member |
Description |
---|---|
Creation date of the document | |
A user-friendly alternative filename for the document. | |
Null-delimited string set containing pairs of field names and values to be associated with this document in the index. | |
Name of the document. The name does not have to correspond to a disk file, but it must have the form of a valid Win32 file name. | |
Modification date of the document. | |
If item was extracted from a container, the type id of the container | |
Pointer to the function used to read data from the input. Read returns the number of bytes read, or 0 if no data could be read To indicate an I/O error that should force processing of the input file to halt, return a value of less than -10,000. | |
Pointer to a function that will destroy the dtsInputStream. This function pointer is provided for use in the dtsDataSource API. External file parsers must not call release() on a dtsInputStream passed to them. | |
Pointer to the function used to change the seek pointer for the input. | |
Pointer to the function used to change the seek pointer for the input. | |
Size of the document (obsolete -- use setSize() to set the size and getSize() to obtain the size). | |
Size of the document | |
If this stream provides access to a temporary file on disk, the filename of the temporary file. This is used in situations where a file parser requires a disk file. dtSearch internal file parsers do not require disk files. | |
The type id of the file parser that should be used with this input stream. If typeId is 0, the file format will be detected automatically. typeId should only be set to a non-zero value in situations in which it is necessary to override the standard file type detection. |
The dtSearch Engine API uses dtsInputStream in two places: (1) the external file parser API, which lets you add a file parser to the dtSearch indexing engine, and (2) the dtsDataSource API, which lets you index text data from any source.
Your file parser receives a dtsInputStream as the source of its input and is responsible for converting the data in the file into blocks of indexable text. The input could come from a file handle, a memory buffer, or any other object. The seek() and read() function pointers in the dtsInputStream provide a generic way to read data from the file your parser is processing. Input may come from a document extracted from a container rather than from a disk file, so parsers should not assume that the filename element refers to an existing file on disk.
dtsInputStream may be extended in the future to provide additional flexibility. To avoid source-level dependencies on dtsInputStream, file parsers should use a dtsInputStreamReader to access a dtsInputStream.
In the dtsDataSource API, your data source creates dtsInputStream objects and returns them to the dtSearch engine to be indexed. Thus, instead of using the read() and seek() pointers, you would implement them. After dtSearch is done indexing a dtsInputStream, dtSearch will call the release() function to request that the calling application dispose of the object. Your implementation can delete the underlying object in response to that call.