dtsInputStream Structure

Abstract interface to any source of document data.

File

File: dtsviewr.h

Syntax

C++

struct dtsInputStream { const char * filename; const char * tempname; long size; dtsFileDate modified; dtsFileDate created; void * pData; void (* seek)(void *pData, long where); long (* read)(void *pData, void *dest, long len); void (* release)(void *pData); const char * displayName; const char * fields; long typeId; __int64 size64; void (* seek64)(void *pData, __int64 where); int parentTypeId; };

Data Members

Data Member	Description
created	Creation date of the document
displayName	A user-friendly alternative filename for the document.
fields	Null-delimited string set containing pairs of field names and values to be associated with this document in the index.
filename	Name of the document. The name does not have to correspond to a disk file, but it must have the form of a valid Win32 file name.
modified	Modification date of the document.
pData	Data to pass to seek and read function pointers
parentTypeId	If item was extracted from a container, the type id of the container
read	Pointer to the function used to read data from the input. Read returns the number of bytes read, or 0 if no data could be read To indicate an I/O error that should force processing of the input file to halt, return a value of less than -10,000.
release	Pointer to a function that will destroy the dtsInputStream. This function pointer is provided for use in the dtsDataSource API. External file parsers must not call release() on a dtsInputStream passed to them.
seek	Pointer to the function used to change the seek pointer for the input.
seek64	Pointer to the function used to change the seek pointer for the input.
size	Size of the document (obsolete -- use setSize() to set the size and getSize() to obtain the size).
size64	Size of the document
tempname	If this stream provides access to a temporary file on disk, the filename of the temporary file. This is used in situations where a file parser requires a disk file. dtSearch internal file parsers do not require disk files.
typeId	The type id of the file parser that should be used with this input stream. If typeId is 0, the file format will be detected automatically. typeId should only be set to a non-zero value in situations in which it is necessary to override the standard file type detection.

Group

Classes

Members

Data Members

Data Member	Description
created	Creation date of the document
displayName	A user-friendly alternative filename for the document.
fields	Null-delimited string set containing pairs of field names and values to be associated with this document in the index.
filename	Name of the document. The name does not have to correspond to a disk file, but it must have the form of a valid Win32 file name.
modified	Modification date of the document.
pData	Data to pass to seek and read function pointers
parentTypeId	If item was extracted from a container, the type id of the container
read	Pointer to the function used to read data from the input. Read returns the number of bytes read, or 0 if no data could be read To indicate an I/O error that should force processing of the input file to halt, return a value of less than -10,000.
release	Pointer to a function that will destroy the dtsInputStream. This function pointer is provided for use in the dtsDataSource API. External file parsers must not call release() on a dtsInputStream passed to them.
seek	Pointer to the function used to change the seek pointer for the input.
seek64	Pointer to the function used to change the seek pointer for the input.
size	Size of the document (obsolete -- use setSize() to set the size and getSize() to obtain the size).
size64	Size of the document
tempname	If this stream provides access to a temporary file on disk, the filename of the temporary file. This is used in situations where a file parser requires a disk file. dtSearch internal file parsers do not require disk files.
typeId	The type id of the file parser that should be used with this input stream. If typeId is 0, the file format will be detected automatically. typeId should only be set to a non-zero value in situations in which it is necessary to override the standard file type detection.

Methods

Method	Description
clear	Clear all data in dtsInputStream
copy	Copy a dtsInputStream (shallow copy)

Methods

Method	Description
clear	Clear all data in dtsInputStream
copy	Copy a dtsInputStream (shallow copy)

Remarks

The dtSearch Engine API uses dtsInputStream in two places: (1) the external file parser API, which lets you add a file parser to the dtSearch indexing engine, and (2) the dtsDataSource API, which lets you index text data from any source.

dtsInputStream in the external file parser API

Your file parser receives a dtsInputStream as the source of its input and is responsible for converting the data in the file into blocks of indexable text. The input could come from a file handle, a memory buffer, or any other object. The seek() and read() function pointers in the dtsInputStream provide a generic way to read data from the file your parser is processing. Input may come from a document extracted from a container rather than from a disk file, so parsers should not assume that the filename element refers to an existing file on disk.

dtsInputStream may be extended in the future to provide additional flexibility. To avoid source-level dependencies on dtsInputStream, file parsers should use a dtsInputStreamReader to access a dtsInputStream.

dtsInputStream in the dtsDataSource API

In the dtsDataSource API, your data source creates dtsInputStream objects and returns them to the dtSearch engine to be indexed. Thus, instead of using the read() and seek() pointers, you would implement them. After dtSearch is done indexing a dtsInputStream, dtSearch will call the release() function to request that the calling application dispose of the object. Your implementation can delete the underlying object in response to that call.