Links
dtSearch Text Retrieval Engine -- Java API
DataSource2 Class
Classes | Legend | Members | Methods | Send Feedback

DataSource2 provides a way for data source indexing applications to obtain information about each document that is indexed, and provides better support for indexing binary data than DataSource.

Class Hierarchy
public abstract class DataSource2 implements com.dtsearch.engine.DataSource;
File

DataSource2.java

Remarks

If an IndexJob.dataSourceToIndex is based on DataSource2 instead of DataSource, then on each call to getNextDoc, the application can call getWordCount, getDocId, and getTypeId to obtain information on the previously-indexed document. 

Additionally, the data source can return binary data (such as document files) using setDocBytes() without the need to create a temporary file.

Group
Methods
Method 
Description 
Each time getNextDoc() is called, wasDocError will return true if there was an error processing the previous document, such as a file parsing error, and getDocError() will return the error message. 
Each time getNextDoc() is called, getDocId() will return the doc id of the last document indexed. A doc id can be used to identify a document in a SearchFilter
Each time getNextDoc() is called, getDocTypeId() will return an integer identifying the file type of the last document indexed 
Each time getNextDoc() is called, getDocWordCount() will return the number of words in the last document indexed 
 
Use setDocBytes to provide an array of bytes for dtSearch to use as the binary contents of this document. To tell dtSearch to check for an array of bytes, tshe data source must return true from haveDocBytes().
The calling program should call setDocBytes to provide the binary contents of a file to be indexed before returning from getNextDoc().
While getDocText() can only return a stream of plain text, setDocBytes can return any type of binary data, such as the contents of a Word document or a PDF file. 
Each time getNextDoc() is called, wasDocError() will return true if there was an error processing the previous document, such as a file parsing error, and getDocError() will return the error message. 
Legend
 
Method 
Links
You are here: Classes > DataSource2 Class
Copyright (c) 1998-2008 dtSearch Corp. All rights reserved.