Close
dtSearch Text Retrieval Engine Programmer's Reference
Removing documents from an index

How to remove documents from an index.

There are two ways to remove documents from an index: 

(1) You can set the "Remove Deleted" flag in an index job, which indicates that each file in the index should be checked and, if the corresponding disk file does not exist, the file is to be removed from the index. 

(2) You can pass a list of filenames to be removed from the index in an index job. The list is passed as a plain text file containing a list of filenames or DocIds, one per line. Filenames must exactly match the path and name associated with a document in the index. DocIds must be preceded by > to indicate that they are not filenames. Each item must appear on a separate line. Examples: 

c:\docs\filename.txt 

>45 

This list would specify that the document "c:\docs\filename.txt", and the document with the DocId 45, should be removed from the index. 

When removing documents by filename, documents inside a container cannot be removed without removing the entire container. For example, if a ZIP file in the index contains 7 documents, only the ZIP file itself (and all of its contents) can be removed by filename. The individual items inside the ZIP file cannot be removed by filename. This limitation does not apply when removing documents by DocId.

Language
API
C/C++
DIndexJob or dtsIndexJob, set action.removeListed or action.removeDeleted = true
.NET (C#, VB.NET)
dtSearch::Engine::IndexJob, set ActionRemoveListed or ActionRemoveDeleted = true
Java
com.dtsearch.engine.IndexJob, setActionRemoveListed(true) or setActionRemoveDeleted(true)
COM (Visual Basic, ASP)
IIndexJob (IndexJob) object, set ActionRemoveListed or ActionRemoveDeleted = true