How to get field data in search results

Article: dts0109

Some document formats contain fields (or "properties") in addition to the document text. For example, Word documents contain a Document Summary area with information such as the Author and Subject of a document. dtSearch automatically recognizes these fields and indexes them so you can search for text across the whole document or just in a particular field. For a list of the types of metadata dtSearch extracts from supported file formats, see What file formats does dtSearch support?

In some cases, it is useful not only to search for a particular field but to see the field as part of the search results list. dtSearch can do this if a field is identified as one that should be stored in the index along with other document information (such as the filename, date, etc.) and returned in search results. These fields are called "stored fields." When a field is set up as a stored field, the value of the field will appear in search results along with other document properties, and the field can also be used to sort search results.

Sources of field data

Field data can come from any of the following sources: document properties, HTML META tags, fields in databases and XML files, and text fields defined using the dtSearch "Define Text Fields" feature.

Storing Fields

To specify a list of fields to be stored when you create an index, click Index > Create Index (Advanced) in dtSearch Desktop, and list the fields to store under Fields to display in search results

To specify fields that should be stored in all indexes,

1. In dtSearch Desktop, click Options > Preferences > Text Fields.

2. Create a field with the same name as the property that you want to store. For example, to store a "Subject" property, create a text field item named "Subject". You do not need to put anything in the spaces for Beginning of field, End of field, etc.

3. Check the box labelled Display field in search results

Fields relating to a document are stored when the document is indexed, so you will need to rebuild any indexes containing the field data for this change to take effect. When you rebuild your indexes, check the Clear index before adding documents option in the Update Index dialog box so that all of your documents will be reindexed.

dtSearch Engine

To specify the fields to be stored, pass a list of names in IndexJob.StoredFields (.NET), dtsIndexJob.storedFields (C++) or IndexJob.setStoredFields (Java).

The field names in the list can contain wildcards (* and ?). A set containing a single entry "*" would match all fields, causing the text of every field to be stored in the index.

Limits on Stored Fields

By default, the maximum size of a stored field is 128 characters. This can be changed with the maxStoredFieldSize option setting, and can be increased to up to 8192 characters. The total size of all all stored fields associated with a single document, including field names, is limited to 27k.

To change the limit on the size of stored fields:

dtSearch Desktop A registry entry can be used to change this in dtSearch. The key is:
HKEY_CURRENT_USER\Software\dtSearch Corp.\dtSearch\Settings\MaxStoredFieldSize

dtSearch Engine (.NET) Use the Options.MaxStoredFieldSize property

dtSearch Engine (C/C++) Use the maxStoredFieldSize member of dtsOptions

These limits do not affect field searching. All fields in a document are searchable, regardless of how long they are or how many fields are in the document. For example, in a document with 100 fields, each 64k in length, all of the data in all of the fields would be searchable. The limits described here apply only to fields that are copied into the index and returned in search results.

The limit on stored field size applies when a document is indexed, so after changing the value you will need to rebuild your indexes for the change to take effect.

Obtaining Stored Fields

dtSearch After a search, dtSearch displays any stored field data returned as additional columns in search results. For example, if you create a "Subject" stored field, the search results list will contain a "Subject" column after the other columns.

dtSearch Web In dtSearch Web, the Form Builder dialog box has a Search Results tab that lets you specify the items to include in search results. To add a custom field, click the Add button and add an item with the name of the field you want. Under Content, use %% around the name of a field to get the value of that field. For example, to add the "Subject" field to search results, you would create a new item in Search Results named "Subject", and under Content you would enter

%%Subject%%

See also: "How to add custom fields to dtSearch Web search results"

dtSearch Engine API

C#, VB.NET: Use SearchResultsItem.UserFields

C/C++: Use dtsSearchResultsItem.userFields

Java: Use SearchResults.getDocDetailItemWeb

ASP, Visual Basic: Use SearchResults.DocDetails or SearchResults.DocDetailItem

Troubleshooting Stored Fields

1. Make sure that the file that contains the field data was indexed after the Text Fields definition was added. Field data is extracted from documents when the documents are indexed.

2. Check that you are spelling the stored field name exactly as dtSearch is spelling it in the index. To see the fields in an index, click the "fields" button in the dtSearch Desktop Search dialog box, or click Index > List Index Contents, and list the fields in the index. To see all of the stored fields for a document, search for the document in dtSearch and press Ctrl-Y in search results after opening the document.

Additional Information

For more information about stored fields, see the following topics:

"How to add custom fields to dtSearch Web search results"

"Define text fields" in the dtSearch manual

"Field searching" and "Retrieving fields in search results" in the dtSearch Text Retrieval Engine Programmer's Reference

"How to add fields to documents"

"How to index databases with the dtSearch Engine"