Languages

When IDOL Server processes a document, it treats the text as a series of tokens (words), each of which is a unit of meaning. At a low level, this method is language independent. However, you can improve your query results by applying some language dependent processing.

Language dependent configuration allows you to:

make sure that all your content is treated consistently, allowing cross-lingual search.
filter your searches to content in a specific language.

This section describes the most important language concepts, and explains why you might use them.

Language Types. The language and encoding of a document.
Tokenization. The methods IDOL Server uses to split text into searchable tokens.
Stemming. Processing that reduces groups of related words to a common stem.
Stop Lists. Lists of words that do not convey meaning in documents.
Cross-Lingual Search. Search across documents in multiple languages.
Order of Language Processes. The order in which the language processing steps occur during indexing and querying.

Send documentation feedback to Micro Focus

_FT_HTML5_bannerTitle.htm