AutoDetectLanguagesAtIndex
Set AutoDetectLanguagesAtIndex
to True
to automatically detect document languages and encodings during indexing.
For accurate language detection, documents must include several sentences. You can change the amount of text that IDOL Content Component analyzes to detect languages by changing the MaxLanguageDetectTerms configuration parameter.
Use DiscardUnconfiguredLanguagesAtIndex and DiscardUnknownLanguagesAtIndex to configure how IDOL Content Component handles documents when it cannot detect the language type because the language type is not defined in the configuration file, or is not recognized.
By default, if IDOL Content Component detects a language type that is not configured, it indexes to the equivalent General
language type for the encoding, if it exists. It also logs a warning message in the index log so that you can add an appropriate language type to the configuration file.
Unknown languages are also indexed to the General
language type for the encoding, if it exists. If the encoding is unknown, the document is indexed to the default language type.
NOTE: You can use AutoDetectLanguagesAtIndex
only if it is included in your IDOL Content Component license.
Type: | Boolean |
Default: | False |
Required: | No |
Configuration Section: | Server |
Example: | AutoDetectLanguagesAtIndex=True
|
See Also: | DiscardUnconfiguredLanguagesAtIndex
DiscardUnknownLanguagesAtIndex LangDetectType LangDetectUTF8 MaxLanguageDetectTerms |