Custom Word Databases

The speech-to-text models that are available for Media Server are trained with billions of words, but the performance of any model depends on how closely its training matches the data being processed. When you run speech-to-text you might find that some words, such as product names that would not have been included in the training, are not recognized correctly.

  • The language models for legacy speech-to-text have a finite vocabulary, and a word that is not in the vocabulary is never detected. The best way to expand the vocabulary for a legacy speech-to-text analysis task is to create a custom language model (see Custom Language Models), but if you have insufficient text to train a custom language model or you want to add some extra words alongside a custom language model, you can use a custom word database.
  • The new speech-to-text models (micro, small, medium, and large) do not have a finite vocabulary. However, uncommon words like product names might not be recognized because they are considered less likely than other word choices. You can create a custom word database to improve the probability of the custom word being recognized correctly.

    NOTE: The new speech-to-text models currently support custom word databases only for non-CJK languages.

A custom word database contains a list of words. To add a custom word you only need to supply the word, but if you want to use legacy speech-to-text models, OpenText recommends that you also specify a base word and weight. The base word and weight are ignored with the new models.

A base word is a word that exists in the standard language model and could appear in the same context as the custom word. For example, if you want to add a company name "AcmeSoft" to the custom word database, you could specify "Microsoft" as the base word. This instructs Media Server that "AcmeSoft" is expected to appear in the same context or similar contexts as "Microsoft". The weight is a multiplier that specifies how likely the word is to appear, relative to the base word. For example, if the word you are adding is only slightly less likely to appear than the base word, you might set a weight of 0.8. If the word is much less likely to appear you might set a weight of 0.1.

To create and use a custom word database

  1. Create the database with the action CreateCustomSpeechWordDatabase. For example:

    /action=CreateCustomSpeechWordDatabase&database=words
  2. Add each custom word with the action TrainCustomSpeechWord. For example:

    /action=TrainCustomSpeechWord&database=words
                                 &word=AcmeSoft
                                 &baseword=Microsoft
                                 &weight=0.8
  3. When you configure your speech-to-text analysis task, use the configuration parameter CustomWordDatabase to specify the name of the custom word database that you created. For more information about configuring a speech-to-text analysis task, see Transcribe Speech.

For more information about the actions that you can use to manage custom word databases, see the Speech to Text training actions.