Configure Tangible Characters

The TangibleCharacters configuration parameter specifies a list of characters to treat as part of a word, rather than as word boundaries. You can set this value when using the Eduction SDK, Eduction Server, or the Eduction command-line utility (edktool).

Some entities in the IDOL Eduction Grammars Package grammar files require you to set tangible characters to allow them to perform correctly. For details, see the descriptions of the entities in the appropriate grammar reference: PII Eduction Grammar Reference, PHI Eduction Grammar Reference, PCI Eduction Grammar Reference, or Government Eduction Grammar Reference.

When you use Eduction to search for matches, TangibleCharacters applies across all of your chosen entities. If you use multiple entities that have different recommended tangible character sets, you might need to take some extra steps. For example:

  • In the Eduction SDK, create a separate configuration file for each distinct set of tangible characters and associated entities, and create an EDK engine for each configuration file.

  • In Eduction Server, send a separate action (EduceFromText or EduceFromFile) for each distinct set of tangible characters. In each action, set the TangibleCharacters and Entities action parameters to specify which set of tangible characters and which entities to use.
  • In the command line edktool, create a separate configuration file for each distinct set of tangible characters and associated entities, and process your input text once with each configuration file.

For more information about the TangibleCharacters configuration parameter, refer to the Eduction User and Programming Guide.