Customize Stop Lists

For some grammars, the post-processing script uses a stop list to discard matches that are likely to be false positives. You can add entries to the stop lists, or remove entries, by modifying the following files.

  • scripts/address_stoplist.lua contains a list of common words that are likely to indicate a false positive when returned as the STREET or CITY component of an address match.

  • scripts/names_stoplist.lua contains two stop lists to discard names. In the stopnames list, each component is plausible but the entire match is likely to be a false positive, for example "Christian Church" or "Norman Conquest". The stopwords list contains common words that are likely to indicate a false positive when returned as either the FORENAME or SURNAME component of a name match.

    You can customize the stop lists in this file such that a name can be considered a false positive in one country but not another.

NOTE: If you use more than one of the grammar sets, you must modify the stoplist for each. For example, scripts/address_stoplist.lua is available for both PII and PHI, and scripts/names_stoplist.lua is available for PII, PHI, and PCI.

The Government grammar set does not have any stoplists by default.

For more information about post-processing, see Configure Post Processing.