6.3.3 Setting up and Using Reflection PAN Detection

When Reflection PAN Detection is selected, InfoConnect uses the following process to detect credit card PAN data:

  1. Read a host screen’s data as input.

  2. Mask out “exclusion patterns” of digit data that are defined as “not credit card data.”

  3. Remove all non-digit data from the input host screen data (leaving just a continuous ordered string of digits).

  4. Apply recognition methods to the remaining digit-only data using stock credit card patterns (provided with InfoConnect) and specified custom credit card patterns of 13 to 16 digits, from left to right.

  5. Redact recognized PANs within this data.

    NOTE:Only matches that pass checksum calculation according to the Luhn Algorithm are redacted. (The Luhn Algorithm, also known as the "modulus 10" or "mod 10" Algorithm, is a checksum formula used to identify identification numbers (see http://en.wikipedia.org/wiki/Luhn_algorithm).

  6. Merge any redactions back into the original host screen data.

  7. Restore the data that was masked by exclusion patterns.

You can specify custom credit card patterns that you want InfoConnect to recognize. To avoid “false positive” redaction, you will also likely need to define additional exclusion patterns (or literal strings containing digits such as application ids, screen ids, or copyright notices).

When to use Reflection PAN Detection

Use Reflection PAN Detection for any of the following applications:

  • You need to detect non-standard credit card patterns/issuers (for instance, oil company or department store cards).

  • Your host application has specialized screens where credit cards can be entered or are displayed in non-standard ways (e.g., non-contiguous sets of digits such as multiple input fields of data arranged in a vertical table or contiguous sets of digits using nonstandard digit group separators).

  • You want PAN detection to be especially “aggressive” or “greedy” in that any digit grouping on any screen should be considered for redaction, and you need to be able to redact without regard to what other text or digit separators may appear between single or groups of digits in the PAN.

Advantages of Reflection PAN Detection

Reflection PAN Detection allows the greatest degree of flexibility and customization for unique detection needs.

  • This method can be configured to detect non-standard credit card issuer patterns of 13-16 digits.

  • If you have host application screens with other numeric data such as part or SKU numbers that look very similar to credit cards, you can exclude those custom patterns from redaction.

  • This method can detect credit cards that use separator text or characters (other than whitespace or hyphens) that are mixed in with the full account number (e.g. “1111 / 3333 / 4444 / 5555”, “first: 1111 second: 2222 third: 3333 fourth: 4444”).

  • This method is suitable for applications that have host screens where credit cards are entered in multiple fields, especially if the screens are laid out in a vertical “table” format.

  • This method can be configured to detect non-standard credit card issuer patterns of 13-16 digits. It also allows detection and redaction of PANs that have nonstandard digit group separation or other random string data in between the digits.

    For example, the following input data can be detected as a PAN with this method, assuming 1111222233334444 is a potential valid credit card number:

    First: 1111 Second: 2222 Third: 3333 Fourth: 4444

    Or

    1111#2222#3333#4444

    The other methods could not be used to detect a credit card number that appears in such a way.

Considerations for Reflection PAN Detection

Reflection PAN Detection has a few items to consider when using this method:

  • This is the most complex method to set up. In order to configure exclusion patterns, you will need to be familiar with regular expressions and their syntax.

  • This method is the most computationally-intensive. It can result in performance degradation and increase response time on PCs with limited processing power and/or memory, especially when the “Redact data while typing” option is selected.

  • The likelihood of “false positive” redaction is much greater with this method than the other two, especially if your host screens are very digit-laden.

  • It is likely that you will need to go through a review process with all of your host applications to eliminate false positives, by identifying and defining exclusion patterns that are not supported “out of the box”. Some examples of these are custom part or SKU numbers for inventory applications, screen or application numeric identifiers, copyrights, international phone number formats, and the like.

How to Set up Reflection PAN Detection

  1. From the InfoConnect File menu or the InfoConnect button (if using the Office 2007 Look and Feel), select InfoConnect Workspace Settings.

  2. Under Trust Center, click Set Up Information Privacy.

  3. On the Information Privacy dialog box, select Enable Redaction and then select InfoConnect PAN Redaction.

  4. Under Primary Account Number (PAN) Detection Rules, configure Custom Detection Rules and Custom Exception Expressions as shown in How to Set up Detection Rules and How to Set up Custom Exclusions to avoid false positives

  5. To “lock down” these settings, see “Lock Down” InfoConnect To Restrict Access to Controls.

  6. To package this file for deployment, see Package Sessions and Custom Settings Files.

    NOTE:Privacy filter settings are saved in the PrivacyFilters.xml file. All other Information for Privacy settings is saved in the PCIDSS.settings file. You can deploy these files to one of the following locations:

    Location for a single user: [AppDataFolder] The full path of the Roaming folder for the current user. The default is C:\Users\username\AppData\Roaming\. \Micro Focus\InfoConnect\Desktop\v17.0

    Location for all users: [CommonAppDataFolder] The full path to application data for all users. The default is C:\ProgramData. \Micro Focus\InfoConnect\Desktop\v17.0

How to Set up Detection Rules

To set up detection rules, you specify the custom credit card sequences (patterns) that you need InfoConnect to recognize.

To enter a custom credit card number sequence, add a regular expression that specifies the pattern the sequence must follow to the Custom Detection Rules table. Because the Reflection PAN Detection method detects digit-only data, do not enter digit grouping separator characters such as hyphens or whitespace in these custom expressions.

NOTE:These patterns are applied on “remaining digit” data only, i.e. a string of all digits on the screen that have not been excluded by stock or custom exclusion patterns. Therefore, the custom PAN pattern should not include non-numeric data such as separators, nor can it contain specifications of preceding or trailing characters, whitespace, or word boundaries.

To match fixed prefixes, you can combine literal text with a regular expression. Typically card issuers have a prefix of 1 or more digits that are fixed, followed by the remainder of digits that can vary. If the prefix is always “static” this can be expressed with literal text in the regular expression. Do not specify something like \d{16}, which would match any 16 digit number. An expression like this is very likely to result in unintended false positives!

Examples of regular expressions that detect credit card PANS

Example 1: Using a regular expression

To detect cards issued by a fictitious Acme Corporation, which are always 15 digits starting with “7200”, you might add the following regular expression:

7200\d{11}

This means match all consecutive instances starting with “7200” followed by any 11 additional digits.

Example 2: Using literal text and a regular expression

Let’s say we need to detect cards issued by a fictitious “National Bank” that are 16 digits starting with “88” with the next digit ranging from 0 to 5 (e.g. 880, 881, 882, 883, 884, or 885 are the valid prefixes). One could add the following regular expression for this case:

88[0-5]\d{13}

This is read as “match the literal text 88, followed by a digit in the range 0-5 inclusive, and followed by 13 additional digits.

How to Set up Custom Exclusions to avoid false positives

Because Reflection PAN Detection disregards the context of non-digit text when detecting PANs, all digits appearing on the screen could potentially be aggregated together to form a potential PAN. It is likely that your host screen data contains digit data that is not to be considered for PAN redaction.

To exclude this data from PAN redaction, InfoConnect uses a set of regular expressions. These exception expressions are listed in the Custom Exception Expressions table, in the Information Privacy dialog box. InfoConnect provides exception expressions that exclude some common digit patterns, such as North American phone numbers, currencies, short date/time formats, US social security numbers, and others.

However, you can also exclude digit formats that are proprietary to your applications, such as custom screen identifiers or inventory part or SKU numbers. To exclude proprietary formats from the redaction process, you will need to add one or more regular expressions to the Custom Exception Expressions table. Literal strings (such as screen ids, copyright notices and dates, etc.) can also be specified here.

NOTE:Unlike the expressions that detect credit card PANs, the expressions for exclusions are applied to the input screen data before removal of non-digit data. These expressions should be specified “as they would appear” on the original host screen.

Examples of Custom Exclusions

Example 1: Using a regular expression

In the USA, postal (or ZIP) codes can follow two formats – 5 consecutive digits, or 5 consecutive digits followed by a hyphen and four additional digits. Typically, on a host application information screen, these codes are preceded and trailed by at least one space character. If we had a screen like this we could add the following expression to the Custom Exception Expression table to eliminate one potential source of “false positives” by excluding ZIP codes from redaction:

\s\d{5}([\-]\d{4})?\s

When we read this expression from left to right, it says “match a leading whitespace character, followed by 5 consecutive digits, and then match zero or one instance of a character group consisting of one hyphen followed by 4 additional digits, followed by a whitespace character.”

Note: Regular expression syntax sometimes requires the “escaping” of certain reserved characters. In this case, the hyphen must be escaped since it is specified within a character group (the text within the parentheses enclosed in square brackets above).

This expression would match strings like “ 88888 “, “ 88888-7777 “. Because leading and trailing whitespace is required, strings like “88888“ and “88888-7777“ would NOT be matched.

Omitting the \s in the expression above would result in matches where the digit patterns are embedded inside other longer strings.

For instance if the expression were modified to:

\d{5}([\-]\d{4})?

Unlike when we included the \s in the expression, the following text strings would result in a match:

Tst88888Str

Embedded88888-7777Text

Prefix88888

88888-7777Suffix

Example 2: Using an exact literal string:

Some of the host screens in an application have a copyright date string such as “Copyright © Acme Corporation 1990, 1992, 2004” that could cause a false positive. In this example, we will exclude the digits in that string from PAN redaction processing.

This can be solved by entering the exact literal string that is desired for exclusion. In this case enter this string in the exclusion expression table:

Copyright © Acme Corporation 1990, 1992, 2004

IMPORTANT:When developing custom regular expressions, it is highly recommended to use a regular expression development and test tool to ensure that the expression truly behaves as you intended. In other words, it matches ONLY what you intend to match and does not match text that you don’t want. One such freeware tool is the Expresso regular expression development tool that can currently be found at http://www.ultrapico.com/Expresso.htm. This tool requires registration to use beyond an introductory trial period, but is currently free to use upon completion of registration. Also, there are many common patterns that have regular expression implementations “published” in the public domain. An internet search for “regular expression library” will turn up several sites that can be searched for pre-constructed regular expressions. One popular site is http://www.regexlib.com. These can be used as a starting point for your own expressions. Make sure that you thoroughly test expressions before they are deployed for InfoConnect.