5.9 Understanding Collectors

Identity Governance provides templates to simplify the collection of data. Collection templates or collectors are the default mappings of identity, account, or permission data from identity and application sources to the core Identity Governance schema. Your systems might use different terms for the same type of objects. Collectors enable you to map your system-specific objects from various sources to the Identity Governance objects in order to collect and publish them to the Identity Governance catalog.

Each collector has one or more views that allow you to specify which data you will collect from your identity or application source, and describe how that data will be linked together in the catalog. Each collector has one or more views that describe the characteristics of the data source that you could collect. The views are different for identity and application sources. For example, the JDBC Identity (Oracle) collector template can collect data for users, groups, group-to-group associations, and group-to-user associations. Collectors for application sources gather either account or permission data.

For each collector, you can collect data from on-premises data centers by enabling Cloud Bridge connection.

5.9.1 Understanding Collector Configuration

Identity Governance provides a large set of collector templates that contain default data and configuration settings for many common enterprise and cloud data sources. Each template can be customized to connect to associated data sources.

NOTE:Customization of templates might require additional knowledge of connected systems, and all modifications are the responsibility of the customer. For further guidance, contact support or professional services.

Every collector has the following common elements:

Collector template

Collector templates include predefined attribute mappings and value transformation policies for specific data source types. Select a template that best suits the data source. For example, select AD Identity to collect identities from Active Directory. The templates support the following types of data sources:

  • Active Directory

  • Azure Active Directory

  • CSV file

    The CSV collector supports TSV files. You enter the word tab, in uppercase, lowercase, or any combination thereof in the Column Delimiter field. To collect from a CSV file, you must specify the full path to the file.

  • eDirectory

  • Google Apps

  • Identity Manager AE

  • IDM Entitlements

  • JDBC, such as Oracle or PostgreSQL

  • Resource Access Control Facility (RACF)

  • Salesforce

  • SAP HR

  • SAP User Management

  • SCIM

  • ServiceNow

  • SharePoint

  • Workday

NOTE:You can have one ore more templates for each data source. Template names indicate that they are permissions or accounts collectors. Template names that end in with changes can be enabled for processing incremental change events.

NOTE:Micro Focus does not currently support SAP collectors in the SaaS environment.

To see all the data source types, select Collector Template when you create the data source.

To collect data from a JDBC or SAP source, Identity Governance needs the appropriate third-party connector libraries to be installed on the Identity Governance server. If you plan to collect data from a JDBC source using Cloud Bridge, you must rename the JAR files and save them to the lib folder relative to the Cloud Bridge agent.

For more information, see Identity Governance Server System Requirements in the Identity Governance 3.7 Installation and Configuration Guide.

You can also customize an existing template or create your own. For more information, see Section 3.5, Customizing the Collector Templates for Data Sources.

Service Parameters

These are the configurable parameters that allow the collector to connect and, if required, authenticate to the target data source. These typically include file locations, server host and port specifications, or service URLs.

This view also includes Cloud Bridge related parameters, authentication related parameters, and a Test connection button to verify the settings.

NOTE:Micro Focus supports Cloud Bridge only in Identity Governance as a Service deployments.

Cloud Bridge Connector The Use Cloud Bridge connector? option enables you to collect data from on-premises data centers when using Identity Governance as a Service. After enabling a Cloud Bridge connection, you must select the data source pertaining to your data center for credentials to be passed through automatically based on the data source unique ID.

NOTE:Once you enable a Cloud Bridge connection, typically you do not need to specify user name and password for the data host server as credentials will be passed through automatically based on your data source unique ID. However, for collectors such as the SCIM, Identity and Manager AE Permission collectors, you might need to specify ordinals for additional authentication methods. For more information about Cloud Bridge procedures for unique collectors, see the Identity Governance as a Service Quick Start. Always verify that you configure the service parameters correctly by testing the connection.

Collect Views

Each collector is comprised of one or more collector “views” that can be customized to match the characteristics of the data source being collected. These views enable you to map attributes and add transformation scripts. When collecting identities, they also enable you to select match rule when publishing and merging.

For information about identity collect views, see Section 6.1, Understanding Collector Templates for Identity Sources and for information about application (account and permission) collect views, see Section 7.2, Understanding Collectors for Application Data Sources.

Transformation Scripts View

This view in the collector template allows you to view transformation script usage information. For information about using transformation scripts, see Section 5.9.2, Transforming Data During Collection.

Test Collection and Troubleshooting

This option allows you to preview data before running a full collection, preserve the configuration for a data source, or create an emulation package for a data source. You can use generated files to validate and troubleshoot collections, send results to support engineers, and to import data source configurations to a different environment.

For more information about test collections and troubleshooting, see Section 5.9.3, Testing Collections and Section 5.9.4, Creating Emulation Packages.

For more information about configuring data source collector templates, see:

5.9.2 Transforming Data During Collection

Because each application might have its own format for the data that you plan to collect, you might need to transform the data during the data collection process. For example, the application might store dates as a string (20151202) that needs to be converted to the Identity Governance date format, which is the Java Date format in milliseconds. Also, an application might use field lengths that do not match the field length in Identity Governance. These variations in collected data affect your ability to use the data or merge it with data collected from other sources.

Transformation scripts may be added to any mapped data field in any data collector by clicking on the ‘{}’ icon next to the field mapping. This will expand the dialog to allow you to either upload a transformation file or paste in transformation text. If required, you can also delete a transformation script after removing all references to the script from the attribute mapping(s) that use it.

The transforms are done through Nashorn-compatible Javascript. Within the Javascript, you can access the collected value by creating a variable name inputValue. After manipulating the collected value, you can return the value to Identity Governance by assigning the value to a variable name outputValue.

The following example translates the values true and false from the connected system to active and inactive in the Identity Governance catalog.

if (inputValue == 'true') {
    outputValue = 'active';
}
else {
    outputValue = 'inactive';
}

To add or delete a transformation script:

  1. Log in as a Customer, Global, or Data Administrator.

  2. Select a configured data source, and then expand a collector view to view related attributes.

  3. Click ‘{}’ icon next to the field mapping to add a script.

    or

  4. Delete a script.

    NOTE:You must remove all references to the script from the attribute mappings to delete a script.

    1. Expand the Transformation scripts view of the data collector to see its usage.

    2. Expand the collector view(s) mentioned in the usage information.

    3. Click ‘{...}’ icon next to the field mapping and choose Select a script... to clear the script usage from the attribute mapping.

    4. Repeat the above step to remove all usage of the script.

    5. Expand the Transformation script view and select the delete icon to delete the script.

For more information about transformations, see the Collected Data Transformations reference.

5.9.3 Testing Collections

When creating, updating, or troubleshooting data collectors, you can test all or part of the collections without publishing the results to the catalog. When you test a collection, you either ensure that the collector is correctly configured, or you have the ability to change the collector configuration and quickly test again to check the results.

You can view the collected data as soon as the test collection completes, or you can download the results to view later. Results of test collections remain available in the Identity Governance database until you delete them or they expire.

When you run a test collection, you have some options for the test data:

  • All records

  • Some records

    When you select a subset of records to collect, you cannot control which records to collect. You could use this option if you want to quickly spot check a collector configuration rather than waiting for all the data to be collected.

  • Raw data

    Raw data contains attribute names from the native application. These attributes have not yet been transformed based on the mappings in the collector. Testing the raw data collection lets you verify that you are collecting the data you intend to collect before Identity Governance transforms it.

  • Transformed data

    Transformed data contains attribute names that you have mapped from the native application to the attribute names you are using within Identity Governance. Testing the transformed data collection lets you verify that your mappings within the data collector meet your expectations.

To test a sample collection from a data source:

  1. Log in as a Customer, Global, or Data Administrator.

  2. Select a data source.

    NOTE:Test connection is not supported when the CSV collector is accessed via an HTTP or HTTPS connection.

  3. Click Test Collection and Troubleshooting.

  4. On the Test Collection tab, select the collectors, then:

    1. Click Run Test Collection

    2. Select the specific entities to collect.

    3. (Conditional) To collect a subset of records, type the number of records to collect.

    4. (Conditional) To collect all records, make no changes to the default All value.

    5. Start raw data or transformed data collection.

  5. To view the test collection results, select Actions > View.

  6. To download the test collection results to your local computer:

    1. Click Actions > Test collection results.

    2. Enter a meaningful description.

    3. Click Download.

    4. Click the download icon on the Identity Governance title bar to download test collection results to your local computer.

    5. (Optional) Delete the test collection results from the download area in Identity Governance.

      If you do not manually delete the test collection results from the download area, Identity Governance will automatically delete the data from the database based on your default download retention day settings. For information about customizing download settings, see Section 3.9, Customizing Download Settings.

  7. (Optional) On the Test Collection tab, click Actions > Delete to delete the test collection.

    Identity Governance will automatically delete the test collection based on your default download retention day settings.

5.9.4 Creating Emulation Packages

You can more easily troubleshoot collection configuration outside your production environment by creating emulation packages for data source collectors. An emulation package contains CSV files with the raw collected data from the data source and a CSV file containing data source configuration details. Emulation packages remain available in the Identity Governance database until you delete them or they expire.

To create an emulation package:

  1. Select a data source.

  2. Select Test Collection and Troubleshooting.

  3. Under Download and Emulation, select Create emulation package.

  4. Click Test Collection and Troubleshooting.

  5. On the Download and Emulation tab, click Create emulation package.

  6. To view the emulation records, select Actions > View.

  7. To download the emulation package to your local computer:

    1. Click Actions > Download emulation package (data source and raw collected data).

    2. Enter a meaningful description.

    3. Click Download.

    4. Click the download icon on the Identity Governance title bar to download the emulation package to your local computer.

    5. (Optional) Delete the emulation package from the download area in Identity Governance.

      If you do not manually delete the emulation package, Identity Governance will automatically delete the data from the database based on your default download retention day settings. For information about customizing download settings, see Section 3.9, Customizing Download Settings.

  8. (Optional) On the Download and Emulation tab, click Actions > Delete to delete the emulation.

    Identity Governance will automatically delete the emulation based on your default download retention day settings.

5.9.5 Downloading and Importing Collectors

The ability to download and import collectors helps you manage your environment in several ways.

  • Back up a working collector

  • Replicate an environment

  • Update collector details in a text editor

  • Troubleshoot collections

Configuring collectors can take time, and you might go through several iterations of trial and error. When you have configured a collector that achieves the results you want, you should download it and save it with your other backup files. You can also use downloaded collectors to replicate an environment, either in a test environment or to use in another office location.

You could decide that you need to change the predefined attribute mappings and value transformation policies of a template to meet your specific environment. If you find that you need to customize a collector template, rather than only editing the values in a collector, you can download and import collector templates under Configuration in Identity Governance. For more information, see Customizing the Collector Templates for Data Sources.

NOTE:To correctly import data, you must download data sources from the current version of Identity Governance.

When you download a data source, the zipped file has the name of the data source. For example, AD_Identities.zip. The files within the zipped file are generically named in English and can include the following files:

  • Identity_Source.json or Application_Source.json file (depending on type of data source) which contains the configuration of the data source and all of its collectors.

  • Attribute files containing the schema elements used by the collectors within the data source. For example, USER_Attributes.json, PERMISSION_Attributes.json, and APPLICATION_attributes.json.

  • Template files containing the collector template name and version used to create the collectors in the data source. For example, Template_AD-Account_3.6.0.json.

  • Categories.json file when categories are applied to the source.

To download data source and associated files:

  1. Select a data source, then select Test Collection and Troubleshooting.

  2. Select Download and Emulation.

  3. Click Download Data Source Configuration.

    1. Type a meaningful description such as the collector name.

    2. (Optional) Download included templates, assigned categories, and associated attribute definitions.

    3. Select the download icon on the top title bar to access the saved file and download the file.

      HINT:We recommend creating a folder for each data source zipped file and extracting the contents into that folder. This ensures that the similarly named files from different sources are not mixed together or overwrite those from other sources.

To import associated files and data source:

  1. (Conditional) If your data source has custom schema or categories associated with it, import the previously downloaded schema files or category files before importing the data source. To import attributes definitions, navigate to the respective attribute page under Data Administration and import respective attribute file. To import categories and templates, select respective options under Configuration.

  2. Under Data Sources, select Identities or Applications.

  3. Select Import an identity source or Import an application source.

  4. Based on the type of data source, select the Identity_Source.json or the Application_Source.json file.