action.skip

Data Mapping

This chapter shows you how the Databridge Client maps DMSII data structures to relational database structures.


DMSII and Relational Database Terms

The following table shows the equivalent terms for DMSII structures and relational database structures:

DMSII Relational
Data set Table
DMS item (data item) Column
Record Row (record)
Set Index

NOTE: A relational database index is a set of column names that is used to efficiently access a row (of a table).
Key Key


DMSII and Relational Database Data Types

The Databridge Engine retrieves the requested DMSII data, and DBServer passes the data to the Databridge Client, where it is assigned to standard relational database data types.

The following table lists equivalent data types for DMSII, Microsoft SQL Server, and Oracle.

DMSII SQL Server Oracle
ALPHA (<= char_limit bytes) CHAR * CHAR +
ALPHA (>char_limit bytes) TEXT VARCHAR2
ALPHA (>varchar2_limit_bytes) TEXT CLOB
BOOLEAN BIT NUMBER(1)
FIELD:

FIELD(n) where n is < 16
FIELD(n) where n is < 32
FIELD(n) where n >= 32


SMALLINT
INT
BIGINT ‡


NUMBER(5)
NUMBER(10)
NUMBER(10-15)
NUMERIC:

NUMBER(n) where n is a DMSII declared length <= 2
NUMBER(n) where n is a DMSII declared length <= 4
NUMBER(n) where n is a DMSII declared length <= 9
NUMBER(n) where n is a DMSII declared length <= 15
NUMBER(n) where n is a DMSII declared length >15
NUMBER(n,m) where n is a DMSII declared length and m is the number of places after the decimal point.


TINYINT §
SMALLINT
INT
BIGINT ‡
DEC(n)
DEC(n,m)


NUMBER(n)
NUMBER(n)
NUMBER(n)
NUMBER(n)
NUMBER(n)
NUMBER(n,m)
REAL:

REAL(n) where n is a DMSII declared length <= 2
REAL(n) where n is a DMSII declared length <= 4
REAL(n) where n is a DMSII declared length <= 9
REAL(n) where n is a DMSII declared length > 9
REAL(n,m) where n is the DMSII declared length and m is the number of places after the decimal point.
REAL, with no precision or scale


TINYINT §
SMALLINT
INT
DEC
DEC(n,m)
FLOAT


NUMBER(n)
NUMBER(n)
NUMBER(n)
NUMBER(n)
NUMBER(n,m)
FLOAT

* VARCHAR if the configuration parameter use_varchar is set to True.

† VARCHAR2 if the configuration parameter use_varchar is set to True.

‡ If the configuration parameter use_bigint is set to False, DEC(n) will be used instead.

§ Note that if the number is signed, SMALLINT is used instead. TINYINT is an unsigned quantity in SQL Server.


Databridge Data Types

IMAGE is a Databridge type that allows you to store an ALPHA item as binary by using the REDEFINE clause of the ALTER command in GenFormat on the host.

When the Databridge Client encounters an item of type IMAGE, it automatically sets the ds_options DIOPT_Clone_as_Binary bit in the DMS_ITEMS Client control table.


Supported DMSII Structures

This section lists DMSII structures that are supported by the Databridge Client. If you are the relational database administrator and have no experience with DMSII databases, this section will be more useful to you if you are working with a DMSII database administrator.

In addition to fixed-format data sets and variable-format data sets, the Databridge Client supports the following DMSII structures:

  • Embedded data sets (see exceptions in the following section)

  • Remaps

  • Logical database

  • GROUP

  • FIELD items for GROUPs of BOOLEANS

  • OCCURS

  • GROUP OCCURS

  • Data sets with more items than the maximum number of columns supported by the relational database

  • Data sets that generate relational tables whose record sizes exceed the Microsoft SQL Server maximum record size.

  • DMSII links in DIRECT, UNORDERED, and STANDARD data sets. (Links in variable-format data sets are cloned but not tracked.)

Some of these structures may not be supported by your relational database. The DMSII structures that are not supported by relational databases are mapped into a form that the relational database can use. Each of these structures and the way they are mapped in the relational database are explained in the remainder of this chapter.


Unsupported DMSII Structures

When the Databridge host replication software does not support a particular DMSII structure, the Databridge Client may or may not issue a warning message, depending on the DMSII structure. For example, a message is generated when the data set has no keys.

The Databridge Client does not support the following DMSII structures:

  • Embedded data sets within an ORDERED or COMPACT data set

  • Embedded data sets if the INDEPENDENTTRANS option is reset

  • POPULATION items

  • COUNT data items

  • FILLER data items

  • AGGREGATE data items


Embedded Data Sets

An embedded data set is a DMSII representation of a hierarchical relationship or tree structure. When a DMSII data set contains another data set as an item, that data set is called an embedded data set. The data set in which it is declared is called the parent of the embedded structure. You can think of the embedded data set as the "child" of the "parent" data set.

To represent this parent-child relationship in a relational database, the Databridge Client uses a foreign key that points to the parent data set. This foreign key is represented by the value in the parent_aa column in the table that corresponds to the embedded data set. The parent_aa column holds the parent record's key.

DMSII DASDL Showing an Embedded Data Set

The following is an excerpt from a DMSII DASDL that shows how an embedded data set is defined.

 GENEALOGY DATA SET
  (
    PARENT-FAT-NAME          ALPHA (30);
    PARENT-MOT-NAME          ALPHA (30);
    PARENT-MOT-MAIDEN        ALPHA (30);
    PARENT-FAT-BDATE         NUMBER (06);
    PARENT-MOT-BDATE         NUMBER (06);
    FILLER                   SIZE (06);
 %
    CHILD                    DATA SET;
       (
       CHILD-NAME            ALPHA (30);
       CHILD-STATUS          ALPHA (11);
       CHILD-BDATE           NUMBER (06);
       CHILD-GENDER          FIELD
          (
          CHILD-MALE         BOOLEAN;
          CHILD-FEMALE       BOOLEAN;
          );
       CHILD-FILLER          FIELD (01);
 %
       );
  );

Resulting Tables

The following examples are for Microsoft SQL Server.

Ignoring any set definition, the resulting relational database tables are as follows:

  • Genealogy (the parent data set is cloned to its own primary table)

  • Child (the embedded data set is cloned to its own secondary table with a pointer to its parent table)

Parent Table: genealogy (table name). Where the my_aa column is a unique key for the record derived from the DMSII AA Value of this record.

    (
    my_aa             char(12),
    parent_fat_name   char(30),
    parent_mot_name   char(30),
    parent_mot_maiden char(30),
    parent_fat_bdate  int,
    parent_mot_bdate  int
  )

Child Table: child (table name)

(
      my_aa             char(12), - child table's key
      parent_aa         char(12), - foreign key of parent table
      child_name        char(30),
      child_status      char(11),
      child_bdate       int,
      child_male        bit,
      child_female      bit,
      child_filler      smallint
  )


Selecting Embedded Data Sets for Cloning

When you run a clone command, by default, the Databridge Client selects embedded data sets along with the parent structures. If you have altered the values for the active column in the Databridge Client control tables, however, check the values for the embedded data set and its parent data set.

Caution

If you want to clone an embedded data set, you must also clone the parent structure. Failure to do this results in the following error message from the Databridge Engine on the host:

0043 Parent of embeddeddataset must be selected.


Record Serial Numbers

A record serial number (RSN) is a unique number (48-bits long) that is associated with a record in a data set. The RSN is guaranteed to be unique, and it stays with a record for the life of the record. Updates do not affect the RSN; therefore, RSNs are ideal keys for Databridge tables. However, RSNs are available only when using DMSII XE.

DMSII XE adds the RSN to every record of a data set that has the EXTENDED attribute set. As long as the EXTENDED attribute is set, Databridge can access the RSN, unlike application programs that can access the RSN only if it has been explicitly declared in the DASDL source. The Databridge Engine is designed to use RSNs instead of AA Values whenever possible. Regardless of whether RSNs are present, AA Values are used for data sets that contain embedded data sets or DMSII links.

Since RSNs and AA Values are the same length, they are interchangeable, unless the data sets are embedded data sets or contain DMSII links. If the Databridge Engine is not handling either of these types of data sets and an RSN is present, Databridge Engine uses the RSN rather than the AA Value. In this case, the resulting column of the DATAITEMS Client control table is named my_rsn instead of my_aa to differentiate it from an AA Value. In both cases, the set_name column of the DATASETS Client control table will have a value of "aa_set".


AA Values

AA is a DMSII term that stands for absolute address. An absolute address value is an A Series WORD (48-bits in length). In the Databridge Client, AA is the hexadecimal representation (12 character strings containing the characters 0--9 and A--F) of the AA Value on the host. Databridge Client uses the AA Values to implement unique keys for the parent structures of embedded data set records. It also uses AA Values to reference the records of data sets that do not have DMSII SETS with the NO DUPLICATES ALLOWED attribute.

AA Values are not constant. Any DMSII reorganization (record conversion, file format, or garbage collection) changes these values.

Note

Databridge Client supports numeric AA Values that are stored as NUMBER(15) in Oracle and BIGINT in SQL Server. It also supports binary AA Values that are stored as RAW(6) in Oracle and BINARY(6) in SQL Server.

The Databridge Client implements DMSII link items, such as MEMO items in LINC databases, using an AA Value. You can use the AA Value as a foreign key when you want data from the tables associated with the link item. To enable support for DMSII links, you must do the following:

  • Enable DMSII link support in the Databridge Engine control file.

  • Set the Client configuration file parameter enable_dms_links to True.

Variable-Format Data Sets

DMSII variable-format data sets consist of a collection of dissimilar records containing a fixed part that is common to all records, and a variable part that depends on the record type. The individual records are identified by the record type, which is a special data item that has a value between 0 and 254.

  • A record type of 0 indicates that the record does not contain a variable part.

  • A record type of 1--254 indicates that the record contains the corresponding variable part in addition to the fixed part that is always present.

The concept of variable-format tables does not exist in relational databases. Therefore, the Databridge host software handles the various types of variable-format records as different structures. Databridge references these structures by a data set name and a record type (all other data set types have a record type value of 0). The Databridge Client uses the notation datasetname/rectype when referring to variable-format data sets in all messages. The Databridge Client handles these structures as if they were logical data sets; thus, each individual record type of a variable-format data set is mapped to a different relational database table. Variable-format data sets are tracked and updated like fixed-format data sets. Links in variable-format data sets, however, are not tracked, but retain their values from the initial clone. The link values for any records created after the clone will be null. (See track_vfds_nolinks.)

Fixed Part Only Records Even though type 0 records are not explicitly declared in the DMSII DASDL, applications can create such records by simply setting the record type to 0. The Databridge software always defines a data set with a record type of 0 (rectype column in the DATASETS Client control table) for type 0 records of variable-format data sets. This data set is mapped to a table whose name is derived from the data set name (that is, name converted to lowercase and all dashes replaced by underscores).

Note that unless the DMSII applications explicitly store type 0 records in the data set, this table may be empty. If you know this is the case, you may want to disable the cloning of the type 0 records for the data set by setting the active column of the corresponding entry in the DATASETS Client control table to 0.
Variable Format Records All other record types are treated as if they were contained in a separate structure. The primary tables for these structures are named by appending the suffix "_type#* to the name mapped from the data set name, where # is the decimal value of the record type (that is, a value between 1 and 254). Note that the fixed part of the variable-format records and the record type are also stored in the resulting relational database table.

DMSII DASDL Showing Fixed- and Variable-Length Records

      MAIN              DATA SET
      (
          VAR-REC       RECORD TYPE (3);
          CUST-NO       NUMBER (08);
          CPU           ALPHA (06);
          FILLER        SIZE (05);
      ),

      %

      1:
      (
          SMSA          ALPHA (04);
          SALES         ALPHA (06);
      )

      %

      2:
      (
          STATUS        ALPHA (01);
          RECEIVED      ALPHA (06);
          ORDER-DATE    NUMBER (08);
      )

      %

      3:
      (
          SITE          ALPHA (07);
          SOURCE        ALPHA (07);
          CLASS         ALPHA (02);
      );

      %

      MAINSET           SET OF MAIN
                        KEY (CUST-NO)
                        NO DUPLICATES;


Resulting Tables

The examples in this section assume that the Microsoft SQL Server database is being used and that the Client uses the SET MAINSET as the source for index for the various tables.

The following tables are derived from the variable-format data set MAIN:

  • main (type 0 records)

  • main_type1 (type 1 records)

  • main_type2 (type 2 records)

  • main_type3 (type 3 records)

    Note

    All four tables contain the fixed part of the data set. The var_rec column is the record type; all records in the individual tables will have the same value in this field.

Record Type 0 Table

The table named main represents all type 0 records that do not have a variable part. The var_rec column of all records in this table will have a value of 0. Note that this table may be empty if your application does not use type 0 records. The SQL statement to create this table is shown as follows:

  create table main
  (
    cust_no int,
    var_rec smallint,
    cpu char(6)
  )

Record Type 1 Table

The table named main_type1 represents all type 1 records. The var_rec column of all records in this table will have a value of 1. The SQL statement to create this table is shown as follows:

create table main_type1
(
    cust_no    int,
    var_rec    smallint,
    cpu        char(6),
    smsa       char(4),
    sales      char(6)
)

Record Type 2 Table

The table named main_type2 represents all type 2 records. The var_rec column of all records in this table will have a value of 2. The SQL statement to create this table is shown as follows:

create table main_type2
(
    cust_no    int,
    var_rec    smallint,
    cpu        char(6),
    status     char(1),
    received   char(6),
    order_date int
)

Record Type 3 Table

The table named main_type3 represents all type 3 records. The var_rec column of all records in this table will have a value of 3. The SQL statement to create this table is shown as follows:

create table main_type3
(
   cust_no     int,
   var_rec     smallint,
   cpu         char(6),
   site        char(7),
   source      char(7),
   class       char(2)
)


Split Variable Format Data Sets Option

When the ds_options bit DSOPT_Split_Varfmt_ds (bit value 65536) is set, variable format data sets are treated slightly differently. The record type 0 tables contains the fixed part of all records regardless of their record types. However, the table has exactly the same layout as above. The tables for all the other records only contain the variable part of the records and the keys from the fixed part.

The table named main_type1 in the above example will now contain the key cust_no and the variable part. The SQL statement to create this table is shown as follows:

create table main_type1 ( cust_no int, smsa char(4), sales char(6) )


Changing the Default Data Type

In most cases, the default data types are sufficient. If you want to change the data type, however, use a relational database query tool to edit the sql_type column in the DATAITEMS Client control table, or put the SQL statements in user scripts as explained in Customizing with User Scripts.

Caution

When changing the default data type, make sure that you choose a correct data type or the data may not be correctly stored in the relational database.

Most of these relational database data types can be changed using data table customization user scripts or the Administrative Console's Customize command is the Settings menu for the data source.

Value for sql_type Generic Data Type Microsoft SQL Server Data Type Oracle Data Type
0 bit bit number(1)
1 char char char
2 varchar varchar varchar2
3 byte tinyint number(3)
4 short int smallint number(5)
5 long int int number(10)
6 float float float
7 text text clob
8 binary binary raw
9 varbinary varbinary raw
10 datetime datetime date
11 packed BCD dec number
12 smalldatetime smalldatetime date
13 numeric date int number(10)
14 unsigned long binary(4) raw(4)
15 timestamp timestamp N/A
16 serial {int - bigint - dec(n) - identity} N/A
17 numeric_time int number(6)
ticks int number(6) or number(10) N/A
18 int64 bigint NA
19 date date NA
20 datetime2 datetime2 timestamp
21 time time N/A
22 uniqueidentifier uniqueidentifier N/A


Handling DMSII GROUPs

A GROUP is a DMSII construct that allows the data items that belong to the group to be referenced at one time (for example, as one item). The concept of GROUP does not exist in a relational database. Therefore, if the DMSII database you replicate has one or more GROUPs, the Databridge Client ignores the GROUP name and instead treats each item within the GROUP as a regular data item. All items in a DMSII GROUP share the same parent item number, which is the item number of the GROUP item.

Following is an example of the DMSII GROUP item in the data set called ADDRESS. This GROUP item consists of the data item CITY and the data item STATE.

DMSII DASDL Showing GROUP

The following is an excerpt from a DMSII DASDL that shows how a GROUP item is defined. With the GROUP item, you can access both city and state with one reference.

      ADDRESS           DATA SET
      (
          STREET        ALPHA (20);
          APARTMENT     ALPHA (5);
          CITY-STATE    GROUP
          (
              CITY      ALPHA (20);
              STATE     ALPHA (2);
          );
          COUNTRY       ALPHA (20);
          ZIPCODE       NUMBER (5);
          POSTFIX       NUMBER (4);
      );

The next example shows how the same DMSII GROUP item is mapped to a relational database.

Relational Database Table

The following example is for Microsoft SQL Server.

The table name is the lowercase form of the DMSII data set name. The GROUP item CITY-STATE is ignored. The data items in that group are included in the relational database table as if they were ordinary DMSII data items.

address (table name)
street       apartment    city       state       country   zipcode  postfix

May St.      3            Paris      OH          USA       15010    2146
Elm Ln.                   River      SD          USA       24906    3381

If there are duplicate names among members of various groups within a data set, the Databridge Client resolves the conflict by appending a digit to the column name to make it unique.


Handling DMSII OCCURS

An OCCURS clause is a DMSII construct that describes the number of times an item is present or repeats within a data set. Because relational databases do not support the OCCURS construct, these clauses generate additional tables, which can degrade the performance of update processing.

You can control how items with an OCCURS clause are mapped on an item by item basis by flattening OCCURS. See Flattening OCCURS Clauses.

Default OCCURS Handling

If you don't flatten OCCURS, Databridge Client creates a new table for each data item that contains an OCCURS clause. The keys from the data item's parent data set are used as keys in the new table. In addition, a new key (named index1) is created to establish a unique composite key for each recurring data item.

For example, a DMSII data set has a data item with an OCCURS clause will result in two relational database tables:

  • The first table (called the primary table) is named using the lowercase form of the DMSII data set name with all hyphens changed to underscores. It contains the key items as well as all data items that do not have OCCURS clauses.

  • The second table (called the secondary table) is named by appending an underscore and the data item name to the primary table name. This table contains all of the OCCURS items; however, each table has a unique key created by index1. (Names that exceed the character limit are truncated. If the truncation results in a duplicate item names, the last characters of the name are changed to digits).

Handling OCCURS items this way can significantly degrade the performance of update processing if the number of occurrences is large. The storage required to hold the keys of the secondary table items can also be substantial. For example, an OCCURS 100 TIMES clause can turn a single DMSII update into 101 relational database updates. See DMSII DASDL with OCCURS for an example of a DMSII data set that has a data item with an OCCURS clause.


DMSII DASDL with OCCURS

The following excerpt from a DMSII DASDL shows how an OCCURS clause is defined.

 ORDERS                   DATA SET
 (
     ORDER-ID             ALPHA (4);
     ORDER-DATE           ALPHA (5);
     ORDER-ITEM           OCCURS 10 TIMES NUMBER (8);
 );

 BY-ORDER-ID SET OF ORDERS
 KEY IS
 (
     ORDER-ID
 )
 NO DUPLICATES,
 INDEX SEQUENTIAL;

The OCCURS clause allows access by subscripting (indexing) within an application program. Because relational databases do not allow subscripting (indexing), the Databridge Client maps the subscript into an additional key. The OCCURS items, then, are available by row.

When this ORDERS data set is cloned into the relational database, it is mapped into the following two tables. These tables show how the DMSII OCCURS clause appears in a relational database.

Table 1

This table is named the same as the ORDERS DMSII data set, and it contains the key item plus all non-OCCURS items. Assuming the ORDERS DMSII data set has 50 records, this table has 50 rows.

orders (table name)
order_id       order_date
--------       ----------
1201           jan12
 .              .
 .              .
 .              .
1250           feb12

Table 2

This table name combines the DMSII data set name and the name of the data item which has an OCCURS clause. It contains all the occurrences of the OCCURS data item ORDER-NUM.

Continuing with the example from Table 1 with 50 records (rows), this table has 500 total rows. For every order_id key (50 total), there are ten OCCURS items (as declared in the DASDL on the previous page).

orders_order_item (table name)
order_id      index1    order_item
--------      ------    ----------
1201          1         00007390
1201          2         00001293
1201          3         00007748
1201          4         00009856
1201          5         00003736
1201          6         00002278
1201          7         00004327
1201          8         00009463
1201          9         00008638
1201          10        00008954
1202          1         00001754
1202          .         00005309
1202          .         00004537
1202          10        00005940
1203          1         00005430
1203          .         00005309
1203          .         00004537
1203          10        00006587
  .           .             .
  .           .             .
  .           .             .


Flattening OCCURS Clauses

The flatten_all_occurs parameter makes the defineand redefinecommands set the value of the bit DIOPT_Flatten_Occurs (1) in the di_options column in the DMS_ITEMS table for all items that have OCCURS clauses. You can set this parameter from the Client Configurator or by editing the configuration file to specify whether to globally flatten OCCURS clauses for a data source. By using user scripts, you can control this option for individual items.

The Databridge Client provides two options for handling OCCURS clauses.

Flatten OCCURS to the primary table Each occurrence of the item is mapped into a separate column in the primary table. Use this method if the number of occurrences is not too large and applications access the occurring items by column name (versus numeric index).

This is the default method for flattening OCCURS clauses and only requires that the above mentioned bit be set in the di_options column in the DMS_ITEM entry for the item with the OCCURS clause.
Flatten OCCURS to a new secondary table In the secondary table, all of the occurring items are mapped to a single row that contains the keys and all of the occurrences of the item. Use this method to flatten OCCURS clauses that have a large number of occurrences.

To make this happen you need to set the bit DIOPT_FlatSecondary(4096) in the di_options column in the DMS_ITEMS table for any items with an OCCURS clause that you want flattened in this manner. If both this bit and the DIOPT_Flatten_Occurs bit are set, this bit takes precedence.

When using the Administrative Console you can set the parameter flatten_all_occurs by using the Configure command in the Settings menu for the data source. To customize the handling of individual items with OCCURS clause use the Customize command in the Settings menu for the data source. Click on the data set involved and click on the wrench (properties) of the item in question in the DMS Items view and pick the entry you want from the drop down list for the "Flatten Occurs" property. The choices are "Don't Flatten", "Flatten Within Table", "Flatten into Secondary Table" and if applicable "Flatten to String".

Flattening OCCURS Clauses to a String

Single items of type NUMBER(n) or ALPHA(n) with an OCCURS clause can be flattened to a character string represented by a CHAR or VARCHAR data type. You can have fixed format strings or CSV format strings, where the delimiter character can be selected via the dms_subtype column in DMS_ITEMS. This feature is controlled by the DIOPT_Flatten2String bit in the di_options and the dms_subtype column. If the dms_subtype is 0, fixed format is used and if the dms_subtype is non-zero it specifies the delimiter character used in the CSV format. NULL data is represented by blanks in fixed format and empty fields in CSV format (i.e. two consecutive delimiters or a delimiter at the end of the data). For example a NUMBER(1) OCCURS 20 TIMES can be flattened to a column that is a CHAR(20) when using fixed format.

Flattening OCCURS Clause for Three-Bit Numeric Flags

MISER systems store certain flags as arrays of single-digit numbers, where each number is used to hold three Boolean values. The Databridge Client can be directed to map these items as a series of Booleans data items (bit in SQL Server). To do this, set the DIOPT_Flatten_Occurs bit (1) and the DIOPT_Clone_as_Tribit bit (16) in the di_options column of the corresponding DMS_ITEMS record.

An example for the item L-LOCK-FLAG in the data set LOAN follows:

Filename: script.user_layout.loan

  update DMS_ITEMS set di_options=17
  where dataset_name = 'LOAN' and rectype=0 and dms_item_name = 'L-LOCK-FLAG'
        and data_source = 'MISDB'

In the above example, if the L-LOCK_FLAG has an OCCURS 20 TIMES clause, 60 items of type bit named l_lock_flag_01 to l_lock_flag_60 are created.

These items can also be flattened to a secondary table by setting the bit DIOPT_FlatSecondary(4096) in the di_options column for the corresponding entry in the DMS_ITEMS table.


Flattening OCCURS Clause for Items Cloned as Dates

The following script directs the define and redefine commands to map an item with an OCCURS clause as a series of columns, whose data type is a relational database date type, in the corresponding table. Furthermore, it specifies that the DMSII item, which is of type NUMBER(8), contains a date in the mm/dd/yyyy format.

Filename: script.user_layout.billing

  update DMS_ITEMS set di_options=3, dms_subtype=23
  where dms_item_name = 'BILLING-DATES' and dataset_name = 'BILLING'


DMSII GROUP OCCURS

The following is an excerpt from a DMSII DASDL that shows a GROUP item that has an OCCURS clause.

SALES                           DATA SET
(
    PRODUCT-CODE                ALPHA (10);
    PRODUCT-NAME                ALPHA (20);
    SALES-HISTORY GROUP         OCCURS 5 TIMES  %FIVE YEAR HISTORY
    (
        TOTAL-UNITS-SOLD        NUMBER (10);    %FOR THE YEAR
        YEARLY-SALES-AMOUNT     NUMBER (S12,2); %BY MONTH
    );

);
SH-PRODUCT-CODE-SET SET OF SALES-HISTORY
   KEY IS
   (
        PRODUCT-CODE
   )
NO DUPLICATES,
INDEX SEQUENTIAL;

When this SALES data set is cloned into the relational database, it is mapped into the following tables:

Table 1 (primary table)

This table is named the same as the SALES DMSII data set, and it contains the key item and the data items that do not have OCCURS clauses. Because the GROUP item has an OCCURS clause, none of the GROUP items are included in this table. Assuming there are five records in the DMSII data set, there are also five rows in this relational database table.

sales  (table name)
product_code     product_name
------------     ------------
BC99992121       Widget
TR55553440       Mixer
HM44447322       Gadget
PP77778299       Twirler
DG22221163       SuperMix

Table 2 (secondary table)

This table is named: datasetname + GROUP_OCCURS_name

Assuming there are five records in the DMSII data set, there are 25 records in this relational database table. The main difference here is the addition of an index to denote the occurrence number of the item.

sales_sales_history (table name)

product_code   index1    total_units_sold   yearly_sales_amount
------------   ------    ----------------   -------------------
BC99992121          1            55543665          123456789.01
BC99992121          2            83746994          234567890.12
BC99992121          3            33847295          345678901.23
BC99992121          4            57483037          456789123.45
BC99992121          5            10947377          567891234.56
TR55553440          1            56722221          678912345.67
TR55553440          2            74838976          789123456.78
TR55553440          3            54793873          891234567.89
TR55553440          4            99048900          912345678.90
TR55553440          5            22308459          123456789.01
HM44447322          1            75032948          234567890.12
HM44447322          2            30750344          345678901.23
HM44447322          3            90570340          456789123.45
HM44447322          4            57948755          567891234.56
HM44447322          5            44874733          678912345.67
     .              .                   .                   .   
     .              .                   .                   .


DMSII Nested OCCURS

The following is an excerpt from a DMSII DASDL showing a GROUP with an OCCURS clause that contains an item with an OCCURS clause.

This example helps to reinforce the previous examples of how DMSII GROUP and OCCURS are mapped to a relational database.

SALES                    DATA SET
 PRODUCT-CODE            ALPHA (10);
 PRODUCT-NAME            ALPHA (20);
 SALES-HISTORY GROUP     OCCURS 5 TIMES %FIVE YEAR HISTORY
   (
   TOTAL-UNITS-SOLD      NUMBER (10); %FOR THE YEAR
   MONTHLY-SALES-AMOUNT  NUMBER (S12,2) OCCURS 12 TIMES;
   );
SH-PRODUCT-CODE-SET      SET OF SALES-HISTORY
   KEY IS
   (
   PRODUCT-CODE
   )
NO DUPLICATES,
INDEX SEQUENTIAL;

When this SALES data set is cloned into the relational database, it is mapped into the following three tables:

  • sales

    (primary table, table name derived from datasetname)

  • sales_sales_history

    (secondary table, table name derived from datasetname + GROUPOCCURSname)

  • sales_monthly_sales_amount

    (secondary table, table name derived from datasetname + OCCURSitemname)

Table 1

This table is named the same as the SALES DMSII data set.

It contains the key item and all non-OCCURS data items. Because the GROUP has an OCCURS clause, none of the GROUP items are included in this table. Assuming there are five records in the DMSII data set, there are five rows in the resulting relational database table.

sales  (table name)
product_code      product_name
------------      ------------
BC99992121        Widget
TR55553440        Mixer
HM44447322        Gadget
PP77778299        Twirler
DG22221163        SuperMix

Table 2

This table is named: datasetname + GROUP_OCCURS_name

Assuming there are five records in the DMSII data set, there are 25 rows in this table. Note the addition of the index1 column to denote the occurrence number of the group.

sales_sales_history (table name)

product_code   index1     total_units_sold
------------   ------     ----------------
BC99992121     1          55543665
BC99992121     2          98075300
BC99992121     3          77476478
BC99992121     4          76593939
BC99992121     5          33728282
TR55553440     1          87548974
TR55553440     2          56722221
TR55553440     3          11910078
TR55553440     4          47589474
TR55553440     5          57987999
HM44447322     1          75533785
HM44447322     2          33673391
HM44447322     3          74904532
HM44447322     4          98724498
HM44447322     5          39875992
   .           .             .    
   .           .             .

Table 3

This table is named: datasetname + OCCURSitemname

Assuming there are five records in the DMSII data set, there are 300 rows in this table (12 occurrences of monthly_sales_amount for each of 5 occurrences of sales_history for each product code). In the table below, index1 is the subscript of the GROUP OCCURS (1--5) and index2 is the subscript of the monthly sales amount, with subscripts (1--12).

In this example, the OCCURS level of the items MONTHLY-SALES-AMOUNT is 2, while the OCCURS level of the item SALES-HISTORY is 1.

sales_monthly_sales_amount (table name)
product_code   index1   index2   monthly_sales_amount
------------   ------   ------   --------------------
BCS9992121     1        1        1075.36
BCS9992121     1        2        49397.90
BCS9992121     1        3        49375.93
BCS9992121     1        4        22840.97
BCS9992121     1        5        38984.02
BCS9992121     1        6        40039.84
BCS9992121     1        7        33875.93
BCS9992121     1        8        35000.22
BCS9992121     1        9        65876.52
BCS9992121     1        10       20402.55
BCS9992121     1        11       17575.00
BCS9992121     1        12       41938.74
BCS9992121     2        1         . 
BCS9992121     2        2          . 
BCS9992121     2        3          . 
BCS9992121     2        4          . 
BCS9992121     2        5          . 
BCS9992121     2        6          . 
BCS9992121     2        7          . 
BCS9992121     2        8          . 
BCS9992121     2        9          . 
BCS9992121     2        10         . 
BCS9992121     2        11         . 
BCS9992121     2        12         . 
BCS9992121     3        1          . 
     .         .        .          . 
     .         .        .          . 
     .         .        .          . 
     .         .        .          .


OCCURS DEPENDING ON

DMSII uses the DEPENDING ON clause (usually with COMPACT data sets) to conserve disk space. For COMPACT data sets, the DMSII work area always contains a fully expanded version of the record; however, the record is compacted when it is stored on disk. The exact syntax for OCCURS DEPENDING ON clause is as follows:

    item_name OCCURS n TIMES DEPENDING ON depends_item_name;

The value n defines the maximum number of occurrences of the data item item_name, while the value of the depends item depends_item_name controls the number of occurrences of the item that are stored. This last number cannot exceed n. Information on an OCCURS DEPENDING ON clause is relayed to the Databridge Client, enabling the Databridge Client to suppress extraneous columns that do not actually exist. If the DEPENDS data item has a value of 3, and the OCCURS clause is OCCURS 10 TIMES, the last 7 columns are not included.


Handling Unflattened OCCURS DEPENDING ON Clauses

To handle a changing depends item, the Databridge Client uses before-image/after-image (BI/AI) pairs for data sets that have items with OCCURS DEPENDING ON clauses that are not flattened.

First, the Databridge Client checks the old and new values of the DEPENDS data item to determine how to execute the modify. The modify is handled in one of the following ways:

  • If the value of the DEPENDS data item is unchanged, the Databridge Client updates the corresponding rows in the secondary tables as usual. (Redundant updates are suppressed if the ds_options bit DSOPT_Use_bi_ai is set.)

  • If the value of the DEPENDS data item increases from m to n, the first m items are updated normally. The newly added items (m+1 through n) are inserted into the secondary table.

  • If the value of the DEPENDS data item decreases from m to n, the first n items are updated normally. Items that are no longer present (n+1 through m) are deleted from the secondary table.


Relational Database Split Tables

A split table occurs when a DMSII data set record requires more than one table in the relational database to hold the data. Split tables occur in the following circumstances:

  • When a table mapped from a DMSII data set has more than the maximum number of columns allowed by the relational database. The maximum_columns parameter in the configuration file allows you to reduce this value.

  • When a relational database table's record size exceeds the Microsoft SQL Server maximum record size (approximately 8K -- the actual value depends on the number of columns in the table).

When the define (or redefine) command reaches the point where one of the above conditions is satisfied, it stops adding columns to the table (named the same as the DMSII data set). It then starts a new table that contains the same keys as in the original record of the primary table, followed by the remaining items in the data set at the point the split occurred. Note that there is always the possibility of having multiple splits for data sets that have a large number of columns. The flattening of OCCURS items can easily lead to split tables.

A split can occur in the middle of flattening an OCCURS clause, which can be rather awkward. In order to better control where a split occurs we added the di_options2 bit DIOPT_Split_Here (4) that forces the table split to occur after the item in question is processed.

Note

When a DMSII data set is split into more than one relational database table, a WARNING message appears during a define or redefine command. In addition, each split table duplicates the keys in the original table.


Split Table Names

The new table is named using the original (parent) table name with a number (usually 1) appended to it to make it unique. All subsequent tables created from the same data set have the original table name with a numeric suffix that is incremented by 1 each time a new split table is created.


Keys for Split Tables

For a data set with keys, the keys of the original data set are duplicated in the split tables because you must access each of these tables individually. The process of splitting the data set into tables continues until there are no more data items left in the data set.

The following examples show the mapping of a data set that has 600 items (5 of which are keys) to a relational database that limits the number of columns in a table to 250. The result is tables that contain a total of 610 columns, where the 5 keys are duplicated across all 3 tables. If the original table is named savings, the remaining two tables are named savings1 and savings2, unless these names are already in use.

tablename tablename1 tablename2
250 columns (first 5 are keys) 5 keys and 245 columns 5 keys and 105 columns

The five keys are duplicated in each table. To search these split tables, you must explicitly open each table. The tables are not automatically linked.


Relational Database Table and Column Names

When you clone a DMSII database, the Databridge Client names the relational database tables and columns the same as their equivalent DMSII data sets and data items. However, some differences exist. In this section, the differences between the names are explained.


Uppercase and Lowercase

All DMSII data set, data item, and set names are uppercase. These names are also stored in uppercase in the DATASETS and DMS_ITEMS Client control tables. Their equivalent relational database table, column, and index names are stored in lowercase in the DATATABLES and DATAITEMS Client control tables.

  • All DMSII data set names are stored in the DATASETS Client control table in uppercase, just as they appear in the DMSII database. The equivalent relational database table name is converted to lowercase and is stored in the DATATABLES Client control table. Thus, a data set named CREDIT in the DMSII database is named credit in the relational database.

  • All DMSII data item names are stored in the DMS_ITEMS Client control table in uppercase, just as they appear in the DMSII database. The equivalent relational database data item name is converted to lowercase and is stored in the DATAITEMS Client control table. Thus, a data item named LIMIT in the DMSII database is named limit in the relational database.

    Note

    You must type these names in the correct case. If you are using the relational database table name as a character string value in a SQL statement (for example, 'tablename'), you must use lowercase.


Hyphens and Underscores

The hyphen (-) in the DMSII name becomes an underscore (_) in the relational database name. The only exception is a data source name that is allowed to contain hyphens.


Name Length

The limit for a DMSII data set name is 17 characters, and DMSII item name is limited to 128 characters. Relational databases typically limit table names to 30 characters; however, the Databridge Client reserves two characters for the prefix of the stored procedure names for updating the table (i_ tablename, d_ tablename, u_ tablename). Thus, the table names are actually limited to 28 characters. Similarly, the Databridge Client adds a one or two character prefix to the item names to create a unique name for the parameters of the stored procedures. The Databridge Client for Microsoft SQL Server uses a prefix of @ while the Databridge Client for Oracle uses a prefix of p_. To avoid using names that are too long for the relational database, items names are limited to 29 characters for SQL Server or 28 characters for Oracle.

With this limit of 28 characters for a table name, typically all the DMSII names fit into the relational database table name or column name. In cases where data set, data item, or other structure names are concatenated and therefore become too long for a relational database, the Databridge Client truncates the name.


Duplicate Names

If two data sets have the same name in two different DMSII databases (or data sources, from the Client perspective), the Databridge Client appends the number 1 to the duplicate table name the first time it is encountered. If a table already exists with the duplicate name with "1" appended to it, the Databridge Client appends the number "2" instead and so on until a unique table name is created.

For example, if DMSII database A has a data set named PRODUCTS and DMSII database B also has a data set named PRODUCTS, the resulting Databridge table names would be products and products1.

If you combine this duplicate data set name convention with the convention for naming split tables (when one data set results in more than one table), you can have multiple suffixes for short names.

For example, if you have two data sources with a data set named CUSTOMER, which also generates split tables, the tables are renamed as follows:

  • customers and customers1in the first data source

  • customers11and customers12in the second data source (as the primary table was renamed customers1)

Duplicate item names may result in the following cases:

  • When you use the same name for items in two different GROUPs. DMSII allows this, but the Databridge Client ignores GROUPs.

  • When you truncate two long DMSII item names that are almost identical

The Databridge Client handles duplicate item names the same way that it handles duplicate table names.


Reserved Keywords

You cannot use reserved keywords for relational database object (table, column, index, etc.) names. For example, "order" is an SQL keyword; therefore, you cannot rename a relational database table or column as "order".

If an existing DMSII data set is named ORDER, the Databridge Client stores ORDER in the DATASETS Client control table and an equivalent relational database table called "order_x" in the DATATABLES Client control table. This same convention of adding "_x" to rename a table whose name is a reserved word applies to DMSII data items. For information on reserved words in your relational database, see the related database documentation.

The SQL Server Client allows you to use reserved words as object names, as long you enclose them in square brackets in SQL statement. The configuration file parameter bracket_tabnames allow you use reserved words like "order" as table names. Setting this parameter to True makes the Client use brackets around table names that are reserved words.