Data Mapping
This chapter shows you how the Databridge Client maps DMSII data structures to relational database structures.
DMSII and Relational Database Terms
The following table shows the equivalent terms for DMSII structures and relational database structures:
DMSII | Relational |
---|---|
Data set | Table |
DMS item (data item) | Column |
Record | Row (record) |
Set | Index NOTE: A relational database index is a set of column names that is used to efficiently access a row (of a table). |
Key | Key |
DMSII and Relational Database Data Types
The Databridge Engine retrieves the requested DMSII data, and DBServer passes the data to the Databridge Client, where it is assigned to standard relational database data types.
The following table lists equivalent data types for DMSII, Microsoft SQL Server, and Oracle.
DMSII | SQL Server | Oracle |
---|---|---|
ALPHA (<= char_limit bytes) | CHAR * | CHAR + |
ALPHA (>char_limit bytes) | TEXT | VARCHAR2 |
ALPHA (>varchar2_limit_bytes) | TEXT | CLOB |
BOOLEAN | BIT | NUMBER(1) |
FIELD: FIELD(n) where n is < 16 FIELD(n) where n is < 32 FIELD(n) where n >= 32 |
SMALLINT INT BIGINT ‡ |
NUMBER(5) NUMBER(10) NUMBER(10-15) |
NUMERIC: NUMBER(n) where n is a DMSII declared length <= 2 NUMBER(n) where n is a DMSII declared length <= 4 NUMBER(n) where n is a DMSII declared length <= 9 NUMBER(n) where n is a DMSII declared length <= 15 NUMBER(n) where n is a DMSII declared length >15 NUMBER(n,m) where n is a DMSII declared length and m is the number of places after the decimal point. |
TINYINT § SMALLINT INT BIGINT ‡ DEC(n) DEC(n,m) |
NUMBER(n) NUMBER(n) NUMBER(n) NUMBER(n) NUMBER(n) NUMBER(n,m) |
REAL: REAL(n) where n is a DMSII declared length <= 2 REAL(n) where n is a DMSII declared length <= 4 REAL(n) where n is a DMSII declared length <= 9 REAL(n) where n is a DMSII declared length > 9 REAL(n,m) where n is the DMSII declared length and m is the number of places after the decimal point. REAL, with no precision or scale |
TINYINT § SMALLINT INT DEC DEC(n,m) FLOAT |
NUMBER(n) NUMBER(n) NUMBER(n) NUMBER(n) NUMBER(n,m) FLOAT |
*
VARCHAR if the configuration parameter use_varchar
is set to True.
† VARCHAR2 if the configuration parameter use_varchar
is set to True.
‡ If the configuration parameter use_bigint
is set to False, DEC(n) will
be used instead.
§ Note that if the number is signed, SMALLINT is used instead. TINYINT is an unsigned quantity in SQL Server.
Databridge Data Types
IMAGE is a Databridge type that allows you to store an ALPHA item as binary by using the REDEFINE clause of the ALTER command in GenFormat on the host.
When the Databridge Client encounters an item of type IMAGE, it
automatically sets the ds_options
DIOPT_Clone_as_Binary bit in the
DMS_ITEMS Client control table.
Supported DMSII Structures
This section lists DMSII structures that are supported by the Databridge Client. If you are the relational database administrator and have no experience with DMSII databases, this section will be more useful to you if you are working with a DMSII database administrator.
In addition to fixed-format data sets and variable-format data sets, the Databridge Client supports the following DMSII structures:
-
Embedded data sets (see exceptions in the following section)
-
Remaps
-
Logical database
-
GROUP
-
FIELD items for GROUPs of BOOLEANS
-
OCCURS
-
GROUP OCCURS
-
Data sets with more items than the maximum number of columns supported by the relational database
-
Data sets that generate relational tables whose record sizes exceed the Microsoft SQL Server maximum record size.
-
DMSII links in DIRECT, UNORDERED, and STANDARD data sets. (Links in variable-format data sets are cloned but not tracked.)
Some of these structures may not be supported by your relational database. The DMSII structures that are not supported by relational databases are mapped into a form that the relational database can use. Each of these structures and the way they are mapped in the relational database are explained in the remainder of this chapter.
Unsupported DMSII Structures
When the Databridge host replication software does not support a particular DMSII structure, the Databridge Client may or may not issue a warning message, depending on the DMSII structure. For example, a message is generated when the data set has no keys.
The Databridge Client does not support the following DMSII structures:
-
Embedded data sets within an ORDERED or COMPACT data set
-
Embedded data sets if the INDEPENDENTTRANS option is reset
-
POPULATION items
-
COUNT data items
-
FILLER data items
-
AGGREGATE data items
Embedded Data Sets
An embedded data set is a DMSII representation of a hierarchical relationship or tree structure. When a DMSII data set contains another data set as an item, that data set is called an embedded data set. The data set in which it is declared is called the parent of the embedded structure. You can think of the embedded data set as the "child" of the "parent" data set.
To represent this parent-child relationship in a relational database,
the Databridge Client uses a foreign key that points to the parent data
set. This foreign key is represented by the value in the parent_aa
column in the table that corresponds to the embedded data set. The
parent_aa
column holds the parent record's key.
DMSII DASDL Showing an Embedded Data Set
The following is an excerpt from a DMSII DASDL that shows how an embedded data set is defined.
GENEALOGY DATA SET
(
PARENT-FAT-NAME ALPHA (30);
PARENT-MOT-NAME ALPHA (30);
PARENT-MOT-MAIDEN ALPHA (30);
PARENT-FAT-BDATE NUMBER (06);
PARENT-MOT-BDATE NUMBER (06);
FILLER SIZE (06);
%
CHILD DATA SET;
(
CHILD-NAME ALPHA (30);
CHILD-STATUS ALPHA (11);
CHILD-BDATE NUMBER (06);
CHILD-GENDER FIELD
(
CHILD-MALE BOOLEAN;
CHILD-FEMALE BOOLEAN;
);
CHILD-FILLER FIELD (01);
%
);
);
Resulting Tables
The following examples are for Microsoft SQL Server.
Ignoring any set definition, the resulting relational database tables are as follows:
-
Genealogy (the parent data set is cloned to its own primary table)
-
Child (the embedded data set is cloned to its own secondary table with a pointer to its parent table)
Parent Table: genealogy (table name). Where the my_aa
column is a unique key for the record derived from the DMSII AA Value of this record.
(
my_aa char(12),
parent_fat_name char(30),
parent_mot_name char(30),
parent_mot_maiden char(30),
parent_fat_bdate int,
parent_mot_bdate int
)
Child Table: child (table name)
(
my_aa char(12), - child table's key
parent_aa char(12), - foreign key of parent table
child_name char(30),
child_status char(11),
child_bdate int,
child_male bit,
child_female bit,
child_filler smallint
)
Selecting Embedded Data Sets for Cloning
When you run a clone
command, by default, the Databridge Client
selects embedded data sets along with the parent structures. If you have
altered the values for the active column in the Databridge Client
control tables, however, check the values for the embedded data set and
its parent data set.
Caution
If you want to clone an embedded data set, you must also clone the parent structure. Failure to do this results in the following error message from the Databridge Engine on the host:0043 Parent of embeddeddataset must be selected.
Record Serial Numbers
A record serial number (RSN) is a unique number (48-bits long) that is associated with a record in a data set. The RSN is guaranteed to be unique, and it stays with a record for the life of the record. Updates do not affect the RSN; therefore, RSNs are ideal keys for Databridge tables. However, RSNs are available only when using DMSII XE.
DMSII XE adds the RSN to every record of a data set that has the EXTENDED attribute set. As long as the EXTENDED attribute is set, Databridge can access the RSN, unlike application programs that can access the RSN only if it has been explicitly declared in the DASDL source. The Databridge Engine is designed to use RSNs instead of AA Values whenever possible. Regardless of whether RSNs are present, AA Values are used for data sets that contain embedded data sets or DMSII links.
Since RSNs and AA Values are the same length, they are interchangeable,
unless the data sets are embedded data sets or contain DMSII links. If
the Databridge Engine is not handling either of these types of data sets
and an RSN is present, Databridge Engine uses the RSN rather than the
AA Value. In this case, the resulting column of the DATAITEMS Client
control table is named my_rsn
instead of my_aa
to differentiate it from
an AA Value. In both cases, the set_name
column of the DATASETS Client
control table will have a value of "aa_set".
AA Values
AA is a DMSII term that stands for absolute address. An absolute address value is an A Series WORD (48-bits in length). In the Databridge Client, AA is the hexadecimal representation (12 character strings containing the characters 0--9 and A--F) of the AA Value on the host. Databridge Client uses the AA Values to implement unique keys for the parent structures of embedded data set records. It also uses AA Values to reference the records of data sets that do not have DMSII SETS with the NO DUPLICATES ALLOWED attribute.
AA Values are not constant. Any DMSII reorganization (record conversion, file format, or garbage collection) changes these values.
Note
Databridge Client supports numeric AA Values that are stored as NUMBER(15) in Oracle and BIGINT in SQL Server. It also supports binary AA Values that are stored as RAW(6) in Oracle and BINARY(6) in SQL Server.
DMSII Links
The Databridge Client implements DMSII link items, such as MEMO items in LINC databases, using an AA Value. You can use the AA Value as a foreign key when you want data from the tables associated with the link item. To enable support for DMSII links, you must do the following:
-
Enable DMSII link support in the Databridge Engine control file.
-
Set the Client configuration file parameter
enable_dms_links
to True.
Variable-Format Data Sets
DMSII variable-format data sets consist of a collection of dissimilar records containing a fixed part that is common to all records, and a variable part that depends on the record type. The individual records are identified by the record type, which is a special data item that has a value between 0 and 254.
-
A record type of 0 indicates that the record does not contain a variable part.
-
A record type of 1--254 indicates that the record contains the corresponding variable part in addition to the fixed part that is always present.
The concept of variable-format tables does not exist in relational databases. Therefore, the Databridge host software handles the various types of variable-format records as different structures. Databridge references these structures by a data set name and a record type (all other data set types have a record type value of 0). The Databridge Client uses the notation datasetname/rectype when referring to variable-format data sets in all messages. The Databridge Client handles these structures as if they were logical data sets; thus, each individual record type of a variable-format data set is mapped to a different relational database table. Variable-format data sets are tracked and updated like fixed-format data sets. Links in variable-format data sets, however, are not tracked, but retain their values from the initial clone. The link values for any records created after the clone will be null. (See track_vfds_nolinks.)
Fixed Part Only Records | Even though type 0 records are not explicitly declared in the DMSII DASDL, applications can create such records by simply setting the record type to 0. The Databridge software always defines a data set with a record type of 0 (rectype column in the DATASETS Client control table) for type 0 records of variable-format data sets. This data set is mapped to a table whose name is derived from the data set name (that is, name converted to lowercase and all dashes replaced by underscores).Note that unless the DMSII applications explicitly store type 0 records in the data set, this table may be empty. If you know this is the case, you may want to disable the cloning of the type 0 records for the data set by setting the active column of the corresponding entry in the DATASETS Client control table to 0. |
Variable Format Records | All other record types are treated as if they were contained in a separate structure. The primary tables for these structures are named by appending the suffix "_type#* to the name mapped from the data set name, where # is the decimal value of the record type (that is, a value between 1 and 254). Note that the fixed part of the variable-format records and the record type are also stored in the resulting relational database table. |
DMSII DASDL Showing Fixed- and Variable-Length Records
MAIN DATA SET
(
VAR-REC RECORD TYPE (3);
CUST-NO NUMBER (08);
CPU ALPHA (06);
FILLER SIZE (05);
),
%
1:
(
SMSA ALPHA (04);
SALES ALPHA (06);
)
%
2:
(
STATUS ALPHA (01);
RECEIVED ALPHA (06);
ORDER-DATE NUMBER (08);
)
%
3:
(
SITE ALPHA (07);
SOURCE ALPHA (07);
CLASS ALPHA (02);
);
%
MAINSET SET OF MAIN
KEY (CUST-NO)
NO DUPLICATES;
Resulting Tables
The examples in this section assume that the Microsoft SQL Server database is being used and that the Client uses the SET MAINSET as the source for index for the various tables.
The following tables are derived from the variable-format data set MAIN:
-
main (type 0 records)
-
main_type1 (type 1 records)
-
main_type2 (type 2 records)
-
main_type3 (type 3 records)
Note
All four tables contain the fixed part of the data set. The var_rec column is the record type; all records in the individual tables will have the same value in this field.
Record Type 0 Table
The table named main represents all type 0 records that do not have a variable part. The var_rec column of all records in this table will have a value of 0. Note that this table may be empty if your application does not use type 0 records. The SQL statement to create this table is shown as follows:
create table main
(
cust_no int,
var_rec smallint,
cpu char(6)
)
Record Type 1 Table
The table named main_type1 represents all type 1 records. The var_rec column of all records in this table will have a value of 1. The SQL statement to create this table is shown as follows:
create table main_type1
(
cust_no int,
var_rec smallint,
cpu char(6),
smsa char(4),
sales char(6)
)
Record Type 2 Table
The table named main_type2 represents all type 2 records. The var_rec column of all records in this table will have a value of 2. The SQL statement to create this table is shown as follows:
create table main_type2
(
cust_no int,
var_rec smallint,
cpu char(6),
status char(1),
received char(6),
order_date int
)
Record Type 3 Table
The table named main_type3 represents all type 3 records. The var_rec column of all records in this table will have a value of 3. The SQL statement to create this table is shown as follows:
create table main_type3
(
cust_no int,
var_rec smallint,
cpu char(6),
site char(7),
source char(7),
class char(2)
)
Split Variable Format Data Sets Option
When the ds_options
bit DSOPT_Split_Varfmt_ds (bit value 65536) is set,
variable format data sets are treated slightly differently. The record
type 0 tables contains the fixed part of all records regardless of their
record types. However, the table has exactly the same layout as above.
The tables for all the other records only contain the variable part of
the records and the keys from the fixed part.
The table named main_type1 in the above example will now contain the key cust_no and the variable part. The SQL statement to create this table is shown as follows:
create table main_type1 ( cust_no int, smsa char(4), sales char(6) )
Changing the Default Data Type
In most cases, the default data types are sufficient. If you want to
change the data type, however, use a relational database query tool to
edit the sql_type
column in the DATAITEMS Client control table, or put
the SQL statements in user scripts as explained in Customizing with
User Scripts.
Caution
When changing the default data type, make sure that you choose a correct data type or the data may not be correctly stored in the relational database.
Most of these relational database data types can be changed using data table customization user scripts or the Administrative Console's Customize command is the Settings menu for the data source.
Value for sql_type | Generic Data Type | Microsoft SQL Server Data Type | Oracle Data Type |
---|---|---|---|
0 | bit | bit | number(1) |
1 | char | char | char |
2 | varchar | varchar | varchar2 |
3 | byte | tinyint | number(3) |
4 | short int | smallint | number(5) |
5 | long int | int | number(10) |
6 | float | float | float |
7 | text | text | clob |
8 | binary | binary | raw |
9 | varbinary | varbinary | raw |
10 | datetime | datetime | date |
11 | packed BCD | dec | number |
12 | smalldatetime | smalldatetime | date |
13 | numeric date | int | number(10) |
14 | unsigned long | binary(4) | raw(4) |
15 | timestamp | timestamp | N/A |
16 | serial | {int - bigint - dec(n) - identity} | N/A |
17 | numeric_time | int | number(6) |
ticks | int | number(6) or number(10) | N/A |
18 | int64 | bigint | NA |
19 | date | date | NA |
20 | datetime2 | datetime2 | timestamp |
21 | time | time | N/A |
22 | uniqueidentifier | uniqueidentifier | N/A |
Handling DMSII GROUPs
A GROUP is a DMSII construct that allows the data items that belong to the group to be referenced at one time (for example, as one item). The concept of GROUP does not exist in a relational database. Therefore, if the DMSII database you replicate has one or more GROUPs, the Databridge Client ignores the GROUP name and instead treats each item within the GROUP as a regular data item. All items in a DMSII GROUP share the same parent item number, which is the item number of the GROUP item.
Following is an example of the DMSII GROUP item in the data set called ADDRESS. This GROUP item consists of the data item CITY and the data item STATE.
DMSII DASDL Showing GROUP
The following is an excerpt from a DMSII DASDL that shows how a GROUP item is defined. With the GROUP item, you can access both city and state with one reference.
ADDRESS DATA SET
(
STREET ALPHA (20);
APARTMENT ALPHA (5);
CITY-STATE GROUP
(
CITY ALPHA (20);
STATE ALPHA (2);
);
COUNTRY ALPHA (20);
ZIPCODE NUMBER (5);
POSTFIX NUMBER (4);
);
The next example shows how the same DMSII GROUP item is mapped to a relational database.
Relational Database Table
The following example is for Microsoft SQL Server.
The table name is the lowercase form of the DMSII data set name. The GROUP item CITY-STATE is ignored. The data items in that group are included in the relational database table as if they were ordinary DMSII data items.
address (table name)
street apartment city state country zipcode postfix
May St. 3 Paris OH USA 15010 2146
Elm Ln. River SD USA 24906 3381
If there are duplicate names among members of various groups within a data set, the Databridge Client resolves the conflict by appending a digit to the column name to make it unique.
Handling DMSII OCCURS
An OCCURS clause is a DMSII construct that describes the number of times an item is present or repeats within a data set. Because relational databases do not support the OCCURS construct, these clauses generate additional tables, which can degrade the performance of update processing.
You can control how items with an OCCURS clause are mapped on an item by item basis by flattening OCCURS. See Flattening OCCURS Clauses.
Default OCCURS Handling
If you don't flatten OCCURS, Databridge Client creates a new table for
each data item that contains an OCCURS clause. The keys from the data
item's parent data set are used as keys in the new table. In addition, a
new key (named index1
) is created to establish a unique composite key
for each recurring data item.
For example, a DMSII data set has a data item with an OCCURS clause will result in two relational database tables:
-
The first table (called the primary table) is named using the lowercase form of the DMSII data set name with all hyphens changed to underscores. It contains the key items as well as all data items that do not have OCCURS clauses.
-
The second table (called the secondary table) is named by appending an underscore and the data item name to the primary table name. This table contains all of the OCCURS items; however, each table has a unique key created by
index1
. (Names that exceed the character limit are truncated. If the truncation results in a duplicate item names, the last characters of the name are changed to digits).
Handling OCCURS items this way can significantly degrade the performance of update processing if the number of occurrences is large. The storage required to hold the keys of the secondary table items can also be substantial. For example, an OCCURS 100 TIMES clause can turn a single DMSII update into 101 relational database updates. See DMSII DASDL with OCCURS for an example of a DMSII data set that has a data item with an OCCURS clause.
DMSII DASDL with OCCURS
The following excerpt from a DMSII DASDL shows how an OCCURS clause is defined.
ORDERS DATA SET
(
ORDER-ID ALPHA (4);
ORDER-DATE ALPHA (5);
ORDER-ITEM OCCURS 10 TIMES NUMBER (8);
);
BY-ORDER-ID SET OF ORDERS
KEY IS
(
ORDER-ID
)
NO DUPLICATES,
INDEX SEQUENTIAL;
The OCCURS clause allows access by subscripting (indexing) within an application program. Because relational databases do not allow subscripting (indexing), the Databridge Client maps the subscript into an additional key. The OCCURS items, then, are available by row.
When this ORDERS data set is cloned into the relational database, it is mapped into the following two tables. These tables show how the DMSII OCCURS clause appears in a relational database.
Table 1
This table is named the same as the ORDERS DMSII data set, and it contains the key item plus all non-OCCURS items. Assuming the ORDERS DMSII data set has 50 records, this table has 50 rows.
orders (table name)
order_id order_date
-------- ----------
1201 jan12
. .
. .
. .
1250 feb12
Table 2
This table name combines the DMSII data set name and the name of the data item which has an OCCURS clause. It contains all the occurrences of the OCCURS data item ORDER-NUM.
Continuing with the example from Table 1 with 50 records (rows), this table has 500 total rows. For every order_id key (50 total), there are ten OCCURS items (as declared in the DASDL on the previous page).
orders_order_item (table name)
order_id index1 order_item
-------- ------ ----------
1201 1 00007390
1201 2 00001293
1201 3 00007748
1201 4 00009856
1201 5 00003736
1201 6 00002278
1201 7 00004327
1201 8 00009463
1201 9 00008638
1201 10 00008954
1202 1 00001754
1202 . 00005309
1202 . 00004537
1202 10 00005940
1203 1 00005430
1203 . 00005309
1203 . 00004537
1203 10 00006587
. . .
. . .
. . .
Flattening OCCURS Clauses
The flatten_all_occurs
parameter makes the define
and
redefine
commands set the value of the bit DIOPT_Flatten_Occurs (1) in
the di_options
column in the DMS_ITEMS table for all items that have
OCCURS clauses. You can set this parameter from the Client Configurator
or by editing the configuration file to specify whether to globally
flatten OCCURS clauses for a data source. By using user scripts, you can
control this option for individual items.
The Databridge Client provides two options for handling OCCURS clauses.
Flatten OCCURS to the primary table | Each occurrence of the item is mapped into a separate column in the primary table. Use this method if the number of occurrences is not too large and applications access the occurring items by column name (versus numeric index). This is the default method for flattening OCCURS clauses and only requires that the above mentioned bit be set in the di_options column in the DMS_ITEM entry for the item with the OCCURS clause. |
Flatten OCCURS to a new secondary table | In the secondary table, all of the occurring items are mapped to a single row that contains the keys and all of the occurrences of the item. Use this method to flatten OCCURS clauses that have a large number of occurrences. To make this happen you need to set the bit DIOPT_FlatSecondary(4096) in the di_options column in the DMS_ITEMS table for any items with an OCCURS clause that you want flattened in this manner. If both this bit and the DIOPT_Flatten_Occurs bit are set, this bit takes precedence. |
When using the Administrative Console you can set the parameter flatten_all_occurs
by using the Configure command in the Settings menu for the data source.
To customize the handling of individual items with OCCURS clause use the Customize command in the Settings menu for the data source. Click on the data set
involved and click on the wrench (properties) of the item in question in the DMS Items view and pick the entry you want from the drop down list for the "Flatten Occurs" property.
The choices are "Don't Flatten", "Flatten Within Table", "Flatten into Secondary Table" and if applicable "Flatten to String".
Flattening OCCURS Clauses to a String
Single items of type NUMBER(n) or ALPHA(n) with an OCCURS clause can be
flattened to a character string represented by a CHAR or VARCHAR data
type. You can have fixed format strings or CSV format strings, where the
delimiter character can be selected via the dms_subtype
column in
DMS_ITEMS. This feature is controlled by the DIOPT_Flatten2String bit in
the di_options
and the dms_subtype
column. If the dms_subtype
is 0,
fixed format is used and if the dms_subtype
is non-zero it specifies the
delimiter character used in the CSV format. NULL data is represented by
blanks in fixed format and empty fields in CSV format (i.e. two
consecutive delimiters or a delimiter at the end of the data). For
example a NUMBER(1) OCCURS 20 TIMES can be flattened to a column that is
a CHAR(20) when using fixed format.
Flattening OCCURS Clause for Three-Bit Numeric Flags
MISER systems store certain flags as arrays of single-digit numbers,
where each number is used to hold three Boolean values. The Databridge
Client can be directed to map these items as a series of Booleans data
items (bit in SQL Server). To do this, set the DIOPT_Flatten_Occurs bit
(1) and the DIOPT_Clone_as_Tribit bit (16) in the di_options
column of
the corresponding DMS_ITEMS record.
An example for the item L-LOCK-FLAG in the data set LOAN follows:
Filename: script.user_layout.loan
update DMS_ITEMS set di_options=17
where dataset_name = 'LOAN' and rectype=0 and dms_item_name = 'L-LOCK-FLAG'
and data_source = 'MISDB'
In the above example, if the L-LOCK_FLAG has an OCCURS 20 TIMES clause, 60 items of type bit named l_lock_flag_01 to l_lock_flag_60 are created.
These items can also be flattened to a secondary table by setting the
bit DIOPT_FlatSecondary(4096) in the di_options
column for the
corresponding entry in the DMS_ITEMS table.
Flattening OCCURS Clause for Items Cloned as Dates
The following script directs the define
and redefine
commands to map
an item with an OCCURS clause as a series of columns, whose data type is
a relational database date type, in the corresponding table.
Furthermore, it specifies that the DMSII item, which is of type
NUMBER(8), contains a date in the mm/dd/yyyy format.
Filename: script.user_layout.billing
update DMS_ITEMS set di_options=3, dms_subtype=23
where dms_item_name = 'BILLING-DATES' and dataset_name = 'BILLING'
DMSII GROUP OCCURS
The following is an excerpt from a DMSII DASDL that shows a GROUP item that has an OCCURS clause.
SALES DATA SET
(
PRODUCT-CODE ALPHA (10);
PRODUCT-NAME ALPHA (20);
SALES-HISTORY GROUP OCCURS 5 TIMES %FIVE YEAR HISTORY
(
TOTAL-UNITS-SOLD NUMBER (10); %FOR THE YEAR
YEARLY-SALES-AMOUNT NUMBER (S12,2); %BY MONTH
);
);
SH-PRODUCT-CODE-SET SET OF SALES-HISTORY
KEY IS
(
PRODUCT-CODE
)
NO DUPLICATES,
INDEX SEQUENTIAL;
When this SALES data set is cloned into the relational database, it is mapped into the following tables:
Table 1 (primary table)
This table is named the same as the SALES DMSII data set, and it contains the key item and the data items that do not have OCCURS clauses. Because the GROUP item has an OCCURS clause, none of the GROUP items are included in this table. Assuming there are five records in the DMSII data set, there are also five rows in this relational database table.
sales (table name)
product_code product_name
------------ ------------
BC99992121 Widget
TR55553440 Mixer
HM44447322 Gadget
PP77778299 Twirler
DG22221163 SuperMix
Table 2 (secondary table)
This table is named: datasetname + GROUP_OCCURS_name
Assuming there are five records in the DMSII data set, there are 25 records in this relational database table. The main difference here is the addition of an index to denote the occurrence number of the item.
sales_sales_history (table name)
product_code index1 total_units_sold yearly_sales_amount
------------ ------ ---------------- -------------------
BC99992121 1 55543665 123456789.01
BC99992121 2 83746994 234567890.12
BC99992121 3 33847295 345678901.23
BC99992121 4 57483037 456789123.45
BC99992121 5 10947377 567891234.56
TR55553440 1 56722221 678912345.67
TR55553440 2 74838976 789123456.78
TR55553440 3 54793873 891234567.89
TR55553440 4 99048900 912345678.90
TR55553440 5 22308459 123456789.01
HM44447322 1 75032948 234567890.12
HM44447322 2 30750344 345678901.23
HM44447322 3 90570340 456789123.45
HM44447322 4 57948755 567891234.56
HM44447322 5 44874733 678912345.67
. . . .
. . . .
DMSII Nested OCCURS
The following is an excerpt from a DMSII DASDL showing a GROUP with an OCCURS clause that contains an item with an OCCURS clause.
This example helps to reinforce the previous examples of how DMSII GROUP and OCCURS are mapped to a relational database.
SALES DATA SET
PRODUCT-CODE ALPHA (10);
PRODUCT-NAME ALPHA (20);
SALES-HISTORY GROUP OCCURS 5 TIMES %FIVE YEAR HISTORY
(
TOTAL-UNITS-SOLD NUMBER (10); %FOR THE YEAR
MONTHLY-SALES-AMOUNT NUMBER (S12,2) OCCURS 12 TIMES;
);
SH-PRODUCT-CODE-SET SET OF SALES-HISTORY
KEY IS
(
PRODUCT-CODE
)
NO DUPLICATES,
INDEX SEQUENTIAL;
When this SALES data set is cloned into the relational database, it is mapped into the following three tables:
-
sales
(primary table, table name derived from datasetname)
-
sales_sales_history
(secondary table, table name derived from datasetname + GROUPOCCURSname)
-
sales_monthly_sales_amount
(secondary table, table name derived from datasetname + OCCURSitemname)
Table 1
This table is named the same as the SALES DMSII data set.
It contains the key item and all non-OCCURS data items. Because the GROUP has an OCCURS clause, none of the GROUP items are included in this table. Assuming there are five records in the DMSII data set, there are five rows in the resulting relational database table.
sales (table name)
product_code product_name
------------ ------------
BC99992121 Widget
TR55553440 Mixer
HM44447322 Gadget
PP77778299 Twirler
DG22221163 SuperMix
Table 2
This table is named: datasetname + GROUP_OCCURS_name
Assuming there are five records in the DMSII data set, there are 25 rows
in this table. Note the addition of the index1
column to denote the occurrence
number of the group.
sales_sales_history (table name)
product_code index1 total_units_sold
------------ ------ ----------------
BC99992121 1 55543665
BC99992121 2 98075300
BC99992121 3 77476478
BC99992121 4 76593939
BC99992121 5 33728282
TR55553440 1 87548974
TR55553440 2 56722221
TR55553440 3 11910078
TR55553440 4 47589474
TR55553440 5 57987999
HM44447322 1 75533785
HM44447322 2 33673391
HM44447322 3 74904532
HM44447322 4 98724498
HM44447322 5 39875992
. . .
. . .
Table 3
This table is named: datasetname + OCCURSitemname
Assuming there are five records in the DMSII data set, there are 300
rows in this table (12 occurrences of monthly_sales_amount for each of 5
occurrences of sales_history for each product code). In the table below,
index1
is the subscript of the GROUP OCCURS (1--5) and index2
is the
subscript of the monthly sales amount, with subscripts (1--12).
In this example, the OCCURS level of the items MONTHLY-SALES-AMOUNT is 2, while the OCCURS level of the item SALES-HISTORY is 1.
sales_monthly_sales_amount (table name)
product_code index1 index2 monthly_sales_amount
------------ ------ ------ --------------------
BCS9992121 1 1 1075.36
BCS9992121 1 2 49397.90
BCS9992121 1 3 49375.93
BCS9992121 1 4 22840.97
BCS9992121 1 5 38984.02
BCS9992121 1 6 40039.84
BCS9992121 1 7 33875.93
BCS9992121 1 8 35000.22
BCS9992121 1 9 65876.52
BCS9992121 1 10 20402.55
BCS9992121 1 11 17575.00
BCS9992121 1 12 41938.74
BCS9992121 2 1 .
BCS9992121 2 2 .
BCS9992121 2 3 .
BCS9992121 2 4 .
BCS9992121 2 5 .
BCS9992121 2 6 .
BCS9992121 2 7 .
BCS9992121 2 8 .
BCS9992121 2 9 .
BCS9992121 2 10 .
BCS9992121 2 11 .
BCS9992121 2 12 .
BCS9992121 3 1 .
. . . .
. . . .
. . . .
. . . .
OCCURS DEPENDING ON
DMSII uses the DEPENDING ON clause (usually with COMPACT data sets) to conserve disk space. For COMPACT data sets, the DMSII work area always contains a fully expanded version of the record; however, the record is compacted when it is stored on disk. The exact syntax for OCCURS DEPENDING ON clause is as follows:
item_name OCCURS n TIMES DEPENDING ON depends_item_name;
The value n defines the maximum number of occurrences of the data item item_name, while the value of the depends item depends_item_name controls the number of occurrences of the item that are stored. This last number cannot exceed n. Information on an OCCURS DEPENDING ON clause is relayed to the Databridge Client, enabling the Databridge Client to suppress extraneous columns that do not actually exist. If the DEPENDS data item has a value of 3, and the OCCURS clause is OCCURS 10 TIMES, the last 7 columns are not included.
Handling Unflattened OCCURS DEPENDING ON Clauses
To handle a changing depends item, the Databridge Client uses before-image/after-image (BI/AI) pairs for data sets that have items with OCCURS DEPENDING ON clauses that are not flattened.
First, the Databridge Client checks the old and new values of the DEPENDS data item to determine how to execute the modify. The modify is handled in one of the following ways:
-
If the value of the DEPENDS data item is unchanged, the Databridge Client updates the corresponding rows in the secondary tables as usual. (Redundant updates are suppressed if the
ds_options
bit DSOPT_Use_bi_ai is set.) -
If the value of the DEPENDS data item increases from m to n, the first m items are updated normally. The newly added items (m+1 through n) are inserted into the secondary table.
-
If the value of the DEPENDS data item decreases from m to n, the first n items are updated normally. Items that are no longer present (n+1 through m) are deleted from the secondary table.
Relational Database Split Tables
A split table occurs when a DMSII data set record requires more than one table in the relational database to hold the data. Split tables occur in the following circumstances:
-
When a table mapped from a DMSII data set has more than the maximum number of columns allowed by the relational database. The
maximum_columns
parameter in the configuration file allows you to reduce this value. -
When a relational database table's record size exceeds the Microsoft SQL Server maximum record size (approximately 8K -- the actual value depends on the number of columns in the table).
When the define
(or redefine
) command reaches the point where one of the
above conditions is satisfied, it stops adding columns to the table
(named the same as the DMSII data set). It then starts a new table that contains
the same keys as in the original record of the primary table,
followed by the remaining items in the data set at the point the split occurred.
Note that there is always the possibility of having multiple splits for data sets
that have a large number of columns. The flattening of OCCURS items can easily
lead to split tables.
A split can occur in the middle of flattening an OCCURS clause, which can be rather awkward.
In order to better control where a split occurs we added the di_options2
bit DIOPT_Split_Here (4)
that forces the table split to occur after the item in question is processed.
Note
When a DMSII data set is split into more than one relational database table, a WARNING message appears during a define
or redefine
command. In addition, each split table duplicates the keys in the original table.
Split Table Names
The new table is named using the original (parent) table name with a number (usually 1) appended to it to make it unique. All subsequent tables created from the same data set have the original table name with a numeric suffix that is incremented by 1 each time a new split table is created.
Keys for Split Tables
For a data set with keys, the keys of the original data set are duplicated in the split tables because you must access each of these tables individually. The process of splitting the data set into tables continues until there are no more data items left in the data set.
The following examples show the mapping of a data set that has 600 items (5 of which are keys) to a relational database that limits the number of columns in a table to 250. The result is tables that contain a total of 610 columns, where the 5 keys are duplicated across all 3 tables. If the original table is named savings, the remaining two tables are named savings1 and savings2, unless these names are already in use.
tablename | tablename1 | tablename2 |
250 columns (first 5 are keys) | 5 keys and 245 columns | 5 keys and 105 columns |
The five keys are duplicated in each table. To search these split tables, you must explicitly open each table. The tables are not automatically linked.
Relational Database Table and Column Names
When you clone a DMSII database, the Databridge Client names the relational database tables and columns the same as their equivalent DMSII data sets and data items. However, some differences exist. In this section, the differences between the names are explained.
Uppercase and Lowercase
All DMSII data set, data item, and set names are uppercase. These names are also stored in uppercase in the DATASETS and DMS_ITEMS Client control tables. Their equivalent relational database table, column, and index names are stored in lowercase in the DATATABLES and DATAITEMS Client control tables.
-
All DMSII data set names are stored in the DATASETS Client control table in uppercase, just as they appear in the DMSII database. The equivalent relational database table name is converted to lowercase and is stored in the DATATABLES Client control table. Thus, a data set named CREDIT in the DMSII database is named credit in the relational database.
-
All DMSII data item names are stored in the DMS_ITEMS Client control table in uppercase, just as they appear in the DMSII database. The equivalent relational database data item name is converted to lowercase and is stored in the DATAITEMS Client control table. Thus, a data item named LIMIT in the DMSII database is named limit in the relational database.
Note
You must type these names in the correct case. If you are using the relational database table name as a character string value in a SQL statement (for example,
'tablename'
), you must use lowercase.
Hyphens and Underscores
The hyphen (-) in the DMSII name becomes an underscore (_) in the relational database name. The only exception is a data source name that is allowed to contain hyphens.
Name Length
The limit for a DMSII data set name is 17 characters, and DMSII item name is limited to 128 characters. Relational databases typically limit table names to 30 characters; however, the Databridge Client reserves two characters for the prefix of the stored procedure names for updating the table (i_ tablename, d_ tablename, u_ tablename). Thus, the table names are actually limited to 28 characters. Similarly, the Databridge Client adds a one or two character prefix to the item names to create a unique name for the parameters of the stored procedures. The Databridge Client for Microsoft SQL Server uses a prefix of @ while the Databridge Client for Oracle uses a prefix of p_. To avoid using names that are too long for the relational database, items names are limited to 29 characters for SQL Server or 28 characters for Oracle.
With this limit of 28 characters for a table name, typically all the DMSII names fit into the relational database table name or column name. In cases where data set, data item, or other structure names are concatenated and therefore become too long for a relational database, the Databridge Client truncates the name.
Duplicate Names
If two data sets have the same name in two different DMSII databases (or data sources, from the Client perspective), the Databridge Client appends the number 1 to the duplicate table name the first time it is encountered. If a table already exists with the duplicate name with "1" appended to it, the Databridge Client appends the number "2" instead and so on until a unique table name is created.
For example, if DMSII database A has a data set named PRODUCTS and DMSII database B also has a data set named PRODUCTS, the resulting Databridge table names would be products and products1.
If you combine this duplicate data set name convention with the convention for naming split tables (when one data set results in more than one table), you can have multiple suffixes for short names.
For example, if you have two data sources with a data set named CUSTOMER, which also generates split tables, the tables are renamed as follows:
-
customers
andcustomers1
in the first data source -
customers11
andcustomers12
in the second data source (as the primary table was renamed customers1)
Duplicate item names may result in the following cases:
-
When you use the same name for items in two different GROUPs. DMSII allows this, but the Databridge Client ignores GROUPs.
-
When you truncate two long DMSII item names that are almost identical
The Databridge Client handles duplicate item names the same way that it handles duplicate table names.
Reserved Keywords
You cannot use reserved keywords for relational database object (table, column, index, etc.) names. For example, "order" is an SQL keyword; therefore, you cannot rename a relational database table or column as "order".
If an existing DMSII data set is named ORDER, the Databridge Client stores ORDER in the DATASETS Client control table and an equivalent relational database table called "order_x" in the DATATABLES Client control table. This same convention of adding "_x" to rename a table whose name is a reserved word applies to DMSII data items. For information on reserved words in your relational database, see the related database documentation.
The SQL Server Client allows you to use reserved words as object names, as long
you enclose them in square brackets in SQL statement. The configuration
file parameter bracket_tabnames
allow you use reserved words like "order" as
table names. Setting this parameter to True makes the Client use brackets
around table names that are reserved words.