PLISAXB Subroutine

Purpose

Invokes the XML parser for processing an XML document residing in a file.

Syntax

CALL PLISAXB(e, p, x, c)

Parameters

e is an event structure.

p is a pointer value or token that the parser passes to the event functions.

x is a character string expression for the input file.

c is a numeric expression that specifies the code page of the XML document for processing.

Description

The PLISAXB built-in subroutine provides internal SAX parsing based on the libmxml2 XML parser. It provides 24 distinct events and operates on a file. That character string expression for the input file can be in ASCII or EBCDIC.

Examples

This example shows the use of the PLISAXB built-in. The example shows only the main routine with a call to PLISAXC, it does not show the event structure or the type declarations.

dcl token        char(8);
dcl xmlDocument  char(4000) var;
xmlDocument =
    '<?xml version="1.0" standalone="yes"?>'
||  '<!--This document is just an example-->'
||  '<sandwich>'
||  '<bread type="baker's best"/>'
||  '<?spread please use real mayonnaise ?>'
||  '<meat>Ham &amp; turkey</meat>'
||  '<filling>Cheese, lettuce, tomato, etc.</filling>'
||  '<![CDATA[We should add a <relish> element in future!]]>'.
||  '</sandwich>'
||  ' ';
call plisaxb( eventHandler,
              addr(token),
              addrdata(xmlDocument),
              length(xmlDocument) );
end;

Restrictions

If using the -ebcdic option to compile your program, the callback event logic should account for the fact that the data passed back to the call backs is in ASCII, even if the input format was in EBCDIC. Because of this, the data must be translated prior to use within the callbacks. In addition, all reference values passed as a FIXED BIN(31) are for ASCII character encoding.

On UNIX, if your XML input is EBCDIC and contains open and close square brackets (e.g. []) and they are of the value X'BA' and X'BB', then you need to create your own custom codeset module for translation of these routines, and where you execute them, you need to set the environment variable MFCODESET to point to your custom codeset. This does not apply for ASCII input.

It is assumed that your program is built using the -bigendian compiler option if operating on an Intel Chip. If compiling your program without using the -bigendian compiler option, then you must convert the parameter types from big-endian to little-endian prior to use.

The Event Handler structure must be AUTOMATIC storage and initialized with the call back values or be assigned prior to invoking PLISAXA or PLISAXB. The Event Handler structure must be populated with an appropriate callback for each element of the structure. If the assignment is done manually using assignment statements, then it can be AUTOMATIC or STATIC.

The following are considerations if migrating an existing z/OS based application:

  • A call back EVENT must be provided for every member of the Event Handler structure. Change ENTRY LIMITED alias definitions for Event Handler Items to ENTRY to prevent the compiler from generating an E Level diagnostic. Also, remove all references to LINKAGE(OPTLINK) from ENTRY and Procedure declarations.
  • The event unknown_attribute_reference (E20), which is part of the PLISAXA and PLISAXB event structure (but not part of PLISAXC) is generated with the same content as IBM's SAX processor but occurs earlier in the parsing.
  • The reference for the event unknown_content_reference (E21) is expanded to be its actual value.
  • The events start_of_prefix_mapping (E22) and end_of_prefix_mapping (E23) are currently not supported to maintain compatibility with applications written using IBM's PLISAXA and PLISAXB functionality.
  • When using the SAX Parser, the exception event (E24) is driven only one time indicating an error, as opposed to having the exception event driven one time per character. See the topic XML Parsing Exception Event error codes for a list of Micro Focus error codes for the exception event (E24).
  • PLISAXB accepts filenames as the third parameter in the form file://dd:ddname under batch, where ddname is the name of the DD statement that specifies the filefile://filename under UNIX, where filename is the name of a UNIX file. The character string specifying the input file should have no leading or trailing blanks. If this is not the format, the parameter value is assumed to be the actual filename. Only files of RECFM=F, FB, FS, FBS and LSEQ are acceptable input.