PL/I provides a SAX-like event-based interface for parsing XML documents. The parser invokes an application-supplied handler for parser events, passing references to the corresponding document fragments.
For an event-based API, the parser reports events to the application through call backs. Events include the start of the document, the beginning of an element, the end of the document and other document-related activities. The application must provide a handler to deal with each of the events reported by the parser. The simple SAX API follows SAX interfaces provided for other languages such as C, for which there is no standard.
The PL/I SAX parsers include the PLISAXA, PLISAXB, and PLISAXC built-in subroutines. THE PLISAXD built-in subroutine is not yet supported.
XML documents have two levels of conformance: well-formedness and validity, both of which are defined in the XML standard. Basically, an XML document is well-formed when it complies with the basic XML grammar, and with a few specific rules, such as the requirement for matching start and end element tags names. A well-formed XML document also includes an associated document type declaration (DTD) that complies with the constraints expressed in the DTD.
A PL/I function must accept the appropriate parameters and return the appropriate return value for each parser event. Also, the return value must be returned as a BYVALUE.
If using the -ebcdic option to compile your program, the callback event logic should account for the fact that the data passed back to the call backs is in ASCII, even if the input format was in EBCDIC. Because of this, the data must be translated prior to use within the callbacks. In addition, all reference values passed as a FIXED BIN(31) are for ASCII character encoding.
On UNIX, if your XML input is EBCDIC and contains open and close square brackets (e.g. []) and they are of the value X'BA' and X'BB', then you need to create your own custom codeset module for translation of these routines, and where you execute them, you need to set the environment variable MFCODESET to point to your custom codeset. This does not apply for ASCII input.
It is assumed that your program is built using the -bigendian compiler option if operating on an Intel Chip. If compiling your program without using the -bigendian compiler option, then you must convert the parameter types from big-endian to little-endian prior to use.