Previous Topic Next topic Print topic


SAX event structure descriptions for PLISAXA and PLISAXB

The event structure has 24 ENTRY variables. These variables point to functions invoked by the parser for various events.

Descriptions of each event in this topic refer to this XML document example.
xmlDocument =
   '<?xml version="1.0" standalone="yes"?>'
|| '<!--This document is just an example-->'
|| '<sandwich>'
|| '<bread type="baker&quot;s best"/>'
|| '<?spread please use real mayonnaise ?>'
|| '<meat>Ham &amp; turkey</meat>'
|| '<filling>Cheese, lettuce, tomato, etc.</filling>'
|| '<![CDATA[We should add a <relish> element in future!]]>'
|| '</sandwich>'
|| 'junk';

The term XML text in the descriptions is the string, which is formed based on the pointer and length passed to the event. The parser may recognize these events in their order of appearance in the structure.

start_of_document
Occurs once, at the beginning of parsing the document. The parser passes the address and length of the entire document, including any line-control characters, such as LF (Line Feed) or NL (New Line). For the above example, the document is 305 characters in length.
version_information
Occurs within the optional XML declaration for the version information. The parser passes the address and length of the text containing the version value ("1.0" in the example above).
encoding_declaration
Occurs within the XML declaration for the optional encoding declaration. The parser passes the address and length of the text containing the encoding value.
standalone_declaration
Occurs within the XML declaration for the optional standalone declaration. The parser passes the address and length of the text containing the standalone value ("yes" in the example).
document_type_declaration
Occurs when the parser finds the document type declaration. A document type declaration begins with the character sequence <!DOCTYPE" and ends with a > character. Fairly complicated grammar rules describe the content in between.

The parser passes the address and length of the text containing the entire declaration. This includes the opening and closing character sequences. This is the only event where XML text includes the delimiters. The example above does not have a document type declaration.

end_of_document
Occurs once, when document parsing has completed.
start_of_element
Occurs once for each element start tag or empty element tag. The parser passes the address and length of the text containing the element name. The first start_of_element event during parsing in the example contains the string sandwich.
attribute_name
Occurs for each attribute in an element start tag or empty element tag, after recognizing a valid name. The parser passes the address and length of the text containing the attribute name. The only attribute name in the example is type.
attribute_characters

Occurs for each fragment of an attribute value. The parser passes the address and length of the text containing the fragment. An attribute value normally consists of a single string only, even if it is split across lines:

<element attribute="This attribute value is
split across two lines"/>

The attribute value might consist of multiple pieces, however. For instance, the value of the type attribute in the sandwich example at the beginning of the example has three fragments: the string baker, the single character ' and the string s best. The parser passes these fragments as three separate events. It passes each multiple-character string as attribute_characters events, and the single character as an attribute_predefined_reference event.

attribute_predefined_reference
Occurs in attribute values for the five pre-defined entity references &, ', >, <, and ". The parser passes a CHAR(1) or WIDECHAR(1) value that contains one of &, ', >, <, or " respectively.
attribute_character_reference

Occurs in attribute values for numeric character references of the form

&#dd;

or

&#xhh;

where d and h represent decimal and hexadecimal digits, respectively. The parser passes a FIXED BIN(31) value that contains the corresponding integer value.

end_of_element
Occurs once for each element end tag or empty element tag whenever the parser recognizes a closing angle bracket of the tag. The parser passes the address and length of the text containing the element name.
start_of_CDATA_section
Occurs at the start of a CDATA section. CDATA sections begin with the string <![CDATA[ and end with the string ]], and are used to escape blocks of text containing characters that would otherwise be recognized as XML markup. The parser passes the address and length of the text containing the opening characters <![CDATA[. The parser also passes the content of a CDATA section between these delimiters as a single content-characters event. The content-characters event is passed the text We should add a <relish> element in future!.
end_of_CDATA_section
This event occurs when the parser recognizes the end of a CDATA section. The parser passes the address and length of the text containing the closing character sequence, “]]”.
content_characters

This event represents the main body of an XML document. This is the character data between element start and end tags. The parser passes the address and length of the text containing this data, which usually consists of a single string only, even if it is split across lines:

<element1>This character content is
split across two lines"</element1>

If the content of an element includes any references or other elements, the complete content may comprise several segments.

For example, the content of the meat element in the example includes the string Ham , the character & and the string turkey. There is a trailing space in the first string fragment and a leading space in the second string fragment. The parser passes these content fragments as separate events. It passes the string content fragments, Ham and turkey, as content_characters events, and the single & character as a content_predefined_reference event. The parser also uses the content_characters event to pass the text of CDATA sections to the application.

content_predefined_reference
Occurs in element content for the pre-defined entity references &, ', >, <, and ". The parser passes a CHAR(1) or WIDECHAR(1) value that contains one of &, ', >, <, or ", respectively.
content_character_reference

Occurs in element content for numeric character references of the form:

&#dd;

or

&#xhh;

where d and h represent decimal and hexadecimal digits, respectively. The parser passes a FIXED BIN(31) value that contains the corresponding integer value.

processing_instruction
Processing instructions (PIs) allow XML documents to contain special instructions for applications. This event occurs when the parser recognizes the name following the PI opening character sequence, <?. The event also covers the data following the PI target, up to but not including the PI closing character sequence, ?>. Trailing, but not leading white space characters in the data are included. The parser passes the address and length of the text containing the target, spread in the example, and the address and length of the text containing the data, please use real mayonnaise in the example.
comment
Occurs for any comments in the XML document. The parser passes the address and length of the text between the opening and closing comment delimiters, <!-- and -->, respectively. The only comment text in the example is This document is just an example.
unknown_attribute_reference
Occurs within attribute values for entity references other than the pre-defined entity references, listed for the event attribute_predefined_reference. The parser passes the address and length of the text containing the entity name.
unknown_content_reference
Occurs within element content for entity references other than the pre-defined entity references listed for the content_predefined_reference event. The parser passes the address and length of the text containing the entity name.
start_of_prefix_mapping
This event is currently not generated.
end_of_prefix_mapping
This event is currently not generated.
exception
The parser generates this event when it detects an error processing the XML document.
Previous Topic Next topic Print topic