The event structure has 19 ENTRY variables. These variables point to functions invoked by the parser for various events.
Descriptions of each event in this topic refer to this XML document example.
xmlDocument =
'<?xml version="1.0" standalone="yes"?>'
|| '<!--This document is just an example-->'
|| '<sandwich>'
|| '<bread type="baker"s best"/>'
|| '<?spread please use real mayonnaise ?>'
|| '<meat>Ham & turkey</meat>'
|| '<filling>Cheese, lettuce, tomato, etc.</filling>'
|| '<![CDATA[We should add a <relish> element in future!]]>'
|| '</sandwich>'
|| 'junk';
The term
XML text in the descriptions is the string, which is formed based on the pointer and length passed to the event. The parser may recognize these events in their order of appearance in the structure.
- start_of_document
- Occurs once, at the beginning of parsing the document. The parser passes no parameters to this event (except the user token).
- version_information
- Occurs within the optional XML declaration for the version information. The parser passes the address and length of the text containing the version value ("1.0" in the example above).
- encoding_declaration
- Occurs within the XML declaration for the optional encoding declaration. The parser passes the address and length of the text containing the encoding value.
- standalone_declaration
- Occurs within the XML declaration for the optional standalone declaration. The parser passes the address and length of the text containing the standalone value ("yes" in the example).
- document_type_declaration
- Occurs when the parser finds the document type declaration. A document type declaration begins with the character sequence
<!DOCTYPE" and ends with a
> character. Fairly complicated grammar rules describe the content in between.
The parser passes the address and length of the text containing the entire declaration. This includes the opening and closing character sequences. This is the only event where XML text includes the delimiters. The example above does not have a document type declaration.
- end_of_document
- Occurs once, when document parsing has completed. The parser passes no parameters to this event (except the user token).
- start_of_element
- Occurs once for each element start tag or empty element tag. The parser passes the address and length of the text containing the element name as well as any applicable namespace information. The first
start_of_element event during parsing in the example contains the string
sandwich.
- attribute_name
- Occurs for each attribute in an element start tag or empty element tag, after recognizing a valid name. The parser passes the address and length of the text containing the attribute name as well as any applicable namespace information. The only attribute name in the example is
type.
- attribute_characters
-
Occurs for each fragment of an attribute value. The parser passes the address and length of the text containing the fragment. An attribute value normally consists of a single string only, even if it is split across lines:
<element attribute="This attribute value is
split across two lines"/>
- end_of_element
- Occurs once for each element end tag or empty element tag whenever the parser recognizes a closing angle bracket of the tag. The parser passes the address and length of the text containing the element name as well as any applicable namespace information.
- start_of_CDATA_section
- Occurs at the start of a CDATA section. CDATA sections begin with the string
<![CDATA[ and end with the string
]]. They are are used to escape blocks of text containing characters that would otherwise be recognized as XML markup. The parser passes no parameters to this event (except the user token).
After this event, the parser passes the content of a CDATA section between these delimiters as as one or more content-characters events. In the preceding example, the
content-characters event is passed the text
We should add a <relish> element in future!
- end_of_CDATA_section
- This event occurs when the parser recognizes the end of a CDATA section. The parser passes no parameters to this event (except the user token).
- content_characters
-
This event represents the main body of an XML document. This is the character data between element start and end tags. The parser passes the address and length of the text containing this data, which usually consists of a single string only, even if it is split across lines:
<element1>This character content is
split across two lines"</element1>
The parser also passes a flag byte which indicates if the next event provides additional characters that form part of the content. This can be true when there is a lot of data between the start and end tags.
The parser also uses the
content_characters event to pass the text of CDATA sections to the application.
- processing_instruction
- Processing instructions (PIs) allow XML documents to contain special instructions for applications. This event occurs when the parser recognizes the name following the PI opening character sequence,
<?. The event also covers the data following the PI target, up to but not including the PI closing character sequence,
?>. Trailing, but not leading white space characters in the data are included. The parser passes the address and length of the text containing the target,
spread in the example, and the address and length of the text containing the data,
please use real mayonnaise in the example.
- comment
- Occurs for any comments in the XML document. The parser passes the address and length of the text between the opening and closing comment delimiters,
<!-- and
-->, respectively. The only comment text in the example is
This document is just an example.
- namespace_declare
- Occurs for any namespace declarations in the XML document. The parser passes the address and length of the namespace prefix (if any) as well as the address and length of the namespace URI. If there is no namespace prefix, the passed length is zero and the value of the address should not be used. There is no corresponding event in the PLIXSAXA and PLISAXB built-ins.
- end_of_input
- This event occurs whenever the parser reaches the end of the current input buffer. The parser passes (along with the BYVALUE user token) two BYADDR parameters: the address and length of the next buffer for it to process. This event and the content character events are the only events that have any BYADDR parameters, but this is the only event that has parameters that the called event should change. There is no corresponding event in the PLIXSAXA and PLISAXB built-ins, and it is this event that allows PLISAXC to parse an XML document of arbitrary size.
- unresolved_reference
- This event occurs for any unresolved references in the XML document. The parser passes the address and length of the unresolved reference.
- unknown_attribute_reference
- Occurs within attribute values for entity references other than the pre-defined entity references, listed for the event
attribute_predefined_reference. The parser passes the address and length of the text containing the entity name.
- exception
- The parser generates this event when it detects an error processing the XML document.