Book HomeJava and XSLTSearch this book

13.3. Expat Handlers

Expat is an event-based parser that recognizes parts of the document (such as the start or end tag for an XML element) and calls any handlers registered for that type of an event. All handlers receive an instance of XML::Parser::Expat as their first argument.

Init (Expat)
Called before parsing starts.

Final (Expat)
Called after parsing has finished, but only if no errors occurred.

Start (Expat, Element [, Attr, Val [,...]])
Generated when an XML start tag is encountered. Element is the name of the XML element type. Attr and Val pairs are generated for each attribute in the start tag of the element.

End (Expat, Element)
Generated when an XML end tag or empty tag (<foo/>) is encountered.

Char (Expat, String)
Generated when non-markup is recognized, with the non-markup sequence of characters in String given to the handler in UTF-8.

Proc (Expat, Target, Data)
Generated when a processing instruction is recognized.

Comment (Expat, Data)
Generated when a comment is recognized.

CdataStart (Expat)
Called at the start of a CDATA section.

CdataEnd (Expat)
Called at the end of a CDATA section.

Default (Expat, String)
Called for any characters that aren't tied to a registered handler, including markup declarations. Whatever the encoding in the original document, the string is returned to the handler in UTF-8.

Unparsed (Expat, Entity, Base, Sysid, Pubid, Notation)
Called for a declaration of an unparsed entity. Entity is the name of the entity. Base is the base to be used for resolving a relative URI. Sysid is the system ID. Pubid is the public ID. Notation is the notation name. Base and Pubidcan be undefined.

Notation (Expat, Notation, Base, Sysid, Pubid)
Called for a declaration of notation. Notation is the notation name. Base is the base to be used for resolving a relative URI. Sysid is the system ID. Pubid is the public ID. Base, Sysid, and Pubid can all be undefined.

ExternEnt (Expat, Base, Sysid, Pubid)
Called when an external entity is referenced. Base is the base to be used for resolving a relative URI. Sysid is the system ID. Pubid is the public ID. Base and Pubidmay be undefined. This handler should either return a string or an open filehandle that represents the contents of the external entity. A return value of undef indicates that the external entity couldn't be found.

If an open filehandle is returned, it must be returned as either a glob or as a reference to a glob (e.g., an instance of IO::Handle). The default handler installed for this event is XML::Parser::lwp_ext_ent_handler unless NoLWP option is true, in which case XML::Parser::file_ext_ent_handler is the default handler for external entities.

ExternEntFin (Expat)
Called after parsing an external entity unless no ExternEnt handler is set.

Entity (Expat, Name, Val, Sysid, Pubid, Ndata, IsParam)
Called when an entity is declared. For internal entities, Val will contain the value, and the remaining three parameters will be undefined. For external entities, Val will be undefined, and the Sysid, Pubid, and the Ndata parameters will be populated. The IsParam parameter is set to true if this is a parameter entity declaration.

If both this handler and the Unparsed handler are set, then this handler will not be called for unparsed entities.

Element (Expat, Name, Model)
Called when an element declaration is found. Name is the element name, and Model is the content model as an XML::Parser::Content object.

Attlist (Expat, Elname, Attname, Type, Default, Fixed)
Called for each attribute in an ATTLIST declaration. An ATTLIST with multiple attributes will generate multiple calls to Attlist. Elname is the name of the element with which the attribute is being associated. Attname is the name of the attribute. Type is the attribute type, given as a string. Default is the default value, which will be #REQUIRED, #IMPLIED, or a quoted string (i.e., the returned string will begin and end with a quote character). If Fixed is true, then this is a fixed attribute.

Doctype (Expat, Name, Sysid, Pubid, Internal)
Called for DOCTYPE declarations. Name is the document type name. Sysid is the system ID of the document type, if it was provided; otherwise, it's undefined. Pubid is the public ID of the document type, which will be undefined if no public id was given. Internal is the internal subset, given as a string.

DoctypeFin (Parser)
Called after parsing of the DOCTYPE declaration has finished, including any internal or external DTD declarations.

XMLDecl (Expat, Version, Encoding, Standalone)
Called for XML declarations. Version is a string containing the version. Encoding is either undefined or contains an encoding string. Standalone is either true or false.



Library Navigation Links

Copyright © 2002 O'Reilly & Associates. All rights reserved.