The
Simple API for XML (SAX) interface was originally developed under Java, although
interfaces now exist under most languages, Python 2 supports SAX Version, and
the interface is extensive.
_______________________________________________
_______________________________________________
Python
provides the Basic interface to the SAX parser, an exception handling system,
a set of base classes for creating SAX handlers, and a low-level interface to
the SAX system for building your own low-level SAX-based parsers.
SAX
works by accepting a content handler class that you have previously created to
handle the different elements. The method is similar in principle to Expect, except
that the class you create is entirely devoted to supporting the handler methods
for the different elements. SAX handles all of the data reading and feeding of
the information to the Parser.
Keeping
with the basic theme for the moment, listing a script that uses SAX to output
the start and end tags from a sample file.
A
Simple SAX Parser From xml.sax import make_parser From xml.sax.handler import
contenthandleer
#
Define a new content handler class, the # defined methods will be triggered
when the # individual elements are found in the XML document class FindStartEnd(ContentHandler): def_int_(self) pass def
startElement(self, name, attrs): print Start: , name, attrs def
endElement(self, name): print End: ,name
#
Make a new parser parser = make_parser()
#
Create a new handler instance based on our class sehandler = FindStartEnd() #
Set up the content handler for using our handler parser.setContentHandler(sehandler) import
sys
try: xmlfile
= open(sys.argv[1]) expect: print you must supply the name of the
file to parse sys.exit(1) # We pass off the name of the file to
# the parsing engine parser.parse(xmlfile)
Aside from not printing
out our data sections, the output from this script is identical to the previous
examples. Also note that we no longer have to supply the data in descrete segments
to the parser: the SAX interface opens a file by name and handles all of the reading
internally.
Because the way SAX works, its ideally suited to situations
where we want to pick out specific elements while processing a document. For example,
we can install triggers to identify specific tags and or data sections in a simpler
way than offered by the DOM techniques. Sax can also be a great way of serializing
documents into another format because we can act on each element as its extracted
from the original XML source.
Technology
helps engage students - how do classroom instruction and social dynamics
change when the classroom is completely dependant on modern technology?