|
|
How to use Parsing with SAX and DOM The
Simple API for XML (SAX) interface was originally developed under Java, although
interfaces now exist under most languages, Python 2 supports SAX Version, and
the interface is extensive.
_______________________________________________
Python
provides the Basic interface to the SAX parser, an exception handling system,
a set of base classes for creating SAX handlers, and a low-level interface to
the SAX system for building your own low-level SAX-based parsers. SAX
works by accepting a content handler class that you have previously created to
handle the different elements. The method is similar in principle to Expect, except
that the class you create is entirely devoted to supporting the handler methods
for the different elements. SAX handles all of the data reading and feeding of
the information to the Parser. Keeping
with the basic theme for the moment, listing a script that uses SAX to output
the start and end tags from a sample file. A
Simple SAX Parser From xml.sax import make_parser From xml.sax.handler import
contenthandleer #
Define a new content handler class, the # defined methods will be triggered
when the # individual elements are found in the XML document class FindStartEnd(ContentHandler): def_int_(self) pass def
startElement(self, name, attrs): print Start: , name, attrs def
endElement(self, name): print End: ,name #
Make a new parser parser = make_parser() #
Create a new handler instance based on our class sehandler = FindStartEnd() #
Set up the content handler for using our handler parser.setContentHandler(sehandler) import
sys try: xmlfile
= open(sys.argv[1]) expect: print you must supply the name of the
file to parse sys.exit(1) # We pass off the name of the file to
# the parsing engine parser.parse(xmlfile)
Aside from not printing
out our data sections, the output from this script is identical to the previous
examples. Also note that we no longer have to supply the data in descrete segments
to the parser: the SAX interface opens a file by name and handles all of the reading
internally.
Because the way SAX works, its ideally suited to situations
where we want to pick out specific elements while processing a document. For example,
we can install triggers to identify specific tags and or data sections in a simpler
way than offered by the DOM techniques. Sax can also be a great way of serializing
documents into another format because we can act on each element as its extracted
from the original XML source.
_______________________________________________
FREE
Subscription
Subscribe to our mailing list and receive new articles
through email. Keep yourself updated with latest
developments in the industry.
Note
: We never rent, trade, or sell my email lists to
anyone.
We assure that your privacy is respected
and protected.
_______________________________________
Recommended
XML Books
| |
| FREE
Subscription Stay
Current With the Latest Technology Developments Realted to XML. Signup for Our
Newsletter and Receive
New Articles Through Email. Note
: We never rent, trade, or sell our email lists to anyone. We assure that
your privacy is respected and protected.
|
|