ePortfolios

March 22, 2005

LIP ePortfolios XML parser

I've been working for the past few days on a parser for the ePortfolios LIP XML standard for the ePET project.

The work is based upon my LOM XML parser that forms part of my LOM Harvester:

- http://www.ltsn-01.ac.uk/interoperability/harvesters

The script uses the python libxml2 library to run a set of XPath queries over the LIP XML and extract the values. The XPath map is stored in the ltsn01_harvester.epet_xpath_map table for the testbed.

The parser script is called parse_lip_xml.py and is written as a Zope external method which needs to be passed the contents of the XPath table either as a dictionary or tuple plus the XML.

The following is a DTML web form which passes some test XML (engresume.xml) to a handler script which marshalls the XPath map from an SQL method and hands both off to the parse script external method.

- http://cadmedfac19.ncl.ac.uk/epet_test/parse_test

Because of the way the LIP XML is structured in quite heavily nested blocks, it really requires iterative parsing a la the classification section of the LOM. This is done by a two stage parse run. The first stage grabs sections such as all the <qcl/> nodes (which is repeatable) and serializes these into a python list of XML fragments so that they can be treated as separate documents in a second stage parse run using special XPath queries just for the contents of that node. This has been written into the parser for the name, qcl and affiliation sections.

Those sections which do not require iterative parsing (as defined by having their XPath queries starting at the learnerinformation root node) are parsed in a final single level parse run at the end.

The result is a nested Python dictionary representing the values serialized in the XML which can be passed to handler methods to do the database inserts / updates.

Newsflash: RM just sent me this URL which is his version of my stuff rolled into ePET with the REST put handler on the front:

- http://epet.ncl.ac.uk/ePET/new/epet_test/parse_test_rm

Posted by pj at 02:58 PM