Skip to content Skip to sidebar Skip to footer

Iterparse Fails To Parse A Field, While Other Similar Ones Are Fine

I use Python's iterparse to parse the XML result of a nessus scan (.nessus file). The parsing fails on unexpected records, wile similar ones have been parsed correctly. The general

Solution 1:

From the iterparse() docs:

Note: iterparse() only guarantees that it has seen the “>” character of a starting tag when it emits a “start” event, so the attributes are defined, but the contents of the text and tail attributes are undefined at that point. The same applies to the element children; they may or may not be present. If you need a fully populated element, look for “end” events instead.

Drop inReport* variables and process ReportHost only on "end" events when it fully parsed. Use ElementTree API to get necessary info such as cvss_base_score from current ReportHost element.

To preserve memory, do:

import xml.etree.cElementTree as etree

defgetelements(filename_or_file, tag):
    context = iter(etree.iterparse(filename_or_file, events=('start', 'end')))
    _, root = next(context) # get root elementfor event, elem in context:
        if event == 'end'and elem.tag == tag:
            yield elem
            root.clear() # preserve memoryfor host in getelements("test2.nessus", "ReportHost"):
    for cvss_el in host.iter("cvss_base_score"):
        print(cvss_el.text)

Post a Comment for "Iterparse Fails To Parse A Field, While Other Similar Ones Are Fine"