Skip to content Skip to sidebar Skip to footer

Reading A Spreadsheet Like .xml With Elementtree

I am reading an xml file using ElementTree but there is a Cell in which I cannot read its data. I adapted my file to make a reproducable example that I present next: from xml.etre

Solution 1:

Question: Reading a spreadsheet like .xml with ElementTree

Documentation: The lxml.etree Tutorial- Namespaces


  1. Define the namespaces used

    ns = {'ss':"urn:schemas-microsoft-com:office:spreadsheet",
          'html':"http://www.w3.org/TR/REC-html40"
         }
    
  2. Use the namespaces with find(.../findall(...

    tree = ElementTree.parse(io.StringIO(xmlf))
    root = tree.getroot()
    
    for ws in root.findall('ss:Worksheet', ns):
        for table in ws.findall('ss:Row', ns):
            for c in table.findall('ss:Cell', ns):
                data = c.find('ss:Data', ns)
                ifdata.text is None:
                    text = []
                    data = data.findall('html:Font', ns)
                    for element indata:
                        text.append(element.text)
    
                    data_text = ''.join(text)
                    print(data_text)
                else:
                    print(data.text)
    

Output:

A
B
C
CAN'T READ THIS
D

Tested with Python: 3.5

Solution 2:

The text content of the fourth cell belongs to the two Font subelements, which are bound to another namespace. Demo:

for e in root.iter():
    text = e.text.strip() if e.textelse None 
    iftext:
        print(e, text)

Output:

<Element {urn:schemas-microsoft-com:office:spreadsheet}Dataat0x7f8013d01dc8> A
<Element {urn:schemas-microsoft-com:office:spreadsheet}Dataat0x7f8013d01dc8> B
<Element {urn:schemas-microsoft-com:office:spreadsheet}Dataat0x7f8013d01dc8> C
<Element {http://www.w3.org/TR/REC-html40}Fontat0x7f8013d01e08> CAN'T READ
<Element {http://www.w3.org/TR/REC-html40}Fontat0x7f8013d01e48> THIS
<Element {urn:schemas-microsoft-com:office:spreadsheet}Dataat0x7f8013d01e48> D

Post a Comment for "Reading A Spreadsheet Like .xml With Elementtree"