Create an account


Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
[Tut] Parsing XML Files in Python – 4 Simple Ways

#1
Parsing XML Files in Python – 4 Simple Ways

<div>
<div class="kk-star-ratings kksr-auto kksr-align-left kksr-valign-top" data-payload='{&quot;align&quot;:&quot;left&quot;,&quot;id&quot;:&quot;883225&quot;,&quot;slug&quot;:&quot;default&quot;,&quot;valign&quot;:&quot;top&quot;,&quot;ignore&quot;:&quot;&quot;,&quot;reference&quot;:&quot;auto&quot;,&quot;class&quot;:&quot;&quot;,&quot;count&quot;:&quot;1&quot;,&quot;readonly&quot;:&quot;&quot;,&quot;score&quot;:&quot;5&quot;,&quot;best&quot;:&quot;5&quot;,&quot;gap&quot;:&quot;5&quot;,&quot;greet&quot;:&quot;Rate this post&quot;,&quot;legend&quot;:&quot;5\/5 - (1 vote)&quot;,&quot;size&quot;:&quot;24&quot;,&quot;width&quot;:&quot;142.5&quot;,&quot;_legend&quot;:&quot;{score}\/{best} - ({count} {votes})&quot;,&quot;font_factor&quot;:&quot;1.25&quot;}'>
<div class="kksr-stars">
<div class="kksr-stars-inactive">
<div class="kksr-star" data-star="1" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" data-star="2" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" data-star="3" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" data-star="4" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" data-star="5" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
</p></div>
<div class="kksr-stars-active" style="width: 142.5px;">
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
</p></div>
</div>
<div class="kksr-legend" style="font-size: 19.2px;"> 5/5 – (1 vote) </div>
</div>
<h2 class="wp-embed-aspect-16-9 wp-has-aspect-ratio">Problem Formulation and Solution Overview</h2>
<p class="wp-embed-aspect-16-9 wp-has-aspect-ratio">This article will show you various ways to work with an <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"><code>XML</code></a> file.</p>
<p class="has-base-background-color has-background"><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/2139.png" alt="ℹ" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"><code>XML</code></a> is an acronym for E<strong>x</strong>tensible <strong>M</strong>arkup <strong>L</strong>anguage. This file type is similar to HTML. However, <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"></a><a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"><code>XML</code></a> does not have pre-defined tags like HTML. Instead, a coder can define their own tags to meet specific requirements. <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"></a><a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"><code>XML</code></a> is a great way to transmit and share data, either locally or via the internet. This file can be parsed based on standardized <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"></a><a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"><code>XML</code></a> if structured correctly.</p>
<p>To make it more interesting, we have the following running scenario:</p>
<p class="wp-embed-aspect-16-9 wp-has-aspect-ratio">Jan, a Bookstore Owner, wants to know the top three (3) selling Books in her store. This data is currently saved in an <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"><code>XML</code></a> format. </p>
<hr class="wp-block-separator has-alpha-channel-opacity wp-embed-aspect-16-9 wp-has-aspect-ratio"/>
<p class="wp-embed-aspect-16-9 wp-has-aspect-ratio has-global-color-8-background-color has-background"><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f4ac.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Question</strong>: How would we write code to read in and extract data from an <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank">XML</a> file into a Python script<em>?</em></p>
<p class="wp-embed-aspect-16-9 wp-has-aspect-ratio">We can accomplish this by performing the following steps:</p>
<ul>
<li><strong>Method 1</strong>: Use <a rel="noreferrer noopener" href="https://pypi.org/project/xmltodict/" data-type="URL" data-id="https://pypi.org/project/xmltodict/" target="_blank"><code>xmltodict()</code></a> </li>
<li><strong>Method 2</strong>: Use <a rel="noreferrer noopener" href="https://docs.python.org/3/library/xml.dom.minidom.html" target="_blank"><code>minidom.parse()</code></a></li>
<li><strong>Method 3</strong>: Use <a rel="noreferrer noopener" href="https://docs.python.org/3/library/xml.etree.elementtree.html" data-type="URL" data-id="https://docs.python.org/3/library/xml.etree.elementtree.html" target="_blank"><code>etree</code></a></li>
<li><strong>Method 4:</strong> Use <a rel="noreferrer noopener" href="https://pypi.org/project/untangle/" data-type="URL" data-id="https://pypi.org/project/untangle/" target="_blank"><code>untangle.parse()</code></a></li>
</ul>
<hr class="wp-block-separator has-alpha-channel-opacity"/>
<h2>Method 1: Use xmltodict()</h2>
<p class="has-global-color-8-background-color has-background">This method uses the <a rel="noreferrer noopener" href="https://pypi.org/project/xmltodict/" data-type="URL" data-id="https://pypi.org/project/xmltodict/" target="_blank"><code>xmltodict()</code></a> function to read an <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"><code>XML</code></a> file, convert it to a <a rel="noreferrer noopener" href="https://blog.finxter.com/python-dictionary/" data-type="URL" data-id="https://blog.finxter.com/python-dictionary/" target="_blank"><code>Dictionary</code></a> and extract the data.</p>
<p>In the current working directory, create an <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"><code>XML</code></a> file called <code>books.xml</code>. Copy and paste the code snippet below into this file and save it.</p>
<pre class="wp-block-preformatted"><code>&lt;bookstore&gt; &lt;book&gt; &lt;title&gt;Surrender&lt;/title&gt; &lt;author&gt;Bono&lt;/author&gt; &lt;sales&gt;21987&lt;/sales&gt; &lt;/book&gt; &lt;book&gt; &lt;title&gt;Going Rogue&lt;/title&gt; &lt;author&gt;Janet Evanovich&lt;/author&gt; &lt;sales&gt;15986&lt;/sales&gt; &lt;/book&gt; &lt;book&gt; &lt;title&gt;Triple Cross&lt;/title&gt; &lt;author&gt;James Patterson&lt;/author&gt; &lt;sales&gt;11311&lt;/sales&gt; &lt;/book&gt;
&lt;/bookstore&gt;</code></pre>
<hr class="wp-block-separator has-alpha-channel-opacity"/>
<p>In the current working directory, create a Python file called <code>books.py</code>. Copy and paste the code snippet below into this file and save it. This code reads in and parses the above <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"><code>XML</code></a> file. If necessary, <a rel="noreferrer noopener" href="https://blog.finxter.com/how-to-install-xmltodict-in-python/" data-type="URL" data-id="https://blog.finxter.com/how-to-install-xmltodict-in-python/" target="_blank">install</a> the <a rel="noreferrer noopener" href="https://pypi.org/project/xmltodict/" data-type="URL" data-id="https://pypi.org/project/xmltodict/" target="_blank"><code>xmltodict</code></a> library.</p>
<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="3-5, 7-10" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import xmltodict with open('books.xml', 'r') as fp: books_dict = xmltodict.parse(fp.read()) fp.close() for i in books_dict: for j in books_dict[i]: for k in books_dict[i][j]: print(f'Title: {k["title"]} \t Sales: {k["sales"]}')</pre>
<p>The first line in the above code snippet imports the <a rel="noreferrer noopener" href="https://pypi.org/project/xmltodict/" data-type="URL" data-id="https://pypi.org/project/xmltodict/" target="_blank"><code>xmltodict</code></a> library. This library is needed to access and parse the <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"><code>XML</code></a> file.</p>
<p>The following highlighted section opens <code>books.xml</code> in read mode (<code>r</code>) and saves it as a File Object, fp. If fp was output to the terminal, an object similar to the one below would display.</p>
<pre class="wp-block-preformatted"><code>&lt;_io.TextIOWrapper name='books.xml' mode='r' encoding='cp1252'&gt;</code></pre>
<p>Next, the <a href="https://pypi.org/project/xmltodict/" data-type="URL" data-id="https://pypi.org/project/xmltodict/"><code>xmltodict.parse()</code></a> function is called and passed one (1) argument, <a rel="noreferrer noopener" href="https://blog.finxter.com/5-ways-to-read-a-text-file-from-a-url/" data-type="URL" data-id="https://blog.finxter.com/5-ways-to-read-a-text-file-from-a-url/" target="_blank"><code>fp.read()</code></a>, which reads in and parses the contents of the <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"><code>XML</code></a> file. The results save to <code>books_dict</code> as a <a rel="noreferrer noopener" href="https://blog.finxter.com/python-dictionary/" data-type="URL" data-id="https://blog.finxter.com/python-dictionary/" target="_blank"><code>Dictionary</code></a>, and the file is closed. The contents of <code>books_dict</code> are shown below.</p>
<pre class="wp-block-preformatted"><code>{'bookstore': {'book': [{'title': Surrender', 'author': 'Bono', 'sales': '21987'}, {'title': 'Going Rogue', 'author': 'Janet Evanovich', 'sales': '15986'}, {'title': 'Triple Cross', 'author': 'James Patterson', 'sales': '11311'}]}}</code></pre>
<p>The final highlighted section loops through the above <a rel="noreferrer noopener" href="https://blog.finxter.com/python-dictionary/" data-type="URL" data-id="https://blog.finxter.com/python-dictionary/" target="_blank"><code>Dictionary</code></a> and extracts each book’s <code>Title</code> and <code>Sales</code>.</p>
<pre class="wp-block-preformatted"><code>Title: Surrender Sales: 21987
Title: Going Rogue Sales: 15986
Title: Triple Cross Sales: 11311</code></pre>
<p class="has-global-color-8-background-color has-background"><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f4a1.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Note</strong>: The <code>\t</code> character represents the &lt;Tab&gt; key on the keyboard.</p>
<figure class="wp-block-embed-youtube wp-block-embed is-type-video is-provider-youtube"><a href="https://blog.finxter.com/parsing-xml-files-in-python-a-simple-guide/"><img src="https://blog.finxter.com/wp-content/plugins/wp-youtube-lyte/lyteCache.php?origThumbUrl=https%3A%2F%2Fi.ytimg.com%2Fvi%2FqX0qqEVpP5s%2Fhqdefault.jpg" alt="YouTube Video"></a><figcaption></figcaption></figure>
<hr class="wp-block-separator has-alpha-channel-opacity"/>
<h2>Method 2: Use minidom.parse()</h2>
<p class="has-global-color-8-background-color has-background">This method uses the <a rel="noreferrer noopener" href="https://docs.python.org/3/library/xml.dom.minidom.html" data-type="URL" data-id="https://docs.python.org/3/library/xml.dom.minidom.html" target="_blank"><code>minidom.parse()</code></a> function to read and parse an <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank">XML</a> file. This example extracts the ID, Title and Sales for each book.</p>
<p>This example differs from Method 1 as this <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"><code>XML</code></a> file contains an additional line at the top (<code>&lt;?xml version="1.0"?&gt;</code>) of the file and each <code>&lt;book&gt;</code> tag now has an <code>id</code> (attribute) assigned to it. </p>
<p>In the current working directory, create an <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"><code>XML</code></a> file called <code>books2.xml</code>. Copy and paste the code snippet below into this file and save it.</p>
<pre class="wp-block-preformatted"><code>&lt;?xml version="1.0"?&gt;
&lt;bookstore&gt; &lt;storename&gt;Jan's Best Sellers List&lt;/storename&gt; &lt;book id="21237"&gt; &lt;title&gt;Surrender&lt;/title&gt; &lt;author&gt;Bono&lt;/author&gt; &lt;sales&gt;21987&lt;/sales&gt; &lt;/book&gt; &lt;book id="21946"&gt; &lt;title&gt;Going Rogue&lt;/title&gt; &lt;author&gt;Janet Evanovich&lt;/author&gt; &lt;sales&gt;15986&lt;/sales&gt; &lt;/book&gt; &lt;book id="18241"&gt; &lt;title&gt;Triple Cross&lt;/title&gt; &lt;author&gt;James Patterson&lt;/author&gt; &lt;sales&gt;11311&lt;/sales&gt; &lt;/book&gt;
&lt;/bookstore&gt;</code></pre>
<p>In the current working directory, create a Python file called <code>books2.py</code>. Copy and paste the code snippet below into this file and save it. </p>
<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="3-5, 7-13" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">from xml.dom import minidom doc = minidom.parse('books2.xml')
name = doc.getElementsByTagName('storename')[0]
books = doc.getElementsByTagName('book') for b in books: bid = b.getAttribute('id') title = b.getElementsByTagName('title')[0] sales = b.getElementsByTagName('sales')[0] print(f'{bid} {title.firstChild.data} {sales.firstChild.data}')</pre>
<p>The first line in the above code snippet imports the <a rel="noreferrer noopener" href="https://docs.python.org/3/library/xml.dom.minidom.html" data-type="URL" data-id="https://docs.python.org/3/library/xml.dom.minidom.html" target="_blank"><code>minidom</code></a> library. This allows access to various functions to parse the <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"><code>XML</code></a> file and retrieve tags and attributes.</p>
<p>The first section of highlighted lines performs the following:</p>
<ul>
<li>Reads and parse the <code>books2.xml</code> file and saves the results to <code>doc</code>. This action creates the Object shown as (1) below.</li>
<li>Retrieves the <code>&lt;storename&gt;</code> tag and saves the results to <code>name</code>. This action creates an Object shown as (2) below.</li>
<li>Retrieves the <code>&lt;book&gt;</code> tag for each <code>book</code> and saves the results to <code>books</code>. This action creates a List of three (3) Objects: one for each book shown as (3) below.</li>
</ul>
<pre class="wp-block-preformatted"><code>(1) &lt;xml.dom.minidom.Document object at 0x0000022D764AFEE0&gt; (2) &lt;DOM Element: storename at 0x22d764f0ee0&gt; (3) [&lt;DOM Element: book at 0x22d764f3a30&gt;, &lt;DOM Element: book at 0x22d764f3c70&gt;, &lt;DOM Element: book at 0x22d764f3eb0&gt;]</code></pre>
<p>The last section of highlighted lines loop through the books Object and outputs the results to the terminal.</p>
<pre class="wp-block-preformatted"><code>21237 Surrender 21987
21946 Going Rogue 15986
18241 Triple Cross 11311</code></pre>
<figure class="wp-block-embed-youtube wp-block-embed is-type-video is-provider-youtube"><a href="https://blog.finxter.com/parsing-xml-files-in-python-a-simple-guide/"><img src="https://blog.finxter.com/wp-content/plugins/wp-youtube-lyte/lyteCache.php?origThumbUrl=https%3A%2F%2Fi.ytimg.com%2Fvi%2F5MXDZI3jRio%2Fhqdefault.jpg" alt="YouTube Video"></a><figcaption></figcaption></figure>
<hr class="wp-block-separator has-alpha-channel-opacity"/>
<h2>Method 3: Use etree</h2>
<p class="has-global-color-8-background-color has-background">This method uses <a rel="noreferrer noopener" href="https://docs.python.org/3/library/xml.etree.elementtree.html" data-type="URL" data-id="https://docs.python.org/3/library/xml.etree.elementtree.html" target="_blank"><code>etree</code></a> to read in and parses an <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank">XML</a> file. This example extracts the Title and Sales data for each book.</p>
<p class="has-base-background-color has-background"><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/2139.png" alt="ℹ" class="wp-smiley" style="height: 1em; max-height: 1em;" /> The <code>etree</code> considers the <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank">XML</a> file as a tree structure. Each element represents a node of said tree. Accessing elements is done on an element level.</p>
<p>This example reads in and parses the <code>books2.xml</code> file created earlier.</p>
<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="3,4, 6-10" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import xml.etree.ElementTree as ET xml_data = ET.parse('books2.xml')
root = xml_data.getroot() for books in root.findall('book'): title = books.find('title').text author = books.find('author').text sales = books.find('sales').text print(title, author, sales)</pre>
<p>The first line in the above code snippet imports the <a rel="noreferrer noopener" href="https://docs.python.org/3/library/xml.etree.elementtree.html" data-type="URL" data-id="https://docs.python.org/3/library/xml.etree.elementtree.html" target="_blank"><code>etree</code></a> library. This allows access to all nodes of the <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank">XML</a> <code>&lt;tag&gt;</code> structure.</p>
<p>The following line reads in and parses <code>books2.xml</code>. The results save as an <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank">XML</a> Object to <code>xml_data</code>. If output to the terminal, an Object similar to the one below displays.</p>
<pre class="wp-block-preformatted"><code>&lt;Element 'bookstore' at 0x000001E45E9442C0&gt;</code></pre>
<p>The following highlighted section uses a <a rel="noreferrer noopener" href="https://blog.finxter.com/python-loops/" data-type="URL" data-id="https://blog.finxter.com/python-loops/" target="_blank"><code>for</code></a> loop to iterate through each <code>&lt;book&gt;</code> tag, extracting the <code>&lt;title&gt;</code>, <code>&lt;author&gt;</code> and <code>&lt;sales&gt;</code> tags for each book and outputting them to the terminal.</p>
<pre class="wp-block-preformatted"><code>Surrender Bono 21987
Going Rogue Janet Evanovich 15986
Triple Cross James Patterson 11311</code></pre>
<p>To retrieve the attribute of the <code>&lt;book&gt;</code> tag, run the following code.</p>
<p>This code extracts the <code>id</code> attribute from each <code>&lt;book&gt;</code> tag and outputs it to the terminal.</p>
<pre class="wp-block-preformatted"><code>{'id': '21237'}
{'id': '21946'}
{'id': '18241'}</code></pre>
<p>To extract the values, run the following code.</p>
<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">for id in root.iter('book'): vals = id.attrib.values() for v in vals: print(vals)</pre>
<pre class="wp-block-preformatted"><code>21237
21946
18241</code></pre>
<hr class="wp-block-separator has-alpha-channel-opacity"/>
<h2>Method 4: Use untangle.parse()</h2>
<p class="has-global-color-8-background-color has-background">This method uses <a rel="noreferrer noopener" href="https://pypi.org/project/untangle/" data-type="URL" data-id="https://pypi.org/project/untangle/" target="_blank"><code>untangle.parse()</code></a> to parse an <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank">XML</a> string.</p>
<p>This example reads in and parses the <code>books3.xml</code> file shown below. If necessary, install the <a rel="noreferrer noopener" href="https://pypi.org/project/untangle/" data-type="URL" data-id="https://pypi.org/project/untangle/" target="_blank"><code>untangle</code></a> library.</p>
<p class="has-base-background-color has-background"><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/2139.png" alt="ℹ" class="wp-smiley" style="height: 1em; max-height: 1em;" /> The <a rel="noreferrer noopener" href="https://pypi.org/project/untangle/" data-type="URL" data-id="https://pypi.org/project/untangle/" target="_blank"><code>untangle</code></a> library converts an <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank">XML</a> file to a Python object. This is a good option when you have a group of items, such as book names.</p>
<p>In the current working directory, create an <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"><code>XML</code></a> file called <code>books3.xml</code>. Copy and paste the code snippet below into this file and save it. If necessary, <a rel="noreferrer noopener" href="https://blog.finxter.com/how-to-install-xmltodict-in-python/" target="_blank">install</a> the <a href="https://pypi.org/project/untangle/" data-type="URL" data-id="https://pypi.org/project/untangle/" target="_blank" rel="noreferrer noopener"><code>untangle</code></a> library.</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">&lt;?xml version="1.0"?>
&lt;root> &lt;book name="Surrender"/> &lt;book name="Going Rogue"/> &lt;book name="Triple Cross"/>
&lt;/root></pre>
<p>In the current working directory, create a Python file called <code>books3.py</code>. Copy and paste the code snippet below into this file and save it. </p>
<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="3-4,6-7" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import untangle book_obj = untangle.parse('books3.xml')
books = ','.join([book['name'] for book in book_obj.root.book]) for b in books.split(','): print(b)</pre>
<p>The first line in the above code snippet imports the <a rel="noreferrer noopener" href="https://pypi.org/project/untangle/" data-type="URL" data-id="https://pypi.org/project/untangle/" target="_blank"><code>untangle</code></a> library allowing access to the <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"><code>XML</code></a> file structure.</p>
<p>The following line reads in and parses the <code>books3.xml</code> file. The results save to <code>book_obj</code>. </p>
<p>The next line calls the <a rel="noreferrer noopener" href="https://blog.finxter.com/python-string-join/" data-type="URL" data-id="https://blog.finxter.com/python-string-join/" target="_blank"><code>join()</code></a> function and passes it one (1) argument: <a rel="noreferrer noopener" href="https://blog.finxter.com/list-comprehension/" data-type="URL" data-id="https://blog.finxter.com/list-comprehension/" target="_blank">List Comprehension</a>. This code iterates through and retrieves the name of each book and saves the results to <code>books</code>. If output to the terminal, the following displays:</p>
<pre class="wp-block-preformatted"><code> Surrender,Going Rogue,Triple Cross</code></pre>
<p>The next line instantiates a <a rel="noreferrer noopener" href="https://blog.finxter.com/python-loops/" data-type="URL" data-id="https://blog.finxter.com/python-loops/" target="_blank"><code>for</code></a> loop, iterates through each book name, and sends it to the terminal.</p>
<pre class="wp-block-preformatted"><code>Surrender
Going Rogue
Triple Cross</code></pre>
<figure class="wp-block-embed-youtube wp-block-embed is-type-video is-provider-youtube"><a href="https://blog.finxter.com/parsing-xml-files-in-python-a-simple-guide/"><img src="https://blog.finxter.com/wp-content/plugins/wp-youtube-lyte/lyteCache.php?origThumbUrl=https%3A%2F%2Fi.ytimg.com%2Fvi%2FaBC0VhpXkOQ%2Fhqdefault.jpg" alt="YouTube Video"></a><figcaption></figcaption></figure>
<hr class="wp-block-separator has-alpha-channel-opacity"/>
<h2>Summary</h2>
<p>This article has shown four (4) ways to work with <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"><code>XML</code></a> files to select the best fit for your coding requirements.</p>
<p>Good Luck &amp; Happy Coding!</p>
<hr class="wp-block-separator has-alpha-channel-opacity"/>
<h2>Programmer Humor – Blockchain</h2>
<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="280" height="394" src="https://blog.finxter.com/wp-content/uploads/2022/07/image-31.png" alt="" class="wp-image-457795" srcset="https://blog.finxter.com/wp-content/uploads/2022/07/image-31.png 280w, https://blog.finxter.com/wp-content/uplo...13x300.png 213w" sizes="(max-width: 280px) 100vw, 280px" /><figcaption><em>“Blockchains are like grappling hooks, in that it’s extremely cool when you encounter a problem for which they’re the right solution, but it happens way too rarely in real life.”</em> <strong>source </strong> – <a href="https://imgs.xkcd.com/comics/blockchain.png" data-type="URL" data-id="https://imgs.xkcd.com/comics/blockchain.png" target="_blank" rel="noreferrer noopener">xkcd</a></figcaption></figure>
</div>
</div>


https://www.sickgaming.net/blog/2022/11/...mple-ways/
Reply



Forum Jump:


Users browsing this thread:
1 Guest(s)

Forum software by © MyBB Theme © iAndrew 2016