Create an account


Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
[Tut] Python Convert Parquet to CSV

#1
Python Convert Parquet to CSV

<div>
<div class="kk-star-ratings kksr-auto kksr-align-left kksr-valign-top" data-payload="{&quot;align&quot;:&quot;left&quot;,&quot;id&quot;:&quot;628454&quot;,&quot;slug&quot;:&quot;default&quot;,&quot;valign&quot;:&quot;top&quot;,&quot;ignore&quot;:&quot;&quot;,&quot;reference&quot;:&quot;auto&quot;,&quot;class&quot;:&quot;&quot;,&quot;count&quot;:&quot;1&quot;,&quot;readonly&quot;:&quot;&quot;,&quot;score&quot;:&quot;5&quot;,&quot;best&quot;:&quot;5&quot;,&quot;gap&quot;:&quot;5&quot;,&quot;greet&quot;:&quot;Rate this post&quot;,&quot;legend&quot;:&quot;5\/5 - (1 vote)&quot;,&quot;size&quot;:&quot;24&quot;,&quot;width&quot;:&quot;142.5&quot;,&quot;_legend&quot;:&quot;{score}\/{best} - ({count} {votes})&quot;,&quot;font_factor&quot;:&quot;1.25&quot;}">
<div class="kksr-stars">
<div class="kksr-stars-inactive">
<div class="kksr-star" data-star="1" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" data-star="2" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" data-star="3" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" data-star="4" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" data-star="5" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
</p></div>
<div class="kksr-stars-active" style="width: 142.5px;">
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
</p></div>
</div>
<div class="kksr-legend" style="font-size: 19.2px;"> 5/5 – (1 vote) </div>
</div>
<h2>Problem</h2>
<p class="has-global-color-8-background-color has-background"><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f4ac.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Challenge</strong>: How to convert a Parquet file <code>'my_file.parquet'</code> to a CSV file <code>'my_file.csv'</code> in Python?</p>
<p>In case you don’t know what a Parquet file is, here’s the definition:</p>
<p class="has-base-background-color has-background"><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f4a1.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Info</strong>: <a rel="noreferrer noopener" href="https://parquet.apache.org/" target="_blank">Apache Parquet</a> is an open-source, column-oriented data file format designed for efficient data storage and retrieval using data compression and encoding schemes to handle complex data in bulk. Parquet is available in multiple languages including Java, C++, and Python.</p>
<p>Here’s an example Parquet file format:</p>
<div class="wp-block-image">
<figure class="aligncenter"><img loading="lazy" width="952" height="486" src="https://blog.finxter.com/wp-content/uploads/2022/06/image-135.png" alt="" class="wp-image-430271" srcset="https://blog.finxter.com/wp-content/uploads/2022/06/image-135.png 952w, https://blog.finxter.com/wp-content/uplo...00x153.png 300w, https://blog.finxter.com/wp-content/uplo...68x392.png 768w" sizes="(max-width: 952px) 100vw, 952px" /><figcaption><a href="https://parquet.apache.org/docs/file-format/" target="_blank" rel="noreferrer noopener">source</a></figcaption></figure>
</div>
<h2>Solution</h2>
<p class="has-global-color-8-background-color has-background">The most simple way to convert a Parquet to a CSV file in Python is to import the Pandas library, call the <code>pandas.read_parquet()</code> function passing the <code>'my_file.parquet'</code> filename argument to load the file content into a DataFrame, and convert the DataFrame to a CSV using the DataFrame <code><a rel="noreferrer noopener" href="https://blog.finxter.com/pandas-dataframe-to_csv-method/" data-type="post" data-id="344277" target="_blank">to_csv()</a></code> method.</p>
<ul>
<li><code><strong>import pandas as pd</strong></code></li>
<li><code><strong>df = pd.read_parquet('my_file.parquet')</strong></code></li>
<li><code><strong>df.to_csv('my_file.csv')</strong></code></li>
</ul>
<p>Here’s a minimal example:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import pandas as pd
df = pd.read_parquet('my_file.parquet')
df.to_csv('my_file.csv')</pre>
<p>For this to work, you may have to <a rel="noreferrer noopener" href="https://blog.finxter.com/how-to-install-pandas-in-python/" data-type="post" data-id="35926" target="_blank">install pandas</a> and <a rel="noreferrer noopener" href="https://blog.finxter.com/how-to-install-pyarrow-in-python/" data-type="post" data-id="35940" target="_blank">pyarrow</a>. But if I were you, I’d just try it because chances are you’ve already installed them or don’t explicitly need to install the PyArrow library.</p>
<h2>Related</h2>
<p class="has-base-background-color has-background"><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f30d.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Related Tutorial</strong>: <a href="https://blog.finxter.com/python-convert-csv-to-parquet/" data-type="post" data-id="430254">Python Convert CSV to Parquet</a></p>
<p>I also found this video from a great YT channel that concerns this particular problem of converting a Parquet to a CSV:</p>
<figure class="wp-block-embed-youtube wp-block-embed is-type-video is-provider-youtube"><a href="https://blog.finxter.com/python-convert-parquet-to-csv/"><img src="https://blog.finxter.com/wp-content/plugins/wp-youtube-lyte/lyteCache.php?origThumbUrl=https%3A%2F%2Fi.ytimg.com%2Fvi%2FkYghFTfDXnU%2Fhqdefault.jpg" alt="YouTube Video"></a><figcaption></figcaption></figure>
</div>


https://www.sickgaming.net/blog/2022/08/...et-to-csv/
Reply



Forum Jump:


Users browsing this thread:
1 Guest(s)

Forum software by © MyBB Theme © iAndrew 2016