Sick Gaming
[Tut] How to Convert Tab-Delimited File to CSV in Python? - Printable Version

+- Sick Gaming (https://www.sickgaming.net)
+-- Forum: Programming (https://www.sickgaming.net/forum-76.html)
+--- Forum: Python (https://www.sickgaming.net/forum-83.html)
+--- Thread: [Tut] How to Convert Tab-Delimited File to CSV in Python? (/thread-99803.html)



[Tut] How to Convert Tab-Delimited File to CSV in Python? - xSicKxBot - 08-14-2022

How to Convert Tab-Delimited File to CSV in Python?

<div>
<div class="kk-star-ratings kksr-auto kksr-align-left kksr-valign-top" data-payload="{&quot;align&quot;:&quot;left&quot;,&quot;id&quot;:&quot;563635&quot;,&quot;slug&quot;:&quot;default&quot;,&quot;valign&quot;:&quot;top&quot;,&quot;ignore&quot;:&quot;&quot;,&quot;reference&quot;:&quot;auto&quot;,&quot;class&quot;:&quot;&quot;,&quot;count&quot;:&quot;1&quot;,&quot;readonly&quot;:&quot;&quot;,&quot;score&quot;:&quot;5&quot;,&quot;best&quot;:&quot;5&quot;,&quot;gap&quot;:&quot;5&quot;,&quot;greet&quot;:&quot;Rate this post&quot;,&quot;legend&quot;:&quot;5\/5 - (1 vote)&quot;,&quot;size&quot;:&quot;24&quot;,&quot;width&quot;:&quot;142.5&quot;,&quot;_legend&quot;:&quot;{score}\/{best} - ({count} {votes})&quot;,&quot;font_factor&quot;:&quot;1.25&quot;}">
<div class="kksr-stars">
<div class="kksr-stars-inactive">
<div class="kksr-star" data-star="1" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" data-star="2" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" data-star="3" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" data-star="4" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" data-star="5" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
</p></div>
<div class="kksr-stars-active" style="width: 142.5px;">
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
</p></div>
</div>
<div class="kksr-legend" style="font-size: 19.2px;"> 5/5 – (1 vote) </div>
</div>
<p><strong>The easiest way to convert a tab-delimited values (TSV) file to a comma-separated values (CSV) file is to use the following three lines of code:</strong></p>
<ol class="has-global-color-8-background-color has-background">
<li><strong><code>import pandas as pd</code></strong></li>
<li><strong><code>df = pd.read_csv('my_file.txt', sep='\t', header=None)</code></strong></li>
<li><strong><code>df.to_csv('my_file.csv', header=None)</code></strong></li>
</ol>
<p>We’ll explain this and other approaches in more detail next—scroll down to <strong>Method 3</strong> for this exact method.</p>
<h2>Problem Formulation</h2>
<p>Given a tab-delimited file with one tab character <code>'\t'</code> between two values in a given column.</p>
<p><strong>Input:</strong> <code>'my_file.tsv'</code></p>
<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" width="572" height="435" src="https://blog.finxter.com/wp-content/uploads/2022/08/image-12.png" alt="" class="wp-image-563647" srcset="https://blog.finxter.com/wp-content/uploads/2022/08/image-12.png 572w, https://blog.finxter.com/wp-content/uploads/2022/08/image-12-300x228.png 300w" sizes="(max-width: 572px) 100vw, 572px" /><figcaption><strong>Figure</strong>: File <code>'my_file.tsv'</code> with tab <code>'\t'</code> separated values.</figcaption></figure>
</div>
<pre class="wp-block-preformatted">Alice DataScience $100000
Bob Programmer $90000
Carl Manager $122000
Dave Freelancer $144000</pre>
<p>How to convert the tab-delimited values (TSV) to a comma-separated values (CSV) file?</p>
<p><strong>Output:</strong> <code>'my_file.csv'</code></p>
<pre class="wp-block-preformatted"><code>0,Alice,DataScience,$100000
1,Bob,Programmer,$90000
2,Carl,Manager,$122000
3,Dave,Freelancer,$144000</code></pre>
<p>We’ll also look at slight variations of this problem. Let’s go!</p>
<h2>Method 1: String Replace Single Tab</h2>
<p class="has-global-color-8-background-color has-background">The most straightforward way to convert a tab-delimited (TSV) to a comma-separated (CSV) file in Python is to replace each tabular character <code>'\t'</code> with a comma <code>','</code> character using the <code><a rel="noreferrer noopener" href="https://blog.finxter.com/python-string-replace-2/" data-type="post" data-id="26083" target="_blank">string.replace()</a></code> method. This works if two values are separated by exactly one tabular character.</p>
<p>Here’s an example input file <code>'my_file.tsv'</code>:</p>
<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" width="580" height="438" src="https://blog.finxter.com/wp-content/uploads/2022/08/image-13.png" alt="" class="wp-image-563655" srcset="https://blog.finxter.com/wp-content/uploads/2022/08/image-13.png 580w, https://blog.finxter.com/wp-content/uploads/2022/08/image-13-300x227.png 300w" sizes="(max-width: 580px) 100vw, 580px" /></figure>
</div>
<p>Here’s an example of some code to convert the tab-delimited file to the CSV file:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="4" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">with open('my_file.tsv') as f: # Read space-delimited file and replace all empty spaces by commas data = f.read().replace('\t', ',') # Write the CSV data in the output file print(data, file=open('my_file.csv', 'w'))</pre>
<p>Output file <code>'my_file.csv'</code>:</p>
<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" width="573" height="428" src="https://blog.finxter.com/wp-content/uploads/2022/08/image-14.png" alt="" class="wp-image-563660" srcset="https://blog.finxter.com/wp-content/uploads/2022/08/image-14.png 573w, https://blog.finxter.com/wp-content/uploads/2022/08/image-14-300x224.png 300w" sizes="(max-width: 573px) 100vw, 573px" /></figure>
</div>
<p>If you have any doubts, feel free to dive into our related tutorials:</p>
<ul class="has-base-3-background-color has-background">
<li><a href="https://blog.finxter.com/python-string-replace/" data-type="URL" data-id="https://blog.finxter.com/python-string-replace/" target="_blank" rel="noreferrer noopener">Python String Replace Method</a></li>
<li><a href="https://blog.finxter.com/python-open-function/" data-type="post" data-id="24793" target="_blank" rel="noreferrer noopener">Python <code>open()</code> Function</a></li>
<li><a href="https://blog.finxter.com/python-one-liner-write-string-to-file/" data-type="post" data-id="10913" target="_blank" rel="noreferrer noopener">Python <code>print()</code> to File</a></li>
<li><a href="https://blog.finxter.com/python-print/" data-type="post" data-id="20731" target="_blank" rel="noreferrer noopener">Python <code>print()</code> Function</a></li>
<li><a href="https://blog.finxter.com/python-__enter__-magic-method/" data-type="post" data-id="140907">Python Context Mana</a><a href="https://blog.finxter.com/python-__enter__-magic-method/" data-type="post" data-id="140907" target="_blank" rel="noreferrer noopener">g</a><a href="https://blog.finxter.com/python-__enter__-magic-method/" data-type="post" data-id="140907">ers</a></li>
</ul>
<h2>Method 2: Regex Replace Arbitrary Tabs</h2>
<p class="has-global-color-8-background-color has-background">To replace one <code>'\t'</code> or more tabs <code>'\t\t\t'</code> between two column values with a comma <code>','</code> and obtain a CSV, use the regular expressions operation <code>re.sub('[\t]+', ',', data)</code> on the space-separated <code>data</code>.</p>
<p>If you have any doubts, feel free to dive into our related tutorials:</p>
<ul>
<li><a href="https://blog.finxter.com/python-regex-sub/" data-type="post" data-id="5861" target="_blank" rel="noreferrer noopener">Python Regex <code>re.sub()</code></a></li>
<li><a href="https://blog.finxter.com/python-character-set-regex-tutorial/" data-type="post" data-id="6208" target="_blank" rel="noreferrer noopener">Python Regex <code>[]</code> Character Class</a></li>
<li><a href="https://blog.finxter.com/python-regex-quantifiers-question-mark-vs-plus-vs-asterisk-differences/" data-type="post" data-id="6915" target="_blank" rel="noreferrer noopener">Python Regex <code>+</code> Quantifier</a></li>
<li><a href="https://blog.finxter.com/python-regex/" data-type="post" data-id="6210" target="_blank" rel="noreferrer noopener">Python Regex Superpower</a></li>
</ul>
<p>Here’s an example input file <code>'my_file.tsv'</code>, notice the additional tabular characters that may separate two column values:</p>
<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" width="567" height="435" src="https://blog.finxter.com/wp-content/uploads/2022/08/image-16.png" alt="" class="wp-image-563673" srcset="https://blog.finxter.com/wp-content/uploads/2022/08/image-16.png 567w, https://blog.finxter.com/wp-content/uploads/2022/08/image-16-300x230.png 300w" sizes="(max-width: 567px) 100vw, 567px" /></figure>
</div>
<p>Here’s an example of some code to convert the TSV to the CSV file:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="1,6" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import re with open('my_file.txt') as infile: # Read space-delimited file and replace all empty spaces by commas data = re.sub('[ ]+', ',', infile.read()) # Write the CSV data in the output file print(data, file=open('my_file.csv', 'w'))</pre>
<p>Output file <code>'my_file.csv'</code>:</p>
<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" width="578" height="436" src="https://blog.finxter.com/wp-content/uploads/2022/08/image-17.png" alt="" class="wp-image-563676" srcset="https://blog.finxter.com/wp-content/uploads/2022/08/image-17.png 578w, https://blog.finxter.com/wp-content/uploads/2022/08/image-17-300x226.png 300w" sizes="(max-width: 578px) 100vw, 578px" /></figure>
</div>
<h2>Method 3: Pandas read_csv() and to_csv()</h2>
<p class="has-global-color-8-background-color has-background">To convert a tab-delimited file to a CSV, first read the file into a Pandas DataFrame using <code>pd.read_csv(filename, sep='\t+', header=None)</code> and then write the DataFrame to a file using <code>df.to_csv(outfilename, header=None)</code>.</p>
<p>Here’s an example input file <code>'my_file.tsv'</code>:</p>
<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" width="567" height="435" src="https://blog.finxter.com/wp-content/uploads/2022/08/image-16.png" alt="" class="wp-image-563673" srcset="https://blog.finxter.com/wp-content/uploads/2022/08/image-16.png 567w, https://blog.finxter.com/wp-content/uploads/2022/08/image-16-300x230.png 300w" sizes="(max-width: 567px) 100vw, 567px" /></figure>
</div>
<p>Here’s an example of some code to convert the tab-delimited file to the CSV file:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="4,7" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import pandas as pd # Read space-delimited file
df = pd.read_csv('my_file.tsv', sep='\t+', header=None) # Write DataFrame to file
df.to_csv('my_file.csv', header=None)
</pre>
<p>Output file <code>'my_file.csv'</code>:</p>
<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" width="573" height="438" src="https://blog.finxter.com/wp-content/uploads/2022/08/image-18.png" alt="" class="wp-image-563680" srcset="https://blog.finxter.com/wp-content/uploads/2022/08/image-18.png 573w, https://blog.finxter.com/wp-content/uploads/2022/08/image-18-300x229.png 300w" sizes="(max-width: 573px) 100vw, 573px" /></figure>
</div>
<p>You can also use the simpler <code>sep='\t'</code> if you are sure that only a single tabular character separates two column values.</p>
<p>If you have any doubts, feel free to dive into our related tutorials:</p>
<ul>
<li><a href="https://blog.finxter.com/read-a-csv-file-to-a-pandas-dataframe/" data-type="post" data-id="440655" target="_blank" rel="noreferrer noopener">Pandas <code>read_csv()</code></a></li>
<li><a href="https://blog.finxter.com/pandas-to_csv/" data-type="post" data-id="8027" target="_blank" rel="noreferrer noopener">Pandas <code>to_csv()</code></a></li>
</ul>
<h2>Summary</h2>
<p>We examined three great ways to convert a space-delimited to a comma-separated CSV file:</p>
<ul>
<li><a href="https://blog.finxter.com/how-to-convert-tab-delimited-file-to-csv-in-python/#Method_1_String_Replace_Single_Tab">Method 1: String Replace Single Tab</a></li>
<li><a href="https://blog.finxter.com/how-to-convert-tab-delimited-file-to-csv-in-python/#Method_2_Regex_Replace_Arbitrary_Tabs">Method 2: Regex Replace Arbitrary Tabs</a></li>
<li><a href="https://blog.finxter.com/how-to-convert-tab-delimited-file-to-csv-in-python/#Method_3_Pandas_read_csv_and_to_csv">Method 3: Pandas <code>read_csv()</code> and <code>to_csv()</code></a></li>
</ul>
<p>Thanks for taking the time to read this article, my friend! <img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f40d.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f49b.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /></p>
<hr class="wp-block-separator has-alpha-channel-opacity"/>
<h2>Regex Humor</h2>
<div class="wp-block-image">
<figure class="aligncenter size-full is-resized"><img loading="lazy" src="https://blog.finxter.com/wp-content/uploads/2022/06/image-133.png" alt="" class="wp-image-428862" width="700" height="629" srcset="https://blog.finxter.com/wp-content/uploads/2022/06/image-133.png 785w, https://blog.finxter.com/wp-content/uploads/2022/06/image-133-300x270.png 300w, https://blog.finxter.com/wp-content/uploads/2022/06/image-133-768x691.png 768w" sizes="(max-width: 700px) 100vw, 700px" /><figcaption><em>Wait, forgot to escape a space. Wheeeeee[taptaptap]eeeeee.</em> (<a href="https://imgs.xkcd.com/comics/regular_expressions.png" data-type="URL" data-id="https://imgs.xkcd.com/comics/regular_expressions.png" target="_blank" rel="noreferrer noopener">source</a>)</figcaption></figure>
</div>
</div>


https://www.sickgaming.net/blog/2022/08/10/how-to-convert-tab-delimited-file-to-csv-in-python/