Create an account


Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
[Tut] Python | Split String into List of Substrings

#1
Python | Split String into List of Substrings

<div>
<div class="kk-star-ratings kksr-auto kksr-align-left kksr-valign-top" data-payload='{&quot;align&quot;:&quot;left&quot;,&quot;id&quot;:&quot;969868&quot;,&quot;slug&quot;:&quot;default&quot;,&quot;valign&quot;:&quot;top&quot;,&quot;ignore&quot;:&quot;&quot;,&quot;reference&quot;:&quot;auto&quot;,&quot;class&quot;:&quot;&quot;,&quot;count&quot;:&quot;0&quot;,&quot;legendonly&quot;:&quot;&quot;,&quot;readonly&quot;:&quot;&quot;,&quot;score&quot;:&quot;0&quot;,&quot;best&quot;:&quot;5&quot;,&quot;gap&quot;:&quot;5&quot;,&quot;greet&quot;:&quot;Rate this post&quot;,&quot;legend&quot;:&quot;0\/5 - (0 votes)&quot;,&quot;size&quot;:&quot;24&quot;,&quot;width&quot;:&quot;0&quot;,&quot;_legend&quot;:&quot;{score}\/{best} - ({count} {votes})&quot;,&quot;font_factor&quot;:&quot;1.25&quot;}'>
<div class="kksr-stars">
<div class="kksr-stars-inactive">
<div class="kksr-star" data-star="1" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" data-star="2" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" data-star="3" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" data-star="4" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" data-star="5" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
</p></div>
<div class="kksr-stars-active" style="width: 0px;">
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
</p></div>
</div>
<div class="kksr-legend" style="font-size: 19.2px;"> <span class="kksr-muted">Rate this post</span> </div>
</div>
<p class="has-background" style="background-color:#c8fefe"><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f34e.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /><strong>Summary: </strong>Use Python’s built-in split function to <code>split</code> a given string into a <a rel="noreferrer noopener" href="https://blog.finxter.com/python-lists/" target="_blank">list</a> substrings. Other methods include using the <code>regex</code> library and the <code>map</code> function. </p>
<h3><strong>Minimal Example</strong></h3>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">text = "Python Java Golang" # Method 1
print(text.split()) # Method 2
import re
print(re.split('\s+',text)) # Method 2.1
print(re.findall('\S+', text)) # Method 3
li = list(map(str.strip, text.split()))
res = []
for i in li: for j in i.split(): res.append(j)
print(res) # OUTPUTS: ['Python', 'Java', 'Golang']</pre>
<h2>Problem Formulation</h2>
<p class="has-background" style="background-color:#fbd8fc"><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f4dc.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /><strong>Problem: </strong>Given a string containing numerous substrings. How will you split the string into a list of substrings?</p>
<p>Let’s understand the problem with the help of an example. </p>
<h3><strong>Example</strong></h3>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""># Input
text = "word1 word2 word3 word4 word5" # Output
['word1', 'word2', 'word3', 'word4', 'word5']</pre>
<h2><strong>Method 1:</strong> Using strip </h2>
<p><strong>Approach:</strong> Use the <code>split("sep")</code> function where sep is the specified separator. In our case the separator is a space. Hence, you do not need to pass any separator to the function as whitespaces are considered to be default separators for the <code>split</code> function. Therefore, whenever a space occurs the string will be split and the substring will be stored in a list. </p>
<p><strong>Code:</strong></p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">text = "word1 word2 word3 word4 word5"
print(text.split()) # ['word1', 'word2', 'word3', 'word4', 'word5']</pre>
<h2><strong>Method 2:</strong> Using re.split </h2>
<p>The&nbsp;<code>re.split(pattern, string)</code>&nbsp;method matches all occurrences of the&nbsp;<code>pattern</code>&nbsp;in the&nbsp;<code>string</code>&nbsp;and divides the string along the matches resulting in a list of strings&nbsp;<em>between&nbsp;</em>the matches. For example,&nbsp;<code>re.split('a', 'bbabbbab')</code>&nbsp;results in the list of strings&nbsp;<code>['bb', 'bbb', 'b']</code>.</p>
<p><strong>Approach:</strong> Use thr <code>re.split('\s+',text)</code> method, where <code>text</code> is the given string and ‘<code>\s+</code>‘ returns a match whenever it finds a space in the string.Therefore, on every occurrence of a space the string will be split. </p>
<p><strong>Code:</strong></p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import re text = "word1 word2 word3 word4 word5"
print(re.split('\s+',text)) # ['word1', 'word2', 'word3', 'word4', 'word5']</pre>
<p class="has-base-2-background-color has-background"><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f680.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /><strong>Related Read: <a href="https://blog.finxter.com/python-regex-split/" target="_blank" rel="noreferrer noopener">Python Regex Split</a></strong></p>
<h2><strong>Method 3:</strong> Using re.findall</h2>
<p>The&nbsp;<code>re.findall(pattern, string)</code>&nbsp;method scans&nbsp;<code>string</code>&nbsp;from&nbsp;<strong>left to right</strong>, searching for all&nbsp;<strong>non-overlapping matches</strong>&nbsp;of the&nbsp;<code>pattern</code>. It returns a&nbsp;<strong>list of strings</strong>&nbsp;in the matching order when scanning the string from left to right.</p>
<p class="has-base-background-color has-background"><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f680.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /><strong>Related Read: <a href="https://blog.finxter.com/python-re-findall/" target="_blank" rel="noreferrer noopener">Python re.findall() – Everything You Need to Know</a></strong></p>
<p><strong>Approach:</strong> Use thr <code>re.findall('\S+',text)</code> method, where <code>text</code> is the given string and ‘<code>\S+</code>‘ returns a match whenever it finds a normal character in the string except whitespace. Therefore, all the non-whitespace characters will be grouped together until the script encounters a space. On the occurrence of a space, the string will be split and the next group of characters that do not include a space will be searched.</p>
<p><strong>Code:</strong></p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import re text = "word1 word2 word3 word4 word5"
print(re.findall('\S+', text)) # ['word1', 'word2', 'word3', 'word4', 'word5']</pre>
<p><strong><em>Do you want to master the regex superpower?</em></strong> Check out my new book <em><strong><a href="https://blog.finxter.com/ebook-the-smartest-way-to-learn-python-regex/" target="_blank" rel="noreferrer noopener" title="[eBook] The Smartest Way to Learn Python Regex">The Smartest Way to Learn Regular Expressions in Python</a></strong></em> with the innovative 3-step approach for active learning: (1) study a book chapter, (2) solve a code puzzle, and (3) watch an educational chapter video. </p>
<h2><strong>Method 4:</strong> Using map</h2>
<p><strong>Prerequisite: </strong>The&nbsp;<code>map()</code>&nbsp;function transforms one or more iterables into a new one by applying a “transformator function” to the i-th elements of each iterable. The arguments are the<em>&nbsp;transformator function object</em>&nbsp;and&nbsp;<em>one or more iterables</em>. If you pass&nbsp;<strong><em>n</em>&nbsp;iterables</strong>&nbsp;as arguments, the transformator function must be an&nbsp;<strong><em>n</em>-ary function</strong>&nbsp;taking&nbsp;<strong><em>n</em></strong>&nbsp;input arguments. The return value is an iterable map object of transformed, and possibly aggregated, elements.</p>
<p class="has-base-2-background-color has-background"><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f680.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /><strong>Related Read: <a href="https://blog.finxter.com/python-map/" target="_blank" rel="noreferrer noopener">Python map() — Finally Mastering the Python Map Function [+Video]</a></strong></p>
<p><strong>Approach:</strong> Use the <code>map</code> function such that the iterable is the split list of substrings. This is the second argument of the map method. Now each item of this list will be passed to the <code>strip</code> method which eliminates the trailing spaces if any and then returns a map object containing the split substrings. You can convert this map object to a list using the <a href="https://blog.finxter.com/python-list/" target="_blank" rel="noreferrer noopener">list</a> constructor.</p>
<p><strong>Code:</strong></p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">text = "word1 word2 word3 word4 word5"
li = list(map(str.strip, text.split()))
res = []
for i in li: for j in i.split(): res.append(j)
print(res) # ['word1', 'word2', 'word3', 'word4', 'word5']</pre>
<h2>Exercise</h2>
<p><strong>Problem: </strong>Given a string containing numerous substrings separated by commas and spaces. How will you extract the substrings and store them in a list? Note that you have to eliminate the whitespaces as well as the commas. </p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""># Input
text = "One, Two, Three"
# Output
['One', 'Two', 'Three']</pre>
<p class="has-base-2-background-color has-background"><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f50e.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /><strong>Hint: <a rel="noreferrer noopener" href="https://blog.finxter.com/python-split-string-by-comma-and-whitespace/" target="_blank">Python | Split String by Comma and Whitespace</a></strong></p>
<p><strong>Solution:</strong></p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">text = "One, Two, Three"
print([x.strip() for x in text.split(',')])
# ['One', 'Two', 'Three']</pre>
<h2>Conclusion</h2>
<p>With that, we come to the end of this tutorial. I hope the methods discussed in this article have helped you and answered your queries. Please stay tuned and subscribe for more solutions and discussions in the future.</p>
<p>Happy learning!<img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f40d.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /></p>
<hr class="wp-block-separator has-alpha-channel-opacity" />
<p><strong>But before we move on, I’m excited to present you my new Python book <a rel="noreferrer noopener" href="https://amzn.to/2WAYeJE" target="_blank" title="https://amzn.to/2WAYeJE">Python One-Liners</a></strong> (Amazon Link).</p>
<p>If you like one-liners, you’ll LOVE the book. It’ll teach you everything there is to know about a <strong>single line of Python code.</strong> But it’s also an <strong>introduction to computer science</strong>, data science, machine learning, and algorithms. <strong><em>The universe in a single line of Python!</em></strong></p>
<div class="wp-block-image">
<figure class="aligncenter"><a href="https://amzn.to/2WAYeJE" target="_blank" rel="noopener noreferrer"><img loading="lazy" decoding="async" width="215" height="283" src="https://blog.finxter.com/wp-content/uploads/2020/02/image-1.png" alt="" class="wp-image-5969"/></a></figure>
</div>
<p>The book was released in 2020 with the world-class programming book publisher NoStarch Press (San Francisco). </p>
<p>Link: <a href="https://nostarch.com/pythononeliners" target="_blank" rel="noreferrer noopener">https://nostarch.com/pythononeliners</a></p>
</div>


https://www.sickgaming.net/blog/2022/12/...ubstrings/
Reply



Forum Jump:


Users browsing this thread:
1 Guest(s)

Forum software by © MyBB Theme © iAndrew 2016