Create an account


Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
[Tut] Python | Split String Multiple Whitespaces

#1
Python | Split String Multiple Whitespaces

Rate this post

?Summary: The most efficient way to split a string using multiple whitespaces is to use the split function like so given_string.split(). An alternate approach is to use different functions of the regex package to split the string at multiple whitespaces.

Minimal Example:


import re text = "mouse\nsnake\teagle human"
# Method 1
print(text.split()) # Method 2
res = re.split("\s+", text)
print(res) # Method 3
res = re.sub(r'\s+', ',', text).split(',')
print(res) # Method 4
print(re.findall(r'\S+', text)) # ['mouse', 'snake', 'eagle', 'human']

Problem Formulation


?Problem: Given a string. How will you split the string using multiple whitespaces?

Example


# Input
text = "abc\nlmn\tpqr xyz\rmno"
# Output
['abc', 'lmn', 'pqr', 'xyz', 'mno']

There are numerous ways of solving the given problem. So, without further ado, let us dive into the solutions.

Method 1: Using Regex


The best way to deal with multiple delimiters is to use the flexibility of the regular expressions library. There are different functions available in the regex library that you can use to split the given string. Let’s go through each one by one.

1.1 Using re.split


The re.split(pattern, string) method matches all occurrences of the pattern in the string and divides the string along the matches resulting in a list of strings between the matches. For example, re.split('a', 'bbabbbab') results in the list of strings ['bb', 'bbb', 'b'].

?Recommended Read:  Python Regex Split.

Approach: To split the string using multiple whitespace characters use re.split("\s+", text) where \s is the matching pattern and it represents a special sequence that returns a match whenever it finds any whitespace character and splits the string.

Code:

import re
text = "abc\nlmn\tpqr xyz\rmno"
res = re.split("\s+", text)
print(res) # ['abc', 'lmn', 'pqr', 'xyz', 'mno']

1.2 Using re.findall


The re.findall(pattern, string) method scans the string from left to right, searching for all non-overlapping matches of the pattern. It returns a list of strings in the matching order when scanning the string from left to right.

?Recommended Read: Python re.findall() – Everything You Need to Know

Code:

import re text = "abc\nlmn\tpqr xyz\rmno"
print(re.findall(r'\S+', text))

Explanation: In the expression, i.e., re.findall(r"\S'+", text), all occurrences of characters except whitespaces are found and stored in a list. Here, \S+ returns a match whenever the string contains one or more occurrences of normal characters (characters from a to Z, digits from 0-9, etc. However, not the whitespaces are considered).

1.3 Using re.sub


The regex function re.sub(P, R, S) replaces all occurrences of the pattern P with the replacement R in string S. It returns a new string. For example, if you call re.sub('a', 'b', 'aabb'), the result will be the new string 'bbbb' with all characters 'a' replaced by 'b'.

Aprroach: Use the re.sub method to replace all occurrences of whitespace characters in the given string with a comma. Thus, the string will now have commas instead of whitespace characters and you can simply split it using a normal string split method by passing comma as the delimiter.

Code:

import re
text = "abc\nlmn\tpqr xyz\rmno"
res = re.sub(r'\s+', ',', text).split(',')
print(res) # ['abc', 'lmn', 'pqr', 'xyz', 'mno']

Do you want to master the regex superpower? Check out my new book The Smartest Way to Learn Regular Expressions in Python with the innovative 3-step approach for active learning: (1) study a book chapter, (2) solve a code puzzle, and (3) watch an educational chapter video.


Method 2: Using split()


By default the split function splits a given string at whitespaces. Meaning, if you do not pass any delimiter to the split function then the string will be split at whitespaces. You can use this default property of the split function and successfully split the given string at multiple whitespaces just by using the split() function.

Code:

text = "abc\nlmn\tpqr xyz\rmno"
print(text.split())
# ['abc', 'lmn', 'pqr', 'xyz', 'mno']

?Recommended Digest: Python String split()

Conclusion


We have successfully solved the given problem using different approaches. Simply using split could do the job for you. However, feel free to explore and try out the other options mentioned above. I hope this article helped you in your Python coding journey. Please subscribe and stay tuned for more interesting articles.

Happy Pythoning! ?


Python Regex Course


Google engineers are regular expression masters. The Google search engine is a massive text-processing engine that extracts value from trillions of webpages.  

Facebook engineers are regular expression masters. Social networks like Facebook, WhatsApp, and Instagram connect humans via text messages

Amazon engineers are regular expression masters. Ecommerce giants ship products based on textual product descriptions.  Regular expressions ​rule the game ​when text processing ​meets computer science. 

If you want to become a regular expression master too, check out the most comprehensive Python regex course on the planet:




https://www.sickgaming.net/blog/2022/12/...itespaces/
Reply



Possibly Related Threads…
Thread Author Replies Views Last Post
  [Tut] Python Int to String with Trailing Zeros xSicKxBot 0 32 12-01-2025, 05:47 PM
Last Post: xSicKxBot
  [Tut] Wrap and Truncate a String with Textwrap in Python xSicKxBot 0 2,053 09-01-2023, 07:45 PM
Last Post: xSicKxBot
  [Tut] Python zip(): Get Elements from Multiple Lists xSicKxBot 0 1,968 08-26-2023, 01:28 AM
Last Post: xSicKxBot
  [Tut] Write a Long String on Multiple Lines in Python xSicKxBot 0 1,500 08-17-2023, 11:05 AM
Last Post: xSicKxBot
  [Tut] 5 Effective Methods to Sort a List of String Numbers Numerically in Python xSicKxBot 0 1,555 08-16-2023, 08:49 AM
Last Post: xSicKxBot
  [Tut] Sort a List, String, Tuple in Python (sort, sorted) xSicKxBot 0 1,694 08-15-2023, 02:08 PM
Last Post: xSicKxBot
  [Tut] How to Access Multiple Matches of a Regex Group in Python? xSicKxBot 0 1,486 04-04-2023, 02:26 PM
Last Post: xSicKxBot
  [Tut] F-String Python Hex, Oct, and Bin: Efficient Number Conversions xSicKxBot 0 1,666 03-28-2023, 12:01 PM
Last Post: xSicKxBot
  [Tut] How to Correctly Write a Raw Multiline String in Python: Essential Tips xSicKxBot 0 1,489 03-27-2023, 05:54 PM
Last Post: xSicKxBot
  [Tut] How To Extract Numbers From A String In Python? xSicKxBot 0 1,315 02-26-2023, 02:45 PM
Last Post: xSicKxBot

Forum Jump:


Users browsing this thread:
1 Guest(s)

Forum software by © MyBB Theme © iAndrew 2016