Create an account


Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
[Tut] Python Regex Multiple Repeat Error

#1
Python Regex Multiple Repeat Error

<div><p>Just like me an hour ago, you’re probably sitting in front of your regular expression code, puzzled by a strange error message:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">re.error: multiple repeat at position x</pre>
<p>How does it arise? Where does it come from? And, most importantly, how can you get rid of it?</p>
<p>This article gives you answers to all of those questions. Alternatively, you can also watch my short explainer video that shows you real quick how to resolve this error:</p>
<figure class="wp-block-embed-youtube wp-block-embed is-type-rich is-provider-embed-handler wp-embed-aspect-16-9 wp-has-aspect-ratio">
<div class="wp-block-embed__wrapper">
<div class="ast-oembed-container"><iframe title="Python Regex Multiple Repeat Error" width="1100" height="619" src="https://www.youtube.com/embed/BtogzCIT4zA?feature=oembed" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe></div>
</p></div>
</figure>
<h2>How Does the Multiple Repeat Error Arise in Python Re?</h2>
<p><strong>Python’s <a rel="noreferrer noopener" aria-label=" (opens in a new tab)" href="https://blog.finxter.com/python-regex/" target="_blank">regex library re</a> throws the multiple repeat error when you try to stack two regex quantifiers on top of each other. For example, the regex <code>'a++'</code> will cause the multiple repeat error. You can get rid of this error by avoiding to stack quantifiers on top of each other. </strong></p>
<p>Here’s an example:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> import re
>>> re.findall('a++', 'aaaa')
Traceback (most recent call last): File "&lt;pyshell#29>", line 1, in &lt;module> re.findall('a++', 'aaaa') File "C:\Users\xcent\AppData\Local\Programs\Python\Python37\lib\re.py", line 223, in findall ...
re.error: multiple repeat at position 2</pre>
<p>I have shortened the error message to focus on the relevant parts. In the code, you first import the regex library re. You then use the <code>re.findall(pattern, string)</code> function (<a rel="noreferrer noopener" aria-label="see this blog tutorial (opens in a new tab)" href="https://blog.finxter.com/python-re-findall/" target="_blank">see this blog tutorial</a>) to find the pattern <code>'a++'</code> in the string <code>'aaaa'</code>.</p>
<p>However, this doesn’t make a lot of sense: what’s the meaning of the pattern <code>a++</code> anyway?</p>
<h2>[Tips] What’s the Source of the Multiple Repeat Error and How to Avoid It?</h2>
<p>The error happens if you use the Python <a rel="noreferrer noopener" aria-label="regex (opens in a new tab)" href="https://blog.finxter.com/python-regex/" target="_blank">regex</a> package <code>re</code>. There are many different reasons for it but all of them have the same source: you stack quantifiers on top of each other. </p>
<p>If you don’t know what a quantifier is, scroll down and read the following subsection where I show you exactly what it is.</p>
<p>Here’s a list of reasons for the error message. Maybe your reason is among them?</p>
<ul>
<li>You use the regex pattern <code>'X++'</code> for any regex expression <code>X</code>. To avoid this error, get rid of one quantifier.</li>
<li>You use the regex pattern <code>'X+*'</code> for any regex expression <code>X</code>. To avoid this error, get rid of one quantifier.</li>
<li>You use the regex pattern <code>'X**'</code> for any regex expression <code>X</code>. To avoid this error, get rid of one quantifier.</li>
<li>You use the regex pattern <code>'X{m,n}*'</code> for any regex expression <code>X</code> and number of repetitions <code>m</code> and <code>n</code>. To avoid this error, get rid of one quantifier.</li>
<li>You try to match a number of characters <code>'+'</code> and use a second quantifier on top of it such as <code>'+?'</code>. In this case, you should escape the first quantifier symbol <code>'\+'</code>. </li>
<li>You try to match a number of characters <code>'*'</code> and use a second quantifier on top of it such as <code>'*+'</code>. Avoid this error by escaping the first quantifier symbol <code>'\*'</code>. </li>
</ul>
<p>Oftentimes, the error appears if you don’t properly escape the special quantifier metacharacters in your regex pattern. </p>
<p>Here’s a <a rel="noreferrer noopener" aria-label="StackOverflow (opens in a new tab)" href="https://stackoverflow.com/questions/19942314/python-multiple-repeat-error" target="_blank">StackOverflow</a> post that shows some code where this happened:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">...
term = 'lg incite" OR author:"http++www.dealitem.com" OR "for sale'
p = re.compile(term, re.IGNORECASE) ...</pre>
<p>I edited the given code snippet to show the important part. The code fails because of a <code>multiple repeat error</code>. Can you see why?</p>
<p>The reason is that the regex <code>'lg incite" OR author:"http++www.dealitem.com" OR "for sale'</code> contains two plus quantifiers stacked on top of each other in the substring <code>'http++'</code>. Get rid of those and the code will run again!</p>
<h2>Python Regex Quantifiers</h2>
<p>The word “<a href="https://www.merriam-webster.com/dictionary/quantity" target="_blank" rel="noreferrer noopener" aria-label=" (opens in a new tab)">quantifier</a>” originates from latin: it’s meaning is <strong>quantus = how much / how often</strong>.</p>
<p><strong>This is precisely what a regular expression quantifier means: you tell the regex engine how often you want to match a given pattern. </strong></p>
<p>If you think you don’t define any quantifier, you do it implicitly: no quantifier means to match the regular expression exactly once.</p>
<p>So what are the regex quantifiers in Python?</p>
<figure class="wp-block-table is-style-stripes">
<table>
<tbody>
<tr>
<td>Quantifier</td>
<td>Meaning</td>
</tr>
<tr>
<td><code>A?</code></td>
<td>Match regular expression <code>A</code> zero or one times</td>
</tr>
<tr>
<td><code>A*</code></td>
<td>Match regular expression <code>A</code> zero or more times</td>
</tr>
<tr>
<td><code>A+</code></td>
<td>Match regular expression <code>A</code> one or more times</td>
</tr>
<tr>
<td><code>A{m}</code></td>
<td>Match regular expression <code>A</code> exactly m times</td>
</tr>
<tr>
<td><code>A{m,n}</code></td>
<td>Match regular expression <code>A</code> between m and n times (included)</td>
</tr>
</tbody>
</table>
</figure>
<p>Note that in this tutorial, I assume you have at least a remote idea of what regular expressions actually are. If you haven’t, no problem, check out my <a rel="noreferrer noopener" aria-label="detailed regex tutorial on this blog (opens in a new tab)" href="https://blog.finxter.com/python-regex/" target="_blank">detailed regex tutorial on this blog</a>.</p>
<p>You see in the table that the quantifiers <code>?</code>, <code>*</code>, <code>+</code>, <code>{m}</code>, and <code>{m,n}</code> define how often you repeat the matching of regex <code>A</code>. </p>
<p>Let’s have a look at some examples—one for each quantifier:</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> import re
>>> re.findall('a?', 'aaaa')
['a', 'a', 'a', 'a', '']
>>> re.findall('a*', 'aaaa')
['aaaa', '']
>>> re.findall('a+', 'aaaa')
['aaaa']
>>> re.findall('a{3}', 'aaaa')
['aaa']
>>> re.findall('a{1,2}', 'aaaa')
['aa', 'aa']</pre>
<p>In each line, you try a different quantifier on the same text <code>'aaaa'</code>. And, interestingly, each line leads to a different output:</p>
<ul>
<li>The <a rel="noreferrer noopener" aria-label="zero-or-one (opens in a new tab)" href="https://blog.finxter.com/python-re-question-mark/" target="_blank">zero-or-one</a> regex <code>'a?'</code> matches four times one <code>'a'</code>. Note that it doesn’t match zero characters if it can avoid doing so.</li>
<li>The <a rel="noreferrer noopener" href="https://blog.finxter.com/python-re-question-mark/" target="_blank">zero-or-more</a> regex <code>'a*'</code> matches once four <code>'a'</code>s and consumes them. At the end of the string, it can still match the empty string.</li>
<li>The <a rel="noreferrer noopener" href="https://blog.finxter.com/python-re-question-mark/" target="_blank">one-or-more</a> regex <code>'a+'</code> matches once four <code>'a'</code>s. In contrast to the previous quantifier, it cannot match an empty string.</li>
<li>The repeating regex <code>'a{3}'</code> matches up to three <code>'a'</code>s in a single run. It can do so only once.</li>
<li>The repeating regex <code>'a{1,2}'</code> matches one or two <code>'a'</code>s. It tries to match as many as possible.</li>
</ul>
<p>You’ve learned the basic quantifiers of Python regular expressions. </p>
<h2>Where to Go From Here?</h2>
<p>To summarize, you’ve learned that the multiple repeat error appears whenever you try to stack multiple quantifiers on top of each other. Avoid this and the error message will disappear. </p>
<p>If you want to boost your Python regex skills to the next level, check out my free <a href="https://blog.finxter.com/python-regex/" target="_blank" rel="noreferrer noopener" aria-label="in-depth regex superpower tutorial (20,000+) words (opens in a new tab)">in-depth regex superpower tutorial (20,000+) words</a>. Or just bookmark the article for later read.</p>
</div>


https://www.sickgaming.net/blog/2020/02/...eat-error/
Reply



Forum Jump:


Users browsing this thread:
1 Guest(s)

Forum software by © MyBB Theme © iAndrew 2016