top of page

Python Programming Help

Public·2 members

Python Program to Remove consecutive identical words from a string

Hi Everyone,

Visit codersarts forum section to find daily high rated python program.


Let's start...


Let suppose our string is:


Mystring = "my friend's new new new new and old old cats are running running in the street"

Output string is look like that:


my friend's new and old cats are running in the street

Solution:


import re

Mystring = "my friend's new new new new and old old cats are running running in the street"
res = re.sub(r'\b(\w+\s*)\1{1,}', '\\1', Mystring)
print(res)

Explanation:


import re is used to import regular expression, to read more about regular expression go to the this link


regex pattern details:


  • \b - word boundary

  • (\w+\s*) - one or more word chars \w+ followed by any number of whitespace characters \s* - enclosed into a captured group (...)

  • \1{1,} - refers to the 1st captured group occurred one or more times {1,}


Read this for more about regex Search and Replace:


re.sub(regex, replacement, subject) performs a search-and-replace across subject, replacing all matches of regex in subject with replacement. The result is returned by the sub() function. The subject string you pass is not modified.


If the regex has capturing groups, you can use the text matched by the part of the regex inside the capturing group. To substitute the text from the third group, insert \3 into the replacement string. If you want to use the text of the third group followed by a literal three as the replacement, use \g<3>3. \33 is interpreted as the 33rd group. It is an error if there are fewer than 33 groups. If you used named capturing groups, you can use them in the replacement text with \g<name>.


The re.sub() function applies the same backslash logic to the replacement text as is applied to the regular expression. Therefore, you should use raw strings for the replacement text, as I did in the examples above. The re.sub() function will also interpret \n and \t in raw strings. If you want c:\temp as a replacement, either use r"c:\\temp" or "c:\\\\temp". The 3rd backreference is r"\3" or "\\3".


For other codersarts top rated python program go the below link



For other top rated python program visit codersarts Forum


18 Views
bottom of page