Python Forum
Put the new line after regex pattern
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Put the new line after regex pattern
#1
Hi,
I have this code:

Quote:text = """MyCo Please have a look at this building’s premium. It looks to be a very high rate. <img width="874" height="162" id="Picture_x0020_2" src="cid:[email protected]" alt="A picture containing text, screenshot, font, algebra

Description automatically generated" style="width:9.1in; height:1.6833in"> The client has a few policies with MyCo as supporting business. 000000123 BonaNou Family T. Please revert back asap. Thanks Sta Sel: 000 00000 pos: [email protected] Lid van:Quanta Primary Cooperative Ltd.



SANTAM Please have a look at this building’s premium. It looks to be a very high rate. <img width="874" height="162" id="Picture_x0020_2" src="cid:[email protected]" alt="A picture containing text, screenshot, font, algebra

Description automatically generated" style="width:9.1in; height:1.6833in"> The client has a few policies with MyCo as supporting business. 000000123 BonaNou Family T. Please revert back asap. Thanks Sta Sel: 000 00000 pos: [email protected] Lid van:Quanta Primary Cooperative Ltd.





<img width="874" height="162" id="Picture_x0020_2" src="cid:[email protected]" alt="A picture containing text, screenshot, font, algebra

Description automatically generated" style="width:9.1in; height:1.6833in"> Dear Mr Smith King thank you and your team for all the assistance throughout the years. Unfortunately, I have decided to depart from Origin and MyCo for personal reasons. I've attached the cancellation letter for my policy to be implemented immediately.Kind Regards """

pattern = re.compile(r'<(img.+.+)>', re.DOTALL)
My code matches all these patterns
Quote:<img width="874" height="162" id="Picture_x0020_2" src="cid:[email protected]" alt="A picture containing text, screenshot, font, algebra

Description automatically generated" style="width:9.1in; height:1.6833in">

but I want to update my code so that it inserts a new line after that pattern, please help. I want to have:
Quote:<img width="874" height="162" id="Picture_x0020_2" src="cid:[email protected]" alt="A picture containing text, screenshot, font, algebra

Description automatically generated" style="width:9.1in; height:1.6833in">
Dear Mr Smith King thank you and your team for all the assistance throughout the years. Unfortunately, I have decided to depart from Origin and MyCo for personal reasons. I've attached the cancellation letter for my policy to be implemented immediately.Kind Regards
Reply
#2
Please do not create new threads on same or very related subject.
Instead, continue posts on original thread.
read carefully https://python-forum.io/thread-40298-pos...#pid170817
Reply
#3
(Jul-06-2023, 08:35 PM)Larz60+ Wrote: Please do not create new threads on same or very related subject.
Instead, continue posts on original thread.
read carefully https://python-forum.io/thread-40298-pos...#pid170817

Hi,

They are not related. on this thread I want to put a new line after my match of Regex.
Reply
#4
Maybe try re.sub()? That is for replacing bits of a string with something else.

import re

mystring = """
<img width="874" height="162" id="Picture_x0020_2" src="cid:[email protected]" alt="A picture containing text, screenshot, font, algebra Description automatically generated" style="width:9.1in; height:1.6833in">
Blablablabla
<img width="874" height="162" id="Picture_x0020_2" src="cid:[email protected]" alt="A picture containing text, screenshot, font, algebra">
Description automatically generated" style="width:9.1in; height:1.6833in"
Blablablabla
<img width="874" height="162" id="Picture_x0020_2" src="cid:[email protected]">
Description automatically generated" style="width:9.1in; height:1.6833in
Blablablabla
"""

# pattern = re.compile(r'<(img.+.+)>', re.DOTALL)
mypat = re.compile(r'">')
repl = '">\n'
# re.sub(pattern, repl, string, count=0, flags=0)
# result = re.sub(pattern, repl, string, flags=re.IGNORECASE)
result = re.sub(mypat, repl, mystring)
data = result.split('\n')
for d in data:
    print(d)
The above can be done in your text editor with ctrl + h
Reply
#5
(Jul-08-2023, 10:44 PM)Pedroski55 Wrote: Maybe try re.sub()? That is for replacing bits of a string with something else.

import re

mystring = """
<img width="874" height="162" id="Picture_x0020_2" src="cid:[email protected]" alt="A picture containing text, screenshot, font, algebra Description automatically generated" style="width:9.1in; height:1.6833in">
Blablablabla
<img width="874" height="162" id="Picture_x0020_2" src="cid:[email protected]" alt="A picture containing text, screenshot, font, algebra">
Description automatically generated" style="width:9.1in; height:1.6833in"
Blablablabla
<img width="874" height="162" id="Picture_x0020_2" src="cid:[email protected]">
Description automatically generated" style="width:9.1in; height:1.6833in
Blablablabla
"""

# pattern = re.compile(r'<(img.+.+)>', re.DOTALL)
mypat = re.compile(r'">')
repl = '">\n'
# re.sub(pattern, repl, string, count=0, flags=0)
# result = re.sub(pattern, repl, string, flags=re.IGNORECASE)
result = re.sub(mypat, repl, mystring)
data = result.split('\n')
for d in data:
    print(d)
The above can be done in your text editor with ctrl + h

Hi, thanks for this,

I ran your code as it is, and it worked fine, then I changed it, and it didn't work as expected.

I put Blablablabla in the same sentence and the pattern we are looking for:

Quote:Blablablabla <img width="874" height="162" id="Picture_x0020_2" src="cid:[email protected]" alt="A picture containing text, screenshot, font, algebra">
Description automatically generated" style="width:9.1in; height:1.6833in"

Here's my desired output, the "Blablablabla" should then be separated with the pattern, pattern should go in the new line. :
Quote:Blablablabla
<img width="874" height="162" id="Picture_x0020_2" src="cid:[email protected]" alt="A picture containing text, screenshot, font, algebra">
Description automatically generated" style="width:9.1in; height:1.6833in"
Reply
#6
You asked for a newline after the pattern, not before. Should be an easy fix.
import re

text = """<img id="Picture_1">Blablablabla <img id="Picture_2">
Blablablabla<img id="Picture_1">Blablablabla
"""

matches = set(re.findall(r"<img.*?>", text, re.DOTALL))
for match in matches:
    text = text.replace(match, f"\n{match}\n")
print(text)
Of course you get extra blank lines if a line starts with or ends with the pattern.
Output:
<img id="Picture_1"> Blablablabla <img id="Picture_2"> Blablablabla <img id="Picture_1"> Blablablabla
Reply


Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020