Home>

I want to replace some href attributes in the order listed.

I was able to get the href attribute that I wanted to replace with BeautifulSoup and a regular expression
I don't know how to browse the list and replace in order.
(I know how to replace one href attribute using replace and sub, but
Could not replace in order)

* href attribute to be replaced
・ The class name is sample
-If possible, replace only https://www.change.com/◯◯/test

I would appreciate it if you could tell me. Thank you.

Applicable source code
<!-list link_after = ['after_ABC', 'after2_AB', 'after_045']->
<body>
<!-→ Do not change->

    <!-→ Change to https://www.change.com/list [0]/test-->


    <!-→ Change to https://www.change.com/list [1]/test-->


    <!-→ Change to https://www.change.com/list [2]/test-->

<!-→ Do not change->
</body>
import re
from bs4 import BeautifulSoup
html = 'text.html'
pattern = r'https: //www.change.com/ (. +) \/test '
link_after = ['after_ABC', 'after2_AB', 'after_045']
soup = BeautifulSoup (open (html, encoding = 'utf-8'), 'html.parser')
for x in soup.select ('. sample a'):
    link_before = x.get ('href')
    mo = re.findall (pattern, link_before)
  • Answer # 1

    The list of replacement strings should be looped with zip () along with the list of tags.

    import re
    from bs4 import BeautifulSoup
    html = "sample.html"
    pattern = r "(https://www.change.com/).+(/test)"
    link_after = ["after_ABC", "after2_AB", "after_045"]
    soup = BeautifulSoup (open (html, encoding = "utf-8"), "html.parser")
    for a_tag, after in zip (soup.select (". sample a"), link_after):
        a_tag ["href"] = re.sub (pattern, fr "\ 1 {after} \ 2", a_tag ["href"])
    print (soup)
    #
    #
    # 
    # 
    # 
    # 
    # 
    # 
    # 
    # 
    # 
    # 
    # 
    #