Home>

I'm using Selenium and Beautiful Soup.

In html
Search string
aaaaaaaaaa
bbbbbbbbbb
Search string
ccccccccc
ddddd "

I want to retrieve URL4 as there is a description. Therefore, I did the following.

      driver = webdriver.Chrome (r "\ chromedriver.exe")
        html = driver.page_source.encode ('utf-8')
        soup = BeautifulSoup (html, "html.parser")
        Murl = []
        for f in soup.find_all ('a'):
            if f.getText () == "search character":
                Murl.append (f)
Print (Murl)
        NextURL = str (Murl [1] .get ('href'))


All the ones with a tag are included in Murl.
I ran it once, checked the print, and selected it from the a tags (in this case, it was the second one, so I chose Murl [1]) to extract the URL.

For the time being, it can be used with this, but to retrieve the URL in the href, there is no statement that can be made a little easier, such as using a for statement or if statement, and then extracting the contents of the href attribute. I wondered if I could make it simple.

I would be grateful if you could give us your opinion that this is easier or something like this.

  • Answer # 1

    Murl = soup.find_all ('a', text ='search string')
    NextURL = str (Murl [1] .get ('href'))

    I didn't have to use for loops or if statements like this.

  • Answer # 2

    I don't think this is what you are looking for,
    For the time being, in the inclusion notation.

    Murl = [f for f in soup.find_all ('a') if f.getText () == "search character"]