Home>

When scraping regular urls, use a for statement like the following
Showed.

from bs4 import BeautifulSoup
import requests
No = ['01', '02', '03', '04']
for n in No
    url = 'html: //www.aaa-'+str (n)
    result = requests.get (url)
    c = result.content
    soup = BeautifulSoup (c, 'lxml')
    summary = soup.find ('div')
    table = summary.find_all ('table')
    rows = table [1] .find_all ('tr') # Get table contents


When scraping with the above code,
For html: //www.aaa-03
Error message when table [1] does not exist
IndexError: list index out of range
Will come out.
Ignore the error (in the case of the above error, move to the next process)
Next html: //www.aaa-04
How to handle exception handling
Are you sure I want to add it?

  • Answer # 1

    I think that it is a range that can be handled even with just an if statement.

    table = summary.find_all ('table')
    if len (table)<2:
        continue
    rows = table [1] .find_all ('tr')

    Exception handling can cause unexpected processing jumps and should be used with caution.
    Especiallytry: ... except: ...doesn't know what to catch.

    >>>lst = [42]
    >>>try:
    ... plint (lst [0])
    ... except:
    ... pass
    ...
    >>>

  • Answer # 2

    try:
            for statement contents
        except:
            pass