Home>
Python scraping (retrieving property information from SUUMO) does not work

I want to scrape property information from SUUMO, but I get "Page cannot be displayed" on the way.
I think the code itself is correct because I've retrieved the information for the first few pages ...
Since it was started by a super beginner by imitating the appearance, I am in trouble because I can not think of the cause.

AttributeError Traceback (most recent call last)<ipython-input-49-ccef615dab50>in<module>   32 soup = BeautifulSoup (c, "html.parser")
     33 summary = soup.find ("div", id ='js-bukkenList')
--->34 houses = summary.find_all ('li', class_ ='cassette js-bukkenCassette')
     35
     36 for house in houses:
AttributeError:'NoneType' object has no attribute'find_all'
Corresponding source code
! pip install beautifulsoup4
from bs4 import BeautifulSoup
import re
import requests
import time
import pandas as pd
! pip install lxml
print ('done')
url ='https://suumo.jp/jj/common/ichiran/JJ901FC004/?initFlg=1&seniFlg=1&ar=030&ta=13&scTmp=13103&scTmp=13109&scTmp=13110&scTmp=13111&scTmp=13112&scTmp=13113&ct=9999999&cb= = 9999999&xb = 60&md = 7% 2C8% 2C9&md = 10% 2C11% 2C12&md = 13&et = 7&cn = 25&newflg = 0&km = 1&sc = 13103&sc = 13109&sc = 13110&sc = 13111&sc = 13112&sc = 13113&bs = 10&bs = 01&bs = 020&bs = 0
result = requests.get (url)
c = result.content
soup = BeautifulSoup (c, "html.parser")
summary = soup.find ("div", {'id':'js-bukkenList'})
body = soup.find ("body")
pages = body.find_all ("div", {'class':'pagination pagination_set-nav'})
pages_text = str (pages)
pages_split = pages_text.split ('</li></ol>')
num_pages = int (pages_split [0] .split ('>') [-1])
print ("number of pages =", num_pages)
urls = []
urls.append (url)
for i in range (num_pages --1):
    page_num = str (i + 2)
    url_page = url +'&pn ='+ page_numurls.append (url_page)
data = []
jenre ='' # property type
name ='' # Property name
price ='' #Purchase price
address ='' # address
station ='' # nearest station
walk ='' #bus walk
area ='' #land area
building ='' #building area
floor_plan ='' # Floor plan
monthly ='' # monthly payment
age ='' # age
link ='' #link
kanrihi ='' # management fee
shuzenhi ='' # Repair reserve cost
reform ='' # reform
total_units ='' #total number of units
right_form ='' # right form
right_district ='' # Area of ​​use
parking ='' # parking lot
for url in urls:
    result = requests.get (url)
    c = result.content
    soup = BeautifulSoup (c, "html.parser")
    summary = soup.find ("div", id ='js-bukkenList')
    houses = summary.find_all ('li', class_ ='cassette js-bukkenCassette')
    for house in houses:
        jenre = house.find_all ('span', class_ ='ui-pct ui-pct--util1 cassettebox-hpct cassettebox-hpctcat') [0] .string
        name = house.find_all ("a", class_ ='js-cassetLinkHref') [0] .string
        price = house.find_all ('dd', class _ = "infodatabox-details-txt") [0] .string
        address = house.find_all ('div', class_ ='infodatabox-box-txt') [0] .string
        station = house.find_all ('div', class_ ='infodatabox-box-txt') [1] .string
        walk = house.find_all ('div', class_ ='infodatabox-box-txt') [2] .text
        area = house.find_all ('dd', class _ = "infodatabox-details-txt") [2] .text
        detail = house.find_all ('div', class_ ='infodatabox-box-txt') [4]
        cols = detail.find_all ('dd', class_ ='infodatabox-details-txt')
        for i in range (len (cols)):if len (cols) == 2:
                building = detail.find_all ('dd', class_ ='infodatabox-details-txt') [0] .string
                floor_plan = detail.find_all ('dd', class_ ='infodatabox-details-txt') [1] .string
            elif len (cols) == 3:
                building = detail.find_all ('dd', class_ ='infodatabox-details-txt') [1] .text
                floor_plan = detail.find_all ('dd', class_ ='infodatabox-details-txt') [2] .string
        monthly = house.find_all ('dd', class _ = "infodatabox-details-txt") [1] .string
        age = house.find_all ('div', class_ ='infodatabox-box-txt') [5] .string
        linkbox = house.find ('div', class_ ='cassettebox-action')
        linked = linkbox.find ('a')
        links = linked.get ('href')

        link_child = link +'bukkengaiyo /'
        result_child = requests.get (link_child)
        c_child = result_child.content
        soup_child = BeautifulSoup (c_child,'html.parser')
        summary_child = soup_child.find_all ('tbody', {'class':'vat tal'})
        try: try:
            kanrihi = summary_child [0] .find_all ('td') [5] .string.strip ('\ r \ n \ t')
            shuzenhi = summary_child [0] .find_all ('td') [6] .string.strip ('\ r \ n \ t')
            reform = summary_child [0] .find_all ('td') [16] .contents
            total_units = summary_child [1] .find_all ('td') [2] .string.strip ('\ r \ n \ t')
            right_form = summary_child [1] .find_all ('td') [5] .string.strip ('\ r \ n \ t')
            right_district = summary_child [1] .find_all ('td') [6] .string.strip ('\ r \ n \ t')
            parking = summary_child [1] .find_all ('td') [7] .string.strip ('\ r \ n \ t')
        except:
            pass
        string = [jenre, name, price, address, station, walk, area, building, floor_plan, monthly, age, link, kanrihi, shuzenhi, reform, total_units, right_form, right_district, parking]
        data.append (string)
    time.sleep (1)
df = pd.DataFrame (data, columns = ['property type','property name','purchase amount','address','nearest station','bus walk','land area','building area' ,'Room layout','Monthly payment amount','Building age','URL','Administration cost','Repair reserve cost','Remodeling','Total number of units','Right form','Used area', 'Parking Lot'])
df.to_csv ('suumo_scrape.csv', sep =',', encoding ='utf-8', header = True, index = False)
What I tried

Before opening the property details page and adding the process of extracting further information (link_child = link + ... and subsequent parts), I was able to complete the process of extracting the information and saving it in CSV format without any problems. After adding the code of, scraping progresses only halfway, and I am at a loss.
I tried to open an interval with time.sleep (), but it still stops halfway.

Supplementary information (FW/tool version, etc.)

I am using Jupyter Notebook.