Home>

I want to create a code that gets the ranking, title, price, URL from Rakuten Books ranking. The information of the first place can be acquired, but the information of the second place and below cannot be acquired.

Applicable source code
import sys
import time
from selenium import webdriver
def main ():
    driver = webdriver.Chrome ('PATH')
    driver.set_window_size (800,600) #Width 800, Height 600
    navigate (driver)
    posts = scrape_posts (driver) #Get a list of text content
    for post in posts:
        print (post)
    driver.quit () # Quit the browser.

def navigate (driver):
    '' '
    Open the target page
    '' '
    print ('Navigating ...', file = sys.stderr)
    assert 'Rakuten Books' in driver.title


def scrape_posts (driver):
    posts = []
    for a in driver.find_elements_by_css_selector ('ol'):
        posts.append ({
            'rank': a.find_element_by_css_selector ('b'). text,
            'title': a.find_element_by_css_selector ('dt>a'). text,
            'value': a.find_element_by_css_selector ('p.price'). text,
            'url': a.find_elements_by_css_selector ('a') [0] .get_attribute ('href'),
        })
    return posts
if __name__ == '__main__':
    main ()
#First place
# Ranking
#extra>div: nth-child (5)>ol>li: nth-child (1)>b
#URL, title
#extra>div: nth-child (5)>ol>li: nth-child (1)>dl>dt>a
#price
#extra>div: nth-child (5)>ol>li: nth-child (1)>dl>dt>p.price
# 2
# Ranking
#extra>div: nth-child (5)>ol>li: nth-child (2)>b
#URL, title
#extra>div: nth-child (5)>ol>li: nth-child (2)>dl>dt>a
#price
#extra>div: nth-child (5)>ol>li: nth-child (2)>dl>dt>p.price
Execution results
DevTools listening on ws: //127.0.0.1: 61750/devtools/browser/217d46f1-b992-42ae-9e1a-35f93f8674ed
Navigating ...
{'rank': '1', 'title': 'Children's Six Law', 'value': '1,320 yen (tax included)', 'url': 'https://books.rakuten.co.jp/rb/15873916 /? l-id = r-rank1-1 '}
driver.execute_script ('scroll (0, document.body.scrollHeight)')


I tried to get it by scrolling with the above code, but the result did not change.
I'd be happy if you could tell me how I can get information below the 2nd place.

  • Answer # 1

      

    for a in driver.find_elements_by_css_selector ('ol'):

    Infor a in driver.find_elements_by_css_selector ('ol>li'):?

Related articles