Home>
I want to create a code that gets the ranking, title, price, URL from Rakuten Books ranking. The information of the first place can be acquired, but the information of the second place and below cannot be acquired.
Applicable source codeimport sys
import time
from selenium import webdriver
def main ():
driver = webdriver.Chrome ('PATH')
driver.set_window_size (800,600) #Width 800, Height 600
navigate (driver)
posts = scrape_posts (driver) #Get a list of text content
for post in posts:
print (post)
driver.quit () # Quit the browser.
def navigate (driver):
'' '
Open the target page
'' '
print ('Navigating ...', file = sys.stderr)
assert 'Rakuten Books' in driver.title
def scrape_posts (driver):
posts = []
for a in driver.find_elements_by_css_selector ('ol'):
posts.append ({
'rank': a.find_element_by_css_selector ('b'). text,
'title': a.find_element_by_css_selector ('dt>a'). text,
'value': a.find_element_by_css_selector ('p.price'). text,
'url': a.find_elements_by_css_selector ('a') [0] .get_attribute ('href'),
})
return posts
if __name__ == '__main__':
main ()
#First place
# Ranking
#extra>div: nth-child (5)>ol>li: nth-child (1)>b
#URL, title
#extra>div: nth-child (5)>ol>li: nth-child (1)>dl>dt>a
#price
#extra>div: nth-child (5)>ol>li: nth-child (1)>dl>dt>p.price
# 2
# Ranking
#extra>div: nth-child (5)>ol>li: nth-child (2)>b
#URL, title
#extra>div: nth-child (5)>ol>li: nth-child (2)>dl>dt>a
#price
#extra>div: nth-child (5)>ol>li: nth-child (2)>dl>dt>p.price
Execution results
DevTools listening on ws: //127.0.0.1: 61750/devtools/browser/217d46f1-b992-42ae-9e1a-35f93f8674ed
Navigating ...
{'rank': '1', 'title': 'Children's Six Law', 'value': '1,320 yen (tax included)', 'url': 'https://books.rakuten.co.jp/rb/15873916 /? l-id = r-rank1-1 '}
driver.execute_script ('scroll (0, document.body.scrollHeight)')
I tried to get it by scrolling with the above code, but the result did not change.
I'd be happy if you could tell me how I can get information below the 2nd place.
-
Answer # 1
Related articles
- python - i want to coordinate the line-of-sight information of the eye tracker using tobii sdk
- i want to get retweet information with python without any limit on the number of retweets and the period
- python - i want to save the information obtained by scraping in excel for each sheet
- python - how to read the record information after 10 seconds
- python - i want to create a new dataframe by repeating the work of extracting the average value of the dataframe
- python - i want to skip the header information when outputting the request result in csv
- i want to extract specific information from a python string
- i want to get twitter retweet information with python
- python - [scipy-interpolate] i want to interpolate the time information and coordinate data into data every second
- i want to achieve python equal ranking output
- information entered in html in vuejs cannot be sent to python with cgi
- python - how to enter confidential information of desktop applications that are supposed to be distributed
- python - i want to find the height information of each pixel in the image
- python - does pixel information change when cv2imwrite processing is performed?
- python - streaming api information to sql database
- i want to get the information of the file opened by the program made with python from "open with"
- when i get information with python's telnetlib and write it to a file, extra line breaks are created
Related questions
- python : How can I close windows with selenium?
- python selenium how to pull a link?
- Python 3.4 print output
- python : Beautifulsoup + Selenium parser does not parse elements from the entire scrolling page (infinite scroll)
- Python Selenium Firefox popup
- python : Parsing a dynamic table
- python : I don't know how to specify a filename when downloading a file from Selenium using Chrome
- python : The sample code of "Transition to specified url" cannot be executed
- Is it possible to make it so that when saving a google table, this or that function is called in Python code?
- Python Selenium Steam Authorization
In
for a in driver.find_elements_by_css_selector ('ol>li'):
?