Home>

Help with the parser. There is a URL with an address, how to implement in my code a dynamic change https://www.eldorado.ru/c/televizory/?sort=-date_create.on with categories} /? sort= -date_create

from bs4 import beautifulsoup
Import CSV.
Import Requests.
CSV= 'Cars.csv'
Host= 'https://www.eldorado.ru'
Url= 'https://www.eldorado.ru/c/televizory/?sort=-date_create'
Headers= {
  'Accept': 'text /html, application /xhtml + xml, application /xml; q= 0.9, image /avif, image /webp, image /apng, * /*; q= 0.8, application /signed-exchange; v= B3; Q= 0.9 ',
  'User-Agent': 'Mozilla /5.0 (x11; Linux x86_64) AppleWebKit /537.36 (KHTML, LIKE GECKO) Chrome /89.0.4389.114 Safari /537.36'
}
DEF Get_HTML (URL, PARAMS= ''):
  R= Requests.get (URL, Headers= Headers, Params= Params)
  Return R.
DEF Get_Content (HTML):
  Soup= Beautifulsoup (HTML, 'HTML.PARSER')
  Items= Soup.find_all ('Li', Class _= 'ListingProductCardList_ProductCardListingWrapper__3-O9i')
  CARS= []
  For item in Items:
      Try:
          Cars.append ({
              'title': item.find ('A', Class _= 'ListingProductcardList_ProductCardListingLink__1Jimi'). Get_Text (Strip= Type),
              'link': host + item.find ('a', class _= 'listingProductcardlist_productcardlistinglink__1jimi'). Get ('href'),
              'Cash': Item.Find ('Span', class _= 'Priceblock_buyboxprice__3qgyj priceblock_buyboxpricestyled__29j_g'). Get_Text (Strip= Type),
              'img': item.find ('a', class _= 'listingProductcardlist_producturecardlistingimagecontaincardlistingimagecontainer__fqe33'). Find ('img'). Get ('src')
          })
      Except:
          Pass
  # PRINT (CARS)
  Return Cars.
DEF SAVE_DOC (Items, Path):
  With Open (Path, 'W', NewLine= '') AS File:
      Writer= CSV.Writer (File, Delimiter= ';')
      Writer.WriteRow (['Product Name', 'Link', 'PRICE', 'Picture'])
      For item in Items:
          Writer.writerow ([Item ['title'], Item ['Link'], Item ['Cash'], item ['img']])
DEF PARSER ():
  PAGINATION= INPUT ('Specify the page number for the parsing')
  Pagination= int (pagnation.strip ())
  HTML= Get_HTML (URL)
  If html.status_code== 200:
      CARS= []
      For Page in Range (1, Pagination + 1):
          Print (F'Parsim Page: {Page} ')
          HTML= get_html (URL, PARAMS= {page ': Page})
          Cars.Extend (Get_Content (HTML.TEXT))
          Save_Doc (Cars, CSV)
      Pass
  ELSE:
      PRINT ('Error')

Separate the menu, make a list of categories and sort out it, obviously? ...

Jack_oS2021-04-07 20:11:59

I can not apply it, or rather describe in the code

Курский2021-04-07 20:22:32
Trends