Home>

About python web script program.

https://yakkun.com/sm/zukan/n245
I want to extract and display each value (HP: 100, Kogeki: 75 ...) on the 140th to 145th line from the site.

<tr><td class ="c1"><tr><td class ="c1"style ="width: 125px;">HP</td><td class ="left"colspan ="5"><img src ="// img.yakkun.com/bar.gif"style ="width: 60px;height: 10px"/>& ;nbsp;100</td></tr>Koreki</td><td class ="left"colspan ="5"><img src ="// img.yakkun. com/bar.gif"style ="width: 45px;height: 10px"/>&nbsp;75</td></tr>
<tr><td class ="c1">bow</td><td class ="left"colspan ="5"><img src ="// img.yakkun .com/bar.gif"style ="width: 69px;height: 10px"/>&nbsp;115</td></tr>
<tr><td class ="c1">Let</td><td class ="left"colspan ="5"><img src ="// img. yakkun.com/bar.gif"style ="width: 54px;height: 10px"/>&nbsp;90</td></tr>
<tr><td class ="c1"></td><td class ="left"colspan ="5"><img src ="// img. yakkun.com/bar.gif"style ="width: 69px;height: 10px"/>&nbsp;115</td></tr>
<tr><td class ="c1">Quickness</td><td class ="left"colspan ="5"><img src ="// img.yakkun .com/bar.gif&style;style ="width: 51px;height: 10px"/>&nbsp;85</td></tr

The following programs are currently being organized. However, it is not displayed at all. Where are the changes?

import requests, bs4

suikun =""

res = requests.get ('https: //yakkun.com/sm/zukan/n245')
res.raise_for_status ()
soup = bs4.BeautifulSoup (res.text,"html.parser")
elems = soup.select ('td')

for tag in elems:
try:

Pop "style" from "span" element.

string_ = tag.get ("style"). pop (0)

Check if nbsp is set in the extracted style string.

if string_ in"nbsp" ;:

Get a string.

suikun = tag.string

Interrupt loop processing.

break
except:

If "style" cannot be popped from "span" element, nothing is done.

pass

print (suikun)

Environment is done in windows7 anaconda jupyter.

  • Answer # 1

    I wrote it.

    import requests
    from bs4 import BeautifulSoup
    response = requests.get ('https://yakkun.com/sm/zukan/n245')
    response.encoding = response.apparent_encoding
    soup = BeautifulSoup (
        response.text, 'html.parser'
    )
    #
    #
    result = {}
    ret = soup.find ('table', summary = 'detailed data')
    for tr in ret.find_all ('tr') [1: 7]:
        name, value = map (lambda tag: tag.text.strip (), tr.find_all ('td'))
        result [name] = int (value)
    print (result)

    Execution results

    {'HP': 100, 'Kouguki': 75, 'Boukyo': 115, 'Tokukou': 90, 'Tokubo': 115, 'Brightness': 85}

    Execution environment

    Windows 10

    Python 3.6.2

    BeautifulSoup 4.6.0


    I don't really read the original code, but it seems a bit too mechanical.
    It would be natural to write from the place where you look for the table, as people actually read.
    Well, some of my code compromises with slices ...

    Look at the html code for hints, check the reference and choose the right access method.
    It's also a good idea to look into the structure using browser developer tools.
    Beautiful Soup 4.2.0 Doc.

    And web scraping, not web scripting.

    How to write code

    StackOverflow has a function that makes it easy to see the code as shown above.
    Open the question edit screen and press the<code>button with the code selected.

    Especially in Python, if the indentation breaks, the meaning of the code changes.