Home>

I want to get graph data (actual numerical values) from the web by scraping, but the graph displayed on the web is
It is a figure (displayed with HTML img tag), and I am in trouble because I cannot get the numerical data.

Probably, behind the scenes, MRTG generates a graph with javascript and draws it with HTML (only displayed as a figure).
Even if you look at the source of javascript, there is no description such as Maximum In: 19.62M, what kind of elements such as HTML and CSS should be specified and how to get it.
I don't know at all.

How do you get the numbers in these graphs?

Graph (example) ↓

  • Answer # 1

    The answer to your direct question is "there is no choice but to analyze the image".

    First of all, I think that the structure of mrtg is a little misunderstood. It's not like dynamic web content written in modern JavaScript. It works as follows.

    Mrtg (written in perl), which is regularly executed on the backend, collects network information with snmp and graphs it to generate an image file.

    A command attached to mrtg called indexmaker creates a web page with the image file pasted.

    The web server only displays the finished web page and is not involved in running mrtg

    Also, since it was created with the idea of ​​the 2000 single digit age, the purpose is to create a web page that is easy for humans to see (+ at most, threshold monitoring and email notification). Therefore, there is no consideration of whether to reuse the collected data or to make it available as data from the outside.

    So, with mrtg, there is no other way but to analyze the finished image.

    If the questioner is in a position to touch the company's network management system, we recommend that you replace it with a modern monitoring system (oss, there are many good ones). Then, the data can be taken properly as data, and it becomes easy to perform the processing determined by Python according to the data.