Home>

I want to be able to execute a program written in Python on the Web.
The content of the created program is an extension of the existing site search tool and added functions. I intend to create it so that the scraped site is not overloaded.
I can distribute this program as it is, but I want to be able to run it on my website.
How can I run a Python program on the web?
Also, does the scraping site seem to be "accessed from my website" if it can be executed on the Web, or "scraping with the information of the person who accessed the website"? Does it look like you're visiting the previous site? "

If it seems difficult, it's a simple program, so I'm thinking of rewriting it in something other than Python. If there are any points that you do not understand due to lack of words, I will add them as appropriate. Thank you.

  • Answer # 1

    Excuse me, I still lacked knowledge and I couldn't consider the copyright because it was a scraping program.
    I will keep it for personal use. Thank you for your answer.

  • Answer # 2

    One way is to make it a Web API.

    Just as a normal website returns the HTML of the website when it receives a request from the browser, the API server starts processing (scraping) on ​​the server when it receives the request and returns the processing result in a response. From the scraped site, it looks like it is being accessed from the API server.

    Once you have an API server, you can set up a button on your website so that when you press it, a request will be sent to the API server.

    API servers can be created using Flask.

    Postscript:

    Scraping someone else's work and publishing it on your web page can violate copyright law.

  • Answer # 3

    I intend to create it so that the scraped site is not overloaded.

    If you access the target site triggered by the user's access to the web page you create, simply create it, and if the user accesses your site 100 times per second, the target site will be accessed for 1 second. I will go 100 times, but is that enough consideration? In that case, you are responsible.

    Also, isn't the terms of use of the target site prohibiting programmatic access?

    Also, does the scraping site seem to be "accessed from my website" if it can be executed on the Web, or "scraping with the information of the person who accessed the website"? Does it look like you're visiting the previous site? "

    The former. If you don't know this skill level, the road is far.

    As a step, assuming that the program is completed
    1. 1. Practice setting up and running a simple Python program on a web server
    2. Required librariespipInstalled on the web server etc.
    3. 3. Place the desired program on the web server. Authenticate so that no one else can use it
    Four. Final test in real environment
    Five. Deauthorize and publish
    Is not it.

    It's fine if the program is all completed with the Python library, but if you use a browser such as Chrome + Selenium for scraping, it's a long way off.

    First of all, I think it is better to think after acquiring various basic skills.
    At the very least, before you publish a program that you want others to use on the web, you need to make sure that there are no security flaws. Could you?