Home>

Hello.
This is my first question. I'm sorry to ask you a general question, but I would appreciate it if you could answer it.

As part of learning python, I am thinking of creating a web app.
This web app is about displaying charts, but I'm thinking of running a program that gets information from the API in the background and reflects it in the database.
Conceptual diagram: Client WEB WEB/AP server DB DB server API API (assuming a separate program is run)

Through the official tutorial of flask, you can get an image of the implementation from the client to the DB server in the above conceptual diagram, but apart from that, I want to run a program that fetches information from the API. How is it common to deploy programs that fetch information from the API each time?

I have imagined two ways, one is borrowing something like Sakura's VPS server and running virtual web application (assuming something like venv) and API One way to build two virtual environments to get information from and run the two, the other is to run the web application on the Sakura VPS server, the program to take information from the API on AWS, It is to connect with the DB server.

The above is my image, so please tell me how it is generally done. I would also appreciate it if you could teach me some reference sites.

As for this chart, it is assumed that there are real-time characteristics such as stocks and exchanges, and we plan to write a program that has real-time characteristics in a range that does not restrict access to the API.

Sorry for the verbose and general question, thank you.

  • Answer # 1

    The answer is that you can use either one, but how about the advantages and disadvantages of running the DB and the acquisition server on different servers?


    From disadvantages

    If you run on a server where information acquisition is different from DB, communication will increase by two as long as you manage it. Once acquired, the information needs to be sent (from the acquisition server) and received again (in the DB).
    If there is a large amount of data exchanged, the cost (monetary and time lag) will increase accordingly.
    If your budget is infinite, you don't have to think about it.

    In addition, the number of possible faults increases by increasing the number of communication paths.
    Along with that, for example, when the acquisition server dies, the data will not be updated, but you need to think about whether it is necessary to inform the front.


    About the advantages

    Since the server is divided, the load on the server on which the DB is located is reduced.

    Also, you can control the amount of data transferred to the DB according to the DB load.
    For example, the number of synchronizations with the acquisition server can be reduced during times when access to the DB increases.

    The acquisition server scale-out design is simplified.


    I don't think it would be unavoidable if you don't separate servers in advance, so personally I think that it will be created in a single configuration for the time being.

    Whether logging should be done properly, and the design should be re-examined according to the load.

    If you make it portable with virtualization technology, you may be able to reduce the cost of porting.


    By the way, if the DB load is heavy, you should consider a cache server.
    Since it is necessary to grasp the data as a flow until it enters the DB, I think that it is good to consider paying attention to the flow rate (such as network and IO).