I want to develop an end-to-end machine learning application where data will be in GPU-memory and computations will run on the GPU. A stateless RESTfull service with a database is not desirable since the traffic between GPU-memory and database will destroy the “purpose” of it being fast.
The way I see it is that I need a way to “serve” the class (let’s call it as experiment class) which has the data and the methods, then call them using rest APIs.
Right now I am using FastApi and initialize the experiment class in it which I believe is not optimal. My class (as well as the data) lives in FastAPI runtime. Kinda like,
import experiment_class
import FastApi
app = FastAPI()
my_experiment = expertiment_class()
@app.get("/load_csv")
my_experiment.load_csv("some_file_path")
// do some more on the data
...
There are two problems I am having a hard time with,
One of them is the terminology:
- Is this really a stateful application?
- Is there a word to describe what I am doing? Is this a “Model, View, Controller” design, can it be a simple “Server-Client” or is it something completely different?
- Do I need a “Web-server”, a “Web-framework” or a “Web-service” for this?
Another one is what technology I can use for this :
- Is it okay to use FastAPI like this?
- Do I set up an RPC server (Remote Procedure Call) and call it using Rest API?
- Is WSGI or an ASGI server suitable for this task?
- Are Django, Flask, Tornado like web frameworks only used for stateless apps? Because nearly all of the examples are.
- Do I stick to bare bone Python where I use threads or BaseManager servers?
P.S. What I meant with end-to-end machine learning is that I should be able to load data, process it, and give it to the model for training all the while without leaving the GPU-memory. You can think of a Jupyter-notebook, but we call the cells with rest API.