Best approach for developing a stateful computation-heavy application with a rest-api interface using python?

I want to develop an end-to-end machine learning application where data will be in GPU-memory and computations will run on the GPU. A stateless RESTfull service with a database is not desirable since the traffic between GPU-memory and database will destroy the “purpose” of it being fast.

The way I see it is that I need a way to “serve” the class (let’s call it as experiment class) which has the data and the methods, then call them using rest APIs.

Right now I am using FastApi and initialize the experiment class in it which I believe is not optimal. My class (as well as the data) lives in FastAPI runtime. Kinda like,

import experiment_class
import FastApi

app = FastAPI()
my_experiment = expertiment_class()

@app.get("/load_csv")
my_experiment.load_csv("some_file_path")

// do some more on the data
...

There are two problems I am having a hard time with,

One of them is the terminology:

  • Is this really a stateful application?
  • Is there a word to describe what I am doing? Is this a “Model, View, Controller” design, can it be a simple “Server-Client” or is it something completely different?
  • Do I need a “Web-server”, a “Web-framework” or a “Web-service” for this?

Another one is what technology I can use for this :

  • Is it okay to use FastAPI like this?
  • Do I set up an RPC server (Remote Procedure Call) and call it using Rest API?
  • Is WSGI or an ASGI server suitable for this task?
  • Are Django, Flask, Tornado like web frameworks only used for stateless apps? Because nearly all of the examples are.
  • Do I stick to bare bone Python where I use threads or BaseManager servers?

P.S. What I meant with end-to-end machine learning is that I should be able to load data, process it, and give it to the model for training all the while without leaving the GPU-memory. You can think of a Jupyter-notebook, but we call the cells with rest API.