Packaging python codes as services
Here we assume that you already have some functional python code. The first step of getting it ready to be integrated into ASKCOS is turning the code into a microservice, like all the other prediction modules. In this way, the service can make predictions and return results to the user who sent the requests via HTTP queries.
Option 1 (basic): using Flask/FastAPI
In this example we will use the site_selectivity module as a reference and assume you have some existing code that can run a prediction task, for example, SiteSelectivityModel().predict().
The first step is wrapping this function as an API endpoint, with the framework of your choice (we recommend FastAPI). See the relevant code section in site_selectivity_server.py
as an example. This turns your prediction code into a service that can be run with
$ python your_server.py
and can be queried once started. The site selectivity module does not depend on other modules and can hence be served standalone by following the instructions in the README. Feel free to try out standalone serving and become familiar with the expected input/output formats, which are most likely json.
Option 2 (advanced): using Torchserve
Serving with Torchserve requires a few more steps, but makes it trivially easy to add new models later on. In this example we will use the augmented_transformer forward predictor module as a reference and assuming you have some existing code that can run a prediction task, for example, ServerModel().run()
.
The first step is wrapping this function in your_handler.py
under YourHandler.inference()
. See the relevant code section in at_handler.py
as an example.
Once the handler has been implemented, you need to package your code, including the handler, with other required files (e.g., model checkpoints, data files) into a servable model archive using torch-model-archiver
, similar to how it was done for augmented Transformer. Note that while we typically perform this archiving step in Docker, this is not required and you can also do it in a conda environment. If successful, you should be able to obtain a model archive ending with .mar
, which can then be served by
$ torchserve \
--start \
--foreground \
--ncs \
--model-store=${FOLDER_CONTAINING_MARS} \
--models \
${MODEL_NAME}=${MODEL_NAME}.mar
and can be queried once started. As with the site selectivity module, the augmented Transformer forward predictor does not depend on other modules and can hence be served standalone by following the instructions in the README.