Skip to content

Latest commit





Async Batcher with Keras model and FastAPI

This example shows how to use the async_batcher library with FastAPI to process data in batches by a Keras (Tensorflow) model.

In this example, we serve a TensorFlow model using FastAPI and create two endpoints:

  • /predict to make predictions using the TensorFlow model directly
  • /optimized_predict to make predictions using the TensorFlow model with the async_batcher library

How to use

# Install the packages
pip install -r requirements.txt

# Run the FastAPI server with Uvicorn
PYTHONPATH=$PYTHONPATH:$(git rev-parse --show-toplevel) uvicorn main:app --reload

Load testing

To evaluate the performance of the /predict and /optimized_predict endpoints, we can use the locust library to simulate multiple users making requests to the server.

You can run one of the provided Locust files to simulate the load testing:

To run the load test, you can use the following command:

locust -f

Then, open the web interface at http://localhost:8089 and start the load test.