Image source: GitHub - allegroai/clearml-serving
This setup orchestrates a scalable ML serving infrastructure using ClearML, integrating Kafka for message streaming, ZooKeeper for Kafka coordination, Prometheus for monitoring, Alertmanager for alerts, Grafana for visualization, and Triton for optimized model serving.
pip install clearml-serving #you prob did this already
clearml-serving create --name "aiorhu demo"`
New Serving Service created: id=ooooahah12345
Lets look at this in ClearML.CLEARML_EXTRA_PYTHON_PACKAGES
and add the packages you need for your model. we’ll add ours here.
CLEARML_EXTRA_PYTHON_PACKAGES: ${CLEARML_EXTRA_PYTHON_PACKAGES:-textstat empath torch transformers nltk openai datasets diffusers benepar spacy sentence_transformers optuna interpret markdown bs4}
docker/example.env
) with your clearml-server credentials and Serving Service UID. For example, you should have something like CLEARML_WEB_HOST="https://app.clear.ml"
CLEARML_API_HOST="https://api.clear.ml"
CLEARML_FILES_HOST="https://files.clear.ml"
CLEARML_API_ACCESS_KEY="<access_key_here>"
CLEARML_API_SECRET_KEY="<secret_key_here>"
CLEARML_SERVING_TASK_ID="<serving_service_id_here>"
cd docker && docker-compose --env-file example.env -f docker-compose-triton.yml up
Notice: Any model that registers with “Triton” engine, will run the pre/post processing code on the Inference service container, and the model inference itself will be executed on the Triton Engine container.
Let’s review what we did.