Kubernetes (Helm)

LoRAX includes Helm charts that make it easy to start using LoRAX in production with high availability and load balancing on Kubernetes.

To spin up a LoRAX deployment with Helm, you only need to be connected to a Kubernetes cluster through `kubectl``. We provide a default values.yaml file that can be used to deploy a Mistral 7B base model to your Kubernetes cluster:

helm install mistral-7b-release charts/lorax

The default values.yaml configuration deploys a single replica of the Mistral 7B model. You can tailor configuration parameters to deploy any Llama or Mistral model by creating a new values file from the template and updating variables. Once a new values file is created, you can run the following command to deploy your LLM with LoRAX:

helm install -f your-values-file.yaml your-model-release charts/lorax

To delete the resources:

helm uninstall your-model-release