Ektos AI is now in Early Access!Join our Discord 

Available models for Inference with Dedicated Deployments #

The models listed below are available to deploy and use with Dedicated Deployments.

Text Models #

NameGPUGPU countString in APIAvailable QuantizationsMaximum Context LengthLicense
Llama 3.1 70B InstructNVIDIA H1002llama-3.1-70b-instructbf16, fp8131072Llama 3.1 Community License Agreement
-NVIDIA H1001llama-3.1-70b-instructfp8131072Llama 3.1 Community License Agreement
Llama 3.3 70B InstructNVIDIA H1002llama-3.3-70b-instructbf16, fp8131072Llama 3.3 Community License Agreement
-NVIDIA H1001llama-3.3-70b-instructfp8131072Llama 3.3 Community License Agreement
Qwen 2.5 Coder 32B InstructNVIDIA H1001qwen2.5-coder-32b-instructbf1632768Apache License 2.0
-NVIDIA H1002qwen2.5-coder-32b-instructbf1632768Apache License 2.0
Qwen 2.5 32B InstructNVIDIA H1001qwen2.5-32b-instructbf1632768Apache License 2.0
-NVIDIA H1002qwen2.5-32b-instructbf1632768Apache License 2.0
Gemma 2 27B InstructNVIDIA H1001gemma-2-27b-itbf164096Gemma
-NVIDIA H1002gemma-2-27b-itbf164096Gemma
legml-v0.1NVIDIA L40S1legml-v0.1bf1632768Apache License 2.0
-NVIDIA H1001legml-v0.1bf1632768Apache License 2.0
-NVIDIA H1002legml-v0.1bf1632768Apache License 2.0
Qwen 2.5 14B InstructNVIDIA L40S1qwen2.5-14b-instructbf1632768Apache License 2.0
-NVIDIA H1001qwen2.5-14b-instructbf1632768Apache License 2.0
-NVIDIA H1002qwen2.5-14b-instructbf1632768Apache License 2.0
Qwen 2.5 Coder 14B InstructNVIDIA L40S1qwen2.5-coder-14b-instructbf1632768Apache License 2.0
-NVIDIA H1001qwen2.5-coder-14b-instructbf1632768Apache License 2.0
-NVIDIA H1002qwen2.5-coder-14b-instructbf1632768Apache License 2.0
Pixtral 12b 2409NVIDIA H1002pixtral-12b-2409bf16128000Apache License 2.0
-NVIDIA L40S1pixtral-12b-2409bf16128000Apache License 2.0
-NVIDIA H1001pixtral-12b-2409bf16128000Apache License 2.0
Mistral Nemo Instruct 2407NVIDIA L40S1mistral-nemo-instruct-2407bf16128000Apache License 2.0
-NVIDIA H1001mistral-nemo-instruct-2407bf16128000Apache License 2.0
-NVIDIA H1002mistral-nemo-instruct-2407bf16128000Apache License 2.0
Gemma 2 9B InstructNVIDIA L41gemma-2-9b-itbf164096Gemma
-NVIDIA L40S1gemma-2-9b-itbf164096Gemma
-NVIDIA H1001gemma-2-9b-itbf164096Gemma
-NVIDIA H1002gemma-2-9b-itbf164096Gemma
Llama 3.1 8B InstructNVIDIA L40S1llama-3.1-8b-instructbf16, fp8131072Llama 3.1 Community License Agreement
-NVIDIA H1001llama-3.1-8b-instructbf16, fp8131072Llama 3.1 Community License Agreement
-NVIDIA H1002llama-3.1-8b-instructbf16, fp8131072Llama 3.1 Community License Agreement
-NVIDIA L41llama-3.1-8b-instructbf16, fp8131072Llama 3.1 Community License Agreement
Qwen 2.5 Coder 7B InstructNVIDIA L41qwen2.5-coder-7b-instructbf1632768Apache License 2.0
-NVIDIA L40S1qwen2.5-coder-7b-instructbf1632768Apache License 2.0
-NVIDIA H1001qwen2.5-coder-7b-instructbf1632768Apache License 2.0
-NVIDIA H1002qwen2.5-coder-7b-instructbf1632768Apache License 2.0
Qwen 2.5 7B InstructNVIDIA L41qwen2.5-7b-instructbf1632768Apache License 2.0
-NVIDIA L40S1qwen2.5-7b-instructbf1632768Apache License 2.0
-NVIDIA H1001qwen2.5-7b-instructbf1632768Apache License 2.0
-NVIDIA H1002qwen2.5-7b-instructbf1632768Apache License 2.0
Phi 3.5 Mini InstructNVIDIA L41phi-3.5-mini-instructbf16131072MIT License
-NVIDIA L40S1phi-3.5-mini-instructbf16131072MIT License
-NVIDIA H1001phi-3.5-mini-instructbf16131072MIT License
-NVIDIA H1002phi-3.5-mini-instructbf16131072MIT License
Gemma 2 2B InstructNVIDIA L40S1gemma-2-2b-itbf164096Gemma
-NVIDIA H1001gemma-2-2b-itbf164096Gemma
-NVIDIA H1002gemma-2-2b-itbf164096Gemma
-NVIDIA L41gemma-2-2b-itbf164096Gemma
Qwen 2.5 Coder 1.5B InstructNVIDIA L40S1qwen2.5-coder-1.5b-instructbf1632768Apache License 2.0
-NVIDIA H1001qwen2.5-coder-1.5b-instructbf1632768Apache License 2.0
-NVIDIA H1002qwen2.5-coder-1.5b-instructbf1632768Apache License 2.0
-NVIDIA L41qwen2.5-coder-1.5b-instructbf1632768Apache License 2.0
Qwen 2.5 1.5B InstructNVIDIA L41qwen2.5-1.5b-instructbf1632768Apache License 2.0
-NVIDIA L40S1qwen2.5-1.5b-instructbf1632768Apache License 2.0
-NVIDIA H1001qwen2.5-1.5b-instructbf1632768Apache License 2.0
-NVIDIA H1002qwen2.5-1.5b-instructbf1632768Apache License 2.0

Audio Models #

NameGPUGPU countString in APIAvailable QuantizationsMaximum Context LengthLicense
Whisper Large v3 TurboNVIDIA L40S1whisper-large-v3-turbofp16-MIT License
-NVIDIA H1001whisper-large-v3-turbofp16-MIT License
-NVIDIA H1002whisper-large-v3-turbofp16-MIT License
-NVIDIA L41whisper-large-v3-turbofp16-MIT License

Text Embedding Models #

NameGPUGPU countString in APIAvailable QuantizationsMaximum Context LengthLicense
GTE Large EN v1.5NVIDIA H1002gte-large-en-v1.5f328kApache License 2.0
-NVIDIA L41gte-large-en-v1.5f328kApache License 2.0
-NVIDIA L40S1gte-large-en-v1.5f328kApache License 2.0
-NVIDIA H1001gte-large-en-v1.5f328kApache License 2.0
GTE Multilingual baseNVIDIA L41gte-multilingual-basefp168kApache License 2.0
-NVIDIA L40S1gte-multilingual-basefp168kApache License 2.0
-NVIDIA H1001gte-multilingual-basefp168kApache License 2.0
-NVIDIA H1002gte-multilingual-basefp168kApache License 2.0

Ektos AI offers the most popular and trending open source models. We add new models on our platform immediately after they are released.

If you would like to use a model that is not currently supported, please let us know on Discord!

Next steps #