Inference -- Dedicated Deployments
Deploy any model with a dedicated GPU and API endpoint in seconds. Customize your deployment with options to select quantization, context length, and GPU accelerator tailored to your exact needs. Maximum performance with no rate limits.
GPU Type | $/minute | $/hour | Select |
---|---|---|---|
NVIDIA L4 24GB 0 | $0.0183 | $1.10 | |
NVIDIA L40S 48GB 0 | $0.0325 | $1.95 | |
NVIDIA H100 80GB 0 | $0.0658 | $3.95 | |
NVIDIA H100 80GB x 2 0 | $0.1317 | $7.90 |
Inference -- Serverless Models
Access the most popular models instantly, with no cold starts. Pay only for what you use (by tokens, minutes, steps) ensuring cost efficiency and seamless performance.
Text and Embedding models | $/1M tokens | Select |
---|---|---|
Text models (0-4B params) LLM | $0.08 | |
Text models (4-8B params) LLM | $0.15 | |
Text models (8-21B params) LLM | $0.25 | |
Text models (21-41B params) LLM | $0.70 | |
Text models (41-80B params) LLM | $0.90 | |
Embeddings models (0-250M params) TEBD | $0.008 | |
Embeddings models (250-500M params) TEBD | $0.016 |
Spending Limits
Spending limits restrict how much you can spend on the Ektos AI platform per calendar month.
- The spending limit is determined by your total historical Ektos AI spend.
- You can purchase prepaid credits to immediately increase your historical spend.
Note: Credits are counted against your spending limit, so it is possible to hit the spending limit before all of your current credits are depleted.
Tier | Spending Limit ($/month) | Criteria |
---|---|---|
Tier 1 | $50 | Default with valid payment method added |
Tier 2 | $500 | Total historical spend of $100+ |
Tier 3 | $5000 | Total historical spend of $1,000+ |
Tier 4 | $50000 | Total historical spend of $10,000+ |
Custom | Custom | Contact sales@ektos.ai |