Ektos AI is now in Early Access!Join our Discord 

Introduction #

Welcome to the Ektos AI documentation!

Ektos AI is a cloud platform for generative AI where you can use the best open-source models without having to think about the underlying compute infrastructure.

We handle the heavy lifting: model deployment, GPU configuration, performance optimization, scaling, and continuous monitoring.

Services #

  • Inference -- Dedicated Deployments: Deploy any model with a dedicated GPU and API endpoint in seconds. Customize your deployment with options to select quantization, context length, and GPU accelerator tailored to your exact needs. Maximum performance with no rate limits.
  • Inference -- Serverless: Access the most popular models instantly, with no cold starts. Pay only for what you use (by tokens, minutes, steps) ensuring cost efficiency and seamless performance.

(Additional services will be announced after Early Access)

Real-time usage dashboards, statistics and logs are directly available from our web platform to get an accurate overview and manage costs effectively.

All our API inference endpoints are compatible with the OpenAI API. You can seamlessly use any OpenAI API client library for an immediate migration to our platform, leading to significant cost savings.

Next steps #