Alaya New Cloud Container Instance (CCI)
Product Overview
Alaya New Cloud Container Instance (CCI) is a Kubernetes-based serverless container service. By abstracting away underlying cluster and server management, CCI allows users to focus entirely on container images and business logic to rapidly deploy containerized applications. Featuring sub-second startup times, elastic scalability, and a pay-as-you-go billing model, CCI empowers users with industry-leading GPU compute and mainstream AI framework support at a fraction of the traditional cost.
Product Architecture
The core design philosophy behind CCI is to deliver accessible, cost-effective, and premium GPU compute. Many small and medium-sized workloads only require single-node compute (up to 8 GPUs). For these users, managing a full Kubernetes cluster introduces steep learning curves and high operational overhead. CCI bridges this gap by offering a ready-to-use serverless execution environment, eliminating infrastructure management so you can focus strictly on model development and inference workloads.
CCI doesn't operate in a vacuum. It natively integrates with Serverless Job (batch processing workloads) and Inference capabilities, creating a comprehensive, end-to-end AI compute solution.
Under the hood, CCI leverages KVM-based secure sandboxing for robust tenant-level resource isolation, alongside network overlay technologies (like VXLAN) for secure multi-tenant networking. For storage, it utilizes Block Storage for system disks and integrates with NAS shared file systems via a high-performance Overlay Network, guaranteeing high-throughput access to your data and models.
Ultimately, CCI is built to feel as seamless as your local dev machine—just submit your workloads. The platform handles all underlying resource orchestration, provisioning, and maintenance, billing you exclusively for the compute you actually consume.
Pricing & Billing
CCI features a transparent, pay-as-you-go pricing model based purely on actual resource consumption, including:
A. Container Compute
- Billing Model: Real-time metering based on actual DCU consumption.
- Billing Cycle: Billing begins the moment the container spins up and stops when it terminates, pro-rated down to the exact second.
B. Storage
Storage is divided into two tiers: Disk Storage and NAS Storage.
- Billing Model: Billed based on the maximum provisioned storage capacity upon creation.
- Billing Cycle:
- Charges are settled hourly at the top of the hour, starting from the time of provisioning or expansion.
- Upon resource deletion, billing stops immediately. Any partial hour of usage is pro-rated to the second.
C. Image Registry
- Billing Model: Billed based on the maximum provisioned capacity of your image registry.
- Billing Cycle:
- Settled hourly at the top of the hour.
- Upon deletion, billing stops immediately. Any partial hour of usage is pro-rated to the second.
❗ Important: To prevent accidental data loss, storage volumes and image registries are not automatically deleted when a task terminates. These resources will continue to occupy your quota and incur persistent charges until manually released. For details, see the Billing Overview.
Core Features
✦ Flexible Compute Provisioning: Choose from a wide range of CPU and GPU instance types (including H100, L40S, and P4) to perfectly match your workload requirements and eliminate resource overprovisioning.
✦ Tiered Storage Solutions: Offers multiple NAS storage tiers, ranging from high-performance to capacity-optimized options, ensuring the ideal cost-to-performance ratio for any I/O profile.
✦ Streamlined Image & Environment Management: Natively supports multi-source images for effortless deployment. Simplifies operations with dynamic environment variable injection to tweak application configurations on the fly.
✦ End-to-End Lifecycle Management: Manage your containers seamlessly from provisioning to teardown. Supports diverse access methods, including WebShell and built-in Jupyter Notebook integrations.
✦ Real-Time Observability: Features comprehensive metrics monitoring and event logging. Supports intelligent scheduling for highly concurrent instances, giving you a transparent, visual dashboard for full-lifecycle container management.
✦ Precise Cost Control: Billed purely on a pay-as-you-go basis (per DCU/hour). You are only charged from the exact moment an instance starts until it stops. Zero idle charges.
Key Benefits
- Focus on the Business: Offload tedious infrastructure provisioning and complex O&M. Developers can concentrate purely on business logic, accelerating innovation and iteration cycles.
- Open Ecosystem: Natively supports popular machine learning frameworks and lightweight container runtimes. Flexibly integrates with the Alaya NeW Cloud's product suite.
- Secure & Reliable: Delivers VM-grade security isolation while retaining the lightweight agility of containers, hitting the optimal sweet spot between security and efficiency.
- AI-Native: Purpose-built and optimized for the entire AI pipeline. Enables low-cost model development and accelerates training via high-performance compute clusters.
Use Cases
Large AI Model Development
CCI provides an instant, cloud-native dev environment. Quickly spin up projects using mainstream frameworks, validate prototypes, and test business logic. It supports seamless team collaboration and parallel development, drastically reducing time-to-market and boosting innovation.
Model Fine-Tuning
To address the varying VRAM and system memory demands of different workloads, CCI offers a diverse portfolio of GPU specs. It natively supports fine-tuning techniques like LoRA and full-parameter tuning, ensuring efficient, cost-effective model customization with minimal framework migration friction.
Model Inference Serving
Seamlessly transition fine-tuned models into production. Supports API-driven serving, batch inference, and real-time online inference to deliver high-concurrency, low-latency AI services.
Core Concepts
- Serverless: A cloud computing execution model where the cloud provider dynamically manages the allocation and provisioning of servers. You don't manage underlying infrastructure; you simply submit workloads, specify compute requirements, and pay only for the exact resources and time consumed.
- Container: A standalone, executable package of software that includes everything needed to run an application in an isolated environment. Multiple containers can run on a single host node.
- CCI (Cloud Container Instance): A fully managed, serverless container service. It eliminates the need to build and maintain Kubernetes clusters, allowing you to spin up containers out-of-the-box.
- NAS (Network Attached Storage): A network-level shared file system. Capacity NAS is ideal for low-I/O data like backups and logs, while High-Performance NAS is optimized for I/O-intensive workloads like AI training and databases.
- Container Image: An immutable, lightweight, standalone, executable package that includes everything needed to run a piece of software (code, runtime, system tools, libraries, and settings).
- Kubernetes (K8s): An open-source container orchestration system for automating software deployment, scaling, and lifecycle management.