Alaya NeW Cloud

Create a training job

Pick a framework, configure resources, mount storage and image — submit a training job from the HyperTrain console

HyperTrain is a Kubernetes-native distributed training service with built-in PyTorch, DeepSpeed, MPI, and TensorFlow frameworks. The platform abstracts infra, scheduling, and runtime dependencies into a unified service interface — so users can launch training jobs without managing the underlying ops.

Prerequisites

  • Compute account DCU balance > 0
  • Cash account balance > 0
  • The enterprise has provisioned NAS bulk or NAS performance storage in the current data center, with permissions on the current account.
  • For private images, the enterprise must have an image registry in the same data center.

Steps

Sign in and go to Product Center → Compute → HyperTrain. Click Activate or Create Job to enter the job creation page.

Job creation entry

1. Basic information

Basic information

2. Resource configuration

Resource configuration

3. Storage and image

Storage and image

4. Other settings

Other settings

Submit the job. The job appears in the list page once successfully created.

Field reference

FieldDescription
Job nameAuto-generated by default; customizable.
TemplateCreate from an existing template, or skip.
DescriptionFree-form summary of the job.
RegionData center where the job runs.
FrameworkPyTorch, DeepSpeed, MPI, TensorFlow.
ResourcesCompute spec and node count.
StorageStorage type and mount path.
ImageChoose from base image, app image, or private image. Private = your custom image stored in the enterprise registry.
Env vars (optional)Custom env. The platform also injects system variables automatically.
Auto-retryRetry the job up to N times on failure.
TimeoutHard cap on wall-clock runtime; the job auto-cancels on timeout.
Start commandDefault working dir for platform images is /root; for custom images the dir set in the image is used.
PriorityApplies only to queued jobs. Lower number = higher priority.

See also

Last updated on

Was this page helpful?

On this page