Alaya NeW Cloud

分布式训练任务详情

查询单个分布式训练任务的完整配置与运行信息,包含资源规格、存储挂载、环境变量、启动命令及执行记录。

GEThttps://api.alayanew.com/v1/training/instance/{id}/detail

鉴权(Authorizations)

AuthorizationString必填

用户可通过已获取的 Open API Key 做验证。例如:Bearer [YOUR_API_KEY]

Path Parameters

idString必填

训练任务 ID(来自任务列表id)。例如:1234567890

Response

statusInteger

业务状态码,200 表示成功。

messageString

接口响应信息。例如:"OK"

dataObject

训练任务详情。

显示 properties
idString

训练任务 ID。例如:"1234567890"

nameString

训练任务名称。例如:"llama3-8b-sft"

descString

训练任务描述。

trainingTypeString

训练类型:PRE_TRAINING 预训练、HPC

trainingFramworkString

训练框架。例如:"PyTorch"

priorityInteger

任务优先级,1~3,数值越小优先级越高。例如:3

imageString

镜像地址。

aidcIdInteger

智算中心 ID。例如:1001

storageConfigsArray<Object>

挂载的存储列表。

显示 Items
idString

存储实例 ID。例如:"00964ce3-b99c-43da-8826-ab9c5574eef2"

nameString

存储目录名。例如:"nas0001"

descriptionString

存储描述。例如:"训练数据"

orderInstanceIdString

订阅实例ID。例如:"00964ce3-b99c-43da-8826-ab9c5574eef2"

statusString

状态。例如:"Creating"

storageTypeString

存储类型。例如:"capacity"

quotaString

配额(GB)。例如:"100"

usedString

已用量(GB)。例如:"10"

mountPathString

默认挂载路径。例如:"/mnt/nas/new_path"

resourceObject

资源。

显示 properties
quotaString

系统盘容量(GB)。例如:50

productNameString

产品名(资源类型)。例如:"NA型实-大容量存储"

productCodeString

产品编码。例如:"PRD-NAS-CAPACITY-1"

createdTimeString

创建时间。例如:"2023-12-10 00:00:00"

lastUpdateTimeString

修改时间。例如:"2023-12-10 23:59:59"

releaseTimeString

释放时间。例如:"2023-12-10 23:59:59"

tenantIdString

租户ID。例如:"4d7119bd-8e71-49a9-a319-f84d165f79d9"

priceString

价格。例如:"0.01B"

priceUnitString

价格单位,例如:元/GB/月。例如:8

currencyUnitOfMeasureInteger

现金单位。例如:9

currencyUnitPriceFormulaString

现金价格公式,例如:元/GB/月。例如:"0.01B"

sharePolicyObject

分享策略。

显示 properties
policyString

策略。例如:"R"

createdByString

创建者Id。例如:"4d7119bd-8e71-49a9-a319-f84d165f79d9"

createdByNameString

创建者名称。

lastUpdateByString

上次更新者ID。

lastUpdateByNameString

上次更新者名称。

createSceneString

创建来源实例类型,如 ALab、CCI,为空表示手动创建。例如:"ALab"

createSceneInstanceIdString

创建来源实例ID。例如:"00964ce3-b99c-43da-8826-ab9c5574eef2"

isDefaultStorageBoolean

是否为当前实例的默认存储(创建时绑定的存储)。例如:true

aidcIdInteger

智算中心ID。例如:[{"storageId":"0000-0000-0000-0000","storageType":"nas-capacity","fileDirectory":"nas123","mountPath":"/root/nas/123","onlyRead":true}]

envObject

环境变量(原始 JSON)。例如:[{"key": "ENV", "value": "production"}]

enableAutoRetryBoolean

是否支持自动重试。例如:true

maxRetryCountInteger

最大重试次数。例如:3

enableTimeoutCancelBoolean

是否超时取消。例如:true

timeoutHoursInteger

任务运行最长时长(小时)。例如:24

startCommandString

容器启动命令。例如:"python train.py --data /root/nas/123/data"

resourceObject

资源。

显示 properties
cpuCoresString

CPU核数。例如:18

memoryGBString

内存大小(GB)。例如:200

gpuNameString

GPU名称。例如:"NVIDIA-H800A-NV-80G"

gpuCountString

GPU数量。例如:8

productNameString

产品名称。例如:"分布式训练1"

productCodeString

产品代码。例如:"PRD-TRAIN-1"

workerCountInteger

节点数量。例如:1

productPriceString

产品单价。例如:"20CB/小时"

typeString

资源类型(worker/ps/Chief/lancher)。例如:"worker"

createdByString

创建者 ID。例如:"user1"

creatorNameString

创建者名称。例如:"张三"

createdTimeString

创建时间。例如:"2025-05-30 10:00:00"

startTimeString

开始运行时间。例如:"2025-05-30 10:02:00"

lastUpdateTimeString

最近更新时间。

statusString

任务状态。例如:"RUNNING"

sourceString

任务来源。例如:"alab"

execInfosArray<Object>

训练任务执行信息列表。

显示 Items
trainingTaskIdString

分布式训练ID。

statusString

状态。

infoString

运行信息。

createTimeString

创建时间。

curl -X 'GET' \
  'https://api.alayanew.com/v1/training/instance/1234567890/detail' \
  -H 'accept: application/json' \
  -H 'Authorization: Bearer [YOUR_API_KEY]'
import requests

task_id = "1234567890"
url = f"https://api.alayanew.com/v1/training/instance/{task_id}/detail"
headers = {
    "accept": "application/json",
    "Authorization": "Bearer [YOUR_API_KEY]"
}

response = requests.get(url, headers=headers)
response.raise_for_status()
print(response.json())
const taskId = '1234567890';

fetch(`https://api.alayanew.com/v1/training/instance/${taskId}/detail`, {
  method: 'GET',
  headers: {
    'accept': 'application/json',
    'Authorization': 'Bearer [YOUR_API_KEY]'
  }
})
  .then(res => {
    if (!res.ok) {
      throw new Error(`HTTP error! status: ${res.status}`);
    }
    return res.json();
  })
  .then(console.log)
  .catch(console.error);
{
  "status": 200,
  "message": "OK",
  "data": {
    "id": "1234567890",
    "name": "my training task",
    "desc": "my training task",
    "trainingType": "PRE_TRAINING",
    "trainingFramwork": "TensorFlow",
    "priority": 3,
    "image": "harbor.zetyun.cn/anc-public/general/tensorflow:2.16.1-gpu-jupyter",
    "aidcId": 1001,
    "storageConfigs": [
      {
        "id": "00964ce3-b99c-43da-8826-ab9c5574eef2",
        "name": "nas0001",
        "description": "我的实例",
        "orderInstanceId": "00964ce3-b99c-43da-8826-ab9c5574eef2",
        "status": "Creating",
        "storageType": "capacity",
        "quota": "100",
        "used": "10",
        "mountPath": "/mnt/nas/new_path",
        "resource": {
          "quota": "50",
          "productName": "NA型实-大容量存储",
          "productCode": "PRD-NAS-CAPACITY-1"
        },
        "createdTime": "2023-12-10 00:00:00",
        "lastUpdateTime": "2023-12-10 23:59:59",
        "releaseTime": "2023-12-10 23:59:59",
        "tenantId": "4d7119bd-8e71-49a9-a319-f84d165f79d9",
        "price": "0.01B",
        "priceUnit": "8",
        "currencyUnitOfMeasure": 9,
        "currencyUnitPriceFormula": "0.01B",
        "sharePolicy": {
          "policy": "R"
        },
        "createdBy": "4d7119bd-8e71-49a9-a319-f84d165f79d9",
        "createdByName": "string",
        "lastUpdateBy": "string",
        "lastUpdateByName": "string",
        "createScene": "ALab",
        "createSceneInstanceId": "00964ce3-b99c-43da-8826-ab9c5574eef2",
        "isDefaultStorage": true,
        "aidcId": 9007199254740991
      }
    ],
    "env": {
      "ENV_MODE": "production",
      "MAX_CONN": "200"
    },
    "enableAutoRetry": true,
    "maxRetryCount": 3,
    "enableTimeoutCancel": true,
    "timeoutHours": 1,
    "startCommand": "python train.py --data /root/nas/123/data",
    "resource": {
      "cpuCores": "18",
      "memoryGB": "200",
      "gpuName": "NVIDIA-H800A-NV-80G",
      "gpuCount": "8",
      "productName": "分布式训练1",
      "productCode": "PRD-TRAIN-1",
      "workerCount": 1,
      "productPrice": "2DCU/小时",
      "type": "worker"
    },
    "createdBy": "user1",
    "creatorName": "张三",
    "createdTime": "2026-06-01T09:07:50.456Z",
    "startTime": "2026-06-01T09:07:50.456Z",
    "lastUpdateTime": "2026-06-01T09:07:50.456Z",
    "status": "RUNNING",
    "source": "alab",
    "execInfos": [
      {
        "trainingTaskId": "string",
        "status": "string",
        "info": "string",
        "createTime": "2026-06-01T09:07:50.456Z"
      }
    ]
  }
}
{
  "status": 403,
  "message": "Forbidden",
  "data": {}
}
{
  "status": 500,
  "message": "Internal Server Error",
  "data": {}
}

Last updated on