DeployService（Expert）

更新时间：2025-07-03 18:32:25

Deploy base model interface.

POST

https://api.alayanew.com/api/serverless-infer/v1/deployment/expert

Authorizations

Authorizations：StringHeaderRequired

用户可通过已获取Open API Key做验证，例如：plain Credential=[YOUR_AK],Signature=[YOUR_SK]。

Body

application/json

vksId：StringRequired

Vital Kubernetes Engine (VKS) Cluster ID.

namespace：StringRequired

Vital Kubernetes Engine (VKS) Namespace.

name：StringRequired

Service name.

servedName：List<String>Required

Internal model identifier.

modelId：StringRequired

Model ID.

headConfig：ObjectRequired

headConfig.image：IntRequired

headConfig.imagePullSecret：Object

headConfig. labels：Object

headConfig.env：Object

headConfig.cmd：Array[String]Required

headConfig. args：Array[String]

headConfig.resource：ObjectRequired

headConfig. pvcMounts：Array

workerConfig：Object

workerConfig.image：IntRequired

workerConfig.imagePullSecret：Object

workerConfig. labels：Object

workerConfig.env：Object

workerConfig.cmd：Array[String]Required

workerConfig. args：Array[String]

workerConfig.resource：ObjectRequired

workerConfig.workers：IntRequired

scale：ObjectRequired

scale.max：IntRequired

scale.min：IntRequired

scale.rpsValue：Int

scale.inFlightValue：Int

scale. idleTime：Int

extensions：ObjectRequired

Extension fields.

Response

状态码：

200

application/json

code：Int

code is a common return value format indicating the execution result of the query operation.

-1

0 is the success flag, indicating the operation completed successfully.

data：Object

data. serviceId：String

msg：StringRequired

Returns exception information when the code value is -1.

cURL

Python

JavaScript

Java

curl --location --request POST 'https://api.alayanew.com/api/serverless-infer/v1/deployment/expert' 
     --header 'Authorization:plain Credential=YOUR_AK,Signature=YOUR_SK' 
     --header 'Content-Type: application/json' 
     --data ' 
        {
        "name": "test-expert",
        "namespace": "default",
        "vksId": "vcacb50arkk4",
        "servedName": ["testvllm"],
        "modelId": "c486cdee-c316-4fc1-9f75-0d1741940f27",
        "scale": {
            "max": 3,
            "min": 1,
            "rpsValue": 10,
            "idleTime": 60
        },
        "headConfig": {
            "image": "registry.cn-hangzhou.aliyuncs.com/ls-2018/test:vllm-0.8.1p",
            "cmd": ["sh", "-c", "test.sh"],
            "labels": {  },
            "env": {
            "VLLM_ALLOW_RUNTIME_LORA_UPDATING": "true"
            },
            "args": ["-Xmx", "52m"],
            "resource": {
            "mem": 32,
            "cpu": 4,
            "gpu": {
                "gpuType": "nvidia.com/gpu-l40s",
                "count": 1
            }
            },
            "pvcMounts": [
            {
                "containerPath": "/scripts",
                "pvcName": "test-name"
            }
            ]
        },
        "workerConfig": {
            "workers": 1,
            "image": "registry.cn-hangzhou.aliyuncs.com/ls-2018/test:vllm-0.8.1p",
            "cmd": ["sh", "-c", "test.sh"],
            "labels": {   },
            "env": {
            "VLLM_ALLOW_RUNTIME_LORA_UPDATING": "true"
            },
            "args": [""],
            "resource": {
            "workers": 3,
            "mem": 8,
            "cpu": 4,
            "gpu": {
                "gpuType": "nvidia.com/gpu-l40s",
                "count": 1
            }
            },
            "pvcMounts": [
            {
                "containerPath": "/scripts",
                "pvcName": "test-name"
            }
            ]
        },
        "extensions": {
            "usage": "test"
        }
        }'

200

400

401

403

404

500

{
  "code": "0",
  "data": {
    "serviceId":""
  },
  "msg": "optional,string,"
}

tip