跳到主要内容

部署模型(Expert)

部署基模型接口。
POST
https://api.alayanew.com/api/serverless-infer/v1/deployment/expert
Authorizations
AuthorizationsStringHeaderRequired

用户可通过已获取Serverless API Key做验证,例如:plain Credential=[YOUR_AK],Signature=[YOUR_SK]。

Body
application/json
vksIdStringRequired

弹性容器集群(VKS)ID。

namespaceStringRequired

弹性容器集群(VKS)NameSpace。

nameStringRequired

服务名称。

servedNameList<String>Required

模型内部标识。

modelIdStringRequired

模型ID。

headConfigObjectRequired

workerConfigObject

scaleObjectRequired

extensionObjectRequired

扩展字段。

Response
状态码:application/json
codeInt

code是一种常见的返回值形式,表示查询操作的执行结果。

0是成功标识,表示操作成功完成。
dataObject

msgString

code返回值为-1时,返回异常信息。

curl --location --request POST 'https://api.alayanew.com/api/serverless-infer/v1/deployment/expert' 
     --header 'Authorization:plain Credential=YOUR_AK,Signature=YOUR_SK' 
     --header 'Content-Type: application/json' 
     --data ' 
        {
        "name": "test-expert",
        "namespace": "default",
        "vksId": "vcacb50arkk4",
        "servedName": ["testvllm"],
        "modelId": "c486cdee-c316-4fc1-9f75-0d1741940f27",
        "scale": {
            "max": 3,
            "min": 1,
            "rpsValue": 10,
            "idleTime": 60
        },
        "headConfig": {
            "image": "registry.cn-hangzhou.aliyuncs.com/ls-2018/test:vllm-0.8.1p",
            "cmd": ["sh", "-c", "test.sh"],
            "labels": {
            "usage": "test"
            },
            "env": {
            "VLLM_ALLOW_RUNTIME_LORA_UPDATING": "true"
            },
            "args": ["-Xmx", "52m"],
            "resource": {
            "workers": 3,
            "mem": 32,
            "cpu": 4,
            "gpu": {
                "gpuType": "vidia.com/gpu-l40s",
                "count": 1
            }
            },
            "pvcMounts": [
            {
                "containerPath": "/scripts",
                "pvcName": "test-name"
            }
            ]
        },
        "workerConfig": {
            "workers": 3,
            "image": "registry.cn-hangzhou.aliyuncs.com/ls-2018/test:vllm-0.8.1p",
            "cmd": ["sh", "-c", "test.sh"],
            "labels": {
            "usage": "test"
            },
            "env": {
            "VLLM_ALLOW_RUNTIME_LORA_UPDATING": "true"
            },
            "args": [""],
            "resource": {
            "workers": 3,
            "mem": 8,
            "cpu": 4,
            "gpu": {
                "gpuType": "vidia.com/gpu-l40s",
                "count": 1
            }
            },
            "pvcMounts": [
            {
                "containerPath": "/scripts",
                "pvcName": "test-name"
            }
            ]
        },
        "extensions": {
            "usage": "test"
        }
        }'
{
  "code": "0",
  "data": {
    "serviceId":""
  },
  "msg": "optional,string,"
}