部署模型(Expert)
部署基模型接口。
POST
https://api.alayanew.com/api/serverless-infer/v1/deployment/expert
Authorizations
Authorizations:StringHeaderRequired
用户可通过已获取Serverless API Key做验证,例如:plain Credential=[YOUR_AK],Signature=[YOUR_SK]。
Body
application/json
vksId:StringRequired
弹性容器集群(VKS)ID。
namespace:StringRequired
弹性容器集群(VKS)NameSpace。
name:StringRequired
服务名称。
servedName:List<String>Required
模型内部标识。
modelId:StringRequired
模型ID。
headConfig:ObjectRequired
workerConfig:Object
scale:ObjectRequired
extension:ObjectRequired
扩展字段。
Response
状态码:application/json
200
code:Int
code是一种常见的返回值形式,表示查询操作的执行结果。
0
-1
0是成功标识,表示操作成功完成。
data:Object
msg:String
code返回值为-1时,返回异常信息。
cURL
Python
JavaScript
Go
Java
curl --location --request POST 'https://api.alayanew.com/api/serverless-infer/v1/deployment/expert'
--header 'Authorization:plain Credential=YOUR_AK,Signature=YOUR_SK'
--header 'Content-Type: application/json'
--data '
{
"name": "test-expert",
"namespace": "default",
"vksId": "vcacb50arkk4",
"servedName": ["testvllm"],
"modelId": "c486cdee-c316-4fc1-9f75-0d1741940f27",
"scale": {
"max": 3,
"min": 1,
"rpsValue": 10,
"idleTime": 60
},
"headConfig": {
"image": "registry.cn-hangzhou.aliyuncs.com/ls-2018/test:vllm-0.8.1p",
"cmd": ["sh", "-c", "test.sh"],
"labels": {
"usage": "test"
},
"env": {
"VLLM_ALLOW_RUNTIME_LORA_UPDATING": "true"
},
"args": ["-Xmx", "52m"],
"resource": {
"workers": 3,
"mem": 32,
"cpu": 4,
"gpu": {
"gpuType": "vidia.com/gpu-l40s",
"count": 1
}
},
"pvcMounts": [
{
"containerPath": "/scripts",
"pvcName": "test-name"
}
]
},
"workerConfig": {
"workers": 3,
"image": "registry.cn-hangzhou.aliyuncs.com/ls-2018/test:vllm-0.8.1p",
"cmd": ["sh", "-c", "test.sh"],
"labels": {
"usage": "test"
},
"env": {
"VLLM_ALLOW_RUNTIME_LORA_UPDATING": "true"
},
"args": [""],
"resource": {
"workers": 3,
"mem": 8,
"cpu": 4,
"gpu": {
"gpuType": "vidia.com/gpu-l40s",
"count": 1
}
},
"pvcMounts": [
{
"containerPath": "/scripts",
"pvcName": "test-name"
}
]
},
"extensions": {
"usage": "test"
}
}'
200
400
401
403
404
500
{
"code": "0",
"data": {
"serviceId":""
},
"msg": "optional,string,"
}