Skip to main content

Deployment Details(Expert)

更新时间:2025-07-22 17:32:25
Base model service details.
GET
https://api.alayanew.com/api/serverless-infer/v1/deployment/{serviceId}
Authorizations
AuthorizationsStringHeaderRequired

用户可通过已获取Open API Key做验证,例如:plain Credential=[YOUR_AK],Signature=[YOUR_SK]。

Path Parameters
serviceIdStringRequired

Service ID.

Response
状态码:application/json
serviceUrlString

Service URL.

apiKeyString

API Key.

statusString

Status.

instanceList<Map<String, Object>>

Instance information.

instance.instanceIdString

Instance ID.

instance.loraModelsList<String>

List of associated LoRA models.

instance.baseModelReadyboolean

Whether the base model is ready.

serviceIdString

Service ID.

vksIdString

Vital Kubernetes Service (VKS) Cluster ID.

namespaceString

Vital Kubernetes Service (VKS) Namespace.

nameString

Service name.

servedNameList<String>

Internal model identifier.

modelIdString

Model ID.

modeString

Startup mode, e.g., quickStart/expert.

quickStartObject

curl --location --request GET 'https://api.alayanew.com/api/serverless-infer/v1/deployment/38fbfc3d-6a88-4c35-b8b6-9efc83949d47'      --header 'Authorization:plain Credential=YOUR_AK,Signature=YOUR_SK'      --header 'Content-Type: application/json'
{
    "code": 0,
    "data": {
        "serviceUrl":"string",
        "apikey":"String",
        "status":"String, model deployment status: starting, running, stopping, stopped, failed",
        "instance": [{
            "instanceId": "0",
            "loraModels": ["lora1","lora2"],
            "baseModelReady": true
        }],
        "vksId":"",
        "namespace":"",
        "name":"string, service display name customized by user",
        "servedName": ["string, internal model identifier"],
        "modelId": "String, model ID",
        "mode":"quickStart",
        "quickStart":{
            "backend":"vllm/sglang",
            "backendVersion":"0.8.4...",
            "backenArgs":[],
            "resource": {
                 "workers": "optional, int, number of workers",
                 "cpu": "required,int",
                 "mem": "required,int",
                 "gpu": {
                    "gpuType": "required, string, gpu type name",
                    "count": "required, int, number of gpu to be used"
          }
      },
      "expert":{...}
    }
  }
}

服务状态

服务状态(status)流转详情如下图所示。

image

服务调用

用户在部署模型服务后,可通过指定模型的参数来调用该服务。调用服务的代码示例如下所示。

curl --location --request POST '[serviceUrl]/v1/chat/completions' \
--header 'apiKey: [apiKey]' \
--data-raw `{ "stream":false,
"messages": [{"role":"user", "content":"你是谁,能干嘛"}],
"model":"[servedName]"}`