vllm.entrypoints.pooling.base.protocol ¶
CompletionRequestMixin ¶
Bases: OpenAIBaseModel
Source code in vllm/entrypoints/pooling/base/protocol.py
PoolingBasicRequestMixin ¶
Bases: OpenAIBaseModel
Source code in vllm/entrypoints/pooling/base/protocol.py
priority class-attribute instance-attribute ¶
priority: int = Field(
default=0,
description="The priority of the request (lower means earlier handling; default: 0). Any priority other than 0 will raise an error if the served model does not use priority scheduling.",
)
request_id class-attribute instance-attribute ¶
request_id: str = Field(
default_factory=random_uuid,
description="The request_id related to this request. If the caller does not set it, a random_uuid will be generated. This id is used through out the inference process and return in response.",
)