Skip to content

Latest commit

 

History

History
118 lines (102 loc) · 5.54 KB

README.md

File metadata and controls

118 lines (102 loc) · 5.54 KB

HTTP/REST and GRPC Protocol

This directory contains documents related to the HTTP/REST and GRPC protocols used by Triton. Triton uses the KServe community standard inference protocols plus several extensions that are defined in the following documents:

Note that some extensions introduce new fields onto the inference protocols, and the other extensions define new protocols that Triton follows, please refer to the extension documents for detail.

For the GRPC protocol, the protobuf specification is also available. In addition, you can find the GRPC health checking protocol protobuf specification here.

Restricted Protocols

You can configure the Triton endpoints, which implement the protocols, to restrict access to some protocols and to control network settings, please refer to protocol customization guide for detail.

IPv6

Assuming your host or docker config supports IPv6 connections, tritonserver can be configured to use IPv6 HTTP endpoints as follows:

$ tritonserver ... --http-address ipv6:[::1]&
...
I0215 21:04:11.572305 571 grpc_server.cc:4868] Started GRPCInferenceService at 0.0.0.0:8001
I0215 21:04:11.572528 571 http_server.cc:3477] Started HTTPService at ipv6:[::1]:8000
I0215 21:04:11.614167 571 http_server.cc:184] Started Metrics Service at ipv6:[::1]:8002

This can be confirmed via netstat, for example:

$ netstat -tulpn | grep tritonserver
tcp6      0      0 :::8000      :::*      LISTEN      571/tritonserver
tcp6      0      0 :::8001      :::*      LISTEN      571/tritonserver
tcp6      0      0 :::8002      :::*      LISTEN      571/tritonserver

And can be tested via curl, for example:

$ curl -6 --verbose "http://[::1]:8000/v2/health/ready"
*   Trying ::1:8000...
* TCP_NODELAY set
* Connected to ::1 (::1) port 8000 (#0)
> GET /v2/health/ready HTTP/1.1
> Host: [::1]:8000
> User-Agent: curl/7.68.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Content-Length: 0
< Content-Type: text/plain
<
* Connection #0 to host ::1 left intact

Mapping Triton Server Error Codes to HTTP Status Codes

This table maps various Triton Server error codes to their corresponding HTTP status codes. It can be used as a reference guide for understanding how Triton Server errors are handled in HTTP responses.

Triton Server Error Code HTTP Status Code Description
TRITONSERVER_ERROR_INTERNAL 500 Internal Server Error
TRITONSERVER_ERROR_NOT_FOUND 404 Not Found
TRITONSERVER_ERROR_UNAVAILABLE 503 Service Unavailable
TRITONSERVER_ERROR_UNSUPPORTED 501 Not Implemented
TRITONSERVER_ERROR_UNKNOWN,
TRITONSERVER_ERROR_INVALID_ARG,
TRITONSERVER_ERROR_ALREADY_EXISTS,
TRITONSERVER_ERROR_CANCELLED
400 Bad Request (default for other errors)