You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, maintainers! I'm learning to use triton server to deploy models and thank you all for such a great project! While learning by reading documents and doing some small demos, I found some documents hard to understand so I came here for help!
As per the document (use output reshape as an example):
For an output, reshape can be used to reshape the output tensor produced by the framework or backend to a different shape that is returned by the inference API. A common use-case is where a model that supports batching expects a batched output to have shape [ batch-size ], which means that the batch dimension fully describes the shape. For the inference API the equivalent shape [ batch-size, 1 ] must be specified since each output must specify a non-empty dims. For this case the output should be specified as:
But I find it unclear and hard to understand for me, and here is my naive understanding about it:
There are two entities mentioned in the document: triton API and the underlying model. They can accept different shapes of tensors, the gap between which should be bridged by the reshape configuration.
In the example illustrated in the document, the model produces tensors with batch_size dimension (as the first dimension) while API needs to strip it off.
Therefore, we use the reshape configuration to remove the batch_size dimension while retaining others (API output shape = model output shape APPLY reshape).
If my understanding is right, shouldn't the whole output configuration be like (with max_batch_size = 0):
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Hello, maintainers! I'm learning to use triton server to deploy models and thank you all for such a great project! While learning by reading documents and doing some small demos, I found some documents hard to understand so I came here for help!
As per the document (use output reshape as an example):
But I find it unclear and hard to understand for me, and here is my naive understanding about it:
reshape
configuration.reshape
configuration to remove the batch_size dimension while retaining others (API output shape = model output shape APPLY reshape).If my understanding is right, shouldn't the whole
output
configuration be like (withmax_batch_size = 0)
:Beta Was this translation helpful? Give feedback.
All reactions