Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update post-23.07 release #6103

Merged
merged 7 commits into from
Jul 28, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Dockerfile.sdk
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@
#

# Base image on the minimum Triton container
ARG BASE_IMAGE=nvcr.io/nvidia/tritonserver:23.06-py3-min
ARG BASE_IMAGE=nvcr.io/nvidia/tritonserver:23.07-py3-min

ARG TRITON_CLIENT_REPO_SUBDIR=clientrepo
ARG TRITON_COMMON_REPO_TAG=main
Expand Down
2 changes: 1 addition & 1 deletion Dockerfile.win10.min
Original file line number Diff line number Diff line change
Expand Up @@ -153,7 +153,7 @@ LABEL TENSORRT_VERSION="${TENSORRT_VERSION}"
#
# Installing CUDNN
#
ARG CUDNN_VERSION=8.9.1.23
ARG CUDNN_VERSION=8.9.3.28
ARG CUDNN_ZIP=cudnn-windows-x86_64-${CUDNN_VERSION}_cuda12-archive.zip
ARG CUDNN_SOURCE=${CUDNN_ZIP}

Expand Down
10 changes: 5 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,8 +32,8 @@

**LATEST RELEASE: You are currently on the main branch which tracks
under-development progress towards the next release. The current release is
version [2.35.0](https://github.com/triton-inference-server/server/tree/r23.06)
and corresponds to the 23.06 container release on
version [2.36.0](https://github.com/triton-inference-server/server/tree/r23.07)
and corresponds to the 23.07 container release on
[NVIDIA GPU Cloud (NGC)](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/tritonserver).**

----
Expand Down Expand Up @@ -88,16 +88,16 @@ Inference Server with the

```bash
# Step 1: Create the example model repository
git clone -b r23.06 https://github.com/triton-inference-server/server.git
git clone -b r23.07 https://github.com/triton-inference-server/server.git
cd server/docs/examples
./fetch_models.sh

# Step 2: Launch triton from the NGC Triton container
docker run --gpus=1 --rm --net=host -v ${PWD}/model_repository:/models nvcr.io/nvidia/tritonserver:23.06-py3 tritonserver --model-repository=/models
docker run --gpus=1 --rm --net=host -v ${PWD}/model_repository:/models nvcr.io/nvidia/tritonserver:23.07-py3 tritonserver --model-repository=/models

# Step 3: Sending an Inference Request
# In a separate console, launch the image_client example from the NGC Triton SDK container
docker run -it --rm --net=host nvcr.io/nvidia/tritonserver:23.06-py3-sdk
docker run -it --rm --net=host nvcr.io/nvidia/tritonserver:23.07-py3-sdk
/workspace/install/bin/image_client -m densenet_onnx -c 3 -s INCEPTION /workspace/images/mug.jpg

# Inference should return the following
Expand Down
6 changes: 3 additions & 3 deletions build.py
Original file line number Diff line number Diff line change
Expand Up @@ -69,12 +69,12 @@
TRITON_VERSION_MAP = {
"2.37.0dev": (
"23.08dev", # triton container
"23.06", # upstream container
"1.15.0", # ORT
"23.07", # upstream container
"1.15.1", # ORT
"2023.0.0", # ORT OpenVINO
"2023.0.0", # Standalone OpenVINO
"2.4.7", # DCGM version
"py310_23.1.0-1", # Conda version
"py310_23.1.0-1", # Conda version.
)
}

Expand Down
2 changes: 1 addition & 1 deletion deploy/aws/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@
replicaCount: 1

image:
imageName: nvcr.io/nvidia/tritonserver:23.06-py3
imageName: nvcr.io/nvidia/tritonserver:23.07-py3
pullPolicy: IfNotPresent
modelRepositoryPath: s3://triton-inference-server-repository/model_repository
numGpus: 1
Expand Down
2 changes: 1 addition & 1 deletion deploy/fleetcommand/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@

apiVersion: v1
# appVersion is the Triton version; update when changing release
appVersion: "2.35.0"
appVersion: "2.36.0"
description: Triton Inference Server (Fleet Command)
name: triton-inference-server
# version is the Chart version; update when changing anything in the chart
Expand Down
6 changes: 3 additions & 3 deletions deploy/fleetcommand/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@
replicaCount: 1

image:
imageName: nvcr.io/nvidia/tritonserver:23.06-py3
imageName: nvcr.io/nvidia/tritonserver:23.07-py3
pullPolicy: IfNotPresent
numGpus: 1
serverCommand: tritonserver
Expand All @@ -46,13 +46,13 @@ image:
# Model Control Mode (Optional, default: none)
#
# To set model control mode, uncomment and configure below
# See https://github.com/triton-inference-server/server/blob/r23.06/docs/model_management.md
# See https://github.com/triton-inference-server/server/blob/r23.07/docs/model_management.md
# for more details
#- --model-control-mode=explicit|poll|none
#
# Additional server args
#
# see https://github.com/triton-inference-server/server/blob/r23.06/README.md
# see https://github.com/triton-inference-server/server/blob/r23.07/README.md
# for more details

service:
Expand Down
2 changes: 1 addition & 1 deletion deploy/gcp/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@
replicaCount: 1

image:
imageName: nvcr.io/nvidia/tritonserver:23.06-py3
imageName: nvcr.io/nvidia/tritonserver:23.07-py3
pullPolicy: IfNotPresent
modelRepositoryPath: gs://triton-inference-server-repository/model_repository
numGpus: 1
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ metadata:
namespace: default
spec:
containers:
- image: nvcr.io/nvidia/tritonserver:23.06-py3-sdk
- image: nvcr.io/nvidia/tritonserver:23.07-py3-sdk
imagePullPolicy: Always
name: nv-triton-client
securityContext:
Expand Down
4 changes: 2 additions & 2 deletions deploy/gke-marketplace-app/server-deployer/build_and_push.sh
Original file line number Diff line number Diff line change
Expand Up @@ -28,8 +28,8 @@
export REGISTRY=gcr.io/$(gcloud config get-value project | tr ':' '/')
export APP_NAME=tritonserver
export MAJOR_VERSION=2.33
export MINOR_VERSION=2.35.0
export NGC_VERSION=23.06-py3
export MINOR_VERSION=2.36.0
export NGC_VERSION=23.07-py3

docker pull nvcr.io/nvidia/$APP_NAME:$NGC_VERSION

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,4 +28,4 @@ apiVersion: v1
appVersion: "2.33"
description: Triton Inference Server
name: triton-inference-server
version: 2.35.0
version: 2.36.0
Original file line number Diff line number Diff line change
Expand Up @@ -32,13 +32,13 @@ tritonProtocol: HTTP
# HPA GPU utilization autoscaling target
HPATargetAverageValue: 85
modelRepositoryPath: gs://triton_sample_models/23_04
publishedVersion: '2.35.0'
publishedVersion: '2.36.0'
gcpMarketplace: true

image:
registry: gcr.io
repository: nvidia-ngc-public/tritonserver
tag: 23.06-py3
tag: 23.07-py3
pullPolicy: IfNotPresent
# modify the model repository here to match your GCP storage bucket
numGpus: 1
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@
x-google-marketplace:
schemaVersion: v2
applicationApiVersion: v1beta1
publishedVersion: '2.35.0'
publishedVersion: '2.36.0'
publishedVersionMetadata:
releaseNote: >-
Initial release.
Expand Down
2 changes: 1 addition & 1 deletion deploy/gke-marketplace-app/server-deployer/schema.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@
x-google-marketplace:
schemaVersion: v2
applicationApiVersion: v1beta1
publishedVersion: '2.35.0'
publishedVersion: '2.36.0'
publishedVersionMetadata:
releaseNote: >-
Initial release.
Expand Down
Loading