tensorrt-llm: 0.9 -> 0.10, triton: 2.42.0 -> 2.44.0 #50

yorickvP · 2024-07-04T13:14:28Z

Open questions:

Should we keep detokenizing in predict.py instead of postprocessing?
It looks like the required config changed. Notably, enable_trt_overlap disappeared and a whole bunch of cache options got added.
Yorick: do some testing.

Remove the tensorrt_llm python script, since it confuses `maybe_download_tarball_with_pget`

technillogue · 2024-07-19T01:54:08Z

enable_trt_overlap is set to false in a lot of places, we will probably need to change that

we should review the new configuration options as well

technillogue · 2024-07-19T01:58:22Z

predict.py

+                if token == []:
+                    continue
+


was this discovered by testing? what is this?

Good question, it was discovered by testing. EOS seems to have been replaced by []. Potentially related to tokenizer config?

yorickvP requested a review from joehoover July 4, 2024 13:14

yorickvP changed the title ~~tensorrt-llm: 0.9 -> 0.19, triton: 2.42.0 -> 2.44.0~~ tensorrt-llm: 0.9 -> 0.10, triton: 2.42.0 -> 2.44.0 Jul 5, 2024

yorickvP added 8 commits July 10, 2024 17:07

tensorrt-llm: 0.9.0 -> 0.10.0, triton: 2.42.0 -> 2.44.0

421f2c5

update cognix

d279a41

update triton_templates to trtllm-0.10

d71edcd

don't tokenize in postprocessing (see #27)

37959da

instantiate triton_model_repo with the default config

f34f604

Adjust decoding_mode after testing, omit missing optional params

1345d36

Remove the tensorrt_llm python script, since it confuses `maybe_download_tarball_with_pget`

ignore empty SSE's, remove decoding_mode, just omit topk instead

8e024af

bump pget==0.8.2, cog==0.10.0-alpha16

193ce88

yorickvP force-pushed the trtllm-0.10 branch from f3f93f4 to 193ce88 Compare July 10, 2024 15:07

yorickvP marked this pull request as ready for review July 10, 2024 15:08

yorickvP added 2 commits July 11, 2024 13:16

tensorrt-llm: decrease closure size by cleaning up kernels

583964b

bump cognix to exclude train: null

9439d32

technillogue reviewed Jul 19, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tensorrt-llm: 0.9 -> 0.10, triton: 2.42.0 -> 2.44.0 #50

tensorrt-llm: 0.9 -> 0.10, triton: 2.42.0 -> 2.44.0 #50

Uh oh!

yorickvP commented Jul 4, 2024 •

edited

Loading

Uh oh!

technillogue commented Jul 19, 2024

Uh oh!

technillogue Jul 19, 2024

Uh oh!

yorickvP Jul 19, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tensorrt-llm: 0.9 -> 0.10, triton: 2.42.0 -> 2.44.0 #50

Are you sure you want to change the base?

tensorrt-llm: 0.9 -> 0.10, triton: 2.42.0 -> 2.44.0 #50

Uh oh!

Conversation

yorickvP commented Jul 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

technillogue commented Jul 19, 2024

Uh oh!

technillogue Jul 19, 2024

Choose a reason for hiding this comment

Uh oh!

yorickvP Jul 19, 2024

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

yorickvP commented Jul 4, 2024 •

edited

Loading