Skip to content

Conversation

@yorickvP
Copy link
Contributor

@yorickvP yorickvP commented Aug 8, 2024

Breaking changes:

  • enable_trt_overlap now useless
  • max_queue_size is now required, 0 is a good default
  • max_seq_len required until next week, but will default to the model max input size after

Other changes:

  • removed bls

yorickvP and others added 30 commits July 19, 2024 13:54
Remove the tensorrt_llm python script, since it confuses
`maybe_download_tarball_with_pget`
hopefully fixes
`ValueError: Invalid pattern: '**' can only be an entire path component`
adds TRTLLM_BUILDER_VARIANT=h100 to work around
"It looks like a copy of this model version already exists on Replicate"
@yorickvP yorickvP requested a review from joehoover August 8, 2024 14:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants