Activity

Update README.md

byshiuepushed 1 commit to main • afdf9a9…df4a753 •

on Oct 19, 2023

fix memory leak

byshiuepushed 1 commit to main • f8e42aa…afdf9a9 •

on Sep 8, 2023

remove mpi_cxx from multi-gpu build for now (#705)

Pull request merge

byshiuepushed 1 commit to main • 7777ff1…f8e42aa •

on Jul 6, 2023

Support size_per_head=112 (#660)

Pull request merge

byshiuepushed 1 commit to main • eb9b81b…7777ff1 •

on Jun 29, 2023

fix: swap tensor bug (#683)

Pull request merge

byshiuepushed 1 commit to main • 1cf9b51…eb9b81b •

on Jun 26, 2023

[bugfix] Fix 2-shot All Reduce correctness issue (indexing bug). (#672)

Pull request merge

byshiuepushed 1 commit to main • c6e8f60…1cf9b51 •

on Jun 21, 2023

Deleted branch

byshiuedeleted fix/gpt_early_stop •

on May 1, 2023

Fix/gpt early stop (#584)

Pull request merge

byshiuepushed 1 commit to main • 19b2956…c6e8f60 •

on May 1, 2023

fix: remove useless codes

byshiuepushed 1 commit to fix/gpt_early_stop • b5d6765…5393302 •

on May 1, 2023

fix codes

byshiuecreated fix/gpt_early_stop • b5d6765 •

on May 1, 2023

fix: fix bug of mask of gptj/gptneox

byshiuepushed 1 commit to fix/softprompt_mask • a3ca909…08b45b4 •

on Apr 25, 2023

fix: fix bug of preparing mask for soft prompt

byshiuecreated fix/softprompt_mask • a3ca909 •

on Apr 24, 2023

perf(bloom): improve performance of huggingface_bloom_convert.py, dec…

Pull request merge

byshiuepushed 1 commit to main • 3460e20…19b2956 •

on Apr 24, 2023

[Enhancement]create huggingface_gptneox_convert.py (#569)

Pull request merge

byshiuepushed 1 commit to main • d7ccf83…3460e20 •

on Apr 24, 2023

Update unfused_attention_kernels.cu

byshiuepushed 1 commit to main • adb21c3…d7ccf83 •

on Apr 20, 2023

fix overflow in softmax_kernel when process long seqlen and big batch…

Pull request merge

byshiuepushed 1 commit to main • c6ba315…adb21c3 •

on Apr 19, 2023

Update cublasMMWrapper.cc

byshiuepushed 1 commit to main • a6ef7af…c6ba315 •

on Apr 18, 2023

Update cublasMMWrapper.cc

byshiuepushed 1 commit to main • 169b8df…a6ef7af •

on Apr 18, 2023

[Enhancement]add pytorch backend support for gptneox (#550)

Pull request merge

byshiuepushed 1 commit to main • 0c12805…169b8df •

on Apr 18, 2023

Update T5DecodingWeight.cc

byshiuepushed 1 commit to main • 4402759…0c12805 •

on Apr 17, 2023

fix: fix bug of gpt buffer and gpt gemm overflow

byshiuepushed 1 commit to main • bc4139e…4402759 •

on Apr 6, 2023

Update ParallelGpt.cc

byshiuepushed 1 commit to tmp/fix_gpt_earlystop • e2dd164…e045811 •

on Apr 6, 2023

Update gpt_guide.md (#529)

Pull request merge

byshiuepushed 1 commit to main • e838426…bc4139e •

on Mar 29, 2023

fix: fix bug of gpt early stop. But this fixing would lead to hang on…

byshiuecreated tmp/fix_gpt_earlystop • e2dd164 •

on Mar 24, 2023

feat: support shared context in gptj

byshiuecreated dev/gptj_shared_context • 7c0ebe8 •

on Mar 22, 2023

fix: gpt tensor shapes inconsistency (#505)

Pull request merge

byshiuepushed 1 commit to main • bb94e2d…e838426 •

on Mar 17, 2023

fix: change int of some kernels to int64_t to prevent overflow

byshiuepushed 1 commit to main • 72d3dce…bb94e2d •

on Mar 14, 2023

Update beam_search_topk_kernels.cu

byshiuepushed 1 commit to main • 303e052…72d3dce •

on Mar 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update README.md

fix memory leak

remove mpi_cxx from multi-gpu build for now (#705)

Support size_per_head=112 (#660)

fix: swap tensor bug (#683)

[bugfix] Fix 2-shot All Reduce correctness issue (indexing bug). (#672)

Deleted branch

Fix/gpt early stop (#584)

fix: remove useless codes

fix codes

fix: fix bug of mask of gptj/gptneox

fix: fix bug of preparing mask for soft prompt

perf(bloom): improve performance of huggingface_bloom_convert.py, dec…

[Enhancement]create huggingface_gptneox_convert.py (#569)

Update unfused_attention_kernels.cu

fix overflow in softmax_kernel when process long seqlen and big batch…

Update cublasMMWrapper.cc

Update cublasMMWrapper.cc

[Enhancement]add pytorch backend support for gptneox (#550)

Update T5DecodingWeight.cc

fix: fix bug of gpt buffer and gpt gemm overflow

Update ParallelGpt.cc

Update gpt_guide.md (#529)

fix: fix bug of gpt early stop. But this fixing would lead to hang on…

feat: support shared context in gptj

fix: gpt tensor shapes inconsistency (#505)

fix: change int of some kernels to int64_t to prevent overflow

Update beam_search_topk_kernels.cu