Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error running benchmarks/benchmark_generation.py #3

Open
BlinkDL opened this issue Jan 24, 2023 · 4 comments · Fixed by #4
Open

Error running benchmarks/benchmark_generation.py #3

BlinkDL opened this issue Jan 24, 2023 · 4 comments · Fixed by #4

Comments

@BlinkDL
Copy link

BlinkDL commented Jan 24, 2023

Hi there. It's great to see another LM trained on the Pile.

When I run benchmarks/benchmark_generation.py:

[KeOps] Compiling cuda jit compiler engine ... OK
[pyKeOps] Compiling nvrtc binder for python ... OK
Number of parameters: 1326096384
[KeOps] Generating code for formula Sum_Reduction(ComplexMult(Var(0,2,1),ComplexExp(ComplexMult(Var(1,2,1),Var(2,2,0)))),0) ... OK
Segmentation fault

and it exits after "Segmentation fault".

So I uninstall pykeops, and then the new error is:

Traceback (most recent call last):
  File "/fsx/BlinkDL/CODE/_PUBLIC_/H3/benchmarks/benchmark_generation_h3.py", line 68, in <module>
    fn()
  File "/fsx/BlinkDL/CODE/_PUBLIC_/H3/benchmarks/benchmark_generation_h3.py", line 65, in <lambda>
    fn = lambda: model.generate(input_ids=input_ids, max_length=max_length,
  File "/fsx/BlinkDL/conda/lib/python3.9/site-packages/flash_attn-0.2.8-py3.9-linux-x86_64.egg/flash_attn/utils/generation.py", line 150, in generate
    output = decode(input_ids, self, max_length, top_k=top_k, top_p=top_p,
  File "/fsx/BlinkDL/conda/lib/python3.9/site-packages/flash_attn-0.2.8-py3.9-linux-x86_64.egg/flash_attn/utils/generation.py", line 107, in decode
    logits = model(input_ids, inference_params=inference_params).logits[:, -1]
  File "/fsx/BlinkDL/conda/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/fsx/BlinkDL/CODE/_PUBLIC_/H3/src/models/ssm_seq.py", line 186, in forward
    hidden_states = self.backbone(input_ids, position_ids=position_ids,
  File "/fsx/BlinkDL/conda/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/fsx/BlinkDL/CODE/_PUBLIC_/H3/src/models/ssm_seq.py", line 141, in forward
    hidden_states, residual = layer(hidden_states, residual, mixer_kwargs=mixer_kwargs)
  File "/fsx/BlinkDL/conda/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/fsx/BlinkDL/conda/lib/python3.9/site-packages/flash_attn-0.2.8-py3.9-linux-x86_64.egg/flash_attn/modules/block.py", line 126, in forward
    hidden_states = self.mixer(hidden_states, **mixer_kwargs)
  File "/fsx/BlinkDL/conda/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/fsx/BlinkDL/conda/lib/python3.9/site-packages/flash_attn-0.2.8-py3.9-linux-x86_64.egg/flash_attn/modules/mha.py", line 481, in forward
    kv = self._update_kv_cache(qkv[:, :, 1:], inference_params)
  File "/fsx/BlinkDL/conda/lib/python3.9/site-packages/flash_attn-0.2.8-py3.9-linux-x86_64.egg/flash_attn/modules/mha.py", line 419, in _update_kv_cache
    assert self.layer_idx is not None, 'Generation requires layer_idx in the constructor'
AssertionError: Generation requires layer_idx in the constructor
kashif added a commit to kashif/H3 that referenced this issue Jan 24, 2023
fixes HazyResearch#3 when pykeops is not installed atleast
@kashif kashif mentioned this issue Jan 24, 2023
@DanFu09 DanFu09 reopened this Jan 24, 2023
@DanFu09
Copy link
Contributor

DanFu09 commented Jan 24, 2023

@BlinkDL try with this fix now!

For the KeOps issue - can you share details about your environment? PyTorch, CUDA, and KeOps versions would all be helpful.

@kashif
Copy link
Contributor

kashif commented Jan 24, 2023

I believe the issue happens when the GPU runs out of mem for the larger benchmarks.... e.g. on my setup with
pykeops-2.1.1 on Driver Version: 525.60.13 CUDA Version: 12.0 with torch 2.0.0a0+git81b5eff on a 24gb card it crashes:

Number of parameters: 1326096384

Thread 1 "python" received signal SIGSEGV, Segmentation fault.
0x00007ffe4e0c556b in range_preprocess_from_device(int&, int, int, int, int**, int, int*&, int*&, int*&, int*&, int, std::vector<int, std::allocator<int> > const&, std::vector<int, std::allocator<int> > const&, std::vector<int, std::allocator<int> > const&, int*) () from .cache/keops2.1.1/build/pykeops_nvrtc.cpython-310-x86_64-linux-gnu.so

but works if i do a smaller test:

Number of parameters: 12102144
[KeOps] Generating code for formula Sum_Reduction(ComplexMult(ComplexMult(Var(1,2,0),Var(0,2,1)),ComplexExp(ComplexMult(Var(2,2,0),Var(3,2,1)))),0) ... OK

hope that helps!

@bryanhpchiang
Copy link

getting this same error -- also on a 24gb card. i see the --ckpt option but is there some way to toggle the model architecture between the different model sizes (ex. 1.3B vs. 2.7B)

thanks!

@DanFu09
Copy link
Contributor

DanFu09 commented Mar 6, 2023

There are examples to switch between the different models for text generation: https://github.com/HazyResearch/H3/tree/main/examples

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants