Skip to content

Question: Is the model available Instruction tuned? #3

@CHesketh76

Description

@CHesketh76

Hello,

Just wondering if the model that you provided on huggingface was instruction tuned to perform the needle in the haystack test.

Also, (hypothetically speaking) would some of the practices to reduce GPU requirements also apply to SSSM models? For example, Unsloth reduces the GPU demand so consumer GPUs can train Llama2 -7B and Mistral - 7B models. My 8BG GPU was able to finetune Mistral for a small usecase of mine. It would absolutely amazing to see a Mamba-7B model train for half the resources that Unsloth Mistral 7B needs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions