Educational scoring prompt? #13

zidsi · 2024-03-23T13:07:59Z

Is the prompt used for content educational scoring part of this repo? Did you use Mixtral to score/classify content or was dedicated classifier trained?

loubnabnl · 2024-03-25T12:52:21Z

We used Mixtral to score the content of the clusters, you can find the prompt here: https://github.com/huggingface/text-clustering/blob/7815f8b37d91b75cf160ed3f0ec8550c0b58cabb/run_pipeline.py#L12

zidsi · 2024-03-26T20:23:09Z

Thank you for kind reply. So if I understand correctly the pipeline you let Mixtral classify/score (based on n representative samples - since all wouldn't fit 32k context for large clusters that could emerge in 100k batch!?) clusters created via embeddings.
Doesn't it mean that you map/classify the embeddings space (for each identified cluster) which in turn could be used to do such prediction directly based on embeddings? If embedding model used is multilingual, such "destilled" classifier would lower the barrier for many low resourced languages.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Educational scoring prompt? #13

Educational scoring prompt? #13

zidsi commented Mar 23, 2024

loubnabnl commented Mar 25, 2024

zidsi commented Mar 26, 2024

Educational scoring prompt? #13

Educational scoring prompt? #13

Comments

zidsi commented Mar 23, 2024

loubnabnl commented Mar 25, 2024

zidsi commented Mar 26, 2024