Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Format conversion tools for post tuning datasets #514

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

HYLcool
Copy link
Collaborator

@HYLcool HYLcool commented Dec 18, 2024

Select Query-Response format as the intermediate format for Data-Juicer.

  • LLaMA-Factory ShareGPT <--> DJ
  • MS-Swift ShareGPT <--> DJ
  • Alpaca <--> DJ
  • Messages <--> DJ

@HYLcool HYLcool added documentation Improvements or additions to documentation enhancement New feature or request labels Dec 18, 2024
@HYLcool HYLcool self-assigned this Dec 18, 2024
Copy link
Collaborator

@yxdyc yxdyc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The prototype implementation LGTM. Later we may need to discuss some terminologies and improve the clarity of the docs.

@yxdyc yxdyc mentioned this pull request Dec 19, 2024
3 tasks
@yxdyc yxdyc linked an issue Dec 19, 2024 that may be closed by this pull request
3 tasks
@HYLcool HYLcool marked this pull request as ready for review December 24, 2024 03:18
@HYLcool HYLcool changed the title [WIP] format conversion tools for post tuning datasets Format conversion tools for post tuning datasets Dec 24, 2024
@HYLcool HYLcool added the dj:tools issues/PRs about specific tools label Dec 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dj:tools issues/PRs about specific tools documentation Improvements or additions to documentation enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sharegpt format support
2 participants