-
Notifications
You must be signed in to change notification settings - Fork 266
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for parsing custom corpus #47
Comments
Yeah totally agree with that. Things kinda busy for me on the life end, but PRs are welcome 🙏 |
On this note-- is there any way to see what the format is of the document that is being fed to createDataset.py? I realize the actual files contain personal info, I just want to know the structure of it so I can recreate it. The fbchat-archive-parser doesn't work for my message data, so I want to re-create the file that would have resulted from running fbcap ./messages.htm > fbMessages.txt |
Actually now that I think back to this project (it's been a while for me), I think the fbMessages.txt file is actually pretty similar to the format @oveddan was talking about (correct me if I'm wrong though). #28 Basically just |
ok so what you're saying is, it expects a file in the format of: somePerson: hi how's it going Would it work if given a file like this? It would be great if there a sample file format for fb messages somewhere in this repo, considering the fb message gathering repo is obsolete. |
It would be great if you supported parsing a custom corpus that a user can provide, that is already formatted in an argument that is recognizable, and can feed into
createDataset.py
For example, if you could pass a text file that looks like:
message: How's it going today?
response: It's going alright
message: What's for dinner tonight?
response: Chicken baked with cheese
The text was updated successfully, but these errors were encountered: