Replies: 2 comments 5 replies
-
Hi, splitting FileDataset is a bit awkward. You could split the dataset text files into multiple files and load them as multiple datasets. You can also do something like this: from neuralogic.core import BuiltDataset
mutagenesis_dataset = ...
model = template.build(...)
dataset = model.build_dataset(mutagenesis_dataset)
train_size = ...
train_dataset = BuiltDataset(dataset.samples[:train_size], batch_size=1)
test_dataset = BuiltDataset(dataset.samples[train_size:], batch_size=1) or mutagenesis_dataset = ...
model = template.build(...)
dataset = model.build_dataset(mutagenesis_dataset)
train_size = ...
# Training
model(dataset.samples[:train_size], train=True)
# Testing
model(dataset.samples[train_size:], train=False) Both of those solutions require building the dataset first. File datasets are passed directly into the backend and are not read on the Python side.
What were the problems you came across? |
Beta Was this translation helpful? Give feedback.
-
Another question, when using this dataset, how can I get the queries? I would like to evaluate the model, but I haven't been able to find a way to get back the queries from the built dataset. My code looks somewhat like the following:
|
Beta Was this translation helpful? Give feedback.
-
Hello,
I am using the Mutagenesis dataset which is already built in your framework, but I would like to split it into the train and test datasets. Simple python indexing seems not to be supported (for FileDataset), so is there a function designated for this? I tried reading the files and create a new dataset, but it seems I have to build the examples and queries first, which produces more problems. Should I
I would appreciate if you could give me some advice on this and what would be the prefered method of doing a train/test split.
Thank you!
Beta Was this translation helpful? Give feedback.
All reactions