Chap 15, pg 513 : ModuleNotFoundError: No module named 'torchdata.datapipes' #199

Emmanuel-Ibekwe · 2024-12-24T15:50:12Z

`
from torchtext.datasets import IMDB
train_dataset = IMDB(split='train')
test_dataset = IMDB(split='test')

`
I keep getting this error despite manually installing torchdata. When I tried installing the exact version of torchtext used in the chapter, version 0.10.0, pip couldn't recognize as a valid version.

I can't find any solution to it online

kostuyn · 2025-01-04T00:17:38Z

@Emmanuel-Ibekwe I installed 0.17.0 version the package and it work (for colab)
!pip install portalocker --quiet
!pip install torchtext==0.17.0 --quiet

after installed - Runtime -> Restart runtime option in the Colab menu

(last version of torchtext has a problem pytorch/text#2272)

rasbt · 2025-01-04T16:07:32Z

@Emmanuel-Ibekwe It looks like you are right, and the PyTorch maintainers removed torchtext 0.10.0 from PyPi for some reason. The ch15 notebook here on GitHub should be updated to work with newer versions of torchtext though as @kostuyn mentioned. It would require installing portalocker as well as described above. Let us know in case this still doesn't work.

Emmanuel-Ibekwe · 2025-01-07T15:06:17Z

Thanks @rasbt and @kostuyn for the responses. I did find out through chatgpt (great tool) that the datasets package from the Huggingface community has the imdb dataset. So I used it.
Using the datasets package I got values for the various training and validation accuracies of different epochs that were different from the ones in the text. The model overfitted. At some point both accuracies maintained an accuracy score of 100%. But the model performed terribly on the test dataset. I got an accuracy of 68.5%.
Thanks one more time.

Edit: built a custom dataset for the imdb dataset from torch.utils.data to help in data loading.

rasbt · 2025-01-08T14:52:00Z

Thanks for the feedback. Yes, I think the dataset would nowadays be easier to get from the datasets library. The splits are different though, and I am surprised about the low test set accuracy. Both the training and validation accuracy were 100% though? This is an interesting case of overfitting where the validation accuracy seems almost too good to be true (and the test accuracy unexpectedly bad).

Emmanuel-Ibekwe · 2025-01-15T12:24:35Z

Yes sir, both the training and validation accuracies were 100% at some point during the training process and maintained it till the last epoch. Here's a link to the repo containing the code just in case you want to take a look at the code. https://github.com/Emmanuel-Ibekwe/Machine-learning-by-S.-Raschka-notebooks

I had commented out the training code and saved the model.

Pls sir, if you are still interested in the code, try manually copying it with your mouse because clicking on the link just leads to a non-existent issue.

Also, sorry for the lack of comments and headings in the code (I had not cared much about them since it was basically for learning purposes). The training code is found at the very bottom of the file with the related code that builds up to it preceding it.

rasbt · 2025-01-16T21:57:00Z

Thanks for sharing, but it seems the link doesn't work:

Emmanuel-Ibekwe · 2025-01-17T12:13:45Z

Good day sir. I finally figured out why it keeps directing to a wrong address. Github seems to be embedding the wrong url in the link. I've fixed that.

It works now.
https://github.com/Emmanuel-Ibekwe/Machine-learning-by-S.-Raschka-notebooks

The training code is towards the bottom of the file. Sorry once again for the lack of comments and headings.

rasbt · 2025-01-30T23:23:08Z

It looks like you are correctly splitting the training set into train and validation subsets. Honestly, I can't see why the accuracies would be exactly the same (100%) in your case. Sorry!

Emmanuel-Ibekwe · 2025-02-01T14:21:57Z

Good day sir. You got something different? You used the hugging face dataset? I've long moved past the chapter though. I got the same accuracies after many iterations.
Thanks for the responses so far.

rasbt · 2025-02-01T16:16:58Z

No worries, I was just looking at your code to see if there was anything suspicious that could explain why the training and validation accuracies would be both 100%. I couldn't find an issue (like an accidentally double-assigned variable or so). In any case, please don't worry about it and feel free to move to the next chapter(s) :).

Emmanuel-Ibekwe · 2025-02-03T13:36:19Z

Ok sir. Thank you for the help so far. :)
By the way, I'm done with the chapters I needed (1 to part of 17) in the book.

Emmanuel-Ibekwe · 2025-02-19T16:53:38Z

Good afternoon sir. How's it going? Sorry to disturb. Pls I would love if you would help me resolve this issue I'm getting from a series of code from the book, NLP with Transformers by Lewis Tunstall and co. I had tried creating an issue on the book's github on a previous yet similar issue but I got no response for over 3 weeks now. It's ok if you decline. I very much understand your busy schedule and moreover, it's not from your book.
This is the link to the Kaggle notebook: https://www.kaggle.com/code/immanuelibekwe/fork-of-nlp-chapter-4. (You only have to click on the edit button to get the actual notebook)

The trainer.train() of the hugging face Trainer api seems to be running indefinitely and the GPU or even the CPU is not utilized while it runs. I searched online and queried chatgpt, yet I got nothing.

I had run into the Trainer api in your book but because it took forever to run and you had not introduced online platform for using GPU's for free at the time, I left it.

rasbt · 2025-02-20T18:09:14Z

Sorry, unfortunately I currently wouldn't have the capacity to help with other books

Emmanuel-Ibekwe · 2025-02-21T11:34:45Z

Ok sir. I do understand. Thank you for your time.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chap 15, pg 513 : ModuleNotFoundError: No module named 'torchdata.datapipes' #199

Chap 15, pg 513 : ModuleNotFoundError: No module named 'torchdata.datapipes' #199

Emmanuel-Ibekwe commented Dec 24, 2024 •

edited

Loading

kostuyn commented Jan 4, 2025 •

edited

Loading

rasbt commented Jan 4, 2025

Emmanuel-Ibekwe commented Jan 7, 2025 •

edited

Loading

rasbt commented Jan 8, 2025

Emmanuel-Ibekwe commented Jan 15, 2025 •

edited

Loading

rasbt commented Jan 16, 2025

Emmanuel-Ibekwe commented Jan 17, 2025 •

edited

Loading

rasbt commented Jan 30, 2025

Emmanuel-Ibekwe commented Feb 1, 2025

rasbt commented Feb 1, 2025

Emmanuel-Ibekwe commented Feb 3, 2025 •

edited

Loading

Emmanuel-Ibekwe commented Feb 19, 2025 •

edited

Loading

rasbt commented Feb 20, 2025

Emmanuel-Ibekwe commented Feb 21, 2025

Chap 15, pg 513 : ModuleNotFoundError: No module named 'torchdata.datapipes' #199

Chap 15, pg 513 : ModuleNotFoundError: No module named 'torchdata.datapipes' #199

Comments

Emmanuel-Ibekwe commented Dec 24, 2024 • edited Loading

kostuyn commented Jan 4, 2025 • edited Loading

rasbt commented Jan 4, 2025

Emmanuel-Ibekwe commented Jan 7, 2025 • edited Loading

rasbt commented Jan 8, 2025

Emmanuel-Ibekwe commented Jan 15, 2025 • edited Loading

rasbt commented Jan 16, 2025

Emmanuel-Ibekwe commented Jan 17, 2025 • edited Loading

rasbt commented Jan 30, 2025

Emmanuel-Ibekwe commented Feb 1, 2025

rasbt commented Feb 1, 2025

Emmanuel-Ibekwe commented Feb 3, 2025 • edited Loading

Emmanuel-Ibekwe commented Feb 19, 2025 • edited Loading

rasbt commented Feb 20, 2025

Emmanuel-Ibekwe commented Feb 21, 2025

Emmanuel-Ibekwe commented Dec 24, 2024 •

edited

Loading

kostuyn commented Jan 4, 2025 •

edited

Loading

Emmanuel-Ibekwe commented Jan 7, 2025 •

edited

Loading

Emmanuel-Ibekwe commented Jan 15, 2025 •

edited

Loading

Emmanuel-Ibekwe commented Jan 17, 2025 •

edited

Loading

Emmanuel-Ibekwe commented Feb 3, 2025 •

edited

Loading

Emmanuel-Ibekwe commented Feb 19, 2025 •

edited

Loading