-
Notifications
You must be signed in to change notification settings - Fork 266
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can you suggest how many dimensions to use for word vectors? #22
Comments
Yeah, I guess the answer to this depends on the complexity of your NLP task as well as the computational power you have, and also the amount of training data you have. Because there are all these factors, it makes it difficult to recommend one particular size, and I don't think there are any papers that say one size is better than another. My intuition is the larger your vectors are, the more info you can pack into them and thus the better they will be. However, this is also dependent on whether or not your training corpus is even large enough to be able to learn accurate vector representations. Long story short, I think this is more of a trial/error hyperparameter you need to play around with. Although, I would love to hear if there is a particular size that worked the best for you. |
I'm not seeing where the results of that question are even used in the python code provided, can you direct me to where in the code that even gets used? |
line 149 of Seq2Seq.py is prompted if you do not use the word2vec.py script. |
Yes I see it prompted, but I don't see the results ever passed into anything after it's read in from the terminal? Where are the word dimensions ever used? |
Hi Adit - Great repo! I am trying to use Seq2Seq.py to generate the word vectors, and it asks for the number of dimensions. I was reading somewhere that it is ideal to use somewhere between 300-500 dimensions. Can you suggest how many should be used? Or how many did you use for your runs?
The text was updated successfully, but these errors were encountered: