-
Notifications
You must be signed in to change notification settings - Fork 3
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
0bb7bcf
commit ef966d6
Showing
84 changed files
with
452,869 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
# NMT Assignment | ||
Note: Heavily inspired by the https://github.com/pcyin/pytorch_nmt repository |
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
rm -f assignment4.zip | ||
zip -r assignment4.zip *.py ./en_es_data ./sanity_check_en_es_data ./outputs |
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
nltk | ||
docopt | ||
tqdm==4.29.1 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
name: local_nmt | ||
channels: | ||
- pytorch | ||
- defaults | ||
dependencies: | ||
- python=3.5 | ||
- numpy | ||
- scipy | ||
- tqdm | ||
- docopt | ||
- pytorch | ||
- nltk | ||
- torchvision |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,58 @@ | ||
#!/usr/bin/env python3 | ||
# -*- coding: utf-8 -*- | ||
|
||
""" | ||
CS224N 2019-20: Homework 4 | ||
model_embeddings.py: Embeddings for the NMT model | ||
Pencheng Yin <pcyin@cs.cmu.edu> | ||
Sahil Chopra <schopra8@stanford.edu> | ||
Anand Dhoot <anandd@stanford.edu> | ||
Vera Lin <veralin@stanford.edu> | ||
""" | ||
|
||
import torch.nn as nn | ||
|
||
class ModelEmbeddings(nn.Module): | ||
""" | ||
Class that converts input words to their embeddings. | ||
""" | ||
def __init__(self, embed_size, vocab): | ||
""" | ||
Init the Embedding layers. | ||
@param embed_size (int): Embedding size (dimensionality) | ||
@param vocab (Vocab): Vocabulary object containing src and tgt languages | ||
See vocab.py for documentation. | ||
""" | ||
super(ModelEmbeddings, self).__init__() | ||
self.embed_size = embed_size | ||
|
||
# default values | ||
self.source = None | ||
self.target = None | ||
|
||
src_pad_token_idx = vocab.src['<pad>'] | ||
tgt_pad_token_idx = vocab.tgt['<pad>'] | ||
|
||
### YOUR CODE HERE (~2 Lines) | ||
### TODO - Initialize the following variables: | ||
### self.source (Embedding Layer for source language) | ||
### self.target (Embedding Layer for target langauge) | ||
### | ||
### Note: | ||
### 1. `vocab` object contains two vocabularies: | ||
### `vocab.src` for source | ||
### `vocab.tgt` for target | ||
### 2. You can get the length of a specific vocabulary by running: | ||
### `len(vocab.<specific_vocabulary>)` | ||
### 3. Remember to include the padding token for the specific vocabulary | ||
### when creating your Embedding. | ||
### | ||
### Use the following docs to properly initialize these variables: | ||
### Embedding Layer: | ||
### https://pytorch.org/docs/stable/nn.html#torch.nn.Embedding | ||
|
||
|
||
### END YOUR CODE | ||
|
||
|
Large diffs are not rendered by default.
Oops, something went wrong.
Empty file.
Oops, something went wrong.