Skip to content

Commit 5ea826a

Browse files
author
Yh Tian
committed
update readme and requirements
1 parent 399054c commit 5ea826a

File tree

4 files changed

+27
-2
lines changed

4 files changed

+27
-2
lines changed

README.md

+12
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,18 @@ Our code works with the following environment.
2626
* `python=3.6`
2727
* `pytorch=1.1`
2828

29+
To run [Stanford CoreNLP Toolkit](https://stanfordnlp.github.io/CoreNLP/cmdline.html), you need
30+
* `Java 8`
31+
32+
To run [Berkeley Neural Parser](https://github.com/nikitakit/self-attentive-parser), you need
33+
* `tensorfolw==1.13.1`
34+
* `benepar[cpu]`
35+
* `cython`
36+
37+
Note that Berkeley Neural Parser does not support `TensorFlow 2.0`.
38+
39+
You can refer to their websites for more information.
40+
2941
## Downloading BERT and ZEN
3042

3143
In our paper, we use BERT ([paper](https://www.aclweb.org/anthology/N19-1423/)) and ZEN ([paper](https://arxiv.org/abs/1911.00720)) as the encoder.

data_preprocessing/README.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -9,8 +9,8 @@ Run `getdata.sh` under that directory to obtain and pre-process the data. This s
99

1010
This script will also download the [Stanford CoreNLP Toolkit v3.9.2](https://stanfordnlp.github.io/CoreNLP/history.html) (SCT) and [Berkeley Neural Parser](https://github.com/nikitakit/self-attentive-parser) (BNP) from their official website, which are used to obtain the auto-analyzed syntactic knowledge. If you only want to use the knowledge from SCT, you can comment out the script to download BNP in `getdata.sh`. If you want to use the auto-analyzed knowledge from BNP, you need to download both SCT and BNP, because BNP relies on the segmentation results from SCT.
1111

12-
To run SCT, you need `java 1.8`; to run BNP, you need `tensorflow`.
12+
To run SCT, you need `java 8`; to run BNP, you need `tensorflow==1.1.3`.
1313

14-
You can refer to their website for more information.
14+
You can refer to their websites for more information.
1515

1616
All processed data will appear in `data` directory organized by the datasets, where each of them contains the files with the same file names in the `sample_data` folder.

data_preprocessing/getdata.sh

+1
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
############## process data ##############
22

33
# download Universal Dependencies 2.4
4+
# If this step fails, you can manually download the file and put it under this directory
45
wget https://lindat.mff.cuni.cz/repository/xmlui/bitstream/handle/11234/1-2988/ud-treebanks-v2.4.tgz
56

67
tar zxvf ud-treebanks-v2.4.tgz

requirements.txt

+12
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
pytorch == 1.1.0
2+
tensorflow-gpu==1.13.1
3+
tqdm
4+
nltk
5+
pandas
6+
boto3
7+
requests
8+
regex
9+
seqeval
10+
psutil
11+
cython
12+
benepar[cpu]

0 commit comments

Comments
 (0)