Website for Question Paper Search
$ jq '. | length' data/data.json
$ jq '.[].Link' data/data.json | awk -F'"' '{ if (match($2, /pdf$/) == 0 && match($2, /drive.google.com/) == 0) { print $2 } }' | wc
# Find the original number of papers
$ jq '.[].Link' data/data.json | wc
# Find the number of unique records
$ jq '.[].Link' data/data.json | sort | uniq | wc
# Subtract the result of the second command
# from the first to get the number of duplicates
# oneliner to find the number of duplicates
$ echo $((`jq '.[].Link' data/data.json | sort | uniq -D | wc -l`-`jq '.[].Link' data/data.json | sort | uniq -d | wc -l`))
Run the following from the data
folder:
python3 ../scripts/pdfFinder.py data.json
You need to install BS4 for that. To install it, run:
pip3 install bs4 --user
This will update the data.json
file with the pdf links found on the library site.
Then from the root directory of the repository, run:
python3 remove_dups.py
This will prune all duplicate entries.
Library site is down? ( http://10.17.32.9 )
Run the following command, commit the new data.json file and push to this repository:
sed -ie "s/http:\/\/10\.17\.32\.9/https:\/\/static\.metakgp\.org/g" data/data.json
or if you need to go back to the library site:
sed -ie "s/https\:\/\/static.metakgp.org/http\:\/\/10.17.32.9/g" data/data.json
Licensed under GNU General Public License v3.0 (GPLv3).
Please read CONTRIBUTING.md guide to know more.