Read a file containing random tokens and store them in the database as quickly and efficiently as possible without storing any token twice and create a list of all non-unique tokens.
- python3.6
- postgres
- ubuntu 18.04
-
install postgres:
apt update apt install postgresql postgresql-contrib
-
create python env:
virtualenv -p python3.6 .token-env
-
activate it:
. .token-env/bin/activate
-
install python requirements:
pip install -r requirements.txt
-
Create Database
-
sudo -i -u postgres
-
createdb test-db
-
psql test-db
-
change the method of postgres !!
from peer to md5 inside the file: [
/etc/postgresql/10/main/pg_hba.conf
]local all postgres trust
ref: https://gist.github.com/AtulKsol/4470d377b448e56468baef85af7fd614
-
-
token generator:
generate_tokens.py
has the function generate_tokens which does the taskpython generate_tokens.py
args:
--token_len
: default 7--num
: default 10 millions--file
: default tokens.txt
--method
: choices[parallel, sequential]--secure
: if present a more cryptographically secure method is used for random generating
-
token reader:
read_tokens.py
: has the function read_tokens_postgres (and the database schema)example:
python read_tokens.py
--file
: default tokens.txt--database
: not useful now
-
tokens generator
random.SystemRandom().choices uses os.urandom() generates operating-system-dependent random bytes that can safely be called cryptographically secure
PRNGs Algorithmrandom.choices |
CSPRNGs Algorithm random.SystemRandom().choices |
|
---|---|---|
Sequential | ~20 seconds | ~190 seconds |
Parallel | ~6 seconds | ~130 seconds |
-
token reader
~40 sec (reading and couting duplicates)