-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #44 from DataONEorg/develop
Release v0.1.1
- Loading branch information
Showing
24 changed files
with
4,737 additions
and
3,136 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2,6 +2,7 @@ | |
instance/ | ||
dbs/ | ||
.vscode | ||
nohup.out | ||
|
||
docs/diagrams/C4-PlantUML | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
#uWSGI configuration for mnlite | ||
[uwsgi] | ||
strict = true | ||
master = true | ||
processes = 5 | ||
enable-threads = true | ||
vacuum = true ; Delete sockets during shutdown | ||
single-interpreter = true | ||
die-on-term = true ; Shutdown when receiving SIGTERM (default is respawn) | ||
need-app = true | ||
|
||
#disable-logging = true ; Disable built-in logging | ||
#log-4xx = true ; but log 4xx's anyway | ||
#log-5xx = true ; and 5xx's | ||
|
||
##harakiri = 60 ; forcefully kill workers after 60 seconds | ||
#py-callos-afterfork = true ; allow workers to trap signals | ||
|
||
##max-requests = 1000 ; Restart workers after this many requests | ||
##max-worker-lifetime = 3600 ; Restart workers after this many seconds | ||
##reload-on-rss = 2048 ; Restart workers after this much resident memory | ||
##worker-reload-mercy = 60 ; How long to wait before forcefully killing workers | ||
|
||
#cheaper-algo = busyness | ||
#processes = 128 ; Maximum number of workers allowed | ||
#cheaper = 8 ; Minimum number of workers allowed | ||
#cheaper-initial = 16 ; Workers created at startup | ||
#cheaper-overload = 1 ; Length of a cycle in seconds | ||
#cheaper-step = 16 ; How many workers to spawn at a time | ||
#cheaper-busyness-multiplier = 30 ; How many cycles to wait before killing workers | ||
#cheaper-busyness-min = 20 ; Below this threshold, kill workers (if stable for multiplier cycles) | ||
#cheaper-busyness-max = 70 ; Above this threshold, spawn new workers | ||
##cheaper-busyness-backlog-alert = 16 ; Spawn emergency workers if more than this many requests are waiting in the queue | ||
##cheaper-busyness-backlog-step = 2 ; How many emergency workers to create if there are too many requests in the queue | ||
|
||
##plugins = python | ||
##virtualenv = /home/mnlite/miniconda3/envs/mnlite | ||
module = mnlite:create_app() | ||
socket = /home/mnlite/WORK/mnlite/mnlite/tmp/mnlite.sock | ||
chmod-socket = 664 | ||
|
||
#stats = /tmp/stats.socket | ||
##stats = 127.0.0.1:9191 | ||
##stats-http = true |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,66 @@ | ||
# mnonboard | ||
|
||
This module is designed to provide a wrapper around `opersist` and `mnlite` in order to streamline the [DataONE member node onboarding process](https://github.com/DataONEorg/mnlite/blob/feature/onboarding/docs/operation.md). | ||
It takes as input either a json document manually edited from a template, or converts direct user input to a json document. | ||
|
||
## Usage | ||
|
||
This script requires working installations of both [sonormal](https://github.com/datadavev/sonormal) and [mnlite](https://github.com/DataONEorg/mnlite) to function properly. | ||
|
||
### CLI options | ||
|
||
``` | ||
Usage: cli [ OPTIONS ] | ||
where OPTIONS := { | ||
-c | --check=[ NUMBER ] | ||
number of random metadata files to check for schema.org compliance | ||
-d | --dump=[ FILE ] | ||
dump default member node json file to configure manually | ||
-h | --help | ||
display this help message | ||
-i | --init | ||
initialize a new member node from scratch | ||
-l | --load=[ FILE ] | ||
initialize a new member node from a json file | ||
-P | --production | ||
run this script in production mode (uses the D1 cn API in searches) | ||
-L | --local | ||
run this script in local mode (will not scrape the remote site for new metadata) | ||
} | ||
``` | ||
|
||
### Onboarding process | ||
|
||
Let's say you are in the `mnlite` base directory. | ||
Start by activating the `mnlite` virtual environment and changing the working directory to `./mnonboard`: | ||
|
||
``` | ||
workon mnlite | ||
cd mnonboard | ||
``` | ||
|
||
**Note:** Node data is stored in `instance/nodes/<NODENAME>` | ||
|
||
#### Using an existing `node.json` | ||
|
||
To onboard a member node with an existing `node.json` file: | ||
|
||
``` | ||
python cli.py -l ../instance/nodes/BONARES/node.json | ||
``` | ||
|
||
The script will guide you through the steps to set up the node and harvest its metadata. | ||
|
||
#### No existing `node.json` | ||
|
||
The script can also ask the user questions to set up the `node.json` file in an assisted manner. To do so, use the `-i` (initialize) flag: | ||
|
||
``` | ||
python cli.py -i | ||
``` | ||
|
||
Keep in mind that you should always check the `node.json` file to ensure correct values. | ||
|
||
## Other functionality | ||
|
||
Coming soon (see [#21](https://github.com/DataONEorg/mnlite/issues/21)) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,63 @@ | ||
import os | ||
import logging | ||
from datetime import datetime | ||
|
||
from opersist.cli import LOG_DATE_FORMAT, LOG_FORMAT | ||
from mnlite.mnode import DEFAULT_NODE_CONFIG | ||
|
||
DEFAULT_JSON = DEFAULT_NODE_CONFIG | ||
|
||
__version__ = 'v0.0.1' | ||
|
||
LOG_FORMAT = "%(asctime)s %(funcName)s:%(levelname)s: %(message)s" # overrides import | ||
|
||
FN_DATE = datetime.now().strftime('%Y-%m-%d') | ||
HM_DATE = datetime.now().strftime('%Y-%m-%d-%H%M') | ||
YM_DATE = datetime.now().strftime('%Y-%m') | ||
LOG_DIR = '/var/log/mnlite/' | ||
LOG_NAME = 'mnonboard-%s.log' % (FN_DATE) | ||
LOG_LOC = os.path.join(LOG_DIR, LOG_NAME) | ||
|
||
HARVEST_LOG_NAME = '-crawl-%s.log' % YM_DATE | ||
|
||
def start_logging(): | ||
""" | ||
Initialize logger. | ||
:returns: The logger to use | ||
:rtype: logging.Logger | ||
""" | ||
logger = logging.getLogger('mnonboard') | ||
logger.setLevel(logging.DEBUG) | ||
formatter = logging.Formatter(fmt=LOG_FORMAT, datefmt=LOG_DATE_FORMAT) | ||
s = logging.StreamHandler() | ||
s.setLevel(logging.INFO) | ||
s.setFormatter(formatter) | ||
# this initializes logging to file | ||
f = logging.FileHandler(LOG_LOC) | ||
f.setLevel(logging.DEBUG) | ||
f.setFormatter(formatter) | ||
# warnings also go to file | ||
# initialize logging | ||
logger.addHandler(s) # stream | ||
logger.addHandler(f) # file | ||
logger.info('----- mnonboard %s start -----' % __version__) | ||
return logger | ||
|
||
L = start_logging() | ||
|
||
# absolute path of current file | ||
CUR_PATH_ABS = os.path.dirname(os.path.abspath(__file__)) | ||
|
||
# relative path from root of mnlite dir to nodes directory | ||
NODE_PATH_REL = 'instance/nodes/' | ||
|
||
def default_json(fx='Unspecified'): | ||
""" | ||
A function that spits out a dict to be used in onboarding. | ||
:returns: A dict of values to be used in member node creation | ||
:rtype: dict | ||
""" | ||
L.info('%s function loading default json template.' % (fx)) | ||
return DEFAULT_JSON |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,155 @@ | ||
import os, sys | ||
import getopt | ||
import time | ||
|
||
from mnonboard import utils | ||
from mnonboard import info_chx | ||
from mnonboard import data_chx | ||
from mnonboard import cn | ||
from mnonboard.defs import CFG, HELP_TEXT, SO_SRVR, CN_SRVR, CN_SRVR_BASEURL, CN_CERT_LOC, APPROVE_SCRIPT_LOC | ||
from mnonboard import default_json, L | ||
|
||
def run(cfg): | ||
""" | ||
Wrapper around opersist that simplifies the process of onboarding a new | ||
member node to DataONE. | ||
:param dict cfg: Dict containing config variables | ||
""" | ||
# auth | ||
if not cfg['token']: | ||
cfg['token'] = os.environ.get('D1_AUTH_TOKEN') | ||
if not cfg['token']: | ||
print('Your DataONE auth token is missing. Please enter it here and/or store it in the env variable "D1_AUTH_TOKEN".') | ||
cfg['token'] = info_chx.req_input('Please enter your DataONE authentication token: ') | ||
os.environ['D1_AUTH_TOKEN'] = cfg['token'] | ||
cfg['cert_loc'] = CN_CERT_LOC[cfg['mode']] | ||
DC = cn.init_client(cn_url=cfg['cn_url'], auth_token=cfg['token']) | ||
if cfg['info'] == 'user': | ||
# do the full user-driven info gathering process | ||
ufields = info_chx.user_input() | ||
fields = info_chx.transfer_info(ufields) | ||
else: | ||
# grab the info from a json | ||
fields = utils.load_json(cfg['json_file']) | ||
info_chx.input_test(fields) | ||
# still need to ask the user for some names | ||
# now we're cooking | ||
# get the node path using the end of the path in the 'node_id' field | ||
end_node_subj = fields['node']['node_id'].split(':')[-1] | ||
loc = utils.node_path(nodedir=end_node_subj) | ||
# initialize a repository there (step 5) | ||
utils.init_repo(loc) | ||
names = {} | ||
for f in ('default_owner', 'default_submitter', 'contact_subject'): | ||
# add a subject for owner and submitter (may not be necessary if they exist already) | ||
# add subject for technical contact (step 6) | ||
val = fields[f] if f not in 'contact_subject' else fields['node'][f] | ||
name = utils.get_or_create_subj(loc=loc, value=val, cn_url=cfg['cn_url'], title=f) | ||
# store this for a few steps later | ||
names[val] = name | ||
# set the update schedule and set the state to up | ||
fields['node']['schedule'] = utils.set_schedule() | ||
fields['node']['state'] = 'up' | ||
# okay, now overwrite the default node.json with our new one (step 8) | ||
utils.save_json(loc=os.path.join(loc, 'node.json'), jf=fields) | ||
# add node as a subject (step 7) | ||
utils.get_or_create_subj(loc=loc, value=fields['node']['node_id'], | ||
cn_url=cfg['cn_url'], | ||
name=end_node_subj) | ||
# restart the mnlite process to pick up the new node.json (step 9) | ||
utils.restart_mnlite() | ||
# run scrapy to harvest metadata (step 10) | ||
if not cfg['local']: | ||
utils.harvest_data(loc, end_node_subj) | ||
# now run tests | ||
data_chx.test_mdata(loc, num_tests=cfg['check_files']) | ||
# create xml to upload for validation (step 15) | ||
files = utils.create_names_xml(loc, node_id=fields['node']['node_id'], names=names) | ||
# uploading xml (proceed to step 14 and ssh to find xml in ~/d1_xml) | ||
ssh, work_dir, node_id = utils.start_ssh(server=cfg['cn_url'], | ||
node_id=fields['node']['node_id'], | ||
loc=loc, | ||
ssh=cfg['ssh']) | ||
time.sleep(0.5) | ||
utils.upload_xml(ssh=ssh, server=CN_SRVR[cfg['mode']], files=files, node_id=node_id, loc=loc) | ||
# create and validate the subject in the accounts service (step 16) | ||
utils.create_subj_in_acct_svc(ssh=ssh, cert=cfg['cert_loc'], files=files, cn=cfg['cn_url'], loc=loc) | ||
utils.validate_subj_in_acct_svc(ssh=ssh, cert=cfg['cert_loc'], names=names, cn=cfg['cn_url'], loc=loc) | ||
# download the node capabilities and register the node | ||
node_filename = utils.dl_node_capabilities(ssh=ssh, baseurl=SO_SRVR[cfg['mode']], node_id=node_id, loc=loc) | ||
utils.register_node(ssh=ssh, cert=cfg['cert_loc'], node_filename=node_filename, cn=cfg['cn_url'], loc=loc) | ||
utils.approve_node(ssh=ssh, script_loc=APPROVE_SCRIPT_LOC, loc=loc) | ||
# close connection | ||
ssh.close() if ssh else None | ||
|
||
def main(): | ||
""" | ||
Uses getopt to set config values in order to call | ||
:py:func:`mnlite.mnonboard.cli.run`. | ||
:returns: Config variable dict to use in :py:func:`mnlite.mnonboard.cli.run` | ||
:rtype: dict | ||
""" | ||
# get arguments | ||
try: | ||
opts = getopt.getopt(sys.argv[1:], 'hiPvLd:l:c:', | ||
['help', 'init', 'production', 'verbose', 'local' 'dump=', 'load=', 'check='] | ||
)[0] | ||
except Exception as e: | ||
L.error('Error: %s' % e) | ||
print(HELP_TEXT) | ||
exit(1) | ||
for o, a in opts: | ||
if o in ('-h', '--help'): | ||
# help | ||
print(HELP_TEXT) | ||
exit(0) | ||
if o in ('-i', '--init'): | ||
# do data gathering | ||
CFG['info'] = 'user' | ||
if o in ('-P', '--production'): | ||
# production case | ||
CFG['cn_url'] = CN_SRVR_BASEURL % CN_SRVR['production'] | ||
CFG['mode'] = 'production' | ||
else: | ||
# testing case | ||
CFG['cn_url'] = CN_SRVR_BASEURL % CN_SRVR['testing'] | ||
CFG['mode'] = 'testing' | ||
if o in ('-d', '--dump'): | ||
# dump default json to file | ||
utils.save_json(a, default_json()) | ||
exit(0) | ||
if o in ('-l', '--load'): | ||
# load from json file | ||
CFG['info'] = 'json' | ||
CFG['json_file'] = a | ||
if o in ('-c', '--check'): | ||
try: | ||
CFG['check_files'] = int(a) | ||
except ValueError: | ||
if a == 'all': # this should probably not be used unless necessary! | ||
CFG['check_files'] = a | ||
else: | ||
L.error('Option -c (--check) requires an integer number of files to check.') | ||
print(HELP_TEXT) | ||
exit(1) | ||
if o in ('-L', '--local'): | ||
CFG['local'] = True | ||
L.info('Local mode (-L) will not scrape the remote site and will only test local files.') | ||
L.info('running mnonboard in %s mode.\n\ | ||
data gathering from: %s\n\ | ||
cn_url: %s\n\ | ||
metadata files to check: %s' % (CFG['mode'], | ||
CFG['info'], | ||
CFG['cn_url'], | ||
CFG['check_files'])) | ||
try: | ||
run(CFG) | ||
except KeyboardInterrupt: | ||
print() | ||
L.error('Caught KeyboardInterrupt, quitting...') | ||
exit(1) | ||
|
||
if __name__ == '__main__': | ||
main() |
Oops, something went wrong.