Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/bagit #26

Open
wants to merge 43 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
67d8abc
chore(dev_settings): keep config that run locally
giangbui Mar 22, 2018
56f9af1
Merge branch 'master' of https://github.com/uc-cdis/peregrine
giangbui Mar 22, 2018
7595ebf
fix(unittes): change unittest since due to submitter fixture
giangbui Mar 22, 2018
0d6a577
feat(bagit): clean up the code
giangbui Mar 23, 2018
af6fd0a
fix(bagit): fix bagit response
giangbui Mar 23, 2018
f8ec016
fix(bug): fix bug
giangbui Mar 23, 2018
d60fc10
chore(cleanup): clean up the code
giangbui Mar 23, 2018
6907e65
chore(code): refactor code
giangbui Mar 26, 2018
e282e1a
feat(alias): support alias
giangbui Mar 26, 2018
4db6a94
fix(bug): fix bugs
giangbui Mar 26, 2018
c3e1ef8
feat(streamming): support data streaming
giangbui Mar 28, 2018
09e5038
fix(tmpfile): cleanup after sending file
giangbui Mar 28, 2018
3b7be5c
feat(bagit): support generate multiple tsv files
giangbui Mar 28, 2018
c368680
fix(bagit): change output tsv names
giangbui Mar 28, 2018
3779f8e
chore(bagit): change method names
giangbui Mar 28, 2018
6b6ec5c
chore(column): remove columns which contain all None
giangbui Apr 4, 2018
ac0a183
chore(dep): bump dictutils
philloooo Apr 13, 2018
8853972
Merge branch 'chore/dep' into feat/bagit
philloooo Apr 13, 2018
459b84d
feat(bagit): generate fetch.txt
giangbui Apr 13, 2018
98b6ae3
Merge branch 'feat/bagit' of https://github.com/uc-cdis/peregrine int…
giangbui Apr 13, 2018
aa0307c
fix(dependencies): add indexclient to requirment.txt
giangbui Apr 14, 2018
0636695
feat(uuid): add size and path to bag
giangbui Apr 14, 2018
7a37ab2
fix(docker): fix dockerfile
giangbui Apr 14, 2018
66adfda
fix(requirments): update requirments.txt
giangbui Apr 14, 2018
48fff4c
feat(presignUrl): push bag to s3 and return presigned url
giangbui Apr 25, 2018
c0ce565
fix(requirement): add boto3
giangbui Apr 25, 2018
4a75878
chore(refactor): refactor code
giangbui Apr 25, 2018
914d2c2
feat(sort): sort headers
zflamig Jun 14, 2018
3065b57
feat(entity): use entity name
zflamig Jun 14, 2018
df4bd6e
feat(guid): detect data guids
zflamig Jun 15, 2018
711aabf
feat(dos): uri
zflamig Jun 15, 2018
aae2a21
feat(dos): hack for dos uri
zflamig Jun 15, 2018
d054277
fix(dos): proper search
zflamig Jun 18, 2018
0e43629
fix(uuid): search for UUID still
zflamig Jul 6, 2018
5a9eaa3
feat(tweak): tweak timeout
zflamig Jul 15, 2018
13cbacc
feat(cleanup): clean up a few things
zflamig Jul 16, 2018
5871e85
Merge branch 'master' into feat/bagit
zflamig Jul 19, 2018
0621490
feat(unique-id)
paulineribeyre Sep 26, 2018
ee37322
feat(unique-id): generate a uuid for each row
paulineribeyre Sep 26, 2018
83ed8e8
feat(missing-files): ignore missing files
paulineribeyre Sep 26, 2018
32fd73c
fix(missing-files): fix typo
paulineribeyre Sep 26, 2018
a16f7da
fix(missing-files): not checking if there is a 'file' header because …
paulineribeyre Sep 27, 2018
61f63ce
Merge branch 'master' into feat/bagit
zflamig Mar 7, 2019
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions deployment/uwsgi/uwsgi.ini
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ chmod-socket = 666
master = true
processes = 2
harakiri-verbose = true
disable-logging = true
harakiri = 45
http-timeout = 45
socket-timeout = 45
Expand Down
3 changes: 0 additions & 3 deletions dev-requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,7 @@ codacy-coverage
moto==0.4.5
Sphinx==1.3.1
sphinxcontrib-httpdomain==1.3.0
-e git+https://git@github.com/uc-cdis/indexclient.git@1.0#egg=indexclient
-e git+https://git@github.com/NCI-GDC/signpost.git@c8e2aa5ff572c808cba9b522b64f7b497e79c524#egg=signpost
-e git+https://git@github.com/uc-cdis/cdisutils-test.git@0.0.1#egg=cdisutilstest
-e git+https://git@github.com/uc-cdis/flask-postgres-session.git@0.1.1#egg=flask_postgres_session
# dependency of sheepdog
envelopes==0.4
-e git+https://git@github.com/uc-cdis/sheepdog.git@1.1.1#egg=sheepdog
Expand Down
20 changes: 20 additions & 0 deletions dockerrun.bash
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
#!/bin/bash

cd /var/www/peregrine

export PYTHONUNBUFFERED=TRUE

(
# Wait for nginx to create uwsgi.sock
let count=0
while [[ (! -e uwsgi.sock) && count -lt 10 ]]; do
sleep 2
let count="$count+1"
done
if [[ ! -e uwsgi.sock ]]; then
echo "WARNING: /var/www/peregrine/uwsgi.sock does not exist!!!"
fi
uwsgi --ini /etc/uwsgi/uwsgi.ini
) &

nginx -g 'daemon off;'
8 changes: 8 additions & 0 deletions peregrine/api.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
import datamodelutils
from dictionaryutils import DataDictionary, dictionary as dict_init
from cdispyutils.log import get_handler
from indexclient.client import IndexClient
from cdispyutils.uwsgi import setup_user_harakiri

import peregrine
Expand Down Expand Up @@ -66,6 +67,13 @@ def db_init(app):
)


app.logger.info('Initializing Indexd driver')
app.index_client = IndexClient(
app.config['SIGNPOST']['host'],
version=app.config['SIGNPOST']['version'],
auth=app.config['SIGNPOST']['auth'])


# Set CORS options on app configuration
def cors_init(app):
accepted_headers = [
Expand Down
40 changes: 17 additions & 23 deletions peregrine/dev_settings.example.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,14 @@

# Auth
AUTH = 'https://gdc-portal.nci.nih.gov/auth/keystone/v3/'
INTERNAL_AUTH = env.get('INTERNAL_AUTH', 'https://gdc-portal.nci.nih.gov/auth/')
INTERNAL_AUTH = env.get(
'INTERNAL_AUTH', 'https://gdc-portal.nci.nih.gov/auth/')

# Signpost
SIGNPOST = {
'host': env.get('SIGNPOST_HOST', 'http://localhost:8888'),
'version': 'v0',
'auth': None}

AUTH_ADMIN_CREDS = {
'domain_name': env.get('KEYSTONE_DOMAIN'),
Expand All @@ -13,31 +20,18 @@
'auth_url': env.get('KEYSTONE_AUTH_URL'),
'user_domain_name': env.get('KEYSTONE_DOMAIN')}

# Storage
CLEVERSAFE_HOST = env.get('CLEVERSAFE_HOST', 'cleversafe.service.consul')
STORAGE = {
"s3":
{
"access_key": '',
'secret_key': ''
}
}

STORAGE = {"s3": {
"keys": {
"cleversafe.service.consul": {
"access_key": os.environ.get('CLEVERSAFE_ACCESS_KEY'),
'secret_key': os.environ.get('CLEVERSAFE_SECRET_KEY')},
"localhost": {
"access_key": os.environ.get('CLEVERSAFE_ACCESS_KEY'),
'secret_key': os.environ.get('CLEVERSAFE_SECRET_KEY')},
}, "kwargs": {
'cleversafe.service.consul': {
'host': 'cleversafe.service.consul',
"is_secure": False,
"calling_format": OrdinaryCallingFormat()},
'localhost': {
'host': 'localhost',
"is_secure": False,
"calling_format": OrdinaryCallingFormat()},
}}}
SUBMISSION = {
"bucket": 'test_submission',
"host": CLEVERSAFE_HOST,
"bucket": ''
}

# Postgres
PSQLGRAPH = {
'host': os.getenv("GDC_PG_HOST", "localhost"),
Expand Down
79 changes: 45 additions & 34 deletions peregrine/dev_settings.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,16 @@
from boto.s3.connection import OrdinaryCallingFormat
from os import environ as env

# Signpost
SIGNPOST = {
'host': env.get('SIGNPOST_HOST', 'http://localhost:8888'),
'version': 'v0',
'auth': None}

# Auth
AUTH = 'https://gdc-portal.nci.nih.gov/auth/keystone/v3/'
INTERNAL_AUTH = env.get('INTERNAL_AUTH', 'https://gdc-portal.nci.nih.gov/auth/')
INTERNAL_AUTH = env.get(
'INTERNAL_AUTH', 'https://gdc-portal.nci.nih.gov/auth/')

# Signpost
SIGNPOST = {
'host': env.get('SIGNPOST_HOST', 'http://localhost:8888'),
'version': 'v0',
'auth': None}

AUTH_ADMIN_CREDS = {
'domain_name': env.get('KEYSTONE_DOMAIN'),
Expand All @@ -22,53 +23,63 @@
# Storage
CLEVERSAFE_HOST = env.get('CLEVERSAFE_HOST', 'cleversafe.service.consul')

STORAGE = {"s3": {
"keys": {
"cleversafe.service.consul": {
"access_key": os.environ.get('CLEVERSAFE_ACCESS_KEY'),
'secret_key': os.environ.get('CLEVERSAFE_SECRET_KEY')},
"localhost": {
"access_key": os.environ.get('CLEVERSAFE_ACCESS_KEY'),
'secret_key': os.environ.get('CLEVERSAFE_SECRET_KEY')},
}, "kwargs": {
'cleversafe.service.consul': {
'host': 'cleversafe.service.consul',
"is_secure": False,
"calling_format": OrdinaryCallingFormat()},
'localhost': {
'host': 'localhost',
"is_secure": False,
"calling_format": OrdinaryCallingFormat()},
}}}
STORAGE = {
"s3":
{
"access_key": '',
'secret_key': ''
}
}


SUBMISSION = {
"bucket": 'test_submission',
"host": CLEVERSAFE_HOST,
"bucket": 'test_submission'
}
# Postgres
PSQLGRAPH = {
'host': os.getenv("GDC_PG_HOST", "localhost"),
'user': os.getenv("GDC_PG_USER", "test"),
'password': os.getenv("GDC_PG_PASSWORD", "test"),
'database': os.getenv("GDC_PG_DBNAME", "automated_test")
'database': os.getenv("GDC_PG_DBNAME", "sheepdog_automated_test")
}

# API server
PEREGRINE_HOST = os.getenv("PEREGRINE_HOST", "localhost")
PEREGRINE_PORT = int(os.getenv("PEREGRINE_PORT", "5000"))
PEREGRINE_PORT = int(os.getenv("PEREGRINE_PORT", "5555"))

# FLASK_SECRET_KEY should be set to a secure random string with an appropriate
# length; 50 is reasonable. For the random generation to be secure, use
# ``random.SystemRandom()``
FLASK_SECRET_KEY = 'eCKJOOw3uQBR5pVDz3WIvYk3RsjORYoPRdzSUNJIeUEkm1Uvtq'

DICTIONARY_URL = os.environ.get('DICTIONARY_URL','https://s3.amazonaws.com/dictionary-artifacts/datadictionary/develop/schema.json')
DICTIONARY_URL = os.environ.get(
'DICTIONARY_URL', 'https://s3.amazonaws.com/dictionary-artifacts/datadictionary/develop/schema.json')

OIDC_ISSUER = 'http://localhost/user'

HMAC_ENCRYPTION_KEY = os.environ.get('CDIS_HMAC_ENCRYPTION_KEY', '')
# OAUTH2 = {
# "client_id": os.environ.get('CDIS_PEREGRINE_CLIENT_ID'),
# "client_secret": os.environ.get("CDIS_PEREGRINE_CLIENT_SECRET"),
# "oauth_provider": os.environ.get("CDIS_USER_API_OAUTH", 'http://localhost:8000/oauth2/'),
# "redirect_uri": os.environ.get("CDIS_PEREGRINE_OAUTH_REDIRECT", 'localhost:5000/v0/oauth2/authorize'),
#}

OAUTH2 = {
"client_id": os.environ.get('CDIS_PEREGRINE_CLIENT_ID'),
"client_secret": os.environ.get("CDIS_PEREGRINE_CLIENT_SECRET"),
"oauth_provider": os.environ.get("CDIS_USER_API_OAUTH", 'http://localhost:8000/oauth2/'),
"redirect_uri": os.environ.get("CDIS_PEREGRINE_OAUTH_REDIRECT", 'localhost:5000/v0/oauth2/authorize'),
'client_id': '',
'client_secret': '',
'api_base_url': 'http://localhost/user/',
'authorize_url': 'http://localhost/user/oauth2/authorize',
'access_token_url': 'http://localhost0/user/oauth2/token',
'refresh_token_url': 'http://localhost/user/oauth2/token',
'client_kwargs': {
'redirect_uri': 'http://localhost/api/v0/oauth2/authorize',
'scope': 'openid data user',
},
# deprecated key values, should be removed after all commons use new oidc
'internal_oauth_provider': 'http://localhost/oauth2/',
'oauth_provider': 'http://localhost/user/oauth2/',
'redirect_uri': 'http://localhost/api/v0/oauth2/authorize'
}

USER_API = "http://localhost:8000/"
Expand Down
44 changes: 40 additions & 4 deletions peregrine/resources/submission/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,14 @@
:py:mod:``peregrine``.
"""

import datetime
import os
import os.path

import uuid
import shutil
from flask import Response, send_file, stream_with_context

import json
import time
import fcntl
Expand All @@ -14,9 +21,11 @@

from peregrine.auth import current_user, get_program_project_roles
import peregrine.blueprints
from peregrine.utils import jsonify_check_errors
from peregrine.resources.submission import graphql



def get_open_project_ids():
"""
List project ids corresponding to projects with ``availability_type ==
Expand Down Expand Up @@ -118,23 +127,50 @@ def set_read_access_projects():
@peregrine.blueprints.blueprint.route('/graphql', methods=['POST'])
def root_graphql_query():
"""
Run a graphql query.
Run a graphql query and export to supported formats(json, bdbag)

"""
# Short circuit if user is not recognized. Make sure that the list of
# projects that the user has read access to is set.

try:
set_read_access_projects()
except AuthZError:
data = flask.jsonify({'data': {}, 'errors': ['Unauthorized query.']})
return data, 403
payload = peregrine.utils.parse_request_json()
query = payload.get('query')
export_format = payload.get('format')
variables, errors = peregrine.utils.get_variables(payload)
if errors:
return flask.jsonify({'data': None, 'errors': errors}), 400
return peregrine.utils.jsonify_check_errors(
graphql.execute_query(query, variables)
)

return_data = jsonify_check_errors(graphql.execute_query(query, variables))
data, code = return_data

if code != 200:
return data, code

if export_format == 'bdbag':
res = peregrine.utils.flatten_json(json.loads(data.data), '', "-")

bag_info = {'organization': 'CDIS',
'data_type': 'TOPMed',
'date_created': datetime.date.today().isoformat()}
args = dict(
bag_info=bag_info,
payload=res)

bag = peregrine.utils.create_bdbag(**args) # bag is a compressed file
key_name = str(flask.g.user.id) + "/" + \
str(uuid.uuid4()) + '_' + datetime.datetime.now().strftime('%s')
peregrine.utils.put_data_to_s3(bag, key_name)
url = peregrine.utils.generate_presigned_url(key_name)
shutil.rmtree(os.path.abspath(os.path.join(bag, os.pardir)))

return flask.Response(url), 200
else:
return return_data


def generate_schema_file(graphql_schema, app_logger):
Expand Down
6 changes: 5 additions & 1 deletion peregrine/utils/__init__.py
Original file line number Diff line number Diff line change
@@ -1,2 +1,6 @@
from .payload import get_variables,jsonify_check_errors,parse_request_json
from .payload import get_variables,jsonify_check_errors,parse_request_json,get_keys,contain_node_with_category
from .pybdbag import create_bdbag
from .scheduling import AsyncPool
from .json2csv import flatten_obj, json2tsv, dicts2tsv, flatten_json
from .response import format_response
from .s3 import put_data_to_s3, generate_presigned_url
Loading