Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

updated models and load functions for almeida #18

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Conversation

sherrie9
Copy link
Collaborator

No description provided.

Copy link
Member

@dfornika dfornika left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking great @sherrie9. I've added a few notes, mostly just asking to remove lines of commented-out code if they're not needed.

tb_db/parsers.py Outdated

with open(speciation_path, 'r') as f:
reader = csv.DictReader(f)
count = 0
#count = 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove this count variable if it isn't going to be used rather than leaving it commented out

tb_db/parsers.py Outdated
Comment on lines 324 to 325
#species.append(speci)
#count+=1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove these lines?

tb_db/parsers.py Outdated
qcs = []
with open(qc_path, 'r') as f:
reader = csv.DictReader(f)
for row in reader:
qc = {
'sample_id': row['sample_id'],
'sample_name': row['sample_id'],
#'sample_name': row['sample_id'],
'sequencing_run_id' : sequencing_run_id,
'most_abundant_species_name':row['most_abundant_species_name'],
'most_abundant_species_fraction_total_reads': float(row['most_abundant_species_fraction_total_reads']),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is optional but I have a suggestion:

I try to wrap conversions to int and float in try / catch blocks like this:

try:
    total_bases = int(row['total_bases'])
except ValueError as e:
    total_bases = None

If there are several fields that need to be converted then I'll do something like this:

float_fields = [
    'average_base_quality',
    'percent_bases_above_q30',
    'percent_gc',
]
for field in float_fields:
    try:
        row[field] = float(row[field])
    except ValueError as e:
        row[field] = None

tb_db/parsers.py Outdated
@@ -185,7 +185,7 @@ def parse_cgmlst(cgmlst_path: str, uncalled='-'):
reader = csv.DictReader(f)
for row in reader:
sample_id = row.pop('sample_id')
sample_id = sample_id[:6]
#sample_id = sample_id
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we remove this line?

tb_db/models.py Outdated
Comment on lines 60 to 61
#accession = Column(String)
#collection_date = Column(Date)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove these if they aren't being used

tb_db/models.py Outdated
@@ -71,9 +71,9 @@ class Library(Base):
"""

sample_id = Column(Integer, ForeignKey("sample.id"), nullable=False)
sample_name = Column(String)
#sample_name = Column(String)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we remove this?

tb_db/models.py Outdated
sequencing_run_id = Column(String)
library_id = Column(String)
#library_id = Column(String)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we remove this?

@@ -349,7 +349,7 @@ def create_cgmlst_profile(self, sample, profile,runid):
assert(created_cgmlst_profile.library_id == created_libraries[0].id)



TestSampleCrudMachine = SampleCrudMachine.TestCase
#comment the below out as it says 'dict' object has no attribute 'startswith' that i am not sure how to fix, refer to line 305 in crud.py
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll have to take a look at the overall testing processes and make sure everything passes, but I'm not going to let it block merging this. We can address that later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants