-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can IsoCon be used on nontargeted Iso-Seq data sets? #2
Comments
Do you have any example to do it? |
Aligning is the better expression, any aligner aligning CCS reads to genome or transcripts should work. Thanks! As for an implemented example I don't have any. But this simple procedure should work:
The "trimming" part is the only step that doesn't have a standard tool to do this. But it's possible it could work without this step. Especially if the resulting dataset is small (say, less than 10,000 reads). |
The problem of my CCS fastq is the quality score is 5.
Subreads fastq file has !.
It looks like place holder during the SMRT analysis. Do you have any opinions regarding this?
Thanks!
…-------------------------------
Won Cheol Yim, Ph.D
Assistant Professor
Department of Biochemistry & Molecular Biology
University of Nevada – Reno
MS330 1664 N. Virginia Street
Reno NV 89557
Office: +1 775-682-9447
Lab: +1 775-682-9448
Fax: 775-784-1419
Email: wyim@unr.edu
http://www.plantbioinformatics.org
From: Kristoffer
Sent: Thursday, October 4, 5:45 PM
Subject: Re: [ksahlin/IsoCon] Can IsoCon be used on nontargeted Iso-Seq data sets? (#2)
To: ksahlin/IsoCon
Cc: Won C Yim, Comment
Aligning is the better expression, any aligner alinging ccs reads to genome or transcripts should work. Thanks!
As for an implemented example I don't have any. But this simple procedure should work:
Align CCS reads to reference of choice (genomic or transcripts) using minimap2 with -a set to produce a sam file. Minimap2 should have a parameter combination customized for aligning Iso-Seq reads. Use samtools to extract reads aligning to the region of interest Either run IsoCon directly on these reads, or try to trim the reads based on their start stop coordinates of the alignments.
The "trimming" part is the only step that doesn't have a standard tool to do this. But it's possible it could work without this step. Especially if the resulting dataset is small (say, less than 10,000 reads).
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fksahlin%2FIsoCon%2Fissues%2F2%23issuecomment-427212980&data=01%7C01%7Cwyim%40unr.edu%7C2de7c0ad299a4e0e67d108d62a5bd5cb%7C523b4bfc0ebd4c03b2b96f6a17fd31d8%7C1&sdata=l1GoK9o8njn7ScnYaZYSMKIr4eLnL%2FIrFYztzrGAQQ0%3D&reserved=0>, or mute the thread<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAA3XIVjqFAXvxImI1pCx4eAjRFuBtz_dks5uhqujgaJpZM4S7Ydv&data=01%7C01%7Cwyim%40unr.edu%7C2de7c0ad299a4e0e67d108d62a5bd5cb%7C523b4bfc0ebd4c03b2b96f6a17fd31d8%7C1&sdata=N5e0H%2Bxa803mg2LrLMO3DNnAvn81OEjxr1FMCX7C2bo%3D&reserved=0>.
|
Running PacBio's CCS caller
Where the flnc file can be obtained e.g. from IsoCon can however also be run with only a fasta file as (meaning that you would only have to convert the fastq to a fasta):
However, since individual base qualities plays a key role in the algorithm, the accuracy of IsoCon will likely give better results with quality values. |
In general: No. IsoCon is designed for targeted sequencing where the CCS flnc reads are cut at relatively precise positions (i.e., at the start and stop primer sites). If this is not the case it may both affect runtime and quality of the output.
However, if a nontargeted Iso-Seq dataset is processed such that the flnc reads from a particular gene are extracted (e.g., by using pre-cluster module from TuFU or aligning ccs reads to genome/transcriptome and separate by region) and these reads are cut at the same start and end position -- IsoCon should work well. Keep in mind though that if reads are "cut", the quality values associated with the ccs reads will also have to be cut the same way to preserve the base quality values remains to their base. This could be done relatively easily from the bam file.
The text was updated successfully, but these errors were encountered: