Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rmarkdown report fails: Detected 7 column names but the data has 5 columns #72

Closed
hoelzer opened this issue Apr 16, 2024 · 6 comments
Closed
Assignees
Labels
bug Something isn't working

Comments

@hoelzer
Copy link
Member

hoelzer commented Apr 16, 2024

Hey,

I am using the pipeline on SARS-CoV-2 Capture-Seq data (Illumina paired-end).

nextflow run rki-mf1/covPipe2 -r v0.5.1 --fastq fq_list.csv --list --kraken --update -profile slurm,singularity --output results-covpipe2-martin 

Everything runs fine, but then

[a1/556128] process > summary_report:rmarkdown_report (1)                     [100%] 1 of 1, failed: 1 ✘

Here is the full error print:

Error executing process > 'summary_report:rmarkdown_report (1)'

Caused by:
  Process `summary_report:rmarkdown_report (1)` terminated with an error exit status (1)

Command executed:

  cp -L summary_report.Rmd report.Rmd
  Rscript -e "rmarkdown::render('report.Rmd', params=list(mode='paired', fastp_table_stats='read_stats.csv', fastp_table_stats_filter='read_stats_filter.csv', kraken_table='species_filtering.csv', flagstat_table='mapping_stats.csv', fragment_size_table='fragment_sizes.csv', fragment_size_median_table='fragment_sizes_median.csv', coverage_table='coverage_table.csv', positive='positive_samples.csv', negative='negative_samples.csv', sample_cov='coverage_samples.csv', president_results='president_results.tsv', pangolin_results='pangolin_results.csv', nextclade_results='nextclade_results.tsv', nextclade_version='nextclade 3.3.1',  nextclade_dataset_info='sars-cov-2, 2024-04-15--15-08-22Z', sc2rf_results='sc2rf_results.csv', vois_results='none', cns_min_cov='20', run_id='none', pipeline_version='https://github.com/rki-mf1/covPipe2 - v0.5.1 [453eb8fca67179cfc6a21bfdb23aab8248e758de]'), output_file='report.html')"

Command exit status:
  1

Command output:

    |
    |                                                                      |   0%
    |
    |.                                                                     |   2%
    ordinary text without R code


    |
    |...                                                                   |   4%
  label: setup (with options)
  List of 1
   $ include: logi FALSE


    |
    |....                                                                  |   6%
    ordinary text without R code


    |
    |......                                                                |   8%
  label: get_cmd_line_parameters

Command error:


  processing file: report.Rmd
  Quitting from lines 86-108 (report.Rmd)
  Error in setnames(x, value) :
    Can't assign 5 names to a 7 column data.table
  Calls: <Anonymous> ... eval -> eval -> names<- -> names<-.data.table -> setnames
  In addition: Warning message:
  In FUN(X[[i]], ...) :
    Detected 7 column names but the data has 5 columns. Filling rows automatically. Set fill=TRUE explicitly to avoid this warning.
  Execution halted

Is this maybe a problem of changes to pangolin/nextclade output?

Thanks!

(ps would be great if I can get this run until next week Monday, to use the data for a report)

@hoelzer hoelzer added the bug Something isn't working label Apr 16, 2024
@anfarr
Copy link

anfarr commented Apr 18, 2024

Hey,
I had the same error. See the issue I have opened for reference (#71). As far as I have investigated, it is not pangolin/nextclade but the output of sc2rf. When separating the output by comma, seven columns are created.
However, in the Rmd script five column names are assigned.

names(dt.sc2rf_results) <- c('sample','examples','intermissions','breakpoints','regions')

Maybe you can do a temporary fix by swapping L104 with
colnames(dt.sc2rf_results)[1:5] <- c('sample','examples','intermissions','breakpoints','regions')
in your local installation of the pipeline.

I am not sure where the regression occurred. I compared output files of sc2rf of the last months and there was no change in format. At first glance i cant find a commit that might be the reason. Unfortunately i do not have the time right now to investigate further.

Anyway, the result files should still be present in your specified publishing directory under ./Report/single_tables .

Best regards
Anton

@hoelzer
Copy link
Member Author

hoelzer commented Apr 22, 2024

Thanks @anfarr !

At least via that the pipeline ran through and produced the final HTML report.

@MarieLataretu I can also submit a PR with that change... but not sure if it's the best way of fixing that. Please feel free to do smt else and reject my PR ;)

@hoelzer
Copy link
Member Author

hoelzer commented Apr 22, 2024

See #73

@MarieLataretu
Copy link
Collaborator

MarieLataretu commented Apr 24, 2024

Hi all, thanks for reporting, @hoelzer , @anfarr !

Could you please test the branch MarieLataretu/issue72?

nextflow pull rki-mf1/CoVpipe2
nextflow run rki-mf1/CoVpipe2 -r MarieLataretu/issue72 ...

@hoelzer
Copy link
Member Author

hoelzer commented Apr 24, 2024

Hey @MarieLataretu thanks! I tested the issue72 branch and it worked!

@MarieLataretu
Copy link
Collaborator

Nice, I'll prepare the release then!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants