Failed to mmap memory dataSize=0 File=./NS_eyo9ogxk/seq.db_h. Error 22. #21

ghost · 2021-05-07T17:37:45Z

Hello, I am new to using these tools, so please excuse me if I don't explain well. I am trying to create a pangenome of Borrelia spp to map my tick microbiome reads against to quantify Borrelia presence in my samples.

I have 11 gff files representing 11 Borrelia species, which I downloaded from NCBI. I have the fasta files too, but I believe the gff format will suffice as input from NCBI, is this correct?
My files are:
GCF_000512145.1_ASM51214v2_genomic.fna.gz GCF_002741785.1_ASM274178v1_genomic.gff.gz
GCF_000512145.1_ASM51214v2_genomic.gff.gz GCF_003606285.1_ASM360628v1_genomic.fna.gz GCF_000956315.1_ASM95631v1_genomic.fna.gz GCF_003606285.1_ASM360628v1_genomic.gff.gz
GCF_000165595.2_ASM16559v2_genomic.fna.gz GCF_000956315.1_ASM95631v1_genomic.gff.gz GCF_003814405.1_ASM381440v1_genomic.fna.gz
GCF_000165595.2_ASM16559v2_genomic.gff.gz GCF_001936255.1_ASM193625v1_genomic.fna.gz GCF_003814405.1_ASM381440v1_genomic.gff.gz
GCF_000181575.2_ASM18157v2_genomic.fna.gz GCF_001936255.1_ASM193625v1_genomic.gff.gz GCF_014525745.1_ASM1452574v1_genomic.fna.gz
GCF_000181575.2_ASM18157v2_genomic.gff.gz GCF_001936295.1_ASM193629v1_genomic.fna.gz GCF_014525745.1_ASM1452574v1_genomic.gff.gz
GCF_000181895.2_ASM18189v2_genomic.fna.gz GCF_001936295.1_ASM193629v1_genomic.gff.gz
GCF_000181895.2_ASM18189v2_genomic.gff.gz GCF_002741785.1_ASM274178v1_genomic.fna.gz

Currently, I get a series of errors when I input the following:

Current Behavior

2021-05-07 12:25:21.570015 COMMAND: /home/sean/.local/bin/PEPPAN -p borrelia_files/BORR -t 4 --clust_identity 0.5 --clust_match_prop 0.6 --match_identity 0.4 borrelia_files/GCF_000165595.2_ASM16559v2_genomic.gff.gz borrelia_files/GCF_000181575.2_ASM18157v2_genomic.gff.gz borrelia_files/GCF_000181895.2_ASM18189v2_genomic.gff.gz borrelia_files/GCF_000512145.1_ASM51214v2_genomic.gff.gz borrelia_files/GCF_000956315.1_ASM95631v1_genomic.gff.gz borrelia_files/GCF_001936255.1_ASM193625v1_genomic.gff.gz borrelia_files/GCF_001936295.1_ASM193629v1_genomic.gff.gz borrelia_files/GCF_002741785.1_ASM274178v1_genomic.gff.gz borrelia_files/GCF_003606285.1_ASM360628v1_genomic.gff.gz borrelia_files/GCF_003814405.1_ASM381440v1_genomic.gff.gz borrelia_files/GCF_014525745.1_ASM1452574v1_genomic.gff.gz
2021-05-07 12:25:22.032943 Run MMSeqs linclust to get exemplar sequences. Params: 0.5 identities and 0.8 align ratio
Failed to mmap memory dataSize=0 File=./NS_eyo9ogxk/seq.db_h. Error 22.
Traceback (most recent call last):
File "/home/sean/.local/bin/PEPPAN", line 8, in
sys.exit(ortho())
File "/home/sean/.local/lib/python3.8/site-packages/PEPPAN/PEPPAN.py", line 1884, in ortho
params['clust'] = iterClust(params['prefix'], params['genes'], groups, dict(identity=params['clust_identity'], coverage=params['clust_match_prop'], n_thread=params['n_thread'], translate=False))
File "/home/sean/.local/lib/python3.8/site-packages/PEPPAN/PEPPAN.py", line 1784, in iterClust
g, clust = getClust(prefix, g, params)
File "/home/sean/.local/lib/python3.8/site-packages/PEPPAN/modules/clust.py", line 67, in getClust
with open(tabFile) as fin :
FileNotFoundError: [Errno 2] No such file or directory: './NS_eyo9ogxk/clust.tab'

Steps to Reproduce (for bugs)

PEPPAN -p borrelia_files/BORR -t 4 --clust_identity 0.5 --clust_match_prop 0.6 --match_identity 0.4 borrelia_files/*.gff.gz

This does generate some output files with my desired prefix:
BORR.encode.csv,BORR.genes and BORR.old_prediction.npz

Context

I have been searching online for clues and this was my reasoning behind changing the values for cluster identity and clust match prop and match identity. I changed -t to use fewer threads, in case it was a memory issue.

Environment

details of my environment:
To install, I did the following -
conda config --add channels defaults
conda config --add channels conda-forge
conda config --add channels bioconda
conda install mmseqs2
conda install blast
conda install diamond
conda install rapidnj
conda install fasttree

command -v mmseqs blastn rapidnj diamond fasttree

/home/sean/miniconda3/envs/peppaninstall/bin/mmseqs
/usr/bin/blastn

pip3 install peppan

I ran the test data and it all worked great.
I hope this makes sense!

Naclist · 2021-09-04T08:58:10Z

GFFs from NCBI without preptreatment are not enough for PEPPAN to establish a pangenome for you, read the Quickstart and you will find out a fasta file should be added. Also, you can use the Prokka to deal with your fasta files and generate GFF files with the sequences.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Failed to mmap memory dataSize=0 File=./NS_eyo9ogxk/seq.db_h. Error 22. #21

Failed to mmap memory dataSize=0 File=./NS_eyo9ogxk/seq.db_h. Error 22. #21

ghost commented May 7, 2021 •

edited by ghost

Loading

Naclist commented Sep 4, 2021 •

edited

Loading

Failed to mmap memory dataSize=0 File=./NS_eyo9ogxk/seq.db_h. Error 22. #21

Failed to mmap memory dataSize=0 File=./NS_eyo9ogxk/seq.db_h. Error 22. #21

Comments

ghost commented May 7, 2021 • edited by ghost Loading

Current Behavior

Steps to Reproduce (for bugs)

Context

Environment

Naclist commented Sep 4, 2021 • edited Loading

ghost commented May 7, 2021 •

edited by ghost

Loading

Naclist commented Sep 4, 2021 •

edited

Loading