6. GBS Sexing

We took advantage of the improved Pigeon Genome Cliv_2.0 to try to sex our pigeons based on differences of coverage regarding the Z chromosome and an autosomal chromosome of similar size.

Generates a `.fasta` file including just the Z chromosome:

samtools faidx GCA_001887795.1_colLiv2_genomic.fasta CM007524.1 > GCA_001887795.1_colLiv2_genomic_Z.fasta

Creates a `.bed` fie with our enzyme cutsites from the above `.fasta` file:

$SCRIPTS/appz/p5-bpwrapper/bin/bioseq --restrict-coord EcoT22I ~/data/Pigeons/Reference/GCA_001887795.1_colLiv2_genomic_Z.fasta > ~/data/Pigeons/Reference/GCA_001887795.1_colLiv2_genomic_Z.bed

Number of CUTSITES: 14,927

Maps this .fasta file against our reference genome:

xsbatch -c 15 --mem-per-cpu 2000 -J Sexing --time 1-00 -- "blastn -num_threads 15 -query ~/data/Pigeons/Reference/GCA_001887795.1_colLiv2_genomic_Z.fasta -db ~/data/Pigeons/Reference/DanishTumbler_Dovetail_ReRun.fasta -evalue 1e-5 -perc_identity 95 -outfmt '7 std slen sstrand' > ~/data/Pigeons/Reference/GCA_001887795.1_colLiv2_genomic_Z--DanishTumbler_Dovetail_ReRun.tsv"

Selects those regions that mapped well enough and create a new `.bed` file out of this selection:

awk '!/#/ && $13>10000 && $4>=1000 && $5/$4<0.01' ~/data/Pigeons/Reference/GCA_001887795.1_colLiv2_genomic_Z--DanishTumbler_Dovetail_ReRun.tsv | sort -k 2,2 -k 9,9g -k 10,10g | awk '{start=$9; end=$10} $14=="minus"{start=$10; end=$9} {print $2"\t"start"\t"end"\t"$13}' | awk '{x[$1]+=$3-$2+1; s[$1]=$4} END{for(i in x) print i"\t"x[i]"\t"x[i]/s[i]}' | sort -k 3,3gr | awk 'NR>1 && $3>0.9{print $1}' | fgrep -w -f - ~/data/Pigeons/Reference/PBGP_FinalRun.EcoT22I_Extended_Merged_RemovedBadLoci-PossibleParalogs-g650.bed > ~/data/Pigeons/Reference/GCA_001887795.1_colLiv2_genomic_Z--DanishTumbler_Dovetail_ReRun.bed

Generates a `.fasta` file including just the 6 chromosome:

samtools faidx GCA_001887795.1_colLiv2_genomic.fasta CM007530.1 > GCA_001887795.1_colLiv2_genomic_6.fasta

Creates a `.bed` fie with our enzyme cutsites from the `.fasta` file above:

$SCRIPTS/appz/p5-bpwrapper/bin/bioseq --restrict-coord EcoT22I ~/data/Pigeons/Reference/GCA_001887795.1_colLiv2_genomic_6.fasta > ~/data/Pigeons/Reference/GCA_001887795.1_colLiv2_genomic_6.bed

Number of CUTSITES: 13.348

Maps this `.fasta` file against our reference genome:

xsbatch -c 8 --mem-per-cpu 1024 -J Chr-6 --time 1-00 -- "blastn -num_threads 8 -query ~/data/Pigeons/Reference/GCA_001887795.1_colLiv2_genomic_6.fasta -db ~/data/Pigeons/Reference/DanishTumbler_Dovetail_ReRun.fasta -evalue 1e-5 -perc_identity 95 -outfmt '7 std slen sstrand' > ~/data/Pigeons/Reference/GCA_001887795.1_colLiv2_genomic_6--DanishTumbler_Dovetail_ReRun.tsv"

Selects those regions that mapped well enough and create a new `.bed` file out of this selection:

awk '!/#/ && $13>10000 && $4>=1000 && $5/$4<0.01' ~/data/Pigeons/Reference/GCA_001887795.1_colLiv2_genomic_6--DanishTumbler_Dovetail_ReRun.tsv | sort -k 2,2 -k 9,9g -k 10,10g | awk '{start=$9; end=$10} $14=="minus"{start=$10; end=$9} {print $2"\t"start"\t"end"\t"$13}' | awk '{x[$1]+=$3-$2+1; s[$1]=$4} END{for(i in x) print i"\t"x[i]"\t"x[i]/s[i]}' | sort -k 3,3gr | awk 'NR>1 && $3>0.9{print $1}' | fgrep -w -f - ~/data/Pigeons/Reference/PBGP_FinalRun.EcoT22I_Extended_Merged_RemovedBadLoci-PossibleParalogs-g650.bed > ~/data/Pigeons/Reference/GCA_001887795.1_colLiv2_genomic_6--DanishTumbler_Dovetail_ReRun.bed

Finally, we used these two `.bed` files to compare the coverage values obtained.

PBG--Pipeline

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

6. GBS Sexing

Generates a `.fasta` file including just the Z chromosome:

Creates a `.bed` fie with our enzyme cutsites from the above `.fasta` file:

Number of CUTSITES: 14,927

Maps this .fasta file against our reference genome:

Selects those regions that mapped well enough and create a new `.bed` file out of this selection:

Generates a `.fasta` file including just the 6 chromosome:

Creates a `.bed` fie with our enzyme cutsites from the `.fasta` file above:

Number of CUTSITES: 13.348

Maps this `.fasta` file against our reference genome:

Selects those regions that mapped well enough and create a new `.bed` file out of this selection:

Finally, we used these two `.bed` files to compare the coverage values obtained.

Clone this wiki locally

6. GBS Sexing

Generates a .fasta file including just the Z chromosome:

Creates a .bed fie with our enzyme cutsites from the above .fasta file:

Number of CUTSITES: 14,927

Maps this .fasta file against our reference genome:

Selects those regions that mapped well enough and create a new .bed file out of this selection:

Generates a .fasta file including just the 6 chromosome:

Creates a .bed fie with our enzyme cutsites from the .fasta file above:

Number of CUTSITES: 13.348

Maps this .fasta file against our reference genome:

Selects those regions that mapped well enough and create a new .bed file out of this selection:

Finally, we used these two .bed files to compare the coverage values obtained.

Clone this wiki locally

Generates a `.fasta` file including just the Z chromosome:

Creates a `.bed` fie with our enzyme cutsites from the above `.fasta` file:

Selects those regions that mapped well enough and create a new `.bed` file out of this selection:

Generates a `.fasta` file including just the 6 chromosome:

Creates a `.bed` fie with our enzyme cutsites from the `.fasta` file above:

Maps this `.fasta` file against our reference genome:

Selects those regions that mapped well enough and create a new `.bed` file out of this selection:

Finally, we used these two `.bed` files to compare the coverage values obtained.