Skip to content

Genomes

Clifford Bohm edited this page Jul 27, 2017 · 8 revisions

Genomes are lists of values which can be read from, written to, mutated and recombined. All genome types share some common characteristics

genomes have a alphabet size. This is the base of the genome. A bit genome has alphabet size 2. A byte genome has alphabet size 8. Biological genomes ave an alphabet size of 4 (ATCG). It is possible to set the alphabet size to any value (that the computer can handle).

genomes have a site type. The type of the site defines how the computer will represent this genome in memory. A genome with alphabet size 2 will behave the same if its sites are chars (bytes), bools (bits), or integers. The alphabet size must be less than or equal to the greatest number that can be represented by the chosen type.

Genomes all provide methods to produce mutated copies from single or multiple parents.

Genome Types

Circular Genome
a simple genome constructed from a single circular chromosome
Multi-Genome
a genome with one or more non-circular chromosomes which can be multi ploidy
### [Genome Handlers](Genome-Handlers) [Genome Handlers](Genome-Handlers) are created by genomes upon request. A genome may have multiple genome handlers at the same time. Genome Handlers are used to interact with the Genome which created them (reading, writing, etc). In addtion to allowing random access to genomes, genome handlers standardize the interface between genomes and the rest of the code. ### Mutations Different types of genomes allow different methods of mutation. Provided here is a summary of common mutation methods. Different genome types may provide different mutation options.
  • Point Mutation - a single site in the genome is selected and it's value is randomized.
  • Insertion Mutation - one or more sites is inserted into the genome at a random location. The values of these sites are randomized.
  • Copy Mutation - a section of the genome is selected, copied and inserted at a random location in the genome.
  • Deletion Mutation - a section of the genome is selected and deleted.
### Make Mutated Offspring Genomes can produce offspring (copies with mutation) asexually (from one parent) or sexually (from many parents).

When asexual reproduction is invoked the parent genome is copied and then mutations re applied to the copy.

When sexual reproduction is invoked then either crossover or recombination is used.
  • If the parent genomes are single ploidy then crossover will be performed between the parent chromosomes to produce the child genome which will then be subjected to mutation.
  • If the parents are multi ploidy then recombination will be used. That is each parent will use crossover to product a new version of each of it's own chromosomes. The child genome will be the collection of all of the resulting chromosomes. The child genome will then be subjected to mutation.
### Genome Parameters
  • genomeType - type of genome being used
  • genomeAlphabetSize - the number of diffrent values possible in a single site of the genome
  • genomeSitesType - the data structure being used by MABE to store sites. This can be bool, char, int or, double.
### Genome Variables
  • dataMap - a DataMap used to store data for output
  • genomeFileColumns - a list of columns used by the archivist when writing genomes to file
  • aveFileColumns - a list of columns of values that are interesting (as determined by each derived genome type). Used by Archivist when writing ave files.
### Critical Genome Interface
shared_ptr<AbstractGenome> makeLike()
returns a genome like this genome. Sites are not copied, which saves some time.
shared_ptr<AbstractGenome::Handler> newHandler(shared_ptr<AbstractGenome> _genome, bool _readDirection = true)
returns a handler to this genome. The genome used to call new handler should be the same genome passed in the arguments list. The handler is initialized to be pointing to the first site of the genome and !EoG, !EoC. _readDirection true indicates that this handler is reading forward, false would indicate reading backward and to initialize the handler to the last site in the genome.
double getAlphabetSize()
return alphabetSize
void copyFrom(shared_ptr<AbstractGenome> from)
copy the genome from to this genome
void fillRandom()
fill this genome with random values bounded by 0 and alphabetSize
string genomeToStr()
convert this genome to a string
void printGenome()
print this genome to the terminal. used for debugging
void loadGenomeFile(string fileName, vector<shared_ptr<AbstractGenome>> &genomes)
load all genomes in "fileName" into genomes
shared_ptr<AbstractGenome> loadGenome(string fileName, string key, string value)
call loadGenomeFile() and then return the genome that matches the key/value pair commonly the keys will be update (in the case of an LOD genome file) or (ID in the case of a snapshot file)
bool isEmpty()
return true if the number of sites in this genome is == 0
void mutate()
apply mutations to this genome. The type of mutations is determined by the derived class. Generally the values that determine the amount of mutation will be maintained by a ParametersTable.
shared_ptr<AbstractGenome> makeMutatedGenomeFrom(shared_ptr<AbstractGenome> parent)
return a new genome which is a mutated copy of the parent
shared_ptr<AbstractGenome> makeMutatedGenomeFromMany(vector<shared_ptr<AbstractGenome>> parents)
create a new genome which is the result of either direct crossover of the parents genomes (in the case of mono-ploidy) or recombination (in the case of multi-plodiy). Generally for multi-plodiy the number of parents must match the ploidy. Mutation is then applied to the new genome and the genome is returned
int countSites()
return the number of sites in this genome
vector<string> getStats()
looks up retentive data (as determined by each derived genome type) and returns a list of key/value pairs
void recordDataMap()
used to add data to this genomes dataMap
Clone this wiki locally