Skip to content

Commit

Permalink
add fitness cache
Browse files Browse the repository at this point in the history
  • Loading branch information
KRM7 committed Sep 1, 2024
1 parent 7937be5 commit 69f663d
Show file tree
Hide file tree
Showing 22 changed files with 1,576 additions and 69 deletions.
2 changes: 1 addition & 1 deletion docs/api/Doxyfile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Doxyfile 1.9.1
# Doxyfile 1.9.1

#---------------------------------------------------------------------------
# Project related configuration options
Expand Down
15 changes: 14 additions & 1 deletion docs/api/core/fitness_function.rst
Original file line number Diff line number Diff line change
Expand Up @@ -31,4 +31,17 @@ class FitnessFunctionBase
:project: gapp
:members:
:protected-members:
:private-members:
:private-members:


class FitnessFunctionInfo
---------------------------------------------------

.. code-block::
#include <core/fitness_function.hpp>
.. doxygenclass:: gapp::FitnessFunctionInfo
:project: gapp
:members:
:protected-members:
2 changes: 1 addition & 1 deletion docs/api/generate_api_docs.sh
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
#!/bin/sh
#!/bin/sh

echo -e "Generating API documentation...\n"

Expand Down
27 changes: 16 additions & 11 deletions docs/encodings.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,10 +48,10 @@ BinaryGA{}.solve(problems::Sphere{});
## Solution representation
The gene type determines how the candidate solutions to the problem
are going be encoded in the population. The representation of the solutions
will be a vector of the gene type used in all cases. Currently, there is no
way to change this to use some other data structure, so this should be taken
How the candidate solutions to a problem are going be encoded in the GA is
determined by the gene type. The representation of the solutions will always
be a vector of the gene type used. There is currently no way to change this
to use some other data structure instead of a vector, so this should be taken
into account when defining new encodings.
```cpp
Expand All @@ -60,12 +60,17 @@ using Chromosome = std::vector<GeneType>;
```

The candidates contain some more information in addition to their
chromosomes, for example their fitness vectors, but these are
independent of the gene type.
chromosomes, like their fitness vectors, but these are independent
of the gene type. They are represented by the `Candidate` class.

The population is then made up of several candidates encoded in
A population is then made up of several candidates encoded in
this way.

```cpp
template<typename GeneType>
using Population = std::vector<Candidate<GeneType>>;
```

### Variable chromosome lengths

The length of the chromosomes is specified as part of the fitness function.
Expand All @@ -92,7 +97,7 @@ new GA class. In order to do this, you have to:
- define a specialization for `GaTraits<GeneType>`
- specialize `is_bounded<GeneType>` if needed
- define the GA class, derived from `GA<GeneType>`
- define crossover and mutation operators for this encoding
- define crossover and mutation operators for the new encoding

The gene type may be anything, with one restriction: the types
already used for the existing encodings are reserved and can't
Expand All @@ -105,9 +110,9 @@ using MyGeneType = std::variant<double, int>;

The specialization of `GaTraits<T>` for the gene type is required
in order to define some attributes of the GA. These are the default
crossover and mutation operators that will be used by the GA if none
are specified explicitly, and the default mutation probability used
for the mutation operators:
crossover and mutation operators that will be used by the GA when
they are not specified explicitly, and the default mutation probability
used for the mutation operator:

```cpp
namespace gapp
Expand Down
71 changes: 54 additions & 17 deletions docs/fitness-functions.md
Original file line number Diff line number Diff line change
Expand Up @@ -133,22 +133,29 @@ returned by `invoke`.
## Other fitness function properties
There are some other parameters of the fitness function that can
be specified, but these generally don't have to be changed from
their default values. However, more complex fitness functions might
have to set their values different from these defaults in some cases.
There are some additional parameters of the fitness function that can
be specified, but these typically don't have to be changed from
their default values. More complex fitness functions, however, might
have to set their values differently from the defaults in some cases.
### Dynamic fitness functions
By default, the fitness function is assumed to always return the
same fitness vector for a particular chromosome passed to it as an
The fitness functions are, by default, assumed to always return the
same fitness vector for a given chromosome passed to them as an
argument. This assumption is used to prevent unnecessary fitness
function calls, but this optimization would cause incorrect fitness
vectors to be assigned to some solutions if the assumption is false.
function calls, but it would also cause potentially incorrect fitness
vectors to be assigned to some solutions if the assumption is not true.
For fitness functions where this assumption would not be true, the value
of the `dynamic` parameter in the constructor of `FitnessFunctionBase`
or `FitnessFunction` has to be set to `true`.
In order to prevent this, the fitness functions have a type parameter
associated with them, which can either be `Static` or `Dynamic`. The type
of a fitness function can be set in its constructor, with the default type
being `Static`.
For fitness functions where this default behaviour would be incorrect, the
value of the `type` parameter in the constructor of the fitness function
should to be set to `Dynamic`. This will disable any kind of caching that
might be used in the GAs, and cause the solutions to be evaluated using
the fitness function every time it's needed.
### Variable chromosome lengths
Expand All @@ -164,19 +171,48 @@ the initial population is generated instead of explicitly specified.
// implementation of a dynamic fitness function
class MyFitnessFunction : public FitnessFunction<RealGene, 1>
{
MyFitnessFunction() : FitnessFunction(/* dynamic = */ true) {}
MyFitnessFunction() : FitnessFunction(/* type = */ Type::Dynamic) {}
FitnessVector invoke(const Chromosome<RealGene>& x) const override;
};
```

## The number of objective function evaluations

The number of times the fitness function is evaluated during a run of the GA
is determined by the number of candidates in the population, and the number
of generations. Naively, the number of fitness function calls during a run
would be:

```
N = population_size * generations
```

While there are cases where this will really be the number of fitness
function calls, such as when the fitness function is dynamic, the library
generally tries to minimize the number of calls to the fitness function
where possible, which means that the actual number will typically be
smaller than this.

By default, only a simple method is used to achieve this, with minimal
overhead during the runs, but it is also possible to cache the fitnesses
of the candidate solutions during a run to further reduce the number of
fitness function calls. Doing this has a larger overhead though, so it's
only worth doing if the fitness function is relatively expensive to evaluate.

This cache can be turned on using the `cache_size` method of the GAs:

```cpp
GA.cache_size(2); // cache the last 2 generations
```

## Other concerns

### Numeric issues

The library in general only assumes that the fitness values returned
The library, in general, only assumes that the fitness values returned
by the fitness function are valid numbers (i.e. no `NaN` values will
be returned by it).
be returned by the fitness function).

Whether infinite fitness values are allowed or not depends on the
selection method used in the GA. If the fitness function can return
Expand All @@ -190,8 +226,9 @@ any issues.

### Thread safety

The candidate solutions in the population are evaluated concurrently
in each generation of a run. As a result of this, the implementation
of the `invoke` method in the derived fitness functions must be thread-safe.
The candidate solutions of a population are evaluated concurrently
in each generation of a run. This means that the implementation
of the `invoke` method in the derived fitness functions should either
be thread-safe.

------------------------------------------------------------------------------------------------
2 changes: 1 addition & 1 deletion examples/4_fitness_functions.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ class XSquareMulti : public FitnessFunction<RealGene, 1>
class XSquareDynamic : public FitnessFunction< RealGene, 1>
{
public:
XSquareDynamic() : FitnessFunction(/* dynamic = */ true) {}
XSquareDynamic() : FitnessFunction(Type::Dynamic) {}

FitnessVector invoke(const Chromosome<RealGene>& x) const override
{
Expand Down
28 changes: 26 additions & 2 deletions gapp.natvis
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
<AutoVisualizer xmlns="http://schemas.microsoft.com/vstudio/debugger/natvis/2010">

<Type Name="gapp::detail::Matrix&lt;*,*&gt;">
<DisplayString>{{ Matrix&lt;{"$T1",sb}, {"$T2",sb}&gt;{{ size = {nrows_}x{ncols_} }} }}</DisplayString>
<DisplayString>{{ {{ size = {nrows_}x{ncols_} }} }}</DisplayString>
<Expand>
<Item Name="[nrows]">nrows_</Item>
<Item Name="[ncols]">ncols_</Item>
Expand All @@ -17,7 +17,7 @@
</Type>

<Type Name="gapp::small_vector&lt;*,*,*&gt;">
<DisplayString>{{ small_vector&lt;{"$T1",sb}, {"$T2",sb}&gt;{{ size = { last_ - first_ }, capacity = { last_alloc_ - first_ } }} }}</DisplayString>
<DisplayString>{{ {{ size = { last_ - first_ }, capacity = { last_alloc_ - first_ } }} }}</DisplayString>
<Expand>
<ArrayItems>
<Size>last_ - first_</Size>
Expand All @@ -26,4 +26,28 @@
</Expand>
</Type>

<Type Name="gapp::detail::dynamic_bitset">
<DisplayString>
{{ {{ size = { size_ }, capacity = { block_size * (blocks_.last_alloc_ - blocks_.first_) } }} }}
</DisplayString>
<Expand>
<IndexListItems>
<Size>size_</Size>
<ValueNode>(bool)(blocks_.first_[$i / block_size] &amp; (block_type(1) &lt;&lt; ($i % block_size)))</ValueNode>
</IndexListItems>
</Expand>
</Type>

<Type Name="gapp::detail::circular_buffer&lt;*,*&gt;">
<DisplayString>
{{ {{ size = { size_ }, capacity = { capacity_ } }} }}
</DisplayString>
<Expand>
<IndexListItems>
<Size>size_</Size>
<ValueNode>*(buffer_ + (first_ + $i &gt;= capacity_ ? first_ + $i - capacity_ : first_ + $i))</ValueNode>
</IndexListItems>
</Expand>
</Type>

</AutoVisualizer>
13 changes: 13 additions & 0 deletions src/core/candidate.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -202,4 +202,17 @@ namespace gapp

} // namespace gapp

namespace std
{
template<typename T>
struct hash<gapp::Candidate<T>>
{
std::size_t operator()(const gapp::Candidate<T>& candidate) const noexcept
{
return gapp::CandidateHasher<T>{}(candidate);
}
};

} // namespace std

#endif // !GA_CORE_CANDIDATE_HPP
47 changes: 32 additions & 15 deletions src/core/fitness_function.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -23,18 +23,32 @@ namespace gapp
class FitnessFunctionInfo
{
public:
/**
* The list of potential fitness function types.
* A fitness function may either be static or dynamic.
*
* @var Type::Static The value representing a static fitness function. A fitness function
* is considered static if it always returns the same fitness vector for a particular
* candidate solution.
* @var Type::Dynamic The value representing a dynamic fitness function. A fitness function
* is considered to be dynamic if it may return different fitness vectors for the same
* candidate solution over multiple calls to the fitness function.
*/
enum class Type { Static = 0, Dynamic = 1 };

/**
* Create a fitness function.
*
* @param chrom_len The chromosome length that is expected by the fitness function,
* and will be used for the candidate solutions in the GA. \n
* Must be at least 1, and a value must be specified even if the chromosome lengths are variable,
* as it will still be used to generate the initial population.
* @param is_dynamic Should be true if the fitness vector returned for a chromosome will not
* always be the same for the same chromosome (eg. it changes over time or isn't deterministic).
* and will be used for the candidate solutions in the GA.
* Must be at least 1, and a value must be specified even if the chromosome length
* is variable, as it will still be used to generate the initial population.
* @param type The type of the fitness function. The value should be either Type::Static
* or Type::Dynamic, based on whether the fitness function always returns the same
* fitness vector for a solution (static) or not (dynamic).
*/
constexpr FitnessFunctionInfo(Positive<size_t> chrom_len, bool is_dynamic = false) noexcept :
chrom_len_(chrom_len), is_dynamic_(is_dynamic)
constexpr FitnessFunctionInfo(Positive<size_t> chrom_len, Type type = Type::Static) noexcept :
chrom_len_(chrom_len), type_(type)
{}

/** @returns The chromosome length the fitness function expects. */
Expand All @@ -43,7 +57,7 @@ namespace gapp

/** @returns True if the fitness function is dynamic. */
[[nodiscard]]
constexpr bool is_dynamic() const noexcept { return is_dynamic_; }
constexpr bool is_dynamic() const noexcept { return type_ == Type::Dynamic; }

/** Destructor. */
virtual ~FitnessFunctionInfo() = default;
Expand All @@ -57,16 +71,16 @@ namespace gapp

private:
Positive<size_t> chrom_len_;
bool is_dynamic_ = false;
Type type_;
};

/**
* The base class of the fitness functions used in the GAs.
* The fitness functions take a candidate solution (chromosome) as a parameter
* and return a fitness vector after evaluating the chromosome.
*
* This should be used as the base class for fitness functions if the chromosome length
* is not known at compile time.
* This should be used as the base class for fitness functions if the chromosome
* length is not known at compile time.
* If the chromosome length is known at compile, use FitnessFunction as the base class instead.
*
* @tparam T The gene type expected by the fitness function.
Expand Down Expand Up @@ -114,14 +128,17 @@ namespace gapp
class FitnessFunction : public FitnessFunctionBase<T>
{
public:
using Type = FitnessFunctionInfo::Type;

/**
* Create a fitness function.
*
* @param dynamic Should be true if the fitness vector returned for a chromosome will not
* always be the same for the same chromosome (eg. it changes over time or isn't deterministic).
* @param type The type of the fitness function. The value should be either Type::Static
* or Type::Dynamic, based on whether the fitness function always returns the same
* fitness vector for a solution (static) or not (dynamic).
*/
constexpr FitnessFunction(bool dynamic = false) noexcept :
FitnessFunctionBase<T>(ChromLen, dynamic)
constexpr FitnessFunction(Type type = Type::Static) noexcept :
FitnessFunctionBase<T>(ChromLen, type)
{}
};

Expand Down
Loading

0 comments on commit 69f663d

Please sign in to comment.