Scripts with Evolutionary Optimization (Experimental)
This section is experimental meaning that the content described here has not been tested thoroughly yet and might undergo patches. |
The Idea
The idea of this approach is to provide a universal interface for marking certain paramters as optimizable.
Then we built upon this idea by using the evolutionary algorithm to perform the optimization. The Arena
module pits bots against each other and helps to determine the fitness function for the evolutionary algorithm - that is - the way to tell how good the current values of the parameters are.
EvoParams and EvoScripts
The EvoParam
is the base class for optimizable parameters. It is a template/generic class. Do not inherit from it but instead use it directly and provide the type for your parameter.
Because the optimization process is discrete, it is necessary to also set the domain for the parameter. The domain is a list of possible values that the parameter may have. The optimization algorithm will be allowed to test various values from the given domain.
Below, you can find an example of an EvoParam
that encodes base damage done by a weapon in a game. Let say, during the design process, you are not sure how strong the weapon should be, but you know it should be somewhere between 50 and 70.
-
C++
-
C#
grail::evolution::EvoParam<float> damageParam({ 50.0f, 55.0f, 60.0f, 65.0f, 70.0f });
EvoParam<float> damageParam = new EvoParam<float>(50.0f, 55.0f, 60.0f, 65.0f, 70.0f);
Please note that any EvoParam
is implicitly convertible to the template/generic type it encapsulates. So you can, for example, do this:
if(damageParam > 20.0f) //you can do this even though damageParam is of type EvoParam<float> { //Do something }
In order to take advantage of such parameters, a class needs to inherit from EvoScript
. The base class has several methods, but below we show the two most important ones.
-
The first one
UserClone()
is the only one that needs to be implemented in the derived class. Return the copy of your script here. You can also find a non-virtualSystemClone()
method available inEvoScript
. This method calls the providedUserClone()
and also copies values of all the registeredEvoParams
as well as the current fitness value of the script. This happens internally, so inUserClone()
you only need to copy the actual data that is introduced in the derived class. -
The second one -
RegisterParam
- is the one to be used to tell the algorithm which parameter is optimizable in a script.
-
C++
-
C#
//EvoScript.hh
class EvoScript
{
...
/// <summary>
/// Make and return a new copy of the type of your derived class from EvoScript - your complete player being optimized.
/// You do not need to set concrete values of EvoParams, because they will be synchronized automatically.
/// </summary>
virtual std::unique_ptr<EvoScript> UserClone() const = 0;
void RegisterParam(IEvoParam& parameter);
...
}
public abstract class EvoScript : IComparable<EvoScript>, IComparer<EvoScript>, IFitnessProvider
{
...
/// <summary>
/// Make and return a new copy of the type of your derived class from EvoScript - your complete player being optimized.
/// You do not need to set concrete values of EvoParams, because they will be synchronized automatically.
/// </summary>
public abstract EvoScript UserClone();
protected void RegisterParam(IEvoParam parameter);
...
}
The EvoScript
allows us to use the RegisterParam
method. An optimizable parameter has to be registered, otherwise it will not be seen by the optimization algorithm.
There are more functions to make your life easier with using EvoScripts
, so you may need to consult the code or API reference. They will be described in the manual when the module has no longer the experimental status.
Below, you can find an example of a class that inherits from EvoScript
.
-
C++
-
C#
#include "GrailEvolution/EvoScript.hh"
using namespace grail::evolution; //for simplicity of the example
struct TestBot : public EvoScript
{
EvoParam<int> Attack = EvoParam<int>({ 50, 100, 150 });
EvoParam<int> Health = EvoParam<int>({ 100, 200, 300 });
TestBot();
virtual std::unique_ptr<EvoScript> UserClone() const override;
};
TestBot::TestBot()
{
RegisterParam(Attack);
RegisterParam(Health);
}
//no need to clone anything inside because the only fields are EvoParams which are cloned automatically by SystemClone()
std::unique_ptr<EvoScript> TestBot::UserClone() const
{
return std::unique_ptr<EvoScript>(new TestBot());
}
using Grail.Evolution;
public class TestBot : EvoScript
{
public EvoParam<int> Attack { get; private set} = new EvoParam<int>(50, 100, 150);
public EvoParam<int> Health { get; private set} = new EvoParam<int>(50, 100, 150);
public TestBot()
{
RegisterParams(Attack, Health);
//or
//RegisterParam(Attack);
//RegisterParam(Health);
}
//no need to clone anything inside because the only fields are EvoParams which are cloned automatically by SystemClone()
public override EvoScript UserClone() => new TestBot();
}
Arena
Arena is an object that evaluates the quality of scripts in some automatic manner. The most common way to do this are:
-
Tournaments - let the various bots play against each other and assign higher fitness value to bots that win more often.
-
Corresponding competition - let the bots play the game on their own and assign fitness value proportional to some measure, e.g. the game score (if there is one) or completion time (the lower the better). You can use any desired measure here. Maybe you want the bots to be neither too fast nor too slow - assign the fitness that is proportional to the distance from some sensible desired value.
In future versions, we will release a dedicated Arena
that works in a client-server architecture and communicates using the GRPC
protocol.
This enables to take advantage of multiple CPU cores and speed up the optimization process.
As of current, you can provide your own Arena
specialization that inherits from the Arena
class.
The base class looks as follows:
-
C++
-
C#
//grail/Evolution/Arena.hh
namespace grail
{
namespace evolution
{
class EvoScript;
/**
The Arena class is an interface for the evaluator of the quality of players represented by EvoScripts.
It is used either stand-alone or in the in the evaluation phase of Evolutionary Algorithm optimization */
class Arena
{
public:
/// <summary>
/// This function's responsibility is to evaluate EvoScripts and assign the value to them using setFitness().
/// If your method of evaluation is deterministic and absolute (i.e. fitness only depends on the EvoScript object itself) then you might ignore the @onceEvaluatedList.
/// If your method is non-deterministic or relative (e.g., tournament-based) to the whole population, then you might evaluate @onceEvaluatedList again within the scope of the current population.
/// </summary>
/// <param name="onceEvaluatedList">EvoScript objects that have been evaluated at least once in previous iterations. The evaluation might need to be updated.</param>
/// <param name="neverEvaluatedList">EvoScript objects that have never been evaluated. Assign Fitness to them in this method.</param>
virtual void Evaluate(std::vector<std::unique_ptr<EvoScript>>& onceEvaluatedPlayers, std::vector<std::unique_ptr<EvoScript>>& neverEvaluatedPlayers) = 0;
virtual ~Arena();
};
}
}
namespace Grail.Evolution
{
/** The Arena class is an interface for the evaluator of the quality of players represented by EvoScripts.
It is used either stand-alone or in the in the evaluation phase of Evolutionary Algorithm optimization */
public abstract class Arena
{
/// <summary>
/// This function's responsibility is to evaluate EvoScripts and assign the value to them using the [.Fitness] getter.
/// If your method of evaluation is deterministic and absolute (i.e. fitness only depends on the EvoScript object itself) then you might ignore the @onceEvaluatedList.
/// If your method is non-deterministic or relative (e.g., tournament-based) to the whole population, then you might evaluate @onceEvaluatedList again within the scope of the current population.
/// </summary>
/// <param name="onceEvaluatedList">EvoScript objects that have been evaluated at least once in previous iterations. The evaluation might need to be updated.</param>
/// <param name="neverEvaluatedList">EvoScript objects that have never been evaluated. Assign Fitness to them in this method.</param>
public abstract void Evaluate(List<EvoScript> onceEvaluatedList, List<EvoScript> neverEvaluatedList);
/// <summary>
/// A helper class that will: for each *bot* in @bots test it against randomly chosen @trialCount another bots and call the compareTwoFunction to score the bot.
/// This function will assign Fitness to each bot based on the average score obtained in the process.
/// </summary>
/// <param name="bots">The population of bots to be tested.</param>
/// <param name="trialCount">The number of times, each bot is tested against some subset of population.</param>
/// <param name="maxScoreInTrial">The maximum score that a bot can get in a single test.</param>
/// <param name="compareTwoFunction">The function that test a pair of bots and returns score from the first one's perspective.</param>
/// <param name="random">Random numbers generator.</param>
protected virtual void TwoPlayerTournamentSelection(List<EvoScript> bots, int trialCount, float maxScoreInTrial, Func<EvoScript, EvoScript, float> compareTwoFunction, Random random)
{
float normalization = 1.0f / (trialCount * maxScoreInTrial);
foreach (EvoScript bot in bots)
{
float wins = 0;
for (int i = 0; i < trialCount; ++i)
{
wins += compareTwoFunction(bot, bots[random.Next(bots.Count)]);
}
bot.Fitness = wins * normalization;
}
}
}
}
Inside the Evaluate() function assign fitness by using the SetFitness() function in C++ or Fitness property in C# of the EvoScript .
|
Please read the code comments the methods carefully. The Evaluate
method has two parameters - onceEvaluatedList
and neverEvaluatedList
in case the evaluation may change in time.
An example of such changing evaluation is when it is based on the number of wins against other bots. Because the other bots may change, then such a relative evaluation may change too.
Below is an example of Arena implementation. Although the fitness proportional to both health and attack of a bot makes sense, it is only to show the idea. Naturally, in a real-world scenario, the fitness would not be calculated so easily and would require running some games.
-
C++
-
C#
#include "grail/Evolution/Arena.hh"
using namespace grail::evolution; //for simplicity of the example
class BotArena : public Arena
{
public:
virtual void Evaluate(std::vector<std::unique_ptr<EvoScript>>& onceEvaluatedList, std::vector<std::unique_ptr<EvoScript>>& neverEvaluatedList) override;
virtual ~BotArena() { } override;
};
void BotArena::Evaluate(std::vector<std::unique_ptr<EvoScript>>& /*onceEvaluatedList*/, std::vector<std::unique_ptr<EvoScript>>& neverEvaluatedList)
{
for (auto& script : neverEvaluatedList)
{
auto ourBot = static_cast<TestBot&>(*script);
//the fitness is higher proportionally to the Health and Attack
script->SetFitness(static_cast<float>(ourBot.Health*ourBot.Attack));
}
}
using Grail.Evolution;
public class BotArena : Arena
{
public override void Evaluate(List<EvoScript> onceEvaluatedList, List<EvoScript> neverEvaluatedList)
{
foreach (var script in neverEvaluatedList)
{
var ourBot = script as TestBot;
//the fitness is higher proportionally to the Health and Attack
script.Fitness = ourBot.Attack * ourBot.Health;
}
}
}
Evolutionary Algorithm
This is the optimization algorithm that uses the Arena
and EvoScripts
introduced earlier.
First construct the EAOptimizer
object. The constructor looks as follows:
-
C++
-
C#
EAOptimizer(std::vector<std::unique_ptr<EvoScript>>& initialPopulation,
std::unique_ptr<Arena> arenaEvaluation,
std::unique_ptr<Crossover> crossover = std::make_unique<Crossover>(),
std::unique_ptr<Mutation> mutation = std::make_unique<Mutation>(),
std::unique_ptr<Selection> selection = std::make_unique<Selection>(),
std::mt19937_64::result_type seed = std::random_device{}());
public EAOptimizer(List<EvoScript> initialPopulation, Arena arenaEvaluation, Mutation mutation = null, Crossover crossover = null, Selection selection = null)
{
this.selection = selection ?? new SelectionDeterministicTopN();
this.mutation = mutation ?? new Mutation();
this.crossover = crossover ?? new Crossover();
this.tempPopulation = new List<EvoScript>();
//additional internal initialization
...
}
The mutation
, crossover
and selection
objects are optional. If they are not provided, the default ones will be used.
Finally, run the evolutionary process by calling the Run()
method and passing the number of epochs (iterations). Below you can find the example:
-
C++
-
C#
#include "grail/Evolution/EAOptimizer.hh"
#include "grail/Evolution/Selection.hh"
#include "grail/Evolution/Crossover.hh"
#include "grail/Evolution/Mutation.hh"
#include "grail/Evolution/Arena.hh"
using namespace grail::evolution; //for simplicity of the example
//constructor with default mutation, crossover and selection
EAOptimizer ea(initialPopulation, std::make_unique<BotArena>());
//50 epochs of the algorithm
ea.Run(50);
//getting the winner script
auto winner = static_cast<TestBot&>(ea.GetBestIndividual());
using Grail.Evolution;
//constructor with default mutation, crossover and selection
EAOptimizer ea = new EAOptimizer(initialPopulation, new BotArena();
//50 epochs of the algorithm
ea.Run(50);
//getting the winner script
auto winner = ea.GetBestIndividual() as TestBot;
Some properties of our implementation
-
The evolutionary algorithm first performs crossover and then mutation.
-
The algorithm maintains both the parent and children populations before the selection.
-
The evolutionary algorithm will maintain the size of the initial population after the selection phase. If the children population is too small (e.g. too low crossover rate) then it will be filled with individuals from the parent population.
Crossover
The following configuration parameters are available.
-
CrossoverPhaseRate - probability that the crossover phase will occur in an iteration. This is the first check regarding the crossover - if equal to
FALSE
then no children population will be produced. The default value is1.0
which means that crossover is always considered. -
CrossoverSelectionType - the way of choosing individuals for crossover. The default value is
PSEUDO_ROULETTE
.-
RANDOM
- individuals are chosen for crossover using uniform random distribution. -
ROULETTE
- individuals are chosen for crossover using the roulette wheel selection proportially to theirFitness
value. -
TOP
- only the best fit individuals are chosen for crossover. There is no randomness here. -
PSEUDO_ROULETTE
-PseudoRouletteRandomPortion
*PopulationSize
of individuals will be chosen at random and the rest according to the roulette wheel selection. -
CONSECUTIVE_PAIRS
- as the name implies, the individual will cross over with its successor in the list maintained by the algorithm. Here theCrossoverIndividualRate
parameter is ignored, meaning that each individual (if the population size is even) will participate in the crossover. If the population size is odd, then the last individual will not be used in the crossover. This is a rather unusual procedure, that sometimes works.
-
-
PseudoRouletteRandomPortion - used if
CrossoverSelectionType
=PSEUDO_ROULETTE
. The default value is0.3
. -
CrossoverIndividualRate - the average rate of crossover. If
N
is thePopulationSize
, then the expected number of children is equal toN
*CrossoverIndividualRate
. The default value is0.45
.
It is recommended to use the ready-to-use default Crossover object. However, if you want to provide your own implementation, then inherit from Crossover and override the CrossoverIndividuals method. This way you take advantage of the built-in selection of individuals for crossover and only implement the actual procedure of combining two individuals.
|
Mutation
The following configuration parameters are available.
-
MutationPhaseRate - probability that the mutation phase will occur in an iteration. This is the first check regarding the mutation - if equal to
FALSE
then no further actions regarding mutation will be done in the current iteration (epoch) of the algorithm. The default value is1.0
which means that mutation is always considered. -
RandomResolutionTypeForIndividual - the formula of selecting individuals for mutation. Two options are possible: {
ITERATE_AND_TEST_P
,SELECT_P_OF_N
}. The default value isSELECT_P_OF_N
.-
ITERATE_AND_TEST_P
- a loop will iterate over individuals and mutate each one with probability equal toMutationIndividualRate
. -
SELECT_P_OF_N
- the number equal toMutationIndividualRate
*PopulationSize
clamped to the [0
,PopulationSize
] interval will be selected for mutation.
-
-
MutationIndividualRate - probability that an individual will mutate (provided that the mutation happens in the current iteration). See above how this parameter interferes with
MutationSelectionType
. The default value is0.09
. This means that approximately9%
of the population will be mutated. -
MutationChromosomeRate - probability that a chromosome of an individual that designated for mutation will change. In the case of EvoScripts, the chromosome is a single EvoParam. A vector of chromosome (EvoParams) makes the individual - the portion of an EvoScript that can be optimized. The default value is
0.5
. This means that half of the individual will be mutated. -
RandomResolutionTypeForChromosome - an analogous to
MutationSelectionType
but this time for selecting chromosomes from a individual rather than individuals from the population. -
MutateBothParentsAndChildren - if
TRUE
then both individuals coming from the previous population and children can be mutated. IfFALSE
then only the individuals after crossover can be mutated. The default value isTRUE
.
It is recommended to use the ready-to-use default Mutation object. However, if you want to provide your own implementation, then inherit from Mutation and you can choose at which point you want to change the default procedure. You can override Perform to change it completely, MutateIndividual to change mutation of the EvoScript or MutateChromosome to just change the mutation of EvoParam .
|
Selection
There is a default selection object, which is "fed" by individuals by EAOptimizer
and among them performs the top N
cutoff from individuals sorted by Fitness
value.
Using the EAOptimizer
object, you can set the ElitismType
enum and optionally the ElitismRate
field to affect which individuals are "fed" to the selection.
-
ElitismType - affects how individuals are chosen to the next iteration. The default value is
ROULETTE
.-
NONE
- just the the topN
individuals from combined populations of children and parents will advance to the next iteration, whereN = PopulationSize
. -
FITNESS_RANKING
- theElitismRate
*PopulationSize
best fit individuals from the previous population (parents) must remain. The rest of the new population, i.e.,(1-ElitismRate)
*PopulationSize
will be chosen amongst the combined populations of parents and children. -
ROULETTE
- the number equal toElitismRate
*PopulationSize
individuals from previous population (parents) will be chosen by roulette sampling proportional to their fitness value. The rest of the new population, i.e.,(1-ElitismRate)
*PopulationSize
will be chosen amongst the combined populations of parents and children.
-
-
ElitismRate - a parameter used when
ElitismType
=FITNESS_RANKING
orROULETTE
. The default value is0.2
.
Although there is a default selection object available, you can also override the Selection
class and provide logic that choses the number of individuals equal to populationMaxSize
to advance to the next iteration. The population passed to the selection object as argument is already sorted by Fitness
value.
Just a reminder that the Fitness value has to be assigned to individuals by the Arena .
|
-
C++
-
C#
class Selection
{
public:
virtual void Perform(std::vector<std::unique_ptr<EvoScript>>& population, size_t populationMaxSize);
virtual ~Selection();
};
public abstract class Selection
{
public abstract List<EvoScript> Perform(List<EvoScript> currentPopulation, int populationMaxSize);
}
Inspiration for Usage
The two main usages are:
-
Direct optimization of bot’s parameters - inherit your bot from
EvoScript
and make its optimizable parameters asEvoParams
. This is a simpler case for which we have shown examples above. -
Optimization of any other parameter in a game (e.g. a map) - prepare some construction script that inherits from
EvoScript
and make some of its parameters asEvoParams
. Such a script may then create a bot or some part of the game environment in which the bots play. Then, those bots may play on the Arena and indirectly evaluate the construction scripts that were used in the beginning. In the first case you could directly useEvoParams
. Here we have a trickier case that will often require doing some preprocessing of the environment based upon theEvoParams
each time a newEvoScript
object is cloned.
INVALID API LINK