Scripts with Evolutionary Optimization (Experimental)

This section is experimental meaning that the content described here has not been tested thoroughly yet and might undergo patches.

The Idea

The idea of this approach is to provide a universal interface for marking certain paramters as optimizable. Then we built upon this idea by using the evolutionary algorithm to perform the optimization. The Arena module pits bots against each other and helps to determine the fitness function for the evolutionary algorithm - that is - the way to tell how good the current values of the parameters are.

EvoParams and EvoScripts

The EvoParam is the base class for optimizable parameters. It is a template/generic class. Do not inherit from it but instead use it directly and provide the type for your parameter. Because the optimization process is discrete, it is necessary to also set the domain for the parameter. The domain is a list of possible values that the parameter may have. The optimization algorithm will be allowed to test various values from the given domain.

Below, you can find an example of an EvoParam that encodes base damage done by a weapon in a game. Let say, during the design process, you are not sure how strong the weapon should be, but you know it should be somewhere between 50 and 70.

  • C++

  • C#

EvoParam<float> damageParam = EvoParam<float>({ 50.0f, 55.0f, 60.0f, 65.0f, 70.0f });
EvoParam<float> damageParam = new EvoParam<float>(50.0f, 55.0f, 60.0f, 65.0f, 70.0f);

Please note that any EvoParam is implicitly convertible to the template/generic type it encapsulates. So you can, for example, do this:

if(damageParam > 20.0f) //you can do this even though damageParam is of type EvoParam<float>
{
    //Do something
}

In order to take advantage of such parameters, a class needs to inherit from EvoScript. The base class has several methods, but below we show the two most important ones.

  1. The first one UserClone() is the only one that needs to be implemented in the derived class. Return the copy of your script here. You can also find a non-virtual SystemClone() method available in EvoScript. This method calls the provided UserClone() and also copies values of all the registered EvoParams as well as the current fitness value of the script. This happens internally, so in UserClone() you only need to copy the actual data that is introduced in the derived class.

  2. The second one - RegisterParam - is the one to be used to tell the algorithm which parameter is optimizable in a script.

  • C++

  • C#

//EvoScript.hh

class EvoScript
{
    ...

   /// <summary>
   /// Make and return a new copy of the type of your derived class from EvoScript - your complete player being optimized.
   /// You do not need to set concrete values of EvoParams, because they will be synchronized automatically.
   /// </summary>
   virtual std::unique_ptr<EvoScript> UserClone() const = 0;

   void RegisterParam(IEvoParam& parameter);

    ...
}
public abstract class EvoScript : IComparable<EvoScript>, IComparer<EvoScript>, IFitnessProvider
{
    ...

    /// <summary>
    /// Make and return a new copy of the type of your derived class from EvoScript - your complete player being optimized.
    /// You do not need to set concrete values of EvoParams, because they will be synchronized automatically.
    /// </summary>
    public abstract EvoScript UserClone();

    protected void RegisterParam(IEvoParam parameter);

    ...
}

The EvoScript allows us to use the RegisterParam method. An optimizable parameter has to be registered, otherwise it will not be seen by the optimization algorithm. There are more functions to make your life easier with using EvoScripts, so you may need to consult the code or API reference. They will be described in the manual when the module has no longer the experimental status.

Below, you can find an example of a class that inherits from EvoScript.

  • C++

  • C#

#include "grail/EvoScripts/EvoScript.hh"
using namespace grail::evolution; //for simplicity of the example

struct TestBot : public EvoScript
{
  EvoParam<int> Attack = EvoParam<int>({ 50, 100, 150 });
  EvoParam<int> Health = EvoParam<int>({ 100, 200, 300 });

  TestBot();

  virtual std::unique_ptr<EvoScript> UserClone() const override;
};

TestBot::TestBot()
{
    RegisterParam(Attack);
    RegisterParam(Health);
}

//no need to clone anything inside because the only fields are EvoParams which are cloned automatically by SystemClone()
std::unique_ptr<EvoScript> TestBot::UserClone() const
{
  return std::unique_ptr<EvoScript>(new TestBot());
}
using Grail.Evolution;

public class TestBot : EvoScript
{
    public EvoParam<int> Attack { get; private set} = new EvoParam<int>(50, 100, 150);
    public EvoParam<int> Health { get; private set} = new EvoParam<int>(50, 100, 150);

    public TestBot()
    {
        RegisterParams(Attack, Health);

        //or
        //RegisterParam(Attack);
        //RegisterParam(Health);
    }

    //no need to clone anything inside because the only fields are EvoParams which are cloned automatically by SystemClone()
    public override EvoScript UserClone() => new TestBot();
}

Arena

Arena is an object that evaluates the quality of scripts in some automatic manner. The most common way to do this are:

  1. Tournaments - let the various bots play against each other and assign higher fitness value to bots that win more often.

  2. Corresponding competition - let the bots play the game on their own and assign fitness value proportional to some measure, e.g. the game score (if there is one) or completion time (the lower the better). You can use any desired measure here. Maybe you want the bots to be neither too fast nor too slow - assign the fitness that is proportional to the distance from some sensible desired value.

In future versions, we will release a dedicated Arena that works in a client-server architecture and communicates using the GRPC protocol. This enables to take advantage of multiple CPU cores and speed up the optimization process.

As of current, you can provide your own Arena specialization that inherits from the Arena class. The base class looks as follows:

  • C++

  • C#

//grail/Evolution/Arena.hh
namespace grail
{
  namespace evolution
  {
    class EvoScript;
    /**
    The Arena class is an interface for the evaluator of the quality of players represented by EvoScripts.
    It is used either stand-alone or in the in the evaluation phase of Evolutionary Algorithm optimization */
    class Arena
    {
    public:
      /// <summary>
      /// This function's responsibility is to evaluate EvoScripts and assign the value to them using setFitness().
      /// If your method of evaluation is deterministic and absolute (i.e. fitness only depends on the EvoScript object itself) then you might ignore the @onceEvaluatedList.
      /// If your method is non-deterministic or relative (e.g., tournament-based) to the whole population, then you might evaluate @onceEvaluatedList again within the scope of the current population.
      /// </summary>
      /// <param name="onceEvaluatedList">EvoScript objects that have been evaluated at least once in previous iterations. The evaluation might need to be updated.</param>
      /// <param name="neverEvaluatedList">EvoScript objects that have never been evaluated. Assign Fitness to them in this method.</param>
    virtual void Evaluate(std::vector<std::unique_ptr<EvoScript>>& onceEvaluatedPlayers, std::vector<std::unique_ptr<EvoScript>>& neverEvaluatedPlayers) = 0;

      virtual ~Arena();
    };
  }
}
namespace Grail.Evolution
{
  /** The Arena class is an interface for the evaluator of the quality of players represented by EvoScripts.
  It is used either stand-alone or in the in the evaluation phase of Evolutionary Algorithm optimization */
  public abstract class Arena
  {
    /// <summary>
    /// This function's responsibility is to evaluate EvoScripts and assign the value to them using the [.Fitness] getter.
    /// If your method of evaluation is deterministic and absolute (i.e. fitness only depends on the EvoScript object itself) then you might ignore the @onceEvaluatedList.
    /// If your method is non-deterministic or relative (e.g., tournament-based) to the whole population, then you might evaluate @onceEvaluatedList again within the scope of the current population.
    /// </summary>
    /// <param name="onceEvaluatedList">EvoScript objects that have been evaluated at least once in previous iterations. The evaluation might need to be updated.</param>
    /// <param name="neverEvaluatedList">EvoScript objects that have never been evaluated. Assign Fitness to them in this method.</param>
    public abstract void Evaluate(List<EvoScript> onceEvaluatedList, List<EvoScript> neverEvaluatedList);


    /// <summary>
    /// A helper class that will: for each *bot* in @bots test it against randomly chosen @trialCount another bots and call the compareTwoFunction to score the bot.
    /// This function will assign Fitness to each bot based on the average score obtained in the process.
    /// </summary>
    /// <param name="bots">The population of bots to be tested.</param>
    /// <param name="trialCount">The number of times, each bot is tested against some subset of population.</param>
    /// <param name="maxScoreInTrial">The maximum score that a bot can get in a single test.</param>
    /// <param name="compareTwoFunction">The function that test a pair of bots and returns score from the first one's perspective.</param>
    /// <param name="random">Random numbers generator.</param>
    protected virtual void TwoPlayerTournamentSelection(List<EvoScript> bots, int trialCount, float maxScoreInTrial, Func<EvoScript, EvoScript, float> compareTwoFunction, Random random)
    {
      float normalization = 1.0f / (trialCount * maxScoreInTrial);
      foreach (EvoScript bot in bots)
      {
        float wins = 0;
        for (int i = 0; i < trialCount; ++i)
        {
          wins += compareTwoFunction(bot, bots[random.Next(bots.Count)]);
        }
        bot.Fitness = wins * normalization;
      }
    }
  }
}
Inside the Evaluate() function assign fitness by using the SetFitness() function in C++ or Fitness property in C# of the EvoScript.

Please read the code comments the methods carefully. The Evaluate method has two parameters - onceEvaluatedList and neverEvaluatedList in case the evaluation may change in time. An example of such changing evaluation is when it is based on the number of wins against other bots. Because the other bots may change, then such a relative evaluation may change too.

Below is an example of Arena implementation. Although the fitness proportional to both health and attack of a bot makes sense, it is only to show the idea. Naturally, in a real-world scenario, the fitness would not be calculated so easily and would require running some games.

  • C++

  • C#

#include "grail/Evolution/Arena.hh"
using namespace grail::evolution; //for simplicity of the example

class BotArena : public Arena
{
public:
  virtual void Evaluate(std::vector<std::unique_ptr<EvoScript>>& onceEvaluatedList, std::vector<std::unique_ptr<EvoScript>>& neverEvaluatedList) override;
  virtual ~BotArena() { } override;
};

void BotArena::Evaluate(std::vector<std::unique_ptr<EvoScript>>& /*onceEvaluatedList*/, std::vector<std::unique_ptr<EvoScript>>& neverEvaluatedList)
{
  for (auto& script : neverEvaluatedList)
  {
    auto ourBot = static_cast<TestBot&>(*script);

    //the fitness is higher proportionally to the Health and Attack
    script->SetFitness(static_cast<float>(ourBot.Health*ourBot.Attack));
  }
}
using Grail.Evolution;

public class BotArena : Arena
{
  public override void Evaluate(List<EvoScript> onceEvaluatedList, List<EvoScript> neverEvaluatedList)
  {
    foreach (var script in neverEvaluatedList)
    {
      var ourBot = script as TestBot;

      //the fitness is higher proportionally to the Health and Attack
      script.Fitness = ourBot.Attack * ourBot.Health;
    }
  }
}

API Reference

  • C++

  • C#

Evolutionary Algorithm

This is the optimization algorithm that uses the Arena and EvoScripts introduced earlier.

First construct the EAOptimizer object. The constructor looks as follows:

  • C++

  • C#

EAOptimizer(std::vector<std::unique_ptr<EvoScript>>& initialPopulation,
            std::unique_ptr<Arena> arenaEvaluation,
            std::unique_ptr<Crossover> crossover = std::make_unique<Crossover>(),
            std::unique_ptr<Mutation> mutation = std::make_unique<Mutation>(),
            std::unique_ptr<Selection> selection = std::make_unique<Selection>(),
            std::mt19937_64::result_type seed = std::random_device{}());
public EAOptimizer(List<EvoScript> initialPopulation, Arena arenaEvaluation,  Mutation mutation = null, Crossover crossover = null, Selection selection = null)
{
    this.selection = selection ?? new SelectionDeterministicTopN();
    this.mutation = mutation ?? new Mutation();
    this.crossover = crossover ?? new Crossover();
    this.tempPopulation = new List<EvoScript>();

    //additional internal initialization
    ...
}

The mutation, crossover and selection objects are optional. If they are not provided, the default ones will be used.

Finally, run the evolutionary process by calling the Run() method and passing the number of epochs (iterations). Below you can find the example:

  • C++

  • C#

#include "grail/Evolution/EAOptimizer.hh"
#include "grail/Evolution/Selection.hh"
#include "grail/Evolution/Crossover.hh"
#include "grail/Evolution/Mutation.hh"
#include "grail/Evolution/Arena.hh"

using namespace grail::evolution; //for simplicity of the example

//constructor with default mutation, crossover and selection
EAOptimizer ea(initialPopulation, std::make_unique<BotArena>());

//50 epochs of the algorithm
ea.Run(50);

//getting the winner script
auto winner = static_cast<TestBot&>(ea.GetBestIndividual());
using Grail.Evolution;

//constructor with default mutation, crossover and selection
EAOptimizer ea = new EAOptimizer(initialPopulation, new BotArena();

//50 epochs of the algorithm
ea.Run(50);

//getting the winner script
auto winner = ea.GetBestIndividual() as TestBot;

Some properties of our implementation

  • The evolutionary algorithm first performs crossover and then mutation.

  • The algorithm maintains both the parent and children populations before the selection.

  • The evolutionary algorithm will maintain the size of the initial population after the selection phase. If the children population is too small (e.g. too low crossover rate) then it will be filled with individuals from the parent population.

Crossover

The following configuration parameters are available.

  1. CrossoverPhaseRate - probability that the crossover phase will occur in an iteration. This is the first check regarding the crossover - if equal to FALSE then no children population will be produced. The default value is 1.0 which means that crossover is always considered.

  2. CrossoverSelectionType - the way of choosing individuals for crossover. The default value is PSEUDO_ROULETTE.

    1. RANDOM - individuals are chosen for crossover using uniform random distribution.

    2. ROULETTE - individuals are chosen for crossover using the roulette wheel selection proportially to their Fitness value.

    3. TOP - only the best fit individuals are chosen for crossover. There is no randomness here.

    4. PSEUDO_ROULETTE - PseudoRouletteRandomPortion * PopulationSize of individuals will be chosen at random and the rest according to the roulette wheel selection.

    5. CONSECUTIVE_PAIRS - as the name implies, the individual will cross over with its successor in the list maintained by the algorithm. Here the CrossoverIndividualRate parameter is ignored, meaning that each individual (if the population size is even) will participate in the crossover. If the population size is odd, then the last individual will not be used in the crossover. This is a rather unusual procedure, that sometimes works.

  3. PseudoRouletteRandomPortion - used if CrossoverSelectionType = PSEUDO_ROULETTE. The default value is 0.3.

  4. CrossoverIndividualRate - the average rate of crossover. If N is the PopulationSize, then the expected number of children is equal to N * CrossoverIndividualRate. The default value is 0.45.

It is recommended to use the ready-to-use default Crossover object. However, if you want to provide your own implementation, then inherit from Crossover and override the CrossoverIndividuals method. This way you take advantage of the built-in selection of individuals for crossover and only implement the actual procedure of combining two individuals.

Mutation

The following configuration parameters are available.

  1. MutationPhaseRate - probability that the mutation phase will occur in an iteration. This is the first check regarding the mutation - if equal to FALSE then no further actions regarding mutation will be done in the current iteration (epoch) of the algorithm. The default value is 1.0 which means that mutation is always considered.

  2. RandomResolutionTypeForIndividual - the formula of selecting individuals for mutation. Two options are possible: {ITERATE_AND_TEST_P, SELECT_P_OF_N}. The default value is SELECT_P_OF_N.

    1. ITERATE_AND_TEST_P - a loop will iterate over individuals and mutate each one with probability equal to MutationIndividualRate.

    2. SELECT_P_OF_N - the number equal to MutationIndividualRate*PopulationSize clamped to the [0, PopulationSize] interval will be selected for mutation.

  3. MutationIndividualRate - probability that an individual will mutate (provided that the mutation happens in the current iteration). See above how this parameter interferes with MutationSelectionType. The default value is 0.09. This means that approximately 9% of the population will be mutated.

  4. MutationChromosomeRate - probability that a chromosome of an individual that designated for mutation will change. In the case of EvoScripts, the chromosome is a single EvoParam. A vector of chromosome (EvoParams) makes the individual - the portion of an EvoScript that can be optimized. The default value is 0.5. This means that half of the individual will be mutated.

  5. RandomResolutionTypeForChromosome - an analogous to MutationSelectionType but this time for selecting chromosomes from a individual rather than individuals from the population.

  6. MutateBothParentsAndChildren - if TRUE then both individuals coming from the previous population and children can be mutated. If FALSE then only the individuals after crossover can be mutated. The default value is TRUE.

It is recommended to use the ready-to-use default Mutation object. However, if you want to provide your own implementation, then inherit from Mutation and you can choose at which point you want to change the default procedure. You can override Perform to change it completely, MutateIndividual to change mutation of the EvoScript or MutateChromosome to just change the mutation of EvoParam.

Selection

There is a default selection object, which is "fed" by individuals by EAOptimizer and among them performs the top N cutoff from individuals sorted by Fitness value.
Using the EAOptimizer object, you can set the ElitismType enum and optionally the ElitismRate field to affect which individuals are "fed" to the selection.

  1. ElitismType - affects how individuals are chosen to the next iteration. The default value is ROULETTE.

    1. NONE - just the the top N individuals from combined populations of children and parents will advance to the next iteration, where N = PopulationSize.

    2. FITNESS_RANKING - the ElitismRate * PopulationSize best fit individuals from the previous population (parents) must remain. The rest of the new population, i.e., (1-ElitismRate) * PopulationSize will be chosen amongst the combined populations of parents and children.

    3. ROULETTE - the number equal to ElitismRate* PopulationSize individuals from previous population (parents) will be chosen by roulette sampling proportional to their fitness value. The rest of the new population, i.e., (1-ElitismRate)*PopulationSize will be chosen amongst the combined populations of parents and children.

  2. ElitismRate - a parameter used when ElitismType = FITNESS_RANKING or ROULETTE. The default value is 0.2.

Although there is a default selection object available, you can also override the Selection class and provide logic that choses the number of individuals equal to populationMaxSize to advance to the next iteration. The population passed to the selection object as argument is already sorted by Fitness value.

Just a reminder that the Fitness value has to be assigned to individuals by the Arena.
  • C++

  • C#

class Selection
{
    public:
        virtual void Perform(std::vector<std::unique_ptr<EvoScript>>& population, size_t populationMaxSize);
        virtual ~Selection();
};
public abstract class Selection
{
    public abstract List<EvoScript> Perform(List<EvoScript> currentPopulation, int populationMaxSize);
}

Inspiration for Usage

The two main usages are:

  1. Direct optimization of bot’s parameters - inherit your bot from EvoScript and make its optimizable parameters as EvoParams. This is a simpler case for which we have shown examples above.

  2. Optimization of any other parameter in a game (e.g. a map) - prepare some construction script that inherits from EvoScript and make some of its parameters as EvoParams. Such a script may then create a bot or some part of the game environment in which the bots play. Then, those bots may play on the Arena and indirectly evaluate the construction scripts that were used in the beginning. In the first case you could directly use EvoParams. Here we have a trickier case that will often require doing some preprocessing of the environment based upon the EvoParams each time a new EvoScript object is cloned.