pyshgp.gp package

pyshgp.gp.estimators module

The estimator module defines a PushEstimator class.

class pyshgp.gp.estimators.PushEstimator(spawner: pyshgp.gp.genome.GeneSpawner, search: str = 'GA', selector: Union[pyshgp.gp.selection.Selector, str] = 'lexicase', variation_strategy: Union[pyshgp.gp.variation.VariationStrategy, dict, str] = 'umad', population_size: int = 300, max_generations: int = 100, initial_genome_size: Tuple[int, int] = 20, 100, simplification_steps: int = 2000, last_str_from_stdout: bool = False, interpreter: pyshgp.push.interpreter.PushInterpreter = 'default', parallelism: Union[int, bool] = False, push_config: pyshgp.push.config.PushConfig = 'default', verbose: int = 0, **kwargs)[source]

Bases: object

Simple estimator that synthesizes Push programs.

Parameters
  • spawner (Union[GeneSpawner, str], optional) – The GeneSpawner to use when producing Genomes during initialization and variation. Default is all core instructions, no literals, and no ERC Generators.

  • search (Union[SearchAlgorithm, str], optional) – The search algorithm, or its abbreviation, to use to when synthesizing Push programs.

  • selector (Union[Selector, str], optional) – The selector, or name of selector, to use when selecting parents. The default is lexicase selection.

  • variation_strategy (Union[VariationStrategy, dict, str]) – A VariationStrategy describing a collection of VariationOperators and how frequently to use them. If a dict is supplied, keys should be operator names and values should be the probability distribution. If a string is provided, the VariationOperators with that name will always be used. Default is "umad"".

  • population_size (int, optional) – The number of individuals hold in the population each generation. Default is 300.

  • max_generations (int, optional) – The number of generations to run the search algorithm. Default is 100.

  • initial_genome_size (Tuple[int, int], optional) – The range of genome sizes to produce during initialization. Default is (20, 100)

  • simplification_steps (int) – The number of simplification iterations to apply to the best Push program produced by the search algorithm. Default 2000.

  • interpreter (PushInterpreter, optional) – The PushInterpreter to use when making predictions. Also holds the instruction set to use

  • parallelism (Union[Int, bool], optional) – Set the number of processes to spawn for use when performing embarrassingly parallel tasks. If false, no processes will spawn and compuation will be serial. Default is true, which spawns one process per available cpu.

  • verbose (int, optional) – Indicates if verbose printing should be used during searching. Default is 0. Options are 0, 1, or 2.

  • **kwargs – Arbitrary keyword arguments. Examples of supported arguments are epsilon (bool or float) when using Lexicase as the selector, and tournament_size (int) when using tournament selection.

fit(X, y)[source]

Run the search algorithm to synthesize a push program.

Parameters
  • X (pandas dataframe of shape = [n_samples, n_features]) – The training input samples.

  • y (list, array-like, or pandas dataframe.) – The target values (class labels in classification, real numbers in regression). Shape = [n_samples] or [n_samples, n_outputs]

load(filepath: str)[source]

Load a found solution from a JSON file.

Parameters

filepath – Filepath to read the serialized search result from.

predict(X)[source]

Execute the synthesized push program on a dataset.

Parameters

X (pandas dataframe of shape = [n_samples, n_features]) – The set of cases to predict.

Returns

y_hat

Return type

pandas dataframe of shape = [n_samples, n_outputs]

save(filepath: str)[source]

Load the found solution to a JSON file.

Parameters

filepath – Filepath to write the serialized search result to.

score(X, y)[source]

Run the search algorithm to synthesize a push program.

Parameters
  • X (pandas dataframe of shape = [n_samples, n_features]) – The training input samples.

  • y (list, array-like, or pandas dataframe.) – The target values (class labels in classification, real numbers in regression). Shape = [n_samples] or [n_samples, n_outputs]

pyshgp.gp.evaluation module

The evaluation module defines classes to evaluate program CodeBlocks.

class pyshgp.gp.evaluation.DatasetEvaluator(X, y, interpreter: pyshgp.push.interpreter.PushInterpreter = 'default', penalty: float = 1000000.0)[source]

Bases: pyshgp.gp.evaluation.Evaluator

Evaluator driven by a labeled dataset.

evaluate(program: pyshgp.push.program.Program) → numpy.array[source]

Evaluate the program and return the error vector.

Parameters

program – Program (CodeBlock of Push code) to evaluate.

Returns

The error vector of the program.

Return type

np.ndarray

class pyshgp.gp.evaluation.Evaluator(interpreter: pyshgp.push.interpreter.PushInterpreter = 'default', penalty: float = 1000000.0)[source]

Bases: abc.ABC

Base class or evaluators.

Parameters
  • interpreter (PushInterpreter, optional) – PushInterpreter used to run program and get their output. Default is an interpreter with the default configuration and all core instructions registered.

  • penalty (float, optional) – When a program’s output cannot be evaluated on a particular case, the penalty error is assigned. Default is 5e5.

  • verbosity_config (Optional[VerbosityConfig] (default = None)) – A VerbosityConfig controlling what is logged during evaluation. Default is no verbosity.

default_error_function(actuals, expecteds) → numpy.array[source]

Produce errors of actual program output given expected program output.

The default error function is intended to be a universal error function for Push programs which only output a subset of the standard data types.

Parameters
  • actuals (list) – The values produced by running a Push program on a sequences of cases.

  • expecteds (list) – The ground truth values for the sequence of cases used to produce the actuals.

Returns

An array of error values describing the program’s performance.

Return type

np.array

abstract evaluate(program: pyshgp.push.program.Program) → numpy.ndarray[source]

Evaluate the program and return the error vector.

Parameters

program – Program (CodeBlock of Push code) to evaluate.

Returns

The error vector of the program.

Return type

np.ndarray

class pyshgp.gp.evaluation.FunctionEvaluator(error_function: Callable)[source]

Bases: pyshgp.gp.evaluation.Evaluator

Evaluator driven by an error function.

evaluate(program: pyshgp.push.program.Program) → numpy.ndarray[source]

Evaluate the program and return the error vector.

Parameters

program – Program (CodeBlock of Push code) to evaluate.

Returns

The error vector of the program.

Return type

np.ndarray

pyshgp.gp.evaluation.damerau_levenshtein_distance(a: Union[str, Sequence], b: Union[str, Sequence]) → int[source]

Damerau Levenshtein Distance that works for both strings and lists.

https://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance. This implemenation is heavily inspired by the implementation in the jellyfish package. https://github.com/jamesturk/jellyfish

pyshgp.gp.genome module

The genome module defines the Genome type and provides genome translation, spawning and simplification.

A Genome is a persistent collection of gene Atoms (any Atom that isn’t a CodeBlock). It can be translated into a CodeBlock.

The GeneSpawner is a factory capable of generating random genes and random genomes. It is used for initializing a population as well producing new genes used by variation operators (ie. mutation).

The genome simplification process is useful for removing superfluous genes from a genome without negatively impacting the behavior of the program produced by the genome. This process has many benefits including: improving generalization, shrinking the size of the serialized solution, and in some cases making the program easier to explain.

class pyshgp.gp.genome.GeneSpawner(n_inputs: int, instruction_set: Union[pyshgp.push.instruction_set.InstructionSet, str], literals: Sequence[Any], erc_generators: Sequence[Callable], distribution: pyshgp.utils.DiscreteProbDistrib = 'proportional')[source]

Bases: object

A factory of random Genes (Atoms) and Genomes.

When spawning a random gene, the result can be one of three types of Atoms. An Instruction, a Closer, or a Literal. If the Atom is a Literal, it may be one of the supplied Literals, or it may be the result of running one of the Ephemeral Random Constant generators.

Reference for ERCs: “A field guide to genetic programming”, Section 3.1 Riccardo Poli and William B. Langdon and Nicholas Freitag McPhee, http://www.gp-field-guide.org.uk/

n_input

Number of input instructions that could appear the genomes.

Type

int

instruction_set

InstructionSet containing instructions to use when spawning genes and genomes.

Type

pyshgp.push.instruction_set.InstructionSet

literals

A list of Literal objects to pull from when spawning genes and genomes.

Type

Sequence[pyshgp.push.instruction_set.atoms.Literal]

erc_generator

A list of functions (aka Ephemeral Random Constant generators). When one of these functions is called, the output is placed in a Literal and returned as the spawned gene.

Type

Sequence[Callable]

distribution

A probability distribution describing how frequently to produce Instructions, Closers, Literals, and ERCs.

Type

pyshgp.utils.DiscreteProbDistrib

random_erc()pyshgp.push.atoms.Literal[source]

Materialize a random ERC generator into a Literal and return it.

Returns

A Literal whose value comes from running a ERC generator function.

Return type

pyshgp.push.atoms.Literal

random_gene()pyshgp.push.atoms.Atom[source]

Return a random Atom based on the GenomeSpawner’s distribution.

Returns

An random Atom. Either an Instruction, Closer, or Literal.

Return type

pyshgp.push.atoms.Atom

random_input()pyshgp.push.atoms.Input[source]

Return a random Input.

Returns

Return type

pyshgp.push.atoms.Input

random_instruction()pyshgp.push.atoms.InstructionMeta[source]

Return a random Instruction from the InstructionSet.

Returns

A randomly selected Literal.

Return type

pyshgp.push.atoms.InstructionMeta

random_literal()pyshgp.push.atoms.Literal[source]

Return a random Literal from the set of Literals.

Returns

A randomly selected Literal.

Return type

pyshgp.push.atoms.Literal

spawn_genome(size: Union[int, Sequence[int]])pyshgp.gp.genome.Genome[source]

Return a random Genome based on the GenomeSpawner’s distribution.

The genome will contain the specified number of Atoms if size is an integer. If size is a pair of integers, the genome will be of a random size in the range of the two integers.

Parameters

size – The resulting genome will contain this many Atoms if size is an integer. If size is a pair of integers, the genome will be of a random size in the range of the two integers.

Returns

A Genome with random contents of a given size.

Return type

pyshgp.gp.genome.Genome

class pyshgp.gp.genome.GeneTypes(value)[source]

Bases: enum.Enum

An Enum denoting the different types of genes that can appear in a Genome.

CLOSE = 3
ERC = 5
INPUT = 1
INSTRUCTION = 2
LITERAL = 4
class pyshgp.gp.genome.Genome(initial=())[source]

Bases: pyrsistent._checked_types.CheckedPVector

A linear sequence of genes (aka any atom that isn’t a CodeBlock).

PyshGP uses the Plushy genome representation.

See: http://gpbib.cs.ucl.ac.uk/gp-html/Spector_2019_GPTP.html

class pyshgp.gp.genome.GenomeSimplifier(evaluator: pyshgp.gp.evaluation.Evaluator, program_signature: pyshgp.push.program.ProgramSignature)[source]

Bases: object

Simplifies a genome while preserving, or improving, its error.

Genomes, and Push programs, can contain superfluous Push code. This extra code often has no effect on the program behavior, but occasionally it can introduce subtle errors or behaviors that is not covered by the training cases. Removing the superfluous code makes genomes (and thus programs) smaller and easier to understand. More importantly, simplification can improve the generalization of the given genome/program.

The process of genome simplification is iterative and closely resembles simple hill climbing. For each iteration, the simplifier will randomly select a small number of random genes to remove. The Genome is re-evaluated and if its error gets worse, the change is reverted. After repeating this for some number of steps, the resulting genome will be the same size or smaller while containing the same (or better) error value.

Reference: “Improving generalization of evolved programs through automatic simplification” Thomas Helmuth, Nicholas Freitag McPhee, Edward Pantridge, and Lee Spector. 2017. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO ‘17). ACM, New York, NY, USA, 937-944. DOI: https://doi.org/10.1145/3071178.3071330

See: https://dl.acm.org/citation.cfm?id=3071178.3071330

simplify(genome: pyshgp.gp.genome.Genome, original_errors: numpy.ndarray, steps: int = 2000) → Tuple[pyshgp.gp.genome.Genome, numpy.ndarray][source]

Simplify the given genome while maintaining error.

Parameters
  • genome – The Genome to simplify.

  • original_errors – Error vector of the genome to simplify.

  • steps – Number of simplification iterations to perform. Default is 2000.

Returns

The shorter Genome that expresses the same computation.

Return type

pyshgp.gp.genome.Genome

class pyshgp.gp.genome.Opener(**kwargs)[source]

Bases: pyrsistent._precord.PRecord

Marks the start of one or more CodeBlock.

dec()pyshgp.gp.genome.Opener[source]

Create an Opener with count decremented.

pyshgp.gp.genome.genome_to_code(genome: pyshgp.gp.genome.Genome)pyshgp.push.atoms.CodeBlock[source]

Translate into nested CodeBlocks.

These CodeBlocks can be considered the Push program representation of the Genome which can be executed by a PushInterpreter and evaluated by an Evaluator.

pyshgp.gp.individual module

The individual module defines an Individaul in an evolutionary population.

Individuals are made up of Genomes, which are the linear Push program representations which can be manipulated by seach algorithms.

class pyshgp.gp.individual.Individual(genome: pyshgp.gp.genome.Genome, signature: pyshgp.push.program.ProgramSignature)[source]

Bases: pyshgp.utils.Saveable, pyshgp.utils.Copyable

An individual in an evolutionary population.

genome

The Genome of the Individual.

Type

Genome

error_vector

An array of error values produced by evaluating the Individual’s program.

Type

np.array

total_error

The sum of all error values in the Individaul’s error_vector.

Type

float

error_vector_bytes

Hashable Byte representation of the individual’s error vector.

property error_vector

Numpy array of numeric error values.

property error_vector_bytes

Hashable Byte representation of the individual’s error vector.

genome
id
property program

Push program of individual. Taken from Plush genome.

signature
property total_error

Numeric sum of the error vector.

pyshgp.gp.population module

The population module defines an evolutionary population of Individuals.

class pyshgp.gp.population.Population(individuals: list = None)[source]

Bases: collections.abc.Sequence

A sequence of Individuals kept in sorted order, with respect to their total errors.

add(individual: pyshgp.gp.individual.Individual)[source]

Add an Individaul to the population.

all_error_vectors()[source]

2D array containing all Individuals’ error vectors.

all_total_errors()[source]

1D array containing all Individuals’ total errors.

best()[source]

Return the best n individual in the population.

best_n(n: int)[source]

Return the best n individuals in the population.

error_diversity()[source]

Proportion of unique error vectors.

evaluate(evaluator: pyshgp.gp.evaluation.Evaluator)[source]

Evaluate all unevaluated individuals in the population.

evaluated
genome_diversity()[source]

Proportion of unique genomes.

mean_genome_length()[source]

Average genome length across all individuals.

median_error()[source]

Median total error in the population.

p_evaluate(evaluator_proxy, pool: multiprocessing.context.BaseContext.Pool)[source]

Evaluate all unevaluated individuals in the population in parallel.

program_diversity()[source]

Proportion of unique programs.

unevaluated

pyshgp.gp.search module

The search module defines algorithms to search for Push programs.

class pyshgp.gp.search.GeneticAlgorithm(config: pyshgp.gp.search.SearchConfiguration)[source]

Bases: pyshgp.gp.search.SearchAlgorithm

Genetic algorithm to synthesize Push programs.

An initial Population of random Individuals is created. Each generation begins by evaluating all Individuals in the population. Then the current Popluation is replaced with children produced by selecting parents from the Population and applying VariationOperators to them.

step()[source]

Perform one generation (step) of the genetic algorithm.

The step method assumes an evaluated Population and performs parent selection and variation (producing children).

class pyshgp.gp.search.ParallelContext(spawner: pyshgp.gp.genome.GeneSpawner, evaluator: pyshgp.gp.evaluation.Evaluator, n_proc: Optional[int] = None)[source]

Bases: object

Holds the objects needed to coordinate parallelism.

close()[source]
class pyshgp.gp.search.SearchAlgorithm(config: pyshgp.gp.search.SearchConfiguration)[source]

Bases: abc.ABC

Base class for all search algorithms.

Parameters

config (SearchConfiguration) – The configuation of the search algorithm.

config

The configuration of the search algorithm.

Type

SearchConfiguration

generation

The current generation, or iteration, of the search.

Type

int

best_seen

The best Individual, with respect to total error, seen so far.

Type

Individual

population

The current Population of individuals.

Type

Population

init_population()[source]

Initialize the population.

is_solved() → bool[source]

Return True if the search algorithm has found a solution or False otherwise.

run()pyshgp.gp.individual.Individual[source]

Run the algorithm until termination.

abstract step() → bool[source]

Perform one generation (step) of the search.

The step method should assume an evaluated Population, and must only perform parent selection and variation (producing children). The step method should modify the search algorithms population in-place, or assign a new Population to the population attribute.

class pyshgp.gp.search.SearchConfiguration(signature: pyshgp.push.program.ProgramSignature, evaluator: pyshgp.gp.evaluation.Evaluator, spawner: pyshgp.gp.genome.GeneSpawner, selection: Union[pyshgp.gp.selection.Selector, pyshgp.utils.DiscreteProbDistrib, str] = 'lexicase', variation: Union[pyshgp.gp.variation.VariationOperator, pyshgp.utils.DiscreteProbDistrib, str] = 'umad', population_size: int = 500, max_generations: int = 100, error_threshold: float = 0.0, initial_genome_size: Tuple[int, int] = 10, 50, simplification_steps: int = 2000, parallelism: Union[int, bool] = True, **kwargs)[source]

Bases: object

Configuration of an search algorithm.

Parameters
  • evaluator (Evaluator) – The Evaluator to use when evaluating individuals.

  • spawning (GeneSpawner) – The GeneSpawner to use when producing Genomes during initialization and variation.

  • selection (Union[Selector, DiscreteProbDistrib, str], optional) – A Selector, or DiscreteProbDistrib of selectors, to use when selecting parents. The default is lexicase selection.

  • variation (Union[VariationOperator, DiscreteProbDistrib, str], optional) – A VariationOperator, or DiscreteProbDistrib of VariationOperators, to use during variation. Default is SIZE_NEUTRAL_UMAD.

  • population_size (int, optional) – The number of individuals hold in the population each generation. Default is 300.

  • max_generations (int, optional) – The number of generations to run the search algorithm. Default is 100.

  • error_threshold (float, optional) – If the search algorithm finds an Individual with a total error less than this values, stop searching. Default is 0.0.

  • initial_genome_size (Tuple[int, int], optional) – The range of genome sizes to produce during initialization. Default is (20, 100)

  • simplification_steps (int, optional) – The number of simplification iterations to apply to the best Push program produced by the search algorithm. Default is 2000.

  • parallelism (Union[Int, bool], optional) – Set the number of processes to spawn for use when performing embarrassingly parallel tasks. If false, no processes will spawn and compuation will be serial. Default is true, which spawns one process per available cpu.

  • verbosity_config (Union[VerbosityConfig, str], optional) – A VerbosityConfig controlling what is logged during the search. Default is no verbosity.

get_selector()[source]

Return a Selector.

get_variation_op()[source]

Return a VariationOperator.

tear_down()[source]
class pyshgp.gp.search.SimulatedAnnealing(config: pyshgp.gp.search.SearchConfiguration)[source]

Bases: pyshgp.gp.search.SearchAlgorithm

Algorithm to synthesize Push programs with Simulated Annealing.

At each step (generation), the simulated annealing heuristic mutates the current Individual, and probabilistically decides between accepting or rejecting the child. If the child is accepted, it becomes the new current Individual.

After each step, the simmulated annealing system cools its temperature. As the temperature lowers, the probability of accepting a child that does not have a lower total error than the current Individual decreases.

step()[source]

Perform one generation, or step, of the Simulated Annealing.

The step method assumes an evaluated Population one Individual and produces a single candidate Individual. If the candidate individual passes the acceptance function, it becomes the Individual in the Population.

pyshgp.gp.search.get_search_algo(name: str, **kwargs)pyshgp.gp.search.SearchAlgorithm[source]

Return the search algorithm class with the given name.

pyshgp.gp.selection module

The selection module defines classes to select Individuals from Populations.

class pyshgp.gp.selection.CaseStream(n_cases: int)[source]

Bases: object

A generator of indices yielded in a random order.

class pyshgp.gp.selection.Elite[source]

Bases: pyshgp.gp.selection.Selector

Returns the best N individuals by total error.

select(population: pyshgp.gp.population.Population, n: int = 1) → Sequence[pyshgp.gp.individual.Individual][source]

Return n individuals from the population.

Parameters
  • population – A Population of Individuals.

  • n (int) – The number of parents to select from the population. Default is 1.

Returns

The selected Individuals.

Return type

Sequence[Individual]

select_one(population: pyshgp.gp.population.Population)pyshgp.gp.individual.Individual[source]

Return single individual from population.

Parameters

population – A Population of Individuals.

Returns

The selected Individual.

Return type

Individual

class pyshgp.gp.selection.FitnessProportionate[source]

Bases: pyshgp.gp.selection.Selector

Fitness proportionate selection, also known as roulette wheel selection.

See: https://en.wikipedia.org/wiki/Fitness_proportionate_selection

select(population: pyshgp.gp.population.Population, n: int = 1) → Sequence[pyshgp.gp.individual.Individual][source]

Return n individuals from the population.

Parameters
  • population – A Population of Individuals.

  • n (int) – The number of parents to select from the population. Default is 1.

Returns

The selected Individuals.

Return type

Sequence[Individual]

select_one(population: pyshgp.gp.population.Population)pyshgp.gp.individual.Individual[source]

Return single individual from population.

Parameters

population – A Population of Individuals.

Returns

The selected Individual.

Return type

Individual

class pyshgp.gp.selection.Lexicase(epsilon: Union[bool, float, numpy.ndarray] = False)[source]

Bases: pyshgp.gp.selection.SimpleMultiSelectorMixin, pyshgp.gp.selection.Selector

Lexicase Selection.

All training cases are considered iteratively in a random order. For each training cases, the population is filtered to only contain the Individuals which have an error value within epsilon of the best error value on that case. This filtering is repeated until the population is down to a single Individual or all cases have been used. After the filtering iterations, a random Individual from the remaining set is returned as the selected Individual.

See: https://ieeexplore.ieee.org/document/6920034

select(population: pyshgp.gp.population.Population, n: int = 1) → Sequence[pyshgp.gp.individual.Individual][source]

Return n individuals from the population.

Parameters
  • population – A Population of Individuals.

  • n (int) – The number of parents to select from the population. Default is 1.

Returns

The selected Individuals.

Return type

Sequence[Individual]

select_one(population: pyshgp.gp.population.Population)pyshgp.gp.individual.Individual[source]

Return single individual from population.

Parameters

population – A Population of Individuals.

Returns

The selected Individual.

Return type

Individual

class pyshgp.gp.selection.Selector[source]

Bases: abc.ABC

Base class for all selection algorithms.

abstract select(population: pyshgp.gp.population.Population, n: int = 1) → Sequence[pyshgp.gp.individual.Individual][source]

Return n individuals from the population.

Parameters
  • population (Population) – A Population of Individuals.

  • n (int) – The number of parents to select from the population. Default is 1.

Returns

The selected Individuals.

Return type

Sequence[Individual]

abstract select_one(population: pyshgp.gp.population.Population)pyshgp.gp.individual.Individual[source]

Return single individual from population.

Parameters

population – A Population of Individuals.

Returns

The selected Individual.

Return type

Individual

class pyshgp.gp.selection.SimpleMultiSelectorMixin[source]

Bases: object

A mixin for Selector classes where selecting many individuals is done by repeated calls to select_one.

select(population: pyshgp.gp.population.Population, n: int = 1) → Sequence[pyshgp.gp.individual.Individual][source]

Return n individuals from the population.

Parameters
  • population (Population) – A Population of Individuals.

  • n (int) – The number of parents to select from the population. Default is 1.

Returns

The selected Individuals.

Return type

Sequence[Individual]

class pyshgp.gp.selection.Tournament(tournament_size: int = 7)[source]

Bases: pyshgp.gp.selection.SimpleMultiSelectorMixin, pyshgp.gp.selection.Selector

Tournament selection.

See: https://en.wikipedia.org/wiki/Tournament_selection

Parameters

tournament_size (int, optional) – Number of individuals selected uniformly randomly to participate in the tournament. Default is 7.

tournament_size

Number of individuals selected uniformly randomly to participate in the tournament. Default is 7.

Type

int, optional

select_one(population: pyshgp.gp.population.Population)pyshgp.gp.individual.Individual[source]

Return single individual from population.

Parameters

population – A Population of Individuals.

Returns

The selected Individual.

Return type

Individual

pyshgp.gp.selection.choice(a, size=None, replace=True, p=None)

Generates a random sample from a given 1-D array

New in version 1.7.0.

Note

New code should use the choice method of a default_rng() instance instead; see random-quick-start.

Parameters
  • a (1-D array-like or int) – If an ndarray, a random sample is generated from its elements. If an int, the random sample is generated as if a were np.arange(a)

  • size (int or tuple of ints, optional) – Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. Default is None, in which case a single value is returned.

  • replace (boolean, optional) – Whether the sample is with or without replacement

  • p (1-D array-like, optional) – The probabilities associated with each entry in a. If not given the sample assumes a uniform distribution over all entries in a.

Returns

samples – The generated random samples

Return type

single item or ndarray

Raises

ValueError – If a is an int and less than zero, if a or p are not 1-dimensional, if a is an array-like of size 0, if p is not a vector of probabilities, if a and p have different lengths, or if replace=False and the sample size is greater than the population size

See also

randint(), shuffle(), permutation()

Generator.choice()

which should be used in new code

Notes

Sampling random rows from a 2-D array is not possible with this function, but is possible with Generator.choice through its axis keyword.

Examples

Generate a uniform random sample from np.arange(5) of size 3:

>>> np.random.choice(5, 3)
array([0, 3, 4]) # random
>>> #This is equivalent to np.random.randint(0,5,3)

Generate a non-uniform random sample from np.arange(5) of size 3:

>>> np.random.choice(5, 3, p=[0.1, 0, 0.3, 0.6, 0])
array([3, 3, 0]) # random

Generate a uniform random sample from np.arange(5) of size 3 without replacement:

>>> np.random.choice(5, 3, replace=False)
array([3,1,0]) # random
>>> #This is equivalent to np.random.permutation(np.arange(5))[:3]

Generate a non-uniform random sample from np.arange(5) of size 3 without replacement:

>>> np.random.choice(5, 3, replace=False, p=[0.1, 0, 0.3, 0.6, 0])
array([2, 3, 0]) # random

Any of the above can be repeated with an arbitrary array-like instead of just integers. For instance:

>>> aa_milne_arr = ['pooh', 'rabbit', 'piglet', 'Christopher']
>>> np.random.choice(aa_milne_arr, 5, p=[0.5, 0.1, 0.1, 0.3])
array(['pooh', 'pooh', 'pooh', 'Christopher', 'piglet'], # random
      dtype='<U11')
pyshgp.gp.selection.get_selector(name: str, **kwargs)pyshgp.gp.selection.Selector[source]

Get the selector class with the given name.

pyshgp.gp.selection.median_absolute_deviation(x: numpy.ndarray) → numpy.float64[source]

Return the MAD.

Parameters

x (array-like, shape = (n,)) –

Returns

mad

Return type

float

pyshgp.gp.selection.one_individual_per_error_vector(population: pyshgp.gp.population.Population) → Sequence[pyshgp.gp.individual.Individual][source]

Preselect one individual per distinct error vector.

Crucial for avoiding the worst case runtime of lexicase selection but does not impact the behavior of which individual gets selected.

pyshgp.gp.selection.random(size=None)

Return random floats in the half-open interval [0.0, 1.0). Alias for random_sample to ease forward-porting to the new random API.

pyshgp.gp.selection.shuffle(x)

Modify a sequence in-place by shuffling its contents.

This function only shuffles the array along the first axis of a multi-dimensional array. The order of sub-arrays is changed but their contents remains the same.

Note

New code should use the shuffle method of a default_rng() instance instead; see random-quick-start.

Parameters

x (array_like) – The array or list to be shuffled.

Returns

Return type

None

See also

Generator.shuffle()

which should be used for new code.

Examples

>>> arr = np.arange(10)
>>> np.random.shuffle(arr)
>>> arr
[1 7 5 2 9 4 3 6 0 8] # random

Multi-dimensional arrays are only shuffled along the first axis:

>>> arr = np.arange(9).reshape((3, 3))
>>> np.random.shuffle(arr)
>>> arr
array([[3, 4, 5], # random
       [6, 7, 8],
       [0, 1, 2]])

pyshgp.gp.variation module

The variation module defines classes for variation operators.

Variation operators (aka genetic operators) are used in evolutionary/genetic algorithms to create “child” genomes from “parent” genomes.

class pyshgp.gp.variation.AdditionMutation(addition_rate: float = 0.01)[source]

Bases: pyshgp.gp.variation.VariationOperator

Uniformly randomly adds some Atoms to parent.

Parameters

rate (float) – The probability of adding a new Atom at any given point in the parent Genome. Default is 0.01.

rate

The probability of adding a new Atom at any given point in the parent Genome. Default is 0.01.

Type

float

num_parents

Number of parent Genomes the operator needs to produce a child Individual.

Type

int

produce(parents: Sequence[pyshgp.gp.genome.Genome], spawner: pyshgp.gp.genome.GeneSpawner)pyshgp.gp.genome.Genome[source]

Produce a child Genome from parent Genomes and optional GenomeSpawner.

Parameters
  • parents – A list of parent Genomes given to the operator.

  • spawner – A GeneSpawner that can be used to produce new genes (aka Atoms).

class pyshgp.gp.variation.Alternation(alternation_rate=0.01, alignment_deviation=10)[source]

Bases: pyshgp.gp.variation.VariationOperator

Uniformly alternates between the two parent genomes.

Parameters
  • rate (float, optional (default=0.01)) – The probability of switching which parent program elements are being copied from. Must be 0 <= rate <= 1. Defaults to 0.1.

  • alignment_deviation (int, optional (default=10)) – The standard deviation of how far alternation may jump between indices when switching between parents.

rate

The probability of switching which parent program elements are being copied from. Must be 0 <= rate <= 1. Defaults to 0.1.

Type

float, optional (default=0.01)

alignment_deviation

The standard deviation of how far alternation may jump between indices when switching between parents.

Type

int, optional (default=10)

num_parents

Number of parent Genomes the operator needs to produce a child Individual.

Type

int

produce(parents: Sequence[pyshgp.gp.genome.Genome], spawner: pyshgp.gp.genome.GeneSpawner = None)pyshgp.gp.genome.Genome[source]

Produce a child Genome from parent Genomes and optional GenomeSpawner.

Parameters
  • parents – A list of parent Genomes given to the operator.

  • spawner – A GeneSpawner that can be used to produce new genes (aka Atoms).

class pyshgp.gp.variation.Cloning[source]

Bases: pyshgp.gp.variation.VariationOperator

Clones the parent genome.

num_parents

Number of parent Genomes the operator needs to produce a child Individual.

Type

int

produce(parents: Sequence[pyshgp.gp.genome.Genome], spawner: pyshgp.gp.genome.GeneSpawner = None)pyshgp.gp.genome.Genome[source]

Produce a child Genome from parent Genomes and optional GenomeSpawner.

Parameters
  • parents – A list of parent Genomes given to the operator.

  • spawner – A GeneSpawner that can be used to produce new genes (aka Atoms).

class pyshgp.gp.variation.DeletionMutation(deletion_rate: float = 0.01)[source]

Bases: pyshgp.gp.variation.VariationOperator

Uniformly randomly removes some Atoms from parent.

Parameters

rate (float) – The probability of removing any given Atom in the parent Genome. Default is 0.01.

rate

The probability of removing any given Atom in the parent Genome. Default is 0.01.

Type

float

num_parents

Number of parent Genomes the operator needs to produce a child Individual.

Type

int

produce(parents: Sequence[pyshgp.gp.genome.Genome], spawner: pyshgp.gp.genome.GeneSpawner)pyshgp.gp.genome.Genome[source]

Produce a child Genome from parent Genomes and optional GenomeSpawner.

Parameters
  • parents – A list of parent Genomes given to the operator.

  • spawner – A GeneSpawner that can be used to produce new genes (aka Atoms).

class pyshgp.gp.variation.Genesis(*, size: Union[int, Sequence[int]])[source]

Bases: pyshgp.gp.variation.VariationOperator

Creates an entirely new (and random) genome.

Parameters

size – The child genome will contain this many Atoms if size is an integer. If size is a pair of integers, the genome will be of a random size in the range of the two integers.

size

The child genome will contain this many Atoms if size is an integer. If size is a pair of integers, the genome will be of a random size in the range of the two integers.

num_parents

Number of parent Genomes the operator needs to produce a child Individual.

Type

int

produce(parents: Sequence[pyshgp.gp.genome.Genome], spawner: pyshgp.gp.genome.GeneSpawner)pyshgp.gp.genome.Genome[source]

Produce a child Genome from parent Genomes and optional GenomeSpawner.

Parameters
  • parents – A list of parent Genomes given to the operator.

  • spawner – A GeneSpawner that can be used to produce new genes (aka Atoms).

class pyshgp.gp.variation.LiteralMutation(push_type: pyshgp.push.types.PushType, rate: float = 0.01)[source]

Bases: pyshgp.gp.variation.VariationOperator, abc.ABC

Base class for mutations of literal Atoms.

Parameters
  • push_type (pyshgp.push.types.PushType) – The PushType which the operator can mutate.

  • rate (float) – The probability of applying the mutation to a given Literal.

push_type

The PushType which the operator can mutate.

Type

pyshgp.push.types.PushType

rate

The probability of applying the mutation to a given Literal.

Type

float

num_parents

Number of parent Genomes the operator needs to produce a child Individual.

Type

int

produce(parents: Sequence[pyshgp.gp.genome.Genome], spawner: pyshgp.gp.genome.GeneSpawner = None)pyshgp.gp.genome.Genome[source]

Produce a child Genome from parent Genomes and optional GenomeSpawner.

Parameters
  • parents – A list of parent Genomes given to the operator.

  • spawner – A GeneSpawner that can be used to produce new genes (aka Atoms).

class pyshgp.gp.variation.VariationOperator(num_parents: int)[source]

Bases: abc.ABC

Base class of all VariationOperators.

Parameters

num_parents (int) – Number of parent Genomes the operator needs to produce a child Individual.

num_parents

Number of parent Genomes the operator needs to produce a child Individual.

Type

int

checknum_parents(parents: Sequence[pyshgp.gp.genome.Genome])[source]

Raise error if given too few parents.

Parameters

parents – A list of parent Genomes given to the operator.

abstract produce(parents: Sequence[pyshgp.gp.genome.Genome], spawner: pyshgp.gp.genome.GeneSpawner)pyshgp.gp.genome.Genome[source]

Produce a child Genome from parent Genomes and optional GenomeSpawner.

Parameters
  • parents – A list of parent Genomes given to the operator.

  • spawner – A GeneSpawner that can be used to produce new genes (aka Atoms).

class pyshgp.gp.variation.VariationPipeline(operators: Sequence[pyshgp.gp.variation.VariationOperator])[source]

Bases: pyshgp.gp.variation.VariationOperator

Variation operator that sequentially applies multiple others variation operators.

Parameters

operators (Sequence[VariationOperators]) – A list of operators to apply in order to produce the child Genome.

operators

A list of operators to apply in order to produce the child Genome.

Type

Sequence[VariationOperators]

num_parents

Number of parent Genomes the operator needs to produce a child Individual.

Type

int

produce(parents: Sequence[pyshgp.gp.genome.Genome], spawner: pyshgp.gp.genome.GeneSpawner)pyshgp.gp.genome.Genome[source]

Produce a child Genome from parent Genomes and optional GenomeSpawner.

Parameters
  • parents – A list of parent Genomes given to the operator.

  • spawner – A GeneSpawner that can be used to produce new genes (aka Atoms).

class pyshgp.gp.variation.VariationStrategy[source]

Bases: pyshgp.utils.DiscreteProbDistrib

A collection of VariationOperator and how frequently to use them.

add(op: pyshgp.gp.variation.VariationOperator, p: float)[source]

Add an element with a relative probability.

Parameters
  • op (VariationOperator) – The VariationOperator to add to the variation strategy.

  • p (float) – The probability of using the given operator relative to the other operators that have been added to the VariationStrategy.

elements
pyshgp.gp.variation.choice(a, size=None, replace=True, p=None)

Generates a random sample from a given 1-D array

New in version 1.7.0.

Note

New code should use the choice method of a default_rng() instance instead; see random-quick-start.

Parameters
  • a (1-D array-like or int) – If an ndarray, a random sample is generated from its elements. If an int, the random sample is generated as if a were np.arange(a)

  • size (int or tuple of ints, optional) – Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. Default is None, in which case a single value is returned.

  • replace (boolean, optional) – Whether the sample is with or without replacement

  • p (1-D array-like, optional) – The probabilities associated with each entry in a. If not given the sample assumes a uniform distribution over all entries in a.

Returns

samples – The generated random samples

Return type

single item or ndarray

Raises

ValueError – If a is an int and less than zero, if a or p are not 1-dimensional, if a is an array-like of size 0, if p is not a vector of probabilities, if a and p have different lengths, or if replace=False and the sample size is greater than the population size

See also

randint(), shuffle(), permutation()

Generator.choice()

which should be used in new code

Notes

Sampling random rows from a 2-D array is not possible with this function, but is possible with Generator.choice through its axis keyword.

Examples

Generate a uniform random sample from np.arange(5) of size 3:

>>> np.random.choice(5, 3)
array([0, 3, 4]) # random
>>> #This is equivalent to np.random.randint(0,5,3)

Generate a non-uniform random sample from np.arange(5) of size 3:

>>> np.random.choice(5, 3, p=[0.1, 0, 0.3, 0.6, 0])
array([3, 3, 0]) # random

Generate a uniform random sample from np.arange(5) of size 3 without replacement:

>>> np.random.choice(5, 3, replace=False)
array([3,1,0]) # random
>>> #This is equivalent to np.random.permutation(np.arange(5))[:3]

Generate a non-uniform random sample from np.arange(5) of size 3 without replacement:

>>> np.random.choice(5, 3, replace=False, p=[0.1, 0, 0.3, 0.6, 0])
array([2, 3, 0]) # random

Any of the above can be repeated with an arbitrary array-like instead of just integers. For instance:

>>> aa_milne_arr = ['pooh', 'rabbit', 'piglet', 'Christopher']
>>> np.random.choice(aa_milne_arr, 5, p=[0.5, 0.1, 0.1, 0.3])
array(['pooh', 'pooh', 'pooh', 'Christopher', 'piglet'], # random
      dtype='<U11')
pyshgp.gp.variation.get_variation_operator(name: str, **kwargs)pyshgp.gp.variation.VariationOperator[source]

Get the variaton operator class with the given name.

pyshgp.gp.variation.random(size=None)

Return random floats in the half-open interval [0.0, 1.0). Alias for random_sample to ease forward-porting to the new random API.

Module contents

pyshgp.push.gp.