pyshgp.gp package¶
pyshgp.gp.estimators module¶
The estimator
module defines a PushEstimator
class.
-
class
pyshgp.gp.estimators.
PushEstimator
(spawner: pyshgp.gp.genome.GeneSpawner, search: str = 'GA', selector: Union[pyshgp.gp.selection.Selector, str] = 'lexicase', variation_strategy: Union[pyshgp.gp.variation.VariationStrategy, dict, str] = 'umad', population_size: int = 300, max_generations: int = 100, initial_genome_size: Tuple[int, int] = 20, 100, simplification_steps: int = 2000, last_str_from_stdout: bool = False, interpreter: pyshgp.push.interpreter.PushInterpreter = 'default', parallelism: Union[int, bool] = False, push_config: pyshgp.push.config.PushConfig = 'default', verbose: int = 0, **kwargs)[source]¶ Bases:
object
Simple estimator that synthesizes Push programs.
- Parameters
spawner (Union[GeneSpawner, str], optional) – The GeneSpawner to use when producing Genomes during initialization and variation. Default is all core instructions, no literals, and no ERC Generators.
search (Union[SearchAlgorithm, str], optional) – The search algorithm, or its abbreviation, to use to when synthesizing Push programs.
selector (Union[Selector, str], optional) – The selector, or name of selector, to use when selecting parents. The default is lexicase selection.
variation_strategy (Union[VariationStrategy, dict, str]) – A VariationStrategy describing a collection of VariationOperators and how frequently to use them. If a dict is supplied, keys should be operator names and values should be the probability distribution. If a string is provided, the VariationOperators with that name will always be used. Default is
"umad""
.population_size (int, optional) – The number of individuals hold in the population each generation. Default is 300.
max_generations (int, optional) – The number of generations to run the search algorithm. Default is 100.
initial_genome_size (Tuple[int, int], optional) – The range of genome sizes to produce during initialization. Default is (20, 100)
simplification_steps (int) – The number of simplification iterations to apply to the best Push program produced by the search algorithm. Default 2000.
interpreter (PushInterpreter, optional) – The PushInterpreter to use when making predictions. Also holds the instruction set to use
parallelism (Union[Int, bool], optional) – Set the number of processes to spawn for use when performing embarrassingly parallel tasks. If false, no processes will spawn and compuation will be serial. Default is true, which spawns one process per available cpu.
verbose (int, optional) – Indicates if verbose printing should be used during searching. Default is 0. Options are 0, 1, or 2.
**kwargs – Arbitrary keyword arguments. Examples of supported arguments are epsilon (bool or float) when using Lexicase as the selector, and tournament_size (int) when using tournament selection.
-
fit
(X, y)[source]¶ Run the search algorithm to synthesize a push program.
- Parameters
X (pandas dataframe of shape = [n_samples, n_features]) – The training input samples.
y (list, array-like, or pandas dataframe.) – The target values (class labels in classification, real numbers in regression). Shape = [n_samples] or [n_samples, n_outputs]
-
load
(filepath: str)[source]¶ Load a found solution from a JSON file.
- Parameters
filepath – Filepath to read the serialized search result from.
-
predict
(X)[source]¶ Execute the synthesized push program on a dataset.
- Parameters
X (pandas dataframe of shape = [n_samples, n_features]) – The set of cases to predict.
- Returns
y_hat
- Return type
pandas dataframe of shape = [n_samples, n_outputs]
-
save
(filepath: str)[source]¶ Load the found solution to a JSON file.
- Parameters
filepath – Filepath to write the serialized search result to.
-
score
(X, y)[source]¶ Run the search algorithm to synthesize a push program.
- Parameters
X (pandas dataframe of shape = [n_samples, n_features]) – The training input samples.
y (list, array-like, or pandas dataframe.) – The target values (class labels in classification, real numbers in regression). Shape = [n_samples] or [n_samples, n_outputs]
pyshgp.gp.evaluation module¶
The evaluation
module defines classes to evaluate program CodeBlocks.
-
class
pyshgp.gp.evaluation.
DatasetEvaluator
(X, y, interpreter: pyshgp.push.interpreter.PushInterpreter = 'default', penalty: float = 1000000.0)[source]¶ Bases:
pyshgp.gp.evaluation.Evaluator
Evaluator driven by a labeled dataset.
-
class
pyshgp.gp.evaluation.
Evaluator
(interpreter: pyshgp.push.interpreter.PushInterpreter = 'default', penalty: float = 1000000.0)[source]¶ Bases:
abc.ABC
Base class or evaluators.
- Parameters
interpreter (PushInterpreter, optional) – PushInterpreter used to run program and get their output. Default is an interpreter with the default configuration and all core instructions registered.
penalty (float, optional) – When a program’s output cannot be evaluated on a particular case, the penalty error is assigned. Default is 5e5.
verbosity_config (Optional[VerbosityConfig] (default = None)) – A VerbosityConfig controlling what is logged during evaluation. Default is no verbosity.
-
default_error_function
(actuals, expecteds) → numpy.array[source]¶ Produce errors of actual program output given expected program output.
The default error function is intended to be a universal error function for Push programs which only output a subset of the standard data types.
- Parameters
actuals (list) – The values produced by running a Push program on a sequences of cases.
expecteds (list) – The ground truth values for the sequence of cases used to produce the actuals.
- Returns
An array of error values describing the program’s performance.
- Return type
np.array
-
class
pyshgp.gp.evaluation.
FunctionEvaluator
(error_function: Callable)[source]¶ Bases:
pyshgp.gp.evaluation.Evaluator
Evaluator driven by an error function.
-
pyshgp.gp.evaluation.
damerau_levenshtein_distance
(a: Union[str, Sequence], b: Union[str, Sequence]) → int[source]¶ Damerau Levenshtein Distance that works for both strings and lists.
https://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance. This implemenation is heavily inspired by the implementation in the jellyfish package. https://github.com/jamesturk/jellyfish
pyshgp.gp.genome module¶
The genome
module defines the Genome
type and provides genome translation, spawning and simplification.
A Genome
is a persistent collection of gene Atoms (any Atom that isn’t a CodeBlock
). It can be translated
into a CodeBlock
.
The GeneSpawner
is a factory capable of generating random genes and random genomes. It is used for initializing a
population as well producing new genes used by variation operators (ie. mutation).
The genome simplification process is useful for removing superfluous genes from a genome without negatively impacting the behavior of the program produced by the genome. This process has many benefits including: improving generalization, shrinking the size of the serialized solution, and in some cases making the program easier to explain.
-
class
pyshgp.gp.genome.
GeneSpawner
(n_inputs: int, instruction_set: Union[pyshgp.push.instruction_set.InstructionSet, str], literals: Sequence[Any], erc_generators: Sequence[Callable], distribution: pyshgp.utils.DiscreteProbDistrib = 'proportional')[source]¶ Bases:
object
A factory of random Genes (Atoms) and Genomes.
When spawning a random gene, the result can be one of three types of Atoms. An Instruction, a Closer, or a Literal. If the Atom is a Literal, it may be one of the supplied Literals, or it may be the result of running one of the Ephemeral Random Constant generators.
Reference for ERCs: “A field guide to genetic programming”, Section 3.1 Riccardo Poli and William B. Langdon and Nicholas Freitag McPhee, http://www.gp-field-guide.org.uk/
-
n_input
¶ Number of input instructions that could appear the genomes.
- Type
int
-
instruction_set
¶ InstructionSet containing instructions to use when spawning genes and genomes.
-
literals
¶ A list of Literal objects to pull from when spawning genes and genomes.
- Type
Sequence[pyshgp.push.instruction_set.atoms.Literal]
-
erc_generator
¶ A list of functions (aka Ephemeral Random Constant generators). When one of these functions is called, the output is placed in a Literal and returned as the spawned gene.
- Type
Sequence[Callable]
-
distribution
¶ A probability distribution describing how frequently to produce Instructions, Closers, Literals, and ERCs.
-
random_erc
() → pyshgp.push.atoms.Literal[source]¶ Materialize a random ERC generator into a Literal and return it.
- Returns
A Literal whose value comes from running a ERC generator function.
- Return type
-
random_gene
() → pyshgp.push.atoms.Atom[source]¶ Return a random Atom based on the GenomeSpawner’s distribution.
- Returns
An random Atom. Either an Instruction, Closer, or Literal.
- Return type
-
random_input
() → pyshgp.push.atoms.Input[source]¶ Return a random
Input
.- Returns
- Return type
-
random_instruction
() → pyshgp.push.atoms.InstructionMeta[source]¶ Return a random Instruction from the InstructionSet.
- Returns
A randomly selected Literal.
- Return type
-
random_literal
() → pyshgp.push.atoms.Literal[source]¶ Return a random Literal from the set of Literals.
- Returns
A randomly selected Literal.
- Return type
-
spawn_genome
(size: Union[int, Sequence[int]]) → pyshgp.gp.genome.Genome[source]¶ Return a random Genome based on the GenomeSpawner’s distribution.
The genome will contain the specified number of Atoms if size is an integer. If size is a pair of integers, the genome will be of a random size in the range of the two integers.
- Parameters
size – The resulting genome will contain this many Atoms if size is an integer. If size is a pair of integers, the genome will be of a random size in the range of the two integers.
- Returns
A Genome with random contents of a given size.
- Return type
-
-
class
pyshgp.gp.genome.
GeneTypes
(value)[source]¶ Bases:
enum.Enum
An
Enum
denoting the different types of genes that can appear in a Genome.-
CLOSE
= 3¶
-
ERC
= 5¶
-
INPUT
= 1¶
-
INSTRUCTION
= 2¶
-
LITERAL
= 4¶
-
-
class
pyshgp.gp.genome.
Genome
(initial=())[source]¶ Bases:
pyrsistent._checked_types.CheckedPVector
A linear sequence of genes (aka any atom that isn’t a
CodeBlock
).PyshGP uses the Plushy genome representation.
See: http://gpbib.cs.ucl.ac.uk/gp-html/Spector_2019_GPTP.html
-
class
pyshgp.gp.genome.
GenomeSimplifier
(evaluator: pyshgp.gp.evaluation.Evaluator, program_signature: pyshgp.push.program.ProgramSignature)[source]¶ Bases:
object
Simplifies a genome while preserving, or improving, its error.
Genomes, and Push programs, can contain superfluous Push code. This extra code often has no effect on the program behavior, but occasionally it can introduce subtle errors or behaviors that is not covered by the training cases. Removing the superfluous code makes genomes (and thus programs) smaller and easier to understand. More importantly, simplification can improve the generalization of the given genome/program.
The process of genome simplification is iterative and closely resembles simple hill climbing. For each iteration, the simplifier will randomly select a small number of random genes to remove. The Genome is re-evaluated and if its error gets worse, the change is reverted. After repeating this for some number of steps, the resulting genome will be the same size or smaller while containing the same (or better) error value.
Reference: “Improving generalization of evolved programs through automatic simplification” Thomas Helmuth, Nicholas Freitag McPhee, Edward Pantridge, and Lee Spector. 2017. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO ‘17). ACM, New York, NY, USA, 937-944. DOI: https://doi.org/10.1145/3071178.3071330
See: https://dl.acm.org/citation.cfm?id=3071178.3071330
-
simplify
(genome: pyshgp.gp.genome.Genome, original_errors: numpy.ndarray, steps: int = 2000) → Tuple[pyshgp.gp.genome.Genome, numpy.ndarray][source]¶ Simplify the given genome while maintaining error.
- Parameters
genome – The Genome to simplify.
original_errors – Error vector of the genome to simplify.
steps – Number of simplification iterations to perform. Default is 2000.
- Returns
The shorter Genome that expresses the same computation.
- Return type
-
-
class
pyshgp.gp.genome.
Opener
(**kwargs)[source]¶ Bases:
pyrsistent._precord.PRecord
Marks the start of one or more CodeBlock.
-
dec
() → pyshgp.gp.genome.Opener[source]¶ Create an
Opener
withcount
decremented.
-
-
pyshgp.gp.genome.
genome_to_code
(genome: pyshgp.gp.genome.Genome) → pyshgp.push.atoms.CodeBlock[source]¶ Translate into nested CodeBlocks.
These CodeBlocks can be considered the Push program representation of the Genome which can be executed by a PushInterpreter and evaluated by an Evaluator.
pyshgp.gp.individual module¶
The individual
module defines an Individaul in an evolutionary population.
Individuals are made up of Genomes, which are the linear Push program representations which can be manipulated by seach algorithms.
-
class
pyshgp.gp.individual.
Individual
(genome: pyshgp.gp.genome.Genome, signature: pyshgp.push.program.ProgramSignature)[source]¶ Bases:
pyshgp.utils.Saveable
,pyshgp.utils.Copyable
An individual in an evolutionary population.
-
error_vector
¶ An array of error values produced by evaluating the Individual’s program.
- Type
np.array
-
total_error
¶ The sum of all error values in the Individaul’s error_vector.
- Type
float
-
error_vector_bytes
¶ Hashable Byte representation of the individual’s error vector.
-
property
error_vector
¶ Numpy array of numeric error values.
-
property
error_vector_bytes
¶ Hashable Byte representation of the individual’s error vector.
-
genome
¶
-
id
¶
-
property
program
¶ Push program of individual. Taken from Plush genome.
-
signature
¶
-
property
total_error
¶ Numeric sum of the error vector.
-
pyshgp.gp.population module¶
The population
module defines an evolutionary population of Individuals.
-
class
pyshgp.gp.population.
Population
(individuals: list = None)[source]¶ Bases:
collections.abc.Sequence
A sequence of Individuals kept in sorted order, with respect to their total errors.
-
add
(individual: pyshgp.gp.individual.Individual)[source]¶ Add an Individaul to the population.
-
evaluate
(evaluator: pyshgp.gp.evaluation.Evaluator)[source]¶ Evaluate all unevaluated individuals in the population.
-
evaluated
¶
-
p_evaluate
(evaluator_proxy, pool: multiprocessing.context.BaseContext.Pool)[source]¶ Evaluate all unevaluated individuals in the population in parallel.
-
unevaluated
¶
-
pyshgp.gp.search module¶
The search
module defines algorithms to search for Push programs.
-
class
pyshgp.gp.search.
GeneticAlgorithm
(config: pyshgp.gp.search.SearchConfiguration)[source]¶ Bases:
pyshgp.gp.search.SearchAlgorithm
Genetic algorithm to synthesize Push programs.
An initial Population of random Individuals is created. Each generation begins by evaluating all Individuals in the population. Then the current Popluation is replaced with children produced by selecting parents from the Population and applying VariationOperators to them.
-
class
pyshgp.gp.search.
ParallelContext
(spawner: pyshgp.gp.genome.GeneSpawner, evaluator: pyshgp.gp.evaluation.Evaluator, n_proc: Optional[int] = None)[source]¶ Bases:
object
Holds the objects needed to coordinate parallelism.
-
class
pyshgp.gp.search.
SearchAlgorithm
(config: pyshgp.gp.search.SearchConfiguration)[source]¶ Bases:
abc.ABC
Base class for all search algorithms.
- Parameters
config (SearchConfiguration) – The configuation of the search algorithm.
-
config
¶ The configuration of the search algorithm.
- Type
-
generation
¶ The current generation, or iteration, of the search.
- Type
int
-
best_seen
¶ The best Individual, with respect to total error, seen so far.
- Type
-
population
¶ The current Population of individuals.
- Type
-
is_solved
() → bool[source]¶ Return
True
if the search algorithm has found a solution orFalse
otherwise.
-
run
() → pyshgp.gp.individual.Individual[source]¶ Run the algorithm until termination.
-
abstract
step
() → bool[source]¶ Perform one generation (step) of the search.
The step method should assume an evaluated Population, and must only perform parent selection and variation (producing children). The step method should modify the search algorithms population in-place, or assign a new Population to the population attribute.
-
class
pyshgp.gp.search.
SearchConfiguration
(signature: pyshgp.push.program.ProgramSignature, evaluator: pyshgp.gp.evaluation.Evaluator, spawner: pyshgp.gp.genome.GeneSpawner, selection: Union[pyshgp.gp.selection.Selector, pyshgp.utils.DiscreteProbDistrib, str] = 'lexicase', variation: Union[pyshgp.gp.variation.VariationOperator, pyshgp.utils.DiscreteProbDistrib, str] = 'umad', population_size: int = 500, max_generations: int = 100, error_threshold: float = 0.0, initial_genome_size: Tuple[int, int] = 10, 50, simplification_steps: int = 2000, parallelism: Union[int, bool] = True, **kwargs)[source]¶ Bases:
object
Configuration of an search algorithm.
- Parameters
evaluator (Evaluator) – The Evaluator to use when evaluating individuals.
spawning (GeneSpawner) – The GeneSpawner to use when producing Genomes during initialization and variation.
selection (Union[Selector, DiscreteProbDistrib, str], optional) – A Selector, or DiscreteProbDistrib of selectors, to use when selecting parents. The default is lexicase selection.
variation (Union[VariationOperator, DiscreteProbDistrib, str], optional) – A VariationOperator, or DiscreteProbDistrib of VariationOperators, to use during variation. Default is SIZE_NEUTRAL_UMAD.
population_size (int, optional) – The number of individuals hold in the population each generation. Default is 300.
max_generations (int, optional) – The number of generations to run the search algorithm. Default is 100.
error_threshold (float, optional) – If the search algorithm finds an Individual with a total error less than this values, stop searching. Default is 0.0.
initial_genome_size (Tuple[int, int], optional) – The range of genome sizes to produce during initialization. Default is (20, 100)
simplification_steps (int, optional) – The number of simplification iterations to apply to the best Push program produced by the search algorithm. Default is 2000.
parallelism (Union[Int, bool], optional) – Set the number of processes to spawn for use when performing embarrassingly parallel tasks. If false, no processes will spawn and compuation will be serial. Default is true, which spawns one process per available cpu.
verbosity_config (Union[VerbosityConfig, str], optional) – A VerbosityConfig controlling what is logged during the search. Default is no verbosity.
-
class
pyshgp.gp.search.
SimulatedAnnealing
(config: pyshgp.gp.search.SearchConfiguration)[source]¶ Bases:
pyshgp.gp.search.SearchAlgorithm
Algorithm to synthesize Push programs with Simulated Annealing.
At each step (generation), the simulated annealing heuristic mutates the current Individual, and probabilistically decides between accepting or rejecting the child. If the child is accepted, it becomes the new current Individual.
After each step, the simmulated annealing system cools its temperature. As the temperature lowers, the probability of accepting a child that does not have a lower total error than the current Individual decreases.
-
pyshgp.gp.search.
get_search_algo
(name: str, **kwargs) → pyshgp.gp.search.SearchAlgorithm[source]¶ Return the search algorithm class with the given name.
pyshgp.gp.selection module¶
The selection
module defines classes to select Individuals from Populations.
-
class
pyshgp.gp.selection.
CaseStream
(n_cases: int)[source]¶ Bases:
object
A generator of indices yielded in a random order.
-
class
pyshgp.gp.selection.
Elite
[source]¶ Bases:
pyshgp.gp.selection.Selector
Returns the best N individuals by total error.
-
select
(population: pyshgp.gp.population.Population, n: int = 1) → Sequence[pyshgp.gp.individual.Individual][source]¶ Return n individuals from the population.
- Parameters
population – A Population of Individuals.
n (int) – The number of parents to select from the population. Default is 1.
- Returns
The selected Individuals.
- Return type
Sequence[Individual]
-
select_one
(population: pyshgp.gp.population.Population) → pyshgp.gp.individual.Individual[source]¶ Return single individual from population.
- Parameters
population – A Population of Individuals.
- Returns
The selected Individual.
- Return type
-
-
class
pyshgp.gp.selection.
FitnessProportionate
[source]¶ Bases:
pyshgp.gp.selection.Selector
Fitness proportionate selection, also known as roulette wheel selection.
See: https://en.wikipedia.org/wiki/Fitness_proportionate_selection
-
select
(population: pyshgp.gp.population.Population, n: int = 1) → Sequence[pyshgp.gp.individual.Individual][source]¶ Return n individuals from the population.
- Parameters
population – A Population of Individuals.
n (int) – The number of parents to select from the population. Default is 1.
- Returns
The selected Individuals.
- Return type
Sequence[Individual]
-
select_one
(population: pyshgp.gp.population.Population) → pyshgp.gp.individual.Individual[source]¶ Return single individual from population.
- Parameters
population – A Population of Individuals.
- Returns
The selected Individual.
- Return type
-
-
class
pyshgp.gp.selection.
Lexicase
(epsilon: Union[bool, float, numpy.ndarray] = False)[source]¶ Bases:
pyshgp.gp.selection.SimpleMultiSelectorMixin
,pyshgp.gp.selection.Selector
Lexicase Selection.
All training cases are considered iteratively in a random order. For each training cases, the population is filtered to only contain the Individuals which have an error value within epsilon of the best error value on that case. This filtering is repeated until the population is down to a single Individual or all cases have been used. After the filtering iterations, a random Individual from the remaining set is returned as the selected Individual.
See: https://ieeexplore.ieee.org/document/6920034
-
select
(population: pyshgp.gp.population.Population, n: int = 1) → Sequence[pyshgp.gp.individual.Individual][source]¶ Return n individuals from the population.
- Parameters
population – A Population of Individuals.
n (int) – The number of parents to select from the population. Default is 1.
- Returns
The selected Individuals.
- Return type
Sequence[Individual]
-
select_one
(population: pyshgp.gp.population.Population) → pyshgp.gp.individual.Individual[source]¶ Return single individual from population.
- Parameters
population – A Population of Individuals.
- Returns
The selected Individual.
- Return type
-
-
class
pyshgp.gp.selection.
Selector
[source]¶ Bases:
abc.ABC
Base class for all selection algorithms.
-
abstract
select
(population: pyshgp.gp.population.Population, n: int = 1) → Sequence[pyshgp.gp.individual.Individual][source]¶ Return n individuals from the population.
- Parameters
population (Population) – A Population of Individuals.
n (int) – The number of parents to select from the population. Default is 1.
- Returns
The selected Individuals.
- Return type
Sequence[Individual]
-
abstract
select_one
(population: pyshgp.gp.population.Population) → pyshgp.gp.individual.Individual[source]¶ Return single individual from population.
- Parameters
population – A Population of Individuals.
- Returns
The selected Individual.
- Return type
-
abstract
-
class
pyshgp.gp.selection.
SimpleMultiSelectorMixin
[source]¶ Bases:
object
A mixin for
Selector
classes where selecting many individuals is done by repeated calls to select_one.-
select
(population: pyshgp.gp.population.Population, n: int = 1) → Sequence[pyshgp.gp.individual.Individual][source]¶ Return n individuals from the population.
- Parameters
population (Population) – A Population of Individuals.
n (int) – The number of parents to select from the population. Default is 1.
- Returns
The selected Individuals.
- Return type
Sequence[Individual]
-
-
class
pyshgp.gp.selection.
Tournament
(tournament_size: int = 7)[source]¶ Bases:
pyshgp.gp.selection.SimpleMultiSelectorMixin
,pyshgp.gp.selection.Selector
Tournament selection.
See: https://en.wikipedia.org/wiki/Tournament_selection
- Parameters
tournament_size (int, optional) – Number of individuals selected uniformly randomly to participate in the tournament. Default is 7.
-
tournament_size
¶ Number of individuals selected uniformly randomly to participate in the tournament. Default is 7.
- Type
int, optional
-
select_one
(population: pyshgp.gp.population.Population) → pyshgp.gp.individual.Individual[source]¶ Return single individual from population.
- Parameters
population – A Population of Individuals.
- Returns
The selected Individual.
- Return type
-
pyshgp.gp.selection.
choice
(a, size=None, replace=True, p=None)¶ Generates a random sample from a given 1-D array
New in version 1.7.0.
Note
New code should use the
choice
method of adefault_rng()
instance instead; see random-quick-start.- Parameters
a (1-D array-like or int) – If an ndarray, a random sample is generated from its elements. If an int, the random sample is generated as if a were np.arange(a)
size (int or tuple of ints, optional) – Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. Default is None, in which case a single value is returned.replace (boolean, optional) – Whether the sample is with or without replacement
p (1-D array-like, optional) – The probabilities associated with each entry in a. If not given the sample assumes a uniform distribution over all entries in a.
- Returns
samples – The generated random samples
- Return type
single item or ndarray
- Raises
ValueError – If a is an int and less than zero, if a or p are not 1-dimensional, if a is an array-like of size 0, if p is not a vector of probabilities, if a and p have different lengths, or if replace=False and the sample size is greater than the population size
Notes
Sampling random rows from a 2-D array is not possible with this function, but is possible with Generator.choice through its
axis
keyword.Examples
Generate a uniform random sample from np.arange(5) of size 3:
>>> np.random.choice(5, 3) array([0, 3, 4]) # random >>> #This is equivalent to np.random.randint(0,5,3)
Generate a non-uniform random sample from np.arange(5) of size 3:
>>> np.random.choice(5, 3, p=[0.1, 0, 0.3, 0.6, 0]) array([3, 3, 0]) # random
Generate a uniform random sample from np.arange(5) of size 3 without replacement:
>>> np.random.choice(5, 3, replace=False) array([3,1,0]) # random >>> #This is equivalent to np.random.permutation(np.arange(5))[:3]
Generate a non-uniform random sample from np.arange(5) of size 3 without replacement:
>>> np.random.choice(5, 3, replace=False, p=[0.1, 0, 0.3, 0.6, 0]) array([2, 3, 0]) # random
Any of the above can be repeated with an arbitrary array-like instead of just integers. For instance:
>>> aa_milne_arr = ['pooh', 'rabbit', 'piglet', 'Christopher'] >>> np.random.choice(aa_milne_arr, 5, p=[0.5, 0.1, 0.1, 0.3]) array(['pooh', 'pooh', 'pooh', 'Christopher', 'piglet'], # random dtype='<U11')
-
pyshgp.gp.selection.
get_selector
(name: str, **kwargs) → pyshgp.gp.selection.Selector[source]¶ Get the selector class with the given name.
-
pyshgp.gp.selection.
median_absolute_deviation
(x: numpy.ndarray) → numpy.float64[source]¶ Return the MAD.
- Parameters
x (array-like, shape = (n,)) –
- Returns
mad
- Return type
float
-
pyshgp.gp.selection.
one_individual_per_error_vector
(population: pyshgp.gp.population.Population) → Sequence[pyshgp.gp.individual.Individual][source]¶ Preselect one individual per distinct error vector.
Crucial for avoiding the worst case runtime of lexicase selection but does not impact the behavior of which individual gets selected.
-
pyshgp.gp.selection.
random
(size=None)¶ Return random floats in the half-open interval [0.0, 1.0). Alias for random_sample to ease forward-porting to the new random API.
-
pyshgp.gp.selection.
shuffle
(x)¶ Modify a sequence in-place by shuffling its contents.
This function only shuffles the array along the first axis of a multi-dimensional array. The order of sub-arrays is changed but their contents remains the same.
Note
New code should use the
shuffle
method of adefault_rng()
instance instead; see random-quick-start.- Parameters
x (array_like) – The array or list to be shuffled.
- Returns
- Return type
None
See also
Generator.shuffle()
which should be used for new code.
Examples
>>> arr = np.arange(10) >>> np.random.shuffle(arr) >>> arr [1 7 5 2 9 4 3 6 0 8] # random
Multi-dimensional arrays are only shuffled along the first axis:
>>> arr = np.arange(9).reshape((3, 3)) >>> np.random.shuffle(arr) >>> arr array([[3, 4, 5], # random [6, 7, 8], [0, 1, 2]])
pyshgp.gp.variation module¶
The variation
module defines classes for variation operators.
Variation operators (aka genetic operators) are used in evolutionary/genetic algorithms to create “child” genomes from “parent” genomes.
-
class
pyshgp.gp.variation.
AdditionMutation
(addition_rate: float = 0.01)[source]¶ Bases:
pyshgp.gp.variation.VariationOperator
Uniformly randomly adds some Atoms to parent.
- Parameters
rate (float) – The probability of adding a new Atom at any given point in the parent Genome. Default is 0.01.
-
rate
¶ The probability of adding a new Atom at any given point in the parent Genome. Default is 0.01.
- Type
float
-
num_parents
¶ Number of parent Genomes the operator needs to produce a child Individual.
- Type
int
-
produce
(parents: Sequence[pyshgp.gp.genome.Genome], spawner: pyshgp.gp.genome.GeneSpawner) → pyshgp.gp.genome.Genome[source]¶ Produce a child Genome from parent Genomes and optional GenomeSpawner.
- Parameters
parents – A list of parent Genomes given to the operator.
spawner – A GeneSpawner that can be used to produce new genes (aka Atoms).
-
class
pyshgp.gp.variation.
Alternation
(alternation_rate=0.01, alignment_deviation=10)[source]¶ Bases:
pyshgp.gp.variation.VariationOperator
Uniformly alternates between the two parent genomes.
- Parameters
rate (float, optional (default=0.01)) – The probability of switching which parent program elements are being copied from. Must be 0 <= rate <= 1. Defaults to 0.1.
alignment_deviation (int, optional (default=10)) – The standard deviation of how far alternation may jump between indices when switching between parents.
-
rate
¶ The probability of switching which parent program elements are being copied from. Must be 0 <= rate <= 1. Defaults to 0.1.
- Type
float, optional (default=0.01)
-
alignment_deviation
¶ The standard deviation of how far alternation may jump between indices when switching between parents.
- Type
int, optional (default=10)
-
num_parents
¶ Number of parent Genomes the operator needs to produce a child Individual.
- Type
int
-
produce
(parents: Sequence[pyshgp.gp.genome.Genome], spawner: pyshgp.gp.genome.GeneSpawner = None) → pyshgp.gp.genome.Genome[source]¶ Produce a child Genome from parent Genomes and optional GenomeSpawner.
- Parameters
parents – A list of parent Genomes given to the operator.
spawner – A GeneSpawner that can be used to produce new genes (aka Atoms).
-
class
pyshgp.gp.variation.
Cloning
[source]¶ Bases:
pyshgp.gp.variation.VariationOperator
Clones the parent genome.
-
num_parents
¶ Number of parent Genomes the operator needs to produce a child Individual.
- Type
int
-
produce
(parents: Sequence[pyshgp.gp.genome.Genome], spawner: pyshgp.gp.genome.GeneSpawner = None) → pyshgp.gp.genome.Genome[source]¶ Produce a child Genome from parent Genomes and optional GenomeSpawner.
- Parameters
parents – A list of parent Genomes given to the operator.
spawner – A GeneSpawner that can be used to produce new genes (aka Atoms).
-
-
class
pyshgp.gp.variation.
DeletionMutation
(deletion_rate: float = 0.01)[source]¶ Bases:
pyshgp.gp.variation.VariationOperator
Uniformly randomly removes some Atoms from parent.
- Parameters
rate (float) – The probability of removing any given Atom in the parent Genome. Default is 0.01.
-
rate
¶ The probability of removing any given Atom in the parent Genome. Default is 0.01.
- Type
float
-
num_parents
¶ Number of parent Genomes the operator needs to produce a child Individual.
- Type
int
-
produce
(parents: Sequence[pyshgp.gp.genome.Genome], spawner: pyshgp.gp.genome.GeneSpawner) → pyshgp.gp.genome.Genome[source]¶ Produce a child Genome from parent Genomes and optional GenomeSpawner.
- Parameters
parents – A list of parent Genomes given to the operator.
spawner – A GeneSpawner that can be used to produce new genes (aka Atoms).
-
class
pyshgp.gp.variation.
Genesis
(*, size: Union[int, Sequence[int]])[source]¶ Bases:
pyshgp.gp.variation.VariationOperator
Creates an entirely new (and random) genome.
- Parameters
size – The child genome will contain this many Atoms if size is an integer. If size is a pair of integers, the genome will be of a random size in the range of the two integers.
-
size
¶ The child genome will contain this many Atoms if size is an integer. If size is a pair of integers, the genome will be of a random size in the range of the two integers.
-
num_parents
¶ Number of parent Genomes the operator needs to produce a child Individual.
- Type
int
-
produce
(parents: Sequence[pyshgp.gp.genome.Genome], spawner: pyshgp.gp.genome.GeneSpawner) → pyshgp.gp.genome.Genome[source]¶ Produce a child Genome from parent Genomes and optional GenomeSpawner.
- Parameters
parents – A list of parent Genomes given to the operator.
spawner – A GeneSpawner that can be used to produce new genes (aka Atoms).
-
class
pyshgp.gp.variation.
LiteralMutation
(push_type: pyshgp.push.types.PushType, rate: float = 0.01)[source]¶ Bases:
pyshgp.gp.variation.VariationOperator
,abc.ABC
Base class for mutations of literal Atoms.
- Parameters
push_type (pyshgp.push.types.PushType) – The PushType which the operator can mutate.
rate (float) – The probability of applying the mutation to a given Literal.
-
push_type
¶ The PushType which the operator can mutate.
-
rate
¶ The probability of applying the mutation to a given Literal.
- Type
float
-
num_parents
¶ Number of parent Genomes the operator needs to produce a child Individual.
- Type
int
-
produce
(parents: Sequence[pyshgp.gp.genome.Genome], spawner: pyshgp.gp.genome.GeneSpawner = None) → pyshgp.gp.genome.Genome[source]¶ Produce a child Genome from parent Genomes and optional GenomeSpawner.
- Parameters
parents – A list of parent Genomes given to the operator.
spawner – A GeneSpawner that can be used to produce new genes (aka Atoms).
-
class
pyshgp.gp.variation.
VariationOperator
(num_parents: int)[source]¶ Bases:
abc.ABC
Base class of all VariationOperators.
- Parameters
num_parents (int) – Number of parent Genomes the operator needs to produce a child Individual.
-
num_parents
¶ Number of parent Genomes the operator needs to produce a child Individual.
- Type
int
-
checknum_parents
(parents: Sequence[pyshgp.gp.genome.Genome])[source]¶ Raise error if given too few parents.
- Parameters
parents – A list of parent Genomes given to the operator.
-
abstract
produce
(parents: Sequence[pyshgp.gp.genome.Genome], spawner: pyshgp.gp.genome.GeneSpawner) → pyshgp.gp.genome.Genome[source]¶ Produce a child Genome from parent Genomes and optional GenomeSpawner.
- Parameters
parents – A list of parent Genomes given to the operator.
spawner – A GeneSpawner that can be used to produce new genes (aka Atoms).
-
class
pyshgp.gp.variation.
VariationPipeline
(operators: Sequence[pyshgp.gp.variation.VariationOperator])[source]¶ Bases:
pyshgp.gp.variation.VariationOperator
Variation operator that sequentially applies multiple others variation operators.
- Parameters
operators (Sequence[VariationOperators]) – A list of operators to apply in order to produce the child Genome.
-
operators
¶ A list of operators to apply in order to produce the child Genome.
- Type
Sequence[VariationOperators]
-
num_parents
¶ Number of parent Genomes the operator needs to produce a child Individual.
- Type
int
-
produce
(parents: Sequence[pyshgp.gp.genome.Genome], spawner: pyshgp.gp.genome.GeneSpawner) → pyshgp.gp.genome.Genome[source]¶ Produce a child Genome from parent Genomes and optional GenomeSpawner.
- Parameters
parents – A list of parent Genomes given to the operator.
spawner – A GeneSpawner that can be used to produce new genes (aka Atoms).
-
class
pyshgp.gp.variation.
VariationStrategy
[source]¶ Bases:
pyshgp.utils.DiscreteProbDistrib
A collection of VariationOperator and how frequently to use them.
-
add
(op: pyshgp.gp.variation.VariationOperator, p: float)[source]¶ Add an element with a relative probability.
- Parameters
op (VariationOperator) – The VariationOperator to add to the variation strategy.
p (float) – The probability of using the given operator relative to the other operators that have been added to the VariationStrategy.
-
elements
¶
-
-
pyshgp.gp.variation.
choice
(a, size=None, replace=True, p=None)¶ Generates a random sample from a given 1-D array
New in version 1.7.0.
Note
New code should use the
choice
method of adefault_rng()
instance instead; see random-quick-start.- Parameters
a (1-D array-like or int) – If an ndarray, a random sample is generated from its elements. If an int, the random sample is generated as if a were np.arange(a)
size (int or tuple of ints, optional) – Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. Default is None, in which case a single value is returned.replace (boolean, optional) – Whether the sample is with or without replacement
p (1-D array-like, optional) – The probabilities associated with each entry in a. If not given the sample assumes a uniform distribution over all entries in a.
- Returns
samples – The generated random samples
- Return type
single item or ndarray
- Raises
ValueError – If a is an int and less than zero, if a or p are not 1-dimensional, if a is an array-like of size 0, if p is not a vector of probabilities, if a and p have different lengths, or if replace=False and the sample size is greater than the population size
See also
randint()
,shuffle()
,permutation()
Generator.choice()
which should be used in new code
Notes
Sampling random rows from a 2-D array is not possible with this function, but is possible with Generator.choice through its
axis
keyword.Examples
Generate a uniform random sample from np.arange(5) of size 3:
>>> np.random.choice(5, 3) array([0, 3, 4]) # random >>> #This is equivalent to np.random.randint(0,5,3)
Generate a non-uniform random sample from np.arange(5) of size 3:
>>> np.random.choice(5, 3, p=[0.1, 0, 0.3, 0.6, 0]) array([3, 3, 0]) # random
Generate a uniform random sample from np.arange(5) of size 3 without replacement:
>>> np.random.choice(5, 3, replace=False) array([3,1,0]) # random >>> #This is equivalent to np.random.permutation(np.arange(5))[:3]
Generate a non-uniform random sample from np.arange(5) of size 3 without replacement:
>>> np.random.choice(5, 3, replace=False, p=[0.1, 0, 0.3, 0.6, 0]) array([2, 3, 0]) # random
Any of the above can be repeated with an arbitrary array-like instead of just integers. For instance:
>>> aa_milne_arr = ['pooh', 'rabbit', 'piglet', 'Christopher'] >>> np.random.choice(aa_milne_arr, 5, p=[0.5, 0.1, 0.1, 0.3]) array(['pooh', 'pooh', 'pooh', 'Christopher', 'piglet'], # random dtype='<U11')
-
pyshgp.gp.variation.
get_variation_operator
(name: str, **kwargs) → pyshgp.gp.variation.VariationOperator[source]¶ Get the variaton operator class with the given name.
-
pyshgp.gp.variation.
random
(size=None)¶ Return random floats in the half-open interval [0.0, 1.0). Alias for random_sample to ease forward-porting to the new random API.
Module contents¶
pyshgp.push.gp.