dhd.evolve

List of functions to run the evolutive algorithm seeking for the best Terminal Steiner Tree (TST) connecting all sinks and sources of the district heating network.

Terminal Steiner Tree (TST)

The tree of minimal length connecting a set of terminals (subset of the graph nodes) is called a Steiner Tree. Its exact computation is a NP complete problem, so that an approximate algorithm is used. The worst possible ratio between the weight of the approximated tree and the weight of the exact Steiner Tree is 2-2/t, where t is the number of terminals.

In the district heating design problem, the further constraint of single terminal connection is imposed. The problem goes under the name of Terminal Steiner Tree problem. No efficient algorithm exists, so that we use an evolutive heuristic method.

Algorithm idea

For each terminal (T_i) there is a set of connections to the network. A configuration of single connections is represented as a set of genes, called an individual. The weight of an individual is the weight of its associated Steiner Tree, which is a TST by construction.

https://gitlab.com/crem-repository/dhd/raw/master/docs/images/evolution.png

The idea is to consider a population of n individual, to class them in increasing order, to select the n1, respectively n2, lighter individuals to form the elites, respectively the parents, populations. The elites are merely mutated and reinjected in the population of the next generation. However the parents are paired into n-n1 couples which each procreate one child, by mixing their genes. These n-n1 children are then mutated and added to the next generation population. This reproduction process is applied for N generations.

Inputs

The algorithm is designed to work on the dataframes vertices and terminals constructed with the module dhd.connect. The user must also provide the number of generation (N) and may specify the whole population size (n), the elite population size (n1), the parent population size (n2) and the rate of gene mutation (mutation_rate).

Outputs

The whole evolution is saved in the dataframe evolution returned by the function dhd.evolve.run_evolution. The TST dataframe associated to the best configuration is returned by the function dhd.evolve.get_best_terminal_steiner_tree.

Functions

dhd.evolve.connect_terminals(vertices, terminals, genes)[source]

Add the connections from terminals associated to the given genes to the graph vertices.

Parameters:
  • vertices (DataFrame) – Dataframe of the street graph plus the connection nodes.
  • terminals (DataFrame) – Dataframe of the terminals to connect to the graph.
  • genes (list) – List of terminal single connections of the considered configuration.
Returns:

Updated vertices dataframe.

Return type:

DataFrame

dhd.evolve.dataframe_compatibility(vertices, terminals)[source]

Check the compatibility of the input dataframes.

Parameters:
  • vertices (DataFrame) – Dataframe of the street graph plus the connection nodes.
  • terminals (DataFrame) – Dataframe of the terminals to connect to the graph.
Returns:

True if the dataframes are compatible.

Return type:

bool

dhd.evolve.get_best_individual(evolution)[source]

Select the best connections configuration of the whole evolution.

Parameters:evolution (DataFrame) – Dataframe of the population evolution.
Returns:
Return type:List of the genes (connection configuration) of the best individual.
dhd.evolve.get_best_terminal_steiner_tree(vertices, terminals, evolution)[source]

Find the TST dataframe with the smallest weight from those computed during the evolution.

Parameters:
  • vertices (DataFrame) – Dataframe of the street graph plus the connection nodes.
  • terminals (DataFrame) – Dataframe of the terminals to connect to the graph.
  • evolution (DataFrame) – Dataframe of the population evolution.
Returns:

tst

Dataframe of the best computed TST with the following structure:
  • INDEX: Integers.
  • COLUMNS:
    • ’idA’: first vertex end node index,
    • ’idB’: second vertex end node index,
    • ’pA’: coordinates (shapely Point) of the node ‘idA’,
    • ’pB’: coordinates (shapely Point) of the node ‘idB’,
    • ’weight’: vertex weight,
    • ’geometry’: vertex geometry.

Return type:

DataFrame

dhd.evolve.get_configuration_number_exponent(allele_number)[source]

Compute the approximate log10 of the total number of connection configurations.

Parameters:allele_number (list) – List of the number of possible connections at each terminal.
Returns:Integer approximation of the log10 of the total number of connection configurations.
Return type:int
dhd.evolve.get_genes_variation(evolution)[source]

Measure the “distance” between each following pair of best genes.

The distance is defined as the cosine of the angle between the two considered gene vectors. The closer the returned number is to one, the less variation.

Parameters:evolution (DataFrame) – Dataframe of the population evolution.
Returns:List of number between 0 and 1 representing the distance between each following genes pairs.
Return type:list
dhd.evolve.get_next_generation(elites, parents, n1, n, allele_number, mutation_rate)[source]

Generate the next generation from the elites and parents populations.

Parameters:
  • elites (list) – Elites population.
  • parents (list) – Parents population.
  • n1 (int) – Elites population size.
  • n (int) – Population size.
  • allele_number (list) – List of the number of possible connections at each terminal.
  • mutation_rate (float) – Rate of genes mutation. Must belong to the open interval (0,1).
Returns:

List of individuals constituting the new generation.

Return type:

list

dhd.evolve.get_statistics(population)[source]

Get population weights mean and standard deviation.

Parameters:population (list) – List of individual tuples (weight, genes).
Returns:
  • float – Mean of the population weights.
  • float – Standard deviation of the population weigths.
dhd.evolve.get_terminal_steiner_tree(vertices, terminals)[source]

Compute the Steiner Tree (TST by construction) associated to the updated graph vertices (with connection edges) of the considered configuration.

Parameters:
  • vertices (DataFrame) – Dataframe of the street graph plus the connection nodes concatenated with a single connection at each terminal.
  • terminals (DataFrame) – Dataframe of the terminals to connect to the graph.
Returns:

Approximate Terminal Steiner Tree of the considered single connections configuration.

Return type:

Graph

dhd.evolve.get_tree_weight(individual, args)[source]

Find the approximate Steiner Tree associated to the genes of the given individual and compute its weight.

Parameters:
  • individual (tuple = float, list) – Individual weight and genes. The weight has not yet been computed and is None.
  • args (tuple = Dataframe, DataFrame) – Dataframes vertices and terminals.
Returns:

Individual with the updated weight associated to its genes.

Return type:

tuple = float, list

dhd.evolve.individual_mutation(individual, allele_number, mutation_rate, inbred=False)[source]

Genes mutation of the given individual.

Each gene is randomly mutated with probability mutation_rate.

Parameters:
  • individual (tuple = float, list) – Tuple of the individual weight and genes.
  • allele_number (list) – List of the number of possible connections at each terminal.
  • mutation_rate (float) – Rate of genes mutation. Must belong to the open interval (0,1).
  • inbred (bool, optional) – True if the two parents are identical. Default is False
Returns:

Individual with weight reset to None and mutated genes.

Return type:

tuple = float, list

dhd.evolve.init_evolution_dataframe(N, save_population)[source]

Initilization of the dataframe to save the population evolution.

Parameters:
  • N (int) – Number of generation.
  • save_population (bool) – True if each generation population must be saved.
Returns:

Dataframe of the population evolution.

Return type:

DataFrame

dhd.evolve.init_individual_genes(allele_number)[source]

Generate a random genes list.

Parameters:allele_number (list) – List of the number of possible connections at each terminal.
Returns:List of terminal single connections.
Return type:list
dhd.evolve.init_population_genes(n, allele_number)[source]

Generate random genes for the n individuals of the population according to the possible connections at each terminal allele_number.

Parameters:
  • n (int) – Population size.
  • allele_number (list) – List of the number of possible connections at each terminal.
Returns:

List of n individual tuples (weight=None, genes).

Return type:

list

dhd.evolve.parameters_compatibility(n, n1, n2, mutation_rate)[source]

Assert that the input parameters are compatible with eachother.

Parameters:
  • n (int) – Population size.
  • n1 (int) – Elite population size.
  • n2 (int) – Parent population size.
  • mutation_rate (float) – Rate of genes mutation. Must belong to the open interval (0,1).
Returns:

True if the parameters are compatible.

Return type:

bool

dhd.evolve.procreate(mum, dad)[source]

Gene mixing of the two parent individuals.

Each gene of the children is either the one of its dad or its mum, with equal probability.

Parameters:
  • mum (tuple = float, list) – First parent individual.
  • dad (tuple = float, list) – Second parent individual.
Returns:

Child individual with weight=None and a mixing of its parents genes.

Return type:

tuple = float, list

dhd.evolve.remove_no_connection_terminals(terminals)[source]

Remove terminals which cannot be connected to the district heating network.

Parameters:terminals (DataFrame) – Dataframe of the terminals to connect to the graph.
Returns:Updated terminals dataframe.
Return type:DataFrame
dhd.evolve.reproduction(parents, allele_number, mutation_rate)[source]

Production of a new individual out of the set of parents.

Two parents are first selected, then mated and finally the child is mutated.

Parameters:
  • parents (list) – Parents population.
  • allele_number (list) – List of the number of possible connections at each terminal.
  • mutation_rate (float) – Rate of genes mutation. Must belong to the open interval (0,1).
Returns:

Child individual with weight=None and a mixing of its parents genes followed by a mutation.

Return type:

tuple = float, list

dhd.evolve.run_evolution(vertices, terminals, N, n=64, n1=8, n2=32, mutation_rate=0.1, save_population=False, pool_number=6)[source]

Evolves a sample of n configurations of single terminal connections seeking for the lowest Terminal Steiner Tree (TST) weight.

The algorithm swipes the space of single connections

\[V = \otimes_{i=0}^{d-1} \:\: \{0,\cdots,A_i-1\}\]

namely the discrete space of dimension d = # of terminals with each direction spanned by the integers between 0 and A_i, with A_i = # of possible connection to terminal i.

A set a n vectors is first chosen randomly and then evolved through the configuration space generation after generation. At each generation, the n2 best individuals (parents) reproduce amongst themselves as couples. These n-n1 children as well as the n1 best individual (elites) of the previous generation then undergo a gene mutation of parameter mutation_rate.

Parameters:
  • vertices (DataFrame) – Dataframe of the street graph plus the connection nodes.
  • terminals (DataFrame) – Dataframe of the terminals to connect to the graph.
  • N (int) – Number of generation.
  • n (int) – Population size.
  • n1 (int) – Elite population size.
  • n2 (int) – Parent population size.
  • mutation_rate (float) – Rate of genes mutation. Must belong to the open interval (0,1).
  • save_population (bool) – True if each generation population must be saved.
  • pool_number (int, optional) – Number of processes working in parallel. Default is 6.
Returns:

evolution

Dataframe of the population evolution with the following structure:
  • INDEX: generation integers from 0 to N-1.
  • COLUMNS:
    • ’weight’: weight of the best individual,
    • ’mean’: mean weight of the population,
    • ’stddev’: standard deviation of the population weights,
    • ’genes’: genes of the best individual,
    • ’population’: whole population weights and genes (only if save_population is True).

Return type:

DataFrame

dhd.evolve.save_generation(i, evolution, population, elites, save_population)[source]

Update the evolution dataframe with the informations on the current generation ; in-place.

Parameters:
  • i (int) – Generation number.
  • evolution (DataFrame) – Dataframe of the population evolution.
  • population (list) – List of individuals constituting the population.
  • elites (list) – Elites population.
  • save_population (bool) – True if the each generation population must be saved.
dhd.evolve.select_elites_and_parents(population, elites, n1, n2)[source]

Select the new elites and parents from the whole population and old elites.

Parameters:
  • population (list) – List of individuals constituting the population.
  • elites (list) – Elites population.
  • n1 (int) – Elites population size.
  • n2 (int) – Parents population size.
Returns:

  • list – New elites population.
  • list – New parents population.

dhd.evolve.select_random_parents(parents)[source]

Randomly select two parents from the parent population.

If the two parents are identical the procreation is classified as ‘inbred’.

Parameters:parents (list) – Parents population.
Returns:
  • tuple = float, list – First parent individual.
  • tuple = float, list – second parent individual.
  • bool – True if the two parents are identical.
dhd.evolve.set_allele_number(terminals)[source]

Counts the number of possible connection at each terminal.

Parameters:terminals (DataFrame) – Dataframe of the terminals to connect to the graph.
Returns:List of the number of possible connections at each terminal.
Return type:list
dhd.evolve.set_nodes_coordinates(tst)[source]

Find all TST nodes coordinates and save them as shapely Points ; in-place.

The key ‘pA’ (‘pB’) correspond to the first (last) shapely Point of the edge geometry.

Parameters:tst (DataFrame) – DataFrame of the TST edges.
dhd.evolve.show_evolution_statistics(evolution)[source]

Plot the evolution of the best weight, the mean weight, the standard deviation and the genes variations.

The gene variation is defined as the cosine of the angle between the best gene of a generation and the best gene of its previous generation. The number lies between 0 and 1. The closer iit is to 0, the more mutation there is.

Parameters:evolution (DataFrame) – Dataframe of the population evolution.