ilustrado package¶
Subpackages¶
Submodules¶
ilustrado.adapt module¶
This file contains a wrapper for mutation and crossover.
-
ilustrado.adapt.
adapt
(possible_parents, mutation_rate, crossover_rate, mutations=None, max_num_mutations=3, max_num_atoms=40, structure_filter=None, minsep_dict=None, debug=False)[source]¶ Take a list of possible parents and randomly adapt according to given mutation weightings.
- Parameters
- Keyword Arguments
mutations (list(str)) – list of desired mutations to choose from (as strings),
max_num_mutations (int) – rand(1, this) mutations will be performed,
max_num_atoms (int) – any structures with more than this many atoms will be filtered out.
structure_filter (callable(dict)) – custom filter to pass to check_feasible.
minsep_dict (dict) – dictionary containing element-specific minimum separations, e.g. {(‘K’, ‘K’): 2.5, (‘K’, ‘P’): 2.0}.
- Returns
the mutated/newborn structure.
- Return type
-
ilustrado.adapt.
check_feasible
(mutant, parents, max_num_atoms, structure_filter=None, minsep_dict=None, debug=False)[source]¶ Check if a mutated/newly-born cell is “feasible”. Here, feasible means:
number density within 25% of pre-mutation/birth level,
no overlapping atoms, parameterised by minsep_dict,
cell angles between 50 and 130 degrees,
fewer than max_num_atoms in the cell,
ensure number of atomic types is maintained,
any custom filter is obeyed.
- Parameters
- Keyword Arguments
- Returns
True if structure is feasible, else False.
- Return type
ilustrado.analysis module¶
Some assorted analysis functions.
ilustrado.crossover module¶
This file implements crossover functionality.
-
ilustrado.crossover.
crossover
(parents, method='random_slice', debug=False)[source]¶ Attempt to create a child structure from two parents structures.
-
ilustrado.crossover.
random_slice
(parent_seeds, standardize=True, supercell=True, shift=True, debug=False)[source]¶ Simple cut-and-splice crossover of two parents.
The overall size of the child can vary between 0.5 and 1.5 the size of the parent structures. Both parent structures are cut and spliced along the same crystallographic axis.
- Parameters
- Returns
newborn structure from parents.
- Return type
ilustrado.fitness module¶
This file implements all notions of fitness.
-
class
ilustrado.fitness.
FitnessCalculator
(fitness_metric='dummy', fitness_function=None, hull=None, sandbagging=False, debug=False)[source]¶ Bases:
object
This class calculates the fitnesses of generations, by some global definition of generation-agnostic fitness.
- Parameters
fitness_metric (str) – either ‘dummy’, ‘hull’ or ‘hull_test’.
fitness_function (callable) – function to operate on numpy array of raw fitness values,
hull (QueryConvexHull) – matador hull from which to calculate metastability,
sandbagging (bool) – whether or not to “sandbag” particular compositions, i.e. lower a structure’s fitness based on the number of nearby phases
-
evaluate
(generation)[source]¶ Assign normalised fitnesses to an entire generation. Normalisation uses the logistic function such that
fitness = 1 - tanh(2*distance_from_hull),
- Parameters
generation (Generation/list) – list/iterator over optimised structures,
-
update_sandbag_multipliers
(generation, modifier=0.95)[source]¶ Assign composition penalty based on number of nearby structures. Updates fitness.sandbag_multipliers to a dictionary with chemical concentration as keys and values of fitness penalty.
- Parameters
generation (Generation) – list of optimised structures.
-
apply_sandbag_multipliers
(generation, locality=0.05)[source]¶ Scale the generation’s fitness by the sandbag modifier. This updates the ‘fitness’ key and the ‘modifier’ key (total scaling) of each document in the generation.
- Parameters
generation (Generation) – list of optimised structures.
- Keyword Arguments
locality (float) – tolerance by which two structures are “nearby”
ilustrado.generation module¶
This file implements the Generation class which is used to store each generation of structures, and to evaulate their fitness.
-
class
ilustrado.generation.
Generation
(run_hash: str, generation_idx: int, num_survivors: int, num_accepted: int, populace=None, dumpfile=None, fitness_calculator=None)[source]¶ Bases:
object
Stores each generation of structures.
- Parameters
- Keyword Arguments
-
dump
(gen_suffix)[source]¶ Dump the current generation to JSON file.
- Parameters
gen_suffix (str) – typically gen<gen_number>.
-
dump_bourgeoisie
(gen_suffix)[source]¶ Dump the current generation’s bourgeoisie to JSON file.
- Parameters
gen_suffix (str) – typically gen<gen_number>.
-
load
(gen_fname)[source]¶ Load populace of the generation from a JSON dump.
- Parameters
gen_fname (str) – filename to load.
-
load_bourgeoisie
(bourge_fname)[source]¶ Load bourgeoisie of the generation from a JSON dump.
- Parameters
bourge_fname (str) – filename to load.
-
birth
(populum: dict)[source]¶ Add a structure to the populace.
- Parameters
populum (dict) – structure to add.
-
clean
()[source]¶ Remove structures with pathological formation enthalpies.
- Returns
number of pathological structures removed.
- Return type
num_removed (int)
-
set_bourgeoisie
(elites=None, best_from_stoich=True)[source]¶ Set the structures that will continue to the next generation, i.e. the bourgeoisie.
- Keyword Arguments
list (elites) – list of elite structures to include from the previous generation,
best_from_stoich (bool) – whether to include one structure from each stoichiometry.
-
is_dupe
(doc, sim_tol=0.05, extra_pdfs=None)[source]¶ Compare doc with all other structures at same stoichiometry via PDF overlap.
-
property
pdfs
¶ Returns list of PDFs for generation, calculating if necessary.
-
property
fitnesses
¶ Return list of normalised fitnesses for population.
-
property
raw_fitnesses
¶ Return list of raw fitnesses for population.
-
property
average_pleb_fitness
¶ Return the average normalised fitness of the generation.
-
property
average_bourgeois_fitness
¶ Return the average normalised fitness of the bourgeoisie.
ilustrado.ilustrado module¶
This file implements the GA algorithm and acts as main().
-
class
ilustrado.ilustrado.
ArtificialSelector
(**kwargs)[source]¶ Bases:
object
ArtificialSelector takes an initial gene pool and applies a genetic algorithm to optimise some fitness function.
- Keyword Arguments
gene_pool (list(dict)) – initial cursor to use as “Generation 0”,
seed (str) – seed name of cell and param files for CASTEP,
seed_prefix (str) – if not specifying a seed, this name will prefix all runs
fitness_metric (str) – currently either ‘hull’ or ‘test’,
hull (QueryConvexHull) – matador QueryConvexHull object to calculate distances,
res_path (str) – path to folder of res files to create hull, if no hull object passed
mutation_rate (float) – rate at which to perform single-parent mutations (DEFAULT: 0.5)
crossover_rate (float) – rate at which to perform crossovers (DEFAULT: 0.5)
num_generations (int) – number of generations to breed before quitting (DEFAULT: 5)
num_survivors (int) – number of structures to survive to next generation for breeding (DEFAULT: 10)
population (int) – number of structures to breed in any given generation (DEFAULT: 25)
failure_ratio (int) – maximum number of attempts per success (DEFAULT: 5)
elitism (float) – fraction of next generation to be comprised of elite structures from previous generation (DEFAULT: 0.2)
best_from_stoich (bool) – whether to always include the best structure from a stoichiomtery in the next generation,
structure_filter (fn(doc)) – any function that takes a matador doc and returns True or False,
check_dupes (bool) – if True, filter relaxed structures for uniqueness on-the-fly (DEFAULT: True)
check_dupes_hull (bool) – compare pdf with all hull structures (DEFAULT: True)
sandbagging (bool) – whether or not to disfavour nearby compositions (DEFAULT: False)
minsep_dict (dict) – dictionary containing element-specific minimum separations, e.g. {(‘K’, ‘K’): 2.5, (‘K’, ‘P’): 2.0}. These should only be set such that atoms do not overlap; let the DFT deal with bond lengths. No effort is made to push apart atoms that are too close, the trial will simply be discarded. (DEFAULT: None)
max_num_mutations (int) – maximum number of mutations to perform on a single structure,
max_num_atoms (int) – most atoms allowed in a structure post-mutation/crossover,
ncores (int or list(int)) – specifies the number of cores used by listed nodes per thread,
nprocs (int) – total number of processes,
recover_from (str) – recover from previous run_hash, by default ilustrado will recover if it finds only one run hash in the folder
load_only (bool) – only load structures, do not continue breeding (DEFAULT: False)
executable (str) – path to DFT binary (DEFAULT: castep)
compute_mode (str) – either direct, slurm, manual (DEFAULT: direct)
max_num_nodes (int) – amount of array jobs to run per generation in slurm mode,
walltime_hrs (int) – maximum walltime for a SLURM array job,
slurm_template (str) – path to template slurm script that includes module loads etc,
entrypoint (str) – path to script that initialised this object, such that it can be called by SLURM
debug (bool) – maximum printing level
testing (bool) – run test code only if true
verbosity (int) – extra printing level,
loglevel (str) – follows std library logging levels.
-
breed_generation
()[source]¶ Build next generation from mutations/crossover of current and perform relaxations if necessary.
-
write_unrelaxed_generation
()[source]¶ Perform mutations and write res files for the resulting structures. Additionally, dump an unrelaxed json file.
-
batch_birth
()[source]¶ Assess whether a generation has been relaxed already. This is done by checking for the existence of a file called <run_hash>-genunrelaxed.json.
If so, match the relaxations up with the cached unrelaxed structures and rank them ready for the next generation.
If not, create a new generation of structures, dump the unrelaxed structures to file, create the jobscripts to relax them, submit them and the job to check up on the relaxations, then exit.
-
continuous_birth
()[source]¶ Create new generation and relax “as they come”, filling the compute resources allocated.
-
enforce_elitism
()[source]¶ Add elite structures from previous generations to bourgeoisie of current generation, through the merit of their ancestors alone.
-
reset_and_dump
()[source]¶ Add now complete generation to generation list, reset the next_gen variable and write dump files.
-
birth_new_structure
()[source]¶ Generate a new structure from current settings.
- Returns
newborn structure to be optimised
- Return type
-
scrape_result
(result, proc=None, newborns=None)[source]¶ Check process for result and scrape into self.next_gen if successful, with duplicate detection if desired. If the optional arguments are provided, extra logging info will be found when running in direct mode.
-
kill_all
(procs)[source]¶ Loop over processes and kill them all.
- Parameters
procs (list) – list of
NewbornProcess
in form documented above.
-
recover
()[source]¶ Attempt to recover previous generations from files in cwd named ‘<run_hash>_gen{}.json’.format(gen_idx).
ilustrado.mutate module¶
This file implements all possible single mutant mutations.
-
ilustrado.mutate.
mutate
(parent, mutations=None, max_num_mutations=2, debug=False)[source]¶ Wrap _mutate to check for null/invalid mutations.
-
ilustrado.mutate.
permute_atoms
(mutant, debug=False)[source]¶ Swap the positions of random pairs of atoms.
- Parameters
mutant (dict) – structure to mutate in-place.
- Raises
RuntimeError – if only one type of atom is present.
-
ilustrado.mutate.
transmute_atoms
(mutant, debug=False)[source]¶ Transmute one atom for another type in the cell.
- Parameters
mutant (dict) – structure to mutate in-place.
- Raises
RuntimeError – if only one type of atom is present.
-
ilustrado.mutate.
vacancy
(mutant, debug=False)[source]¶ Remove a random atom from the structure.
- Parameters
mutant (dict) – structure to mutate in-place.
-
ilustrado.mutate.
voronoi_shuffle
(mutant, element_to_remove=None, preserve_stoich=False, debug=False, testing=False)[source]¶ Remove all atoms of type element, then perform Voronoi analysis on the remaining sublattice. Cluster the nodes with KMeans, then repopulate the clustered Voronoi nodes with atoms of the removed element.
- Parameters
mutant (dict) – structure to mutate in-place.
- Keyword Arguments
- Raises
RuntimeError – if unable to perform Voronoi shuffle.
-
ilustrado.mutate.
random_strain
(mutant, debug=False)[source]¶ Apply random strain tensor to unit cell from 6 epsilon_i components with values between -1 and 1. The cell is then scaled to the parent’s volume.
- Parameters
mutant (dict) – structure to mutate in-place.
ilustrado.util module¶
Catch-all file for utility functions.
-
ilustrado.util.
strip_useless
(doc, to_run=False)[source]¶ Strip useless information from a matador doc.
-
class
ilustrado.util.
FakeComputeTask
(*args, **kwargs)[source]¶ Bases:
matador.compute.compute.ComputeTask
Fake Relaxer for testing, with same parameters as the real one from matador.compute.