ipsuite.configuration_selection package¶
Submodules¶
ipsuite.configuration_selection.base module¶
Base Node for ConfigurationSelection.
- class ipsuite.configuration_selection.base.BatchConfigurationSelection(*args, **kwargs)[source]¶
Bases:
ConfigurationSelectionBase node for BatchConfigurationSelection.
Attributes¶
- data: list[ase.Atoms]
The atoms data to process. This must be an input to the Node
- train_data: list[ase.Atoms]
Batch active learning methods usually take into account the data a model was trained on. The training dataset has to be supplied with this argument.
- atoms: list[ase.Atoms]
The processed atoms data. This is an output of the Node. It does not have to be ‘field.Atoms’ but can also be e.g. a ‘property’.
- train_data: list[Atoms]¶
- class ipsuite.configuration_selection.base.ConfigurationSelection(*args, **kwargs)[source]¶
Bases:
IPSNodeBase Node for ConfigurationSelection.
Attributes¶
- data: list[Atoms]|list[list[Atoms]]|utils.types.SupportsAtoms
the data to select from
- exclude_configurations: dict[str, list]|utils.types.SupportsSelectedConfigurations
Atoms to exclude from the
- exclude: list[zntrack.Node]|zntrack.Node|None
Exclude the selected configurations from these nodes.
- data: list[Atoms]¶
- property excluded_frames: list[Atoms]¶
Get a list of the atoms objects that were not selected.
- property frames: list[Atoms]¶
Get a list of the selected atoms objects.
- img_selection: Path = PosixPath('$nwd$/selection.png')¶
- select_atoms(atoms_lst: List[Atoms]) List[int][source]¶
Run the selection method.
Attributes¶
- atoms_lst: List[ase.Atoms]
List of ase Atoms objects to select configurations from.
Returns¶
- List[int]:
A list of the selected ids from 0 .. len(atoms_lst)
- selected_ids: list[int] = NOT_AVAILABLE¶
ipsuite.configuration_selection.filter module¶
- class ipsuite.configuration_selection.filter.FilterOutlier(*args, **kwargs)[source]¶
Bases:
IPSNodeRemove outliers from the data based on a given property.
Attributes¶
- keystr, default=”energy”
The property to filter on.
- thresholdfloat, default=3
The threshold for filtering in units of standard deviations.
- direction{“above”, “below”, “both”}, default=”both”
The direction to filter in.
- data: list[Atoms]¶
- direction: Literal['above', 'below', 'both'] = 'both'¶
- property excluded_frames¶
- filtered_indices: list = NOT_AVAILABLE¶
- property frames: list[Atoms]¶
- histogram: str = PosixPath('$nwd$/histogram.png')¶
- key: str = 'energy'¶
- threshold: float = 3¶
ipsuite.configuration_selection.index module¶
Select configurations by item, e.g. slice or list of indices.
- class ipsuite.configuration_selection.index.IndexSelection(*args, **kwargs)[source]¶
Bases:
ConfigurationSelectionSelect configurations by explicit indices or slice parameters.
Parameters¶
- datalist[ase.Atoms]
The atomic configurations to select from.
- indiceslist[int], optional
Explicit list of indices to select. Cannot be used with slice parameters.
- startint, optional
Start index for slice selection.
- stopint, optional
Stop index for slice selection.
- stepint, optional
Step size for slice selection.
Attributes¶
- selected_idslist[int]
Indices of selected configurations.
- frameslist[ase.Atoms]
The selected atomic configurations.
- excluded_frameslist[ase.Atoms]
The atomic configurations that were not selected.
Examples¶
>>> with project: ... data = ips.AddData(file="ethanol.xyz") ... selector = ips.IndexSelection(data=data.frames, indices=[0, 5, 10, 15]) >>> project.repro() >>> print(f"Selected {len(selector.selected_ids)} configurations with IDs: " ... f"{selector.selected_ids}") Selected 4 configurations with IDs: [0, 5, 10, 15]
- data: list[ase.Atoms]¶
- indices: list[int] | None = None¶
- select_atoms(atoms_lst: List[Atoms]) List[int][source]¶
Select Atoms by explicit indices or slice parameters.
- start: int | None = None¶
- step: int | None = None¶
- stop: int | None = None¶
ipsuite.configuration_selection.kernel module¶
ipsuite.configuration_selection.random module¶
Module for selecting Atoms randomly.
- class ipsuite.configuration_selection.random.RandomSelection(*args, **kwargs)[source]¶
Bases:
ConfigurationSelectionSelect configurations randomly without replacement.
Parameters¶
- datalist[ase.Atoms]
The atomic configurations to select from.
- n_configurationsint
Number of configurations to select.
- seedint, default=1234
Random seed for reproducible selection.
Attributes¶
- selected_idslist[int]
Indices of selected configurations.
- frameslist[ase.Atoms]
The selected atomic configurations.
- excluded_frameslist[ase.Atoms]
The atomic configurations that were not selected.
Examples¶
>>> with project: ... data = ips.AddData(file="ethanol.xyz") ... selector = ips.RandomSelection(data=data.frames, n_configurations=10, seed=42) >>> project.repro() >>> print(f"Selected {len(selector.selected_ids)} configurations with IDs: " ... f"{selector.selected_ids}") Selected 10 configurations with IDs: [83, 53, 70, 45, 44, 39, 22, 80, 10, 0]
- data: list[ase.Atoms]¶
- n_configurations: int¶
- seed: int = 1234¶
ipsuite.configuration_selection.split module¶
Module for selecting Atoms randomly.
- class ipsuite.configuration_selection.split.SplitSelection(*args, **kwargs)[source]¶
Bases:
ConfigurationSelectionSelect the first n% of configurations from the dataset.
Parameters¶
- datalist[ase.Atoms]
The atomic configurations to select from.
- splitfloat
Fraction of the data to select (0.0 to 1.0).
Attributes¶
- selected_idslist[int]
Indices of selected configurations.
- frameslist[ase.Atoms]
The selected atomic configurations.
- excluded_frameslist[ase.Atoms]
The atomic configurations that were not selected.
Examples¶
>>> with project: ... data = ips.AddData(file="ethanol.xyz") # contains 100 frames ... selector = ips.SplitSelection(data=data.frames, split=0.1) >>> project.repro() >>> print(f"Selected {len(selector.selected_ids)} configurations with IDs: " ... f"{selector.selected_ids}") Selected 10 configurations with IDs: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
- data: list[ase.Atoms]¶
- select_atoms(atoms_lst: List[Atoms]) List[int][source]¶
Run the selection method.
Attributes¶
- atoms_lst: List[ase.Atoms]
List of ase Atoms objects to select configurations from.
Returns¶
- List[int]:
A list of the selected ids from 0 .. len(atoms_lst)
- split: float¶
ipsuite.configuration_selection.threshold module¶
Selecting atoms with a given step between them.
- class ipsuite.configuration_selection.threshold.ThresholdSelection(*args, **kwargs)[source]¶
Bases:
ConfigurationSelectionSelect atoms based on a given threshold.
Select atoms above a given threshold or the n_configurations with the highest / lowest value. Typically useful for uncertainty based selection.
Attributes¶
- key: str
The key in ‘calc.results’ to select from
- threshold: float, optional
All values above (or below if negative) this threshold will be selected. If n_configurations is given, ‘self.threshold’ will be prioritized, but a maximum of n_configurations will be selected.
- reference: str, optional
For visualizing the selection a reference value can be given. For ‘energy_uncertainty’ this would typically be ‘energy’.
- n_configurations: int, optional
Number of configurations to select.
- min_distance: int, optional
Minimum distance between selected configurations.
- dim_reduction: str, optional
Reduces the dimensionality of the chosen uncertainty along the specified axis by calculating either the maximum or mean value.
Choose from [“max”, “mean”]
- reduction_axis: tuple(int), optional
Specifies the axis along which the reduction occurs.
- data: list[ase.Atoms]¶
- dim_reduction: str = None¶
- key: str = 'energy_uncertainty'¶
- min_distance: int = 1¶
- n_configurations: int | None = None¶
- reduction_axis: list[int] = (1, 2)¶
- reference: str = 'energy'¶
- select_atoms(atoms_lst: List[Atoms], save_fig: bool = True) List[int][source]¶
Take every nth (step) object of a given atoms list.
Parameters¶
- atoms_lst: typing.List[ase.Atoms]
list of atoms objects to arange
Returns¶
- typing.List[int]:
list containing the taken indices
- threshold: float | None = None¶
ipsuite.configuration_selection.uniform_arange module¶
Selecting atoms with a given step between them.
- class ipsuite.configuration_selection.uniform_arange.UniformArangeSelection(*args, **kwargs)[source]¶
Bases:
ConfigurationSelectionSelect configurations with uniform spacing using a step size.
Parameters¶
- datalist[ase.Atoms]
The atomic configurations to select from.
- stepint
Step size for selection. Every nth configuration will be selected.
Attributes¶
- selected_idslist[int]
Indices of selected configurations.
- frameslist[ase.Atoms]
The selected atomic configurations.
- excluded_frameslist[ase.Atoms]
The atomic configurations that were not selected.
Examples¶
>>> with project: ... data = ips.AddData(file="ethanol.xyz") # contains 100 frames ... selector = ips.UniformArangeSelection(data=data.frames, step=10) >>> project.repro() >>> print(f"Selected {len(selector.selected_ids)} configurations with IDs: " ... f"{selector.selected_ids}") Selected 10 configurations with IDs: [0, 10, 20, 30, 40, 50, 60, 70, 80, 90]
- data: list[ase.Atoms]¶
- select_atoms(atoms_lst: List[Atoms]) List[int][source]¶
Take every nth (step) object of a given atoms list.
Parameters¶
- atoms_lst: typing.List[ase.Atoms]
list of atoms objects to arange
Returns¶
- typing.List[int]:
list containing the taken indices
- step: int¶
ipsuite.configuration_selection.uniform_energetic module¶
Module for selecting atoms uniformly in energy space.
- class ipsuite.configuration_selection.uniform_energetic.UniformEnergeticSelection(*args, **kwargs)[source]¶
Bases:
ConfigurationSelectionA class to perform data selection based on uniform global energy selection.
- data: list[ase.Atoms]¶
- n_configurations: int¶
ipsuite.configuration_selection.uniform_temporal module¶
Module for selecting atoms uniform in time.
- class ipsuite.configuration_selection.uniform_temporal.UniformTemporalSelection(*args, **kwargs)[source]¶
Bases:
ConfigurationSelectionSelect configurations uniformly distributed across time.
Parameters¶
- datalist[ase.Atoms]
The atomic configurations to select from.
- n_configurationsint
Number of configurations to select uniformly across the trajectory.
Attributes¶
- selected_idslist[int]
Indices of selected configurations.
- frameslist[ase.Atoms]
The selected atomic configurations.
- excluded_frameslist[ase.Atoms]
The atomic configurations that were not selected.
Examples¶
>>> with project: ... data = ips.AddData(file="ethanol.xyz") # contains 100 frames ... selector = ips.UniformTemporalSelection(data=data.frames, n_configurations=5) >>> project.repro() >>> print(f"Selected {len(selector.selected_ids)} configurations with IDs: " ... f"{selector.selected_ids}") Selected 5 configurations with IDs: [0, 25, 50, 74, 99]
- data: list[ase.Atoms]¶
- n_configurations: int¶
Module contents¶
Configuration Selection Nodes.
- class ipsuite.configuration_selection.ConfigurationSelection(*args, **kwargs)[source]¶
Bases:
IPSNodeBase Node for ConfigurationSelection.
Attributes¶
- data: list[Atoms]|list[list[Atoms]]|utils.types.SupportsAtoms
the data to select from
- exclude_configurations: dict[str, list]|utils.types.SupportsSelectedConfigurations
Atoms to exclude from the
- exclude: list[zntrack.Node]|zntrack.Node|None
Exclude the selected configurations from these nodes.
- data: list[Atoms]¶
- property excluded_frames: list[Atoms]¶
Get a list of the atoms objects that were not selected.
- property frames: list[Atoms]¶
Get a list of the selected atoms objects.
- img_selection: Path = PosixPath('$nwd$/selection.png')¶
- select_atoms(atoms_lst: List[Atoms]) List[int][source]¶
Run the selection method.
Attributes¶
- atoms_lst: List[ase.Atoms]
List of ase Atoms objects to select configurations from.
Returns¶
- List[int]:
A list of the selected ids from 0 .. len(atoms_lst)
- selected_ids: list[int] = NOT_AVAILABLE¶
- class ipsuite.configuration_selection.FilterOutlier(*args, **kwargs)[source]¶
Bases:
IPSNodeRemove outliers from the data based on a given property.
Attributes¶
- keystr, default=”energy”
The property to filter on.
- thresholdfloat, default=3
The threshold for filtering in units of standard deviations.
- direction{“above”, “below”, “both”}, default=”both”
The direction to filter in.
- data: list[Atoms]¶
- direction: Literal['above', 'below', 'both'] = 'both'¶
- property excluded_frames¶
- filtered_indices: list = NOT_AVAILABLE¶
- property frames: list[Atoms]¶
- histogram: str = PosixPath('$nwd$/histogram.png')¶
- key: str = 'energy'¶
- threshold: float = 3¶
- class ipsuite.configuration_selection.IndexSelection(*args, **kwargs)[source]¶
Bases:
ConfigurationSelectionSelect configurations by explicit indices or slice parameters.
Parameters¶
- datalist[ase.Atoms]
The atomic configurations to select from.
- indiceslist[int], optional
Explicit list of indices to select. Cannot be used with slice parameters.
- startint, optional
Start index for slice selection.
- stopint, optional
Stop index for slice selection.
- stepint, optional
Step size for slice selection.
Attributes¶
- selected_idslist[int]
Indices of selected configurations.
- frameslist[ase.Atoms]
The selected atomic configurations.
- excluded_frameslist[ase.Atoms]
The atomic configurations that were not selected.
Examples¶
>>> with project: ... data = ips.AddData(file="ethanol.xyz") ... selector = ips.IndexSelection(data=data.frames, indices=[0, 5, 10, 15]) >>> project.repro() >>> print(f"Selected {len(selector.selected_ids)} configurations with IDs: " ... f"{selector.selected_ids}") Selected 4 configurations with IDs: [0, 5, 10, 15]
- data: list[ase.Atoms]¶
- indices: list[int] | None = None¶
- select_atoms(atoms_lst: List[Atoms]) List[int][source]¶
Select Atoms by explicit indices or slice parameters.
- start: int | None = None¶
- step: int | None = None¶
- stop: int | None = None¶
- class ipsuite.configuration_selection.RandomSelection(*args, **kwargs)[source]¶
Bases:
ConfigurationSelectionSelect configurations randomly without replacement.
Parameters¶
- datalist[ase.Atoms]
The atomic configurations to select from.
- n_configurationsint
Number of configurations to select.
- seedint, default=1234
Random seed for reproducible selection.
Attributes¶
- selected_idslist[int]
Indices of selected configurations.
- frameslist[ase.Atoms]
The selected atomic configurations.
- excluded_frameslist[ase.Atoms]
The atomic configurations that were not selected.
Examples¶
>>> with project: ... data = ips.AddData(file="ethanol.xyz") ... selector = ips.RandomSelection(data=data.frames, n_configurations=10, seed=42) >>> project.repro() >>> print(f"Selected {len(selector.selected_ids)} configurations with IDs: " ... f"{selector.selected_ids}") Selected 10 configurations with IDs: [83, 53, 70, 45, 44, 39, 22, 80, 10, 0]
- data: list[ase.Atoms]¶
- n_configurations: int¶
- seed: int = 1234¶
- class ipsuite.configuration_selection.SplitSelection(*args, **kwargs)[source]¶
Bases:
ConfigurationSelectionSelect the first n% of configurations from the dataset.
Parameters¶
- datalist[ase.Atoms]
The atomic configurations to select from.
- splitfloat
Fraction of the data to select (0.0 to 1.0).
Attributes¶
- selected_idslist[int]
Indices of selected configurations.
- frameslist[ase.Atoms]
The selected atomic configurations.
- excluded_frameslist[ase.Atoms]
The atomic configurations that were not selected.
Examples¶
>>> with project: ... data = ips.AddData(file="ethanol.xyz") # contains 100 frames ... selector = ips.SplitSelection(data=data.frames, split=0.1) >>> project.repro() >>> print(f"Selected {len(selector.selected_ids)} configurations with IDs: " ... f"{selector.selected_ids}") Selected 10 configurations with IDs: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
- data: list[ase.Atoms]¶
- select_atoms(atoms_lst: List[Atoms]) List[int][source]¶
Run the selection method.
Attributes¶
- atoms_lst: List[ase.Atoms]
List of ase Atoms objects to select configurations from.
Returns¶
- List[int]:
A list of the selected ids from 0 .. len(atoms_lst)
- split: float¶
- class ipsuite.configuration_selection.ThresholdSelection(*args, **kwargs)[source]¶
Bases:
ConfigurationSelectionSelect atoms based on a given threshold.
Select atoms above a given threshold or the n_configurations with the highest / lowest value. Typically useful for uncertainty based selection.
Attributes¶
- key: str
The key in ‘calc.results’ to select from
- threshold: float, optional
All values above (or below if negative) this threshold will be selected. If n_configurations is given, ‘self.threshold’ will be prioritized, but a maximum of n_configurations will be selected.
- reference: str, optional
For visualizing the selection a reference value can be given. For ‘energy_uncertainty’ this would typically be ‘energy’.
- n_configurations: int, optional
Number of configurations to select.
- min_distance: int, optional
Minimum distance between selected configurations.
- dim_reduction: str, optional
Reduces the dimensionality of the chosen uncertainty along the specified axis by calculating either the maximum or mean value.
Choose from [“max”, “mean”]
- reduction_axis: tuple(int), optional
Specifies the axis along which the reduction occurs.
- data: list[ase.Atoms]¶
- dim_reduction: str = None¶
- key: str = 'energy_uncertainty'¶
- min_distance: int = 1¶
- n_configurations: int | None = None¶
- reduction_axis: list[int] = (1, 2)¶
- reference: str = 'energy'¶
- select_atoms(atoms_lst: List[Atoms], save_fig: bool = True) List[int][source]¶
Take every nth (step) object of a given atoms list.
Parameters¶
- atoms_lst: typing.List[ase.Atoms]
list of atoms objects to arange
Returns¶
- typing.List[int]:
list containing the taken indices
- threshold: float | None = None¶
- class ipsuite.configuration_selection.UniformArangeSelection(*args, **kwargs)[source]¶
Bases:
ConfigurationSelectionSelect configurations with uniform spacing using a step size.
Parameters¶
- datalist[ase.Atoms]
The atomic configurations to select from.
- stepint
Step size for selection. Every nth configuration will be selected.
Attributes¶
- selected_idslist[int]
Indices of selected configurations.
- frameslist[ase.Atoms]
The selected atomic configurations.
- excluded_frameslist[ase.Atoms]
The atomic configurations that were not selected.
Examples¶
>>> with project: ... data = ips.AddData(file="ethanol.xyz") # contains 100 frames ... selector = ips.UniformArangeSelection(data=data.frames, step=10) >>> project.repro() >>> print(f"Selected {len(selector.selected_ids)} configurations with IDs: " ... f"{selector.selected_ids}") Selected 10 configurations with IDs: [0, 10, 20, 30, 40, 50, 60, 70, 80, 90]
- data: list[ase.Atoms]¶
- select_atoms(atoms_lst: List[Atoms]) List[int][source]¶
Take every nth (step) object of a given atoms list.
Parameters¶
- atoms_lst: typing.List[ase.Atoms]
list of atoms objects to arange
Returns¶
- typing.List[int]:
list containing the taken indices
- step: int¶
- class ipsuite.configuration_selection.UniformEnergeticSelection(*args, **kwargs)[source]¶
Bases:
ConfigurationSelectionA class to perform data selection based on uniform global energy selection.
- data: list[ase.Atoms]¶
- n_configurations: int¶
- class ipsuite.configuration_selection.UniformTemporalSelection(*args, **kwargs)[source]¶
Bases:
ConfigurationSelectionSelect configurations uniformly distributed across time.
Parameters¶
- datalist[ase.Atoms]
The atomic configurations to select from.
- n_configurationsint
Number of configurations to select uniformly across the trajectory.
Attributes¶
- selected_idslist[int]
Indices of selected configurations.
- frameslist[ase.Atoms]
The selected atomic configurations.
- excluded_frameslist[ase.Atoms]
The atomic configurations that were not selected.
Examples¶
>>> with project: ... data = ips.AddData(file="ethanol.xyz") # contains 100 frames ... selector = ips.UniformTemporalSelection(data=data.frames, n_configurations=5) >>> project.repro() >>> print(f"Selected {len(selector.selected_ids)} configurations with IDs: " ... f"{selector.selected_ids}") Selected 5 configurations with IDs: [0, 25, 50, 74, 99]
- data: list[ase.Atoms]¶
- n_configurations: int¶