napistu_torch.labels.create
Functions
|
Create edge/relation labels from edge_strata for relation-aware tasks. |
|
Create vertex labels for single-label predictions tasks |
|
Prepare labels for PyTorch Geometric based on task type. |
- napistu_torch.labels.create._prepare_continuous_labels(labels: Series, missing_value: float = nan, dtype: torch.dtype = torch.float32) torch.Tensor
Convert a pandas Series of continuous labels to a float tensor for regression.
- Parameters:
labels (pd.Series) – Series containing continuous numeric values (can include NaN)
missing_value (float, optional) – Value to use for missing entries (default: nan)
dtype (torch.dtype, optional) – Torch dtype for the output tensor (default: torch.float32)
- Returns:
Continuous labels as a tensor
- Return type:
torch.Tensor
- Raises:
ValueError – If the Series dtype is not numeric
Examples
>>> labels = pd.Series([1.5, 2.3, np.nan, 4.1]) >>> tensor = prepare_continuous_labels(labels) >>> print(tensor) tensor([1.5000, 2.3000, nan, 4.1000])
- napistu_torch.labels.create._prepare_discrete_labels(labels: Series, missing_value: int = -1) Tuple[torch.Tensor, Dict[int, str | int | float]]
Convert a pandas Series of discrete/categorical labels to integer encoding.
Supports: - String/object dtype - Categorical dtype - Integer dtype - Float dtype (treated as discrete categories)
- Parameters:
labels (pd.Series) – Series containing discrete labels (can include NaN/None/pd.NA)
missing_value (int, optional) – Integer to use for missing values (default: -1)
- Returns:
encoded (torch.Tensor) – Integer-encoded labels as a tensor (dtype=torch.long)
lookup (Dict[int, any]) – Mapping from integer codes to original label values
- Raises:
ValueError – If the Series dtype is not supported
Examples
>>> labels = pd.Series(['A', 'B', 'A', None, 'C']) >>> encoded, lookup = prepare_discrete_labels(labels) >>> print(encoded) tensor([0, 1, 0, -1, 2])
- napistu_torch.labels.create.create_relation_labels(edge_strata: Series) Tuple[torch.Tensor, Dict[int, Any] | None, LabelingManager]
Create edge/relation labels from edge_strata for relation-aware tasks.
- Parameters:
edge_strata (pd.Series) – Edge categories (e.g., from create_composite_edge_strata). Index should be MultiIndex with ‘from’ and ‘to’ columns.
- Returns:
labels (torch.Tensor) – Integer-encoded relation labels
labeling_manager (LabelingManager) – A LabelingManager configured for relation labels
Examples
>>> edge_strata = create_composite_edge_strata(napistu_graph) >>> labels, lookup, manager = create_edge_labels(edge_strata)
- napistu_torch.labels.create.create_vertex_labels(napistu_graph: NapistuGraph, label_type: str | LabelingManager = 'species_type', task_type: str = 'classification', labeling_managers: Dict[str, LabelingManager] = {'node_type': LabelingManager(label_attribute='node_type', exclude_vertex_attributes=['node_type', 'species_type'], augment_summary_types=['sources'], label_names=None), 'species_type': LabelingManager(label_attribute='species_type', exclude_vertex_attributes=['species_type'], augment_summary_types=['sources'], label_names=None)})
Create vertex labels for single-label predictions tasks
- Parameters:
napistu_graph (NapistuGraph) – A network-based representation of the SBML_dfs
label_type (Union[str, LabelingManager]) –
Either a string indicating the type of labels to generate (which will lookup a strategy from LABELING_MANAGERS) or a LabelingManager
THe supported strings with pre-configured strategies are: - species_type: protein, metabolite, drug, etc. - node_type: protein, metabolite, drug, etc.
labeling_managers (Dict[str, LabelingManager]) – A dictionary of LabelingManager objects for each label type. Ignored if label_type is a LabelingManager.
- Returns:
labels (pd.Series) – A Series with labels as values and vertex names as an index
labeling_manager (LabelingManager) – The LabelingManager for the label type
- napistu_torch.labels.create.encode_labels(labels: Series, task_type: str = 'classification', missing_value: int | float = None) Tuple[torch.Tensor, Dict[int, any]] | torch.Tensor
Prepare labels for PyTorch Geometric based on task type.
This is a convenience wrapper that calls either prepare_discrete_labels or prepare_continuous_labels based on the task type.
- Parameters:
labels (pd.Series) – Series containing labels
task_type ({'classification', 'regression'}, optional) – Type of task (default: ‘classification’)
missing_value (int or float, optional) – Value to use for missing entries. Defaults: -1 for classification, nan for regression
- Returns:
For classification –
- encodedtorch.Tensor
Integer-encoded labels
- lookupDict[int, any]
Mapping from integers to original labels
For regression –
- torch.Tensor
Continuous labels as float tensor
Examples
>>> # Classification >>> labels = pd.Series(['A', 'B', 'A', None]) >>> encoded, lookup = encode_labels(labels, task_type='classification')
>>> # Regression >>> labels = pd.Series([1.5, 2.3, np.nan, 4.1]) >>> tensor = encode_labels(labels, task_type='regression')