napistu_torch.labels.create

Functions

create_relation_labels(edge_strata)

Create edge/relation labels from edge_strata for relation-aware tasks.

create_vertex_labels(napistu_graph[, ...])

Create vertex labels for single-label predictions tasks

encode_labels(labels[, task_type, missing_value])

Prepare labels for PyTorch Geometric based on task type.

napistu_torch.labels.create._prepare_continuous_labels(labels: Series, missing_value: float = nan, dtype: torch.dtype = torch.float32) torch.Tensor

Convert a pandas Series of continuous labels to a float tensor for regression.

Parameters:
  • labels (pd.Series) – Series containing continuous numeric values (can include NaN)

  • missing_value (float, optional) – Value to use for missing entries (default: nan)

  • dtype (torch.dtype, optional) – Torch dtype for the output tensor (default: torch.float32)

Returns:

Continuous labels as a tensor

Return type:

torch.Tensor

Raises:

ValueError – If the Series dtype is not numeric

Examples

>>> labels = pd.Series([1.5, 2.3, np.nan, 4.1])
>>> tensor = prepare_continuous_labels(labels)
>>> print(tensor)
tensor([1.5000, 2.3000,    nan, 4.1000])
napistu_torch.labels.create._prepare_discrete_labels(labels: Series, missing_value: int = -1) Tuple[torch.Tensor, Dict[int, str | int | float]]

Convert a pandas Series of discrete/categorical labels to integer encoding.

Supports: - String/object dtype - Categorical dtype - Integer dtype - Float dtype (treated as discrete categories)

Parameters:
  • labels (pd.Series) – Series containing discrete labels (can include NaN/None/pd.NA)

  • missing_value (int, optional) – Integer to use for missing values (default: -1)

Returns:

  • encoded (torch.Tensor) – Integer-encoded labels as a tensor (dtype=torch.long)

  • lookup (Dict[int, any]) – Mapping from integer codes to original label values

Raises:

ValueError – If the Series dtype is not supported

Examples

>>> labels = pd.Series(['A', 'B', 'A', None, 'C'])
>>> encoded, lookup = prepare_discrete_labels(labels)
>>> print(encoded)
tensor([0, 1, 0, -1, 2])
napistu_torch.labels.create.create_relation_labels(edge_strata: Series) Tuple[torch.Tensor, Dict[int, Any] | None, LabelingManager]

Create edge/relation labels from edge_strata for relation-aware tasks.

Parameters:

edge_strata (pd.Series) – Edge categories (e.g., from create_composite_edge_strata). Index should be MultiIndex with ‘from’ and ‘to’ columns.

Returns:

  • labels (torch.Tensor) – Integer-encoded relation labels

  • labeling_manager (LabelingManager) – A LabelingManager configured for relation labels

Examples

>>> edge_strata = create_composite_edge_strata(napistu_graph)
>>> labels, lookup, manager = create_edge_labels(edge_strata)
napistu_torch.labels.create.create_vertex_labels(napistu_graph: NapistuGraph, label_type: str | LabelingManager = 'species_type', task_type: str = 'classification', labeling_managers: Dict[str, LabelingManager] = {'node_type': LabelingManager(label_attribute='node_type', exclude_vertex_attributes=['node_type', 'species_type'], augment_summary_types=['sources'], label_names=None), 'species_type': LabelingManager(label_attribute='species_type', exclude_vertex_attributes=['species_type'], augment_summary_types=['sources'], label_names=None)})

Create vertex labels for single-label predictions tasks

Parameters:
  • napistu_graph (NapistuGraph) – A network-based representation of the SBML_dfs

  • label_type (Union[str, LabelingManager]) –

    Either a string indicating the type of labels to generate (which will lookup a strategy from LABELING_MANAGERS) or a LabelingManager

    THe supported strings with pre-configured strategies are: - species_type: protein, metabolite, drug, etc. - node_type: protein, metabolite, drug, etc.

  • labeling_managers (Dict[str, LabelingManager]) – A dictionary of LabelingManager objects for each label type. Ignored if label_type is a LabelingManager.

Returns:

  • labels (pd.Series) – A Series with labels as values and vertex names as an index

  • labeling_manager (LabelingManager) – The LabelingManager for the label type

napistu_torch.labels.create.encode_labels(labels: Series, task_type: str = 'classification', missing_value: int | float = None) Tuple[torch.Tensor, Dict[int, any]] | torch.Tensor

Prepare labels for PyTorch Geometric based on task type.

This is a convenience wrapper that calls either prepare_discrete_labels or prepare_continuous_labels based on the task type.

Parameters:
  • labels (pd.Series) – Series containing labels

  • task_type ({'classification', 'regression'}, optional) – Type of task (default: ‘classification’)

  • missing_value (int or float, optional) – Value to use for missing entries. Defaults: -1 for classification, nan for regression

Returns:

  • For classification

    encodedtorch.Tensor

    Integer-encoded labels

    lookupDict[int, any]

    Mapping from integers to original labels

  • For regression

    torch.Tensor

    Continuous labels as float tensor

Examples

>>> # Classification
>>> labels = pd.Series(['A', 'B', 'A', None])
>>> encoded, lookup = encode_labels(labels, task_type='classification')
>>> # Regression
>>> labels = pd.Series([1.5, 2.3, np.nan, 4.1])
>>> tensor = encode_labels(labels, task_type='regression')