BINN

This is the API reference for the BINN-package. For usage examples, see Examples. Note that the API is still stabilizing and will undergo changes.

`BINN`

Bases: Module

A biologically informed neural network (BINN) in pure PyTorch.

If heads_ensemble=False, we build a standard sequential network with layer-to-layer connections.

If heads_ensemble=True, we build an 'ensemble of heads' network: each hidden layer also produces a separate head (dimension = n_outputs) which is passed through a sigmoid, then summed at the end.

Parameters:

Name	Type	Description	Default
`data_matrix`	`DataFrame`	A DataFrame of input features (samples x features). If not needed, can be None.	`None`
`use_reactome`	`bool`	If True, loads `mapping` and `pathways` from `load_reactome_db()`, ignoring the ones provided.	required
`mapping`	`DataFrame`	A DataFrame describing how each input feature maps into the pathway graph. If None, the user must rely on `use_reactome=True`.	`None`
`pathways`	`DataFrame`	A DataFrame describing the edges among pathway nodes.	`None`
`entity_col`	`str`	Datamatrix: The column for the entity, in the datamatrix file.	`'Protein'`
`input_col`	`str`	Mapping: The column for the input in the mapping file. Should correspond to entity in the datamatrix file.	`'input'`
`translation_col`	`str`	Mapping: The column for the translation in the mapping file.	`'translation'`
`target_col`	`str`	Pathways: The column for the target in the pathways file.	`'target'`
`source_col`	`str`	Pathways: The column for the source in the pathways file.	`'source'`
`activation`	`str`	The activation function to use in each layer. Defaults to "tanh".	`'tanh'`
`n_layers`	`int`	Number of layers in the network (depth). Defaults to 4.	`4`
`n_outputs`	`int`	Dimension of the final output (e.g., 2 for binary classification). Defaults to 2.	`2`
`dropout`	`float`	Dropout probability. Defaults to 0.	`0`
`heads_ensemble`	`bool`	If True, build an ensemble-of-heads network. Otherwise, a standard MLP.	`False`
`device`	`str`	The PyTorch device to place this model on. Defaults to "cpu".	`'cpu'`

Attributes:

Name	Type	Description
`inputs`	`List[str]`	The list of input feature names derived from the first connectivity matrix.
`layers`	`Module`	The built network (either standard sequential or ensemble-of-heads).
`layer_names`	`List[List[str]]`	The node (feature) names for each layer, for interpretability.
`connectivity_matrices`	`List[DataFrame]`	The adjacency (pruning) masks for each layer, derived from the pathway network.

Source code in binn/model/binn.py

class BINN(nn.Module):
    """
    A biologically informed neural network (BINN) in pure PyTorch.

    If `heads_ensemble=False`, we build a standard sequential network
    with layer-to-layer connections.

    If `heads_ensemble=True`, we build an 'ensemble of heads' network:
    each hidden layer also produces a separate head (dimension = n_outputs)
    which is passed through a sigmoid, then summed at the end.

    Args:
        data_matrix (pd.DataFrame, optional):
            A DataFrame of input features (samples x features). If not needed, can be None.
        use_reactome (bool, optional):
            If True, loads `mapping` and `pathways` from `load_reactome_db()`, ignoring the ones provided.
        mapping (pd.DataFrame, optional):
            A DataFrame describing how each input feature maps into the pathway graph.
            If None, the user must rely on `use_reactome=True`.
        pathways (pd.DataFrame, optional):
            A DataFrame describing the edges among pathway nodes.
        entity_col (str, optional):
            **Datamatrix**: The column for the entity, in the datamatrix file.
        input_col (str, optional):
            **Mapping**: The column for the input in the mapping file. Should correspond to entity in the datamatrix file.
        translation_col (str, optional):
            **Mapping**: The column for the translation in the mapping file.
        target_col (str, optional):
            **Pathways**: The column for the target in the pathways file.
        source_col (str, optional):
            **Pathways**: The column for the source in the pathways file.
        activation (str, optional):
            The activation function to use in each layer. Defaults to "tanh".
        n_layers (int, optional):
            Number of layers in the network (depth). Defaults to 4.
        n_outputs (int, optional):
            Dimension of the final output (e.g., 2 for binary classification). Defaults to 2.
        dropout (float, optional):
            Dropout probability. Defaults to 0.
        heads_ensemble (bool, optional):
            If True, build an ensemble-of-heads network. Otherwise, a standard MLP.
        device (str, optional):
            The PyTorch device to place this model on. Defaults to "cpu".


    Attributes:
        inputs (List[str]):
            The list of input feature names derived from the first connectivity matrix.
        layers (nn.Module):
            The built network (either standard sequential or ensemble-of-heads).
        layer_names (List[List[str]]):
            The node (feature) names for each layer, for interpretability.
        connectivity_matrices (List[pd.DataFrame]):
            The adjacency (pruning) masks for each layer, derived from the pathway network.
    """

    def __init__(
        self,
        data_matrix: pd.DataFrame = None,
        network_source: str = None,
        input_source: str = "uniprot",
        mapping: pd.DataFrame = None,
        pathways: pd.DataFrame = None,
        entity_col: str = "Protein",
        input_col: str = "input",
        translation_col: str = "translation",
        target_col: str = "target",
        source_col: str = "source",
        activation: str = "tanh",
        n_layers: int = 4,
        n_outputs: int = 2,
        dropout: float = 0,
        heads_ensemble: bool = False,
        device: str = "cpu",
    ):
        super().__init__()

        self.device = device
        self.to(self.device)

        self.n_layers = n_layers
        self.heads_ensemble = heads_ensemble

        # Build the pathway network from dataframes

        if network_source == "reactome":
            reactome_db = load_reactome_db(input_source=input_source)
            mapping = reactome_db["mapping"]
            pathways = reactome_db["pathways"]

        # Build connectivity from the pathway network
        pn = dataframes_to_pathway_network(
            data_matrix=data_matrix,
            pathway_df=pathways,
            mapping_df=mapping,
            input_col=input_col,
            target_col=target_col,
            source_col=source_col,
            entity_col=entity_col,
            translation_col=translation_col,
        )

        # The connectivity matrices for each layer
        self.connectivity_matrices = pn.get_connectivity_matrices(n_layers=n_layers)

        # Collect layer sizes
        layer_sizes = []
        self.layer_names = []

        # First matrix => input layer size
        mat_first = self.connectivity_matrices[0]
        in_features, _ = mat_first.shape
        layer_sizes.append(in_features)

        self.inputs = mat_first.index.tolist()  # feature names
        self.layer_names.append(mat_first.index.tolist())

        # Additional layers
        for mat in self.connectivity_matrices[1:]:
            i, _ = mat.shape
            layer_sizes.append(i)
            self.layer_names.append(mat.index.tolist())

        # Build actual layers
        if heads_ensemble:
            self.layers = _generate_ensemble_of_heads(
                layer_sizes,
                self.connectivity_matrices,
                activation=activation,
                n_outputs=n_outputs,
                bias=True,
            )
        else:
            self.layers = _generate_sequential(
                layer_sizes,
                self.connectivity_matrices,
                activation=activation,
                n_outputs=n_outputs,
                dropout=dropout,
                bias=True,
            )

        # Weight init
        self.apply(_init_weights)

        # Print device info
        print(f"\n[INFO] BINN is on device: {self.device}")

    def forward(self, x: torch.Tensor) -> torch.Tensor:
        """Standard forward pass; if heads_ensemble=True, sum-of-heads is used."""
        return self.layers(x)

`forward(x)`

Standard forward pass; if heads_ensemble=True, sum-of-heads is used.

Source code in binn/model/binn.py

def forward(self, x: torch.Tensor) -> torch.Tensor:
    """Standard forward pass; if heads_ensemble=True, sum-of-heads is used."""
    return self.layers(x)

Keys	Action
`?`	Open this help
`n`	Next page
`p`	Previous page
`s`	Search