BAITS.VDJ.tl.compute_grouped_index

BAITS.VDJ.tl.compute_grouped_index#

BAITS.VDJ.tl.compute_grouped_index(df, group_by, Cgene_col, clone_col, groups, count_basis='location', loc_x_col='X', loc_y_col='Y', Umi_col=None, index='shannon_entropy')#

Compute a diversity index (e.g., Shannon entropy) per group.

Parameters:
  • df (pandas.DataFrame) – Input dataframe containing clone data.

  • group_by (str) – Column name for primary grouping.

  • Cgene_col (str) – Column name for chain (Cgene).

  • clone_col (str) – Column containing clone identifiers.

  • groups (list of str) – Columns to group by for computing index.

  • count_basis (str, default='location') – Whether to count by ‘location’ or ‘UMI’.

  • loc_x_col (str, default='X') – X-coordinate column (for location-based counts).

  • loc_y_col (str, default='Y') – Y-coordinate column (for location-based counts).

  • Umi_col (str, optional) – Column for UMI counts.

  • index (str, default='shannon_entropy') – Diversity index to compute. Supported indices include ‘shannon_entropy’, ‘renyi_entropy’, etc.

Returns:

Dataframe containing the computed index per group. For renyi_entropy, includes an ‘alpha’ column.

Return type:

pandas.DataFrame