BAITS.VDJ.tl.aggregate_clone_df

BAITS.VDJ.tl.aggregate_clone_df#

BAITS.VDJ.tl.aggregate_clone_df(df, group_by, Cgene_col, clone_col, groups, count_basis='location', loc_x_col='X', loc_y_col='Y', Umi_col='UMI')#

Aggregate clone counts and frequencies per group.

Parameters:
  • df (pandas.DataFrame) – Input dataframe containing clone data.

  • group_by (str) – Column name for primary grouping (e.g., sample).

  • Cgene_col (str) – Column name for chain (Cgene).

  • clone_col (str) – Column containing clone identifiers.

  • groups (list of str) – Columns to group by for aggregation.

  • count_basis (str, default='location') – Whether to count by ‘location’ or ‘UMI’.

  • loc_x_col (str, default='X') – X-coordinate column for location-based counting.

  • loc_y_col (str, default='Y') – Y-coordinate column for location-based counting.

  • Umi_col (str, default='UMI') – UMI count column for UMI-based counting.

Returns:

Aggregated dataframe containing frequency (‘freq’) and count (‘count’) per clone per group.

Return type:

pandas.DataFrame