BAITS.VDJ.tl.calculate_cdr3_length

BAITS.VDJ.tl.calculate_cdr3_length#

BAITS.VDJ.tl.calculate_cdr3_length(df, sample_col, Cgene_col, cdr3_col, cdr3_type='nt', plot=True, figsize=(9, 3))#

Calculate the CDR3 length for each clone and optionally plot the distribution.

Parameters:
  • df (pandas.DataFrame) – Input dataframe containing clone and CDR3 information.

  • sample_col (str) – Column name for sample or library.

  • Cgene_col (str) – Column name for the chain (Cgene).

  • cdr3_col (str) – Column containing CDR3 sequences.

  • cdr3_type (str, default='nt') – Type of CDR3 sequence (‘nt’ for nucleotide, ‘aa’ for amino acid).

  • plot (bool, default=True) – Whether to plot CDR3 length distribution.

  • figsize (tuple, default=(9,3)) – Figure size for the plot.

Returns:

Original dataframe with an additional ‘cdr3_length’ column.

Return type:

pandas.DataFrame