scmagnify.tools.extract_regfactor_genes

scmagnify.tools.extract_regfactor_genes#

scmagnify.tools.extract_regfactor_genes(data, regfactor_key='regfactors', mode='TF', threshold=0.0, n_top=None, percentile=None, plot=False, ncols=3, figsize=(15, 8), bins=30, kde=True, palette=None, context=None, font_scale=1, default_context=None, theme='whitegrid', save=None, show=None)#

Extract TFs or TGs with high loadings for each RegFactor and optionally plot their distributions.

Parameters:
  • data (AnnData | MuData | GRNMuData) – Single cell data object. Can be an anndata.AnnData, mudata.MuData, scmagnify.GRNMuData

  • threshold (float (default: 0.0)) – The minimum loading value to include a gene (used if n_top and percentile are None).

  • n_top (Optional[int] (default: None)) – The number of top genes to extract for each RegFactor.

  • percentile (Optional[float] (default: None)) – The percentile of loadings to use as a threshold.

  • regfactor_key (str (default: 'regfactors')) – The key in data.uns where the RegFactor loadings are stored.

  • mode (str (default: 'TF')) – The mode (‘TF’ or ‘TG’) to extract genes from.

  • plot (bool (default: False)) – Whether to plot the distribution of loadings with thresholds.

  • ncols (int (default: 3)) – Number of columns in the plot grid.

  • figsize (tuple[int, int] (default: (15, 8))) – Size of the figure.

  • bins (int (default: 30)) – Number of bins for the histogram.

  • kde (bool (default: True)) – Whether to overlay a KDE on the histogram.

  • palette (Optional[str] (default: None)) – Color palette for the plots.

  • context (Optional[str] (default: None)) – Seaborn context for the plots.

  • font_scale (float | None (default: 1)) – Scaling factor for fonts in the plots.

  • default_context (Optional[dict] (default: None)) – Default context settings for the plots.

  • theme (str | None (default: 'whitegrid')) – Seaborn theme for the plots.

  • save (Union[bool, str, None] (default: None)) – Whether to save the plot. If a string is provided, it is used as the filename.

  • show (Optional[bool] (default: None)) – Whether to display the plot.

Return type:

dict[str, DataFrame]

Returns:

Dict[str, pd.DataFrame] A dictionary where keys are RegFactor names and values are DataFrames of genes with high loadings.