scmagnify.tools.connect_peaks_genes

scmagnify.tools.connect_peaks_genes#

scmagnify.tools.connect_peaks_genes(data, meta_mdata, gene_selected=None, rna_key='RNA', atac_key='ATAC', path_to_gtf=None, span=100000, n_rand_samples=100, cor_cutoff=0.1, pval_cutoff=0.1, n_jobs=1, save_tmp=False)#

Calculate the correlation between ATAC-seq peaks and gene expression for a list of genes.

Parameters:
  • gene_selected (Optional[list[str]] (default: None)) – List of gene names.

  • meta_mdata (MuData) – MuData object with the multi-omics data.

  • rna_key (str (default: 'RNA')) – Key for the RNA data in the MuData object, by default “RNA”.

  • atac_key (str (default: 'ATAC')) – Key for the ATAC data in the MuData object, by default “ATAC”.

  • path_to_gtf (Optional[str] (default: None)) – Path to the GTF file. Download from https://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/genes/hg38.gtf.gz

  • span (int (default: 100000)) – Span around the gene to consider, by default 1000

  • n_rand_samples (int (default: 100)) – Number of random samples to calculate the correlation, by default 100

  • cor_cutoff (float (default: 0.1)) – Correlation cutoff, by default 0.1

  • pval_cutoff (float (default: 0.1)) – P-value cutoff, by default 0.1

  • save_tmp (bool, optional) – Save the results to a temporary file, by default False

  • data (AnnData | MuData)

  • n_jobs (int)

Return type:

AnnData | MuData

Returns:

gene_peak_correlations

Series with the correlation between ATAC-seq peaks and gene expression. Index: Gene name Values: DataFrame with the correlation between ATAC-seq peaks and gene expression. Columns:

  • Peak_ID (str)

  • Correlation (float)

  • P-value (float)

data.uns[“filtered_peak_gene_corrs”]

Series with the filtered peak-gene correlations. Index: Peak_ID Values: Gene