Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding peak edges to network -- Add feature request #111

Open
mriken opened this issue Jul 13, 2021 · 6 comments
Open

Adding peak edges to network -- Add feature request #111

mriken opened this issue Jul 13, 2021 · 6 comments

Comments

@mriken
Copy link

mriken commented Jul 13, 2021

Hi,
I find ANANSE rather useful in inferring regulatory interactions. I would like to use it to connect the peaks to the target genes directly. Is there a way to do that? Can I somehow extract the peaks from binding.tsv that are contributing to the regulation inferred in network.txt?
Or, would it be possible to modify the output of the ananse network command to also return the peak region used to infer the TF-target gene link?

thanks

@mriken mriken changed the title Adding peak edges to netowrk -- Add feature request Adding peak edges to network -- Add feature request Jul 13, 2021
@simonvh
Copy link
Member

simonvh commented Jul 14, 2021

Hi @mriken, I'm not sure there is a way to do that. I guess what would be possible is to get the peak to gene mapping that is used to create the network, which will show which peak is associated with which gene, and what the weight of that association is. Would that be what you're interested in?

@mriken
Copy link
Author

mriken commented Jul 15, 2021

Hi @simonvh,

thanks for your reply.
That would already be terrific, yes.

cheers

@simonvh
Copy link
Member

simonvh commented Jul 15, 2021

We may add this to ANANSE in a later release. Meanwhile, this would be the Python code to do this. Just update line 5 to your binding.h5 fiel location and save this as peaks2enhancer.py and run it with python peaks2enhancer.py (in your ananse environment).

import pyranges as pr
import pandas as pd
from ananse.network import Network

fname = "/path/to/binding.h5"

with pd.HDFStore(fname) as hdf:
    peaks = hdf.get("_index")
    
enhancer_pr = pr.PyRanges(
    peaks
    .index.to_series()
    .str.split(r"[:-]", expand=True)
    .rename(columns={0: "Chromosome", 1: "Start", 2: "End"})
)

n = Network(genome="hg38") # If genome is not hg38 or mm10 you need to specify gene_bed as well!

# Link enhancers to genes on basis of distance to annotated TSS
gene_df = n.enhancer2gene(
    enhancer_pr,
)
gene_df = gene_df.dropna()

gene_df.to_csv("enhancer2gene.txt", sep="\t")

@mriken
Copy link
Author

mriken commented Jul 16, 2021

Wonderful, very appreciated!!

However, after running ananse binding I do not see the binding.h5 file in the output directory at all. I only have the files binding.tsv and factor_activity.tsv, other than the atac/h3k27ac tsv files.

Is this in the development version of ananse, perhaps? Or does it mean that the prediction is incomplete?
I don't see any error messages printed on screen when I run it.

@simonvh
Copy link
Member

simonvh commented Jul 16, 2021

Ah, sorry! This is indeed in the latest version, which was just released yesterday (0.3.0). This version is faster for ananse network and uses less memory.

@mriken
Copy link
Author

mriken commented Jul 16, 2021

Cool, I'll upgrade then and try.
Cheers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants