Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use malloc #303

Draft
wants to merge 2 commits into
base: develop
Choose a base branch
from
Draft

use malloc #303

wants to merge 2 commits into from

Conversation

siebrenf
Copy link
Member

@siebrenf siebrenf commented Mar 17, 2023

C function pfmscan() will now store the intermediate scores in the heap (the RAM I think, this is my first tangle with C).

With my test FASTA, two arrays are created that are way bigger than the stack size. We could store these arrays on the heap, at the cost of speed. However, if the length of the sequences are normally not that large, we could also throw an error/warning instead.

Note: One thing that I could not get to work, is dynamically deciding to use the stack/heap (you cannot declare a variable inside an if-else in C, and then use it outside the if-else).

How to test:

Command line:

git clone [email protected]:vanheeringen-lab/gimmemotifs.git
cd gimmemotifs
git checkout option_1
python setup.py build && pip install .

Python:

from gimmemotifs.motif import Motif, read_motifs
from gimmemotifs.fasta import Fasta


jaspar_motif_file="data/motif_databases/JASPAR2022_vertebrates.pfm"
motifs = read_motifs(jaspar_motif_file)
print(len([m for m in motifs if m.id.startswith("MA0046.2")]))  # prints 1

m = [m for m in motifs if m.id.startswith("MA0046.2")][0]
print(m.id)  # prints 'MA0046.2_MA0046.2.HNF1A'

ff = "test/data/scan/genome/scan_test.fa"  # segfault

f = Fasta(ff)
m.scan_all(f)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant