Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A way to mark C extensions as thread-safe, so they can be executed in parallel #2136

Open
eregon opened this issue Nov 5, 2020 · 10 comments

Comments

@eregon
Copy link
Member

eregon commented Nov 5, 2020

TruffleRuby supports C extensions, but for scalability it is important to run at least some of them in parallel (e.g., HTTP parsing in Puma). This was notably mentioned in my RubyKaigi talk.
So we need a way to mark a given C extension as thread-safe, so we know we do not need to acquire a global lock when executing its methods.
More details on the MRI tracker: https://bugs.ruby-lang.org/issues/17307

Currently there is the experimental option --cexts-lock=false but that's global so safe only if we know all the extensions used are thread-safe.

@eregon eregon self-assigned this Nov 5, 2020
@aardvark179
Copy link
Contributor

aardvark179 commented Nov 10, 2020

There are two main places where we would need to fix things, when creating the stubs that wrap C methods, and when making calls into the Ruby runtime which would release and reacquire the lock. I think we can do this through a macro (call it __TRUFFLERUBY_NO_MUTEX__) which can be set to 1 before including our standard headers to indicate that this extension should not use locks, but we would then have to rename a large number of standard functions based on this as they would have to behave slightly differently. The other option would be to use some thread local to indicate whether a mutex was required, but this is unlikely to be as performant as we would really like.

@eregon
Copy link
Member Author

eregon commented Nov 10, 2020

I was thinking the main place where we take the lock is rb_define_method and friends, and those we can simply read from a thread-local when defining functions to know whether we define a variant acquiring the lock or not. But it seems there are quite a few other usages of Primitive.call_with_c_mutex*, we should review them.

@eregon
Copy link
Member Author

eregon commented Jul 19, 2021

Given https://bugs.ruby-lang.org/issues/17307#note-21
The way to mark C extensions as thread-safe should be:

rb_define_method(mod, "need C ext lock", func1, 0);

#ifdef HAVE_RB_EXT_THREAD_SAFE
  rb_ext_thread_safe(true);
#endif

rb_define_method(mod, "does not need C ext lock and can run in parallel", func2, 0);

#ifdef HAVE_RB_EXT_THREAD_SAFE
  rb_ext_thread_safe(false);
#endif

rb_define_method(mod, "need C ext lock", func3, 0);

@eregon
Copy link
Member Author

eregon commented Jul 19, 2021

@ioquatix says a way to define it explicitly per method would be nice, like:

rb_define_method(mod, "does not need C ext lock and can run in parallel", func2, RB_NO_EXT_LOCK(arity));

It's more verbose if many methods, but explicit vs implicit thread-local state. Maybe we can have both.

@ioquatix
Copy link
Contributor

ioquatix commented Jul 19, 2021

Here is a way to efficiently pack flags into args which gives us plenty of head room:

#include <limits.h>

const int MASK = 15;
const int NOGVL = 16; // space for ~26 more bit flags.

int combine(int count, int flags) {
    if (count >= 0) return count | flags;
    else return count & ~flags;
}

int count(int arity) {
    if (arity & INT_MIN) {
        return (arity | ~MASK); 
    } else {
        return (arity & MASK);
    }
}

int flags(int arity) {
    if (arity & INT_MIN) {
        return (~arity & ~MASK); 
    } else {
        return (arity & ~MASK);
    }
}

@eregon
Copy link
Member Author

eregon commented Jun 16, 2023

The plan here is to not use the C extensions lock if RB_EXT_RACTOR_SAFE is used for an extension (already used in some extensions), or if RB_EXT_THREAD_SAFE is used (would be new and less restrictive than RB_EXT_RACTOR_SAFE)

@loid345
Copy link

loid345 commented Mar 19, 2024

Is there any progress on this issue?

@eregon eregon assigned eregon and andrykonchin and unassigned eregon Jul 22, 2024
@eregon
Copy link
Member Author

eregon commented Jul 24, 2024

https://github.com/ruby/ruby/blob/993bb55d982f315f5137fc9b5d9d087e971270c2/doc/extension.rdoc#label-Appendix+F.+Ractor+support describes the conditions for rb_ext_ractor_safe(true).
We should have a similar document explaining the necessary conditions for RB_EXT_THREAD_SAFE.

It could be useful to also have a way to override the safe value for a given gem/extension, i.e. as if the Init_ function called rb_ext_ractor_safe(true)/rb_ext_thread_safe(true) for experimenting and when gems are known to be safe but not marked yet.
But extension names are not always nice, e.g. Init_puma_http11: https://github.com/puma/puma/blob/62f7cd6fbd9bffa3564cba0d111abe2d27f2b2dc/ext/puma_http11/puma_http11.c#L462 (which is in some lib/puma/3.2/puma_http11.so).

@mohamedhafez
Copy link
Contributor

mohamedhafez commented Sep 23, 2024

Totally agree with the previous post, for situations like ruby/nkf#19 (comment), or for when gem maintainers just haven't responded even though we know for certain their C-extension is safe. I'd suggest a config file where we can just list the gems for which we want to assume their C-extension(s) are threadsafe, but any way of being able to do so, even if it was listing paths or Init function names would be great!

@eregon
Copy link
Member Author

eregon commented Sep 25, 2024

We plan to work on this for the next release.
Related: #3672

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants