Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: memory_view: finalizer GC safety issue #5012

Closed
wants to merge 1 commit into from

Conversation

XrXr
Copy link
Member

@XrXr XrXr commented Oct 22, 2021

The finalizer of Fiddle::memory_view uses rb_memory_view_release(),
which traverses the super chain to find the memory view entry. Trouble
is, while finalizing object x, it's unsafe to access any object other
than x. Depending on finalization order, other objects might have been
finalized and released already.

This memory safety issue manifests as a crash on my machine reliably
when I run the following:

make test-all TESTS='test/fiddle/test_memory_view.rb -n/test_memory_view_from_pointer/'

On CI, there are many instances of this crash. For example:

This change checks for object liveness before access, however, I don't
think this fix is completely correct while it does avoid the crash.

In the event that the klass which stores the memory view entry is
finalized before rb_memory_view_release is called, the memory view entry
will be leaked. Also, it's possible that during finalization, the
address of the super class has been repurposed for a different object.

It seems that memory view needs to handle double-free coming from the
above situation. One free comes the class storing the entry going away,
and another when the Fiddle object finalizes.

cc: @ko1 @mrkn

The finalizer of Fiddle::memory_view uses rb_memory_view_release(),
which traverses the super chain to find the memory view entry. Trouble
is, while finalizing object x, it's unsafe to access any object other
than x. Depending on finalization order, other objects might have been
finalized and released already.

This memory safety issue manifests as a crash on my machine reliably
when I run the following:

    make test-all TESTS='test/fiddle/test_memory_view.rb -n/test_memory_view_from_pointer/'

On CI, there are many instances of this crash. For example:
 - http://rubyci.s3.amazonaws.com/osx1015/ruby-master/log/20211022T114502Z.fail.html.gz
 - http://rubyci.s3.amazonaws.com/solaris10-sunc/ruby-master/log/20211022T090002Z.fail.html.gz

This change checks for object liveness before access, however, I don't
think this fix is completely correct while it does avoid the crash.

In the event that the klass which stores the memory view entry is
finalized before rb_memory_view_release is called, the memory view entry
will be leaked. Also, it's possible that during finalization, the
address of the super class has been repurposed for a different object.

It seems that memory view needs to handle double-free coming from the
above situation. One free comes the class storing the entry going away,
and another when the Fiddle object finalizes.
@@ -781,8 +782,16 @@ rb_memory_view_get_item(rb_memory_view_t *view, const ssize_t *indices)
static const rb_memory_view_entry_t *
lookup_memory_view_entry(VALUE klass)
{
RUBY_ASSERT(rb_objspace_markable_object_p(klass));
Copy link
Member Author

@XrXr XrXr Oct 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even this assert fails. The "touch other objects during finalization" issue extends one level up to fiddle_memview_release(). When the wrapper object is finalizing, view->obj is already gone and T_NONE.

@kou
Copy link
Member

kou commented Oct 23, 2021

ruby/fiddle#79 may be related.

@@ -781,8 +782,16 @@ rb_memory_view_get_item(rb_memory_view_t *view, const ssize_t *indices)
static const rb_memory_view_entry_t *
lookup_memory_view_entry(VALUE klass)
{
RUBY_ASSERT(rb_objspace_markable_object_p(klass));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, we can and should do nothing?

Suggested change
RUBY_ASSERT(rb_objspace_markable_object_p(klass));
if (!rb_objspace_markable_object_p(klass)) return NULL;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could, but I don't think it covers all the cases. Since it's a stale pointer, it could be a live object and a completely unrelated class. We could leak or even end up getting an unrelated rb_memory_view_entry_t and use a bad release function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to free zombies after finalization calls?

diff --git a/gc.c b/gc.c
index 0c739ba709b..47f5cfc5b2a 100644
--- a/gc.c
+++ b/gc.c
@@ -4069,12 +4069,10 @@ run_final(rb_objspace_t *objspace, VALUE zombie)
 static void
 finalize_list(rb_objspace_t *objspace, VALUE zombie)
 {
+    VALUE next_zombie;
     while (zombie) {
-        VALUE next_zombie;
-        struct heap_page *page;
         asan_unpoison_object(zombie, false);
         next_zombie = RZOMBIE(zombie)->next;
-        page = GET_HEAP_PAGE(zombie);
 
 	run_final(objspace, zombie);
 
@@ -4084,20 +4082,28 @@ finalize_list(rb_objspace_t *objspace, VALUE zombie)
             if (FL_TEST(zombie, FL_SEEN_OBJ_ID)) {
                 obj_free_object_id(objspace, zombie);
             }
-
-            GC_ASSERT(heap_pages_final_slots > 0);
-            GC_ASSERT(page->final_slots > 0);
-
-            heap_pages_final_slots--;
-            page->final_slots--;
-            page->free_slots++;
-            heap_page_add_freeobj(objspace, page, zombie);
-            objspace->profile.total_freed_objects++;
         }
         RB_VM_LOCK_LEAVE();
 
         zombie = next_zombie;
     }
+
+    RB_VM_LOCK_ENTER();
+    while (zombie) {
+        struct heap_page *page;
+        next_zombie = RZOMBIE(zombie)->next;
+        page = GET_HEAP_PAGE(zombie);
+        GC_ASSERT(heap_pages_final_slots > 0);
+        GC_ASSERT(page->final_slots > 0);
+
+        heap_pages_final_slots--;
+        page->final_slots--;
+        page->free_slots++;
+        heap_page_add_freeobj(objspace, page, zombie);
+        objspace->profile.total_freed_objects++;
+        zombie = next_zombie;
+    }
+    RB_VM_LOCK_LEAVE();
 }
 
 static void

Copy link
Member Author

@XrXr XrXr Oct 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that fixes the crash on my machine. I was thinking of adding a rb_memory_view_entry_t * to rb_memory_view_t to avoid the lookup in rb_memory_view_release() but maybe with this change that won't be necessary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

zombie must be reset before the second loop.

diff --git a/gc.c b/gc.c
index bd3915fb470..c0bcbfb2a5b 100644
--- a/gc.c
+++ b/gc.c
@@ -4068,12 +4068,10 @@ run_final(rb_objspace_t *objspace, VALUE zombie)
 static void
 finalize_list(rb_objspace_t *objspace, VALUE zombie)
 {
+    VALUE next_zombie, first_zombie = zombie;
     while (zombie) {
-        VALUE next_zombie;
-        struct heap_page *page;
         asan_unpoison_object(zombie, false);
         next_zombie = RZOMBIE(zombie)->next;
-        page = GET_HEAP_PAGE(zombie);
 
 	run_final(objspace, zombie);
 
@@ -4083,20 +4081,29 @@ finalize_list(rb_objspace_t *objspace, VALUE zombie)
             if (FL_TEST(zombie, FL_SEEN_OBJ_ID)) {
                 obj_free_object_id(objspace, zombie);
             }
-
-            GC_ASSERT(heap_pages_final_slots > 0);
-            GC_ASSERT(page->final_slots > 0);
-
-            heap_pages_final_slots--;
-            page->final_slots--;
-            page->free_slots++;
-            heap_page_add_freeobj(objspace, page, zombie);
-            objspace->profile.total_freed_objects++;
         }
         RB_VM_LOCK_LEAVE();
 
         zombie = next_zombie;
     }
+
+    zombie = first_zombie;
+    RB_VM_LOCK_ENTER();
+    while (zombie) {
+        struct heap_page *page;
+        next_zombie = RZOMBIE(zombie)->next;
+        page = GET_HEAP_PAGE(zombie);
+        GC_ASSERT(heap_pages_final_slots > 0);
+        GC_ASSERT(page->final_slots > 0);
+
+        heap_pages_final_slots--;
+        page->final_slots--;
+        page->free_slots++;
+        heap_page_add_freeobj(objspace, page, zombie);
+        objspace->profile.total_freed_objects++;
+        zombie = next_zombie;
+    }
+    RB_VM_LOCK_LEAVE();
 }
 
 static void

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

zombie must be reset before the second loop.

I tried the new patch and it also fixes the crash.

@eregon
Copy link
Member

eregon commented Oct 25, 2021

The finalizer of Fiddle::memory_view uses rb_memory_view_release(),
which traverses the super chain to find the memory view entry. Trouble
is, while finalizing object x, it's unsafe to access any object other
than x. Depending on finalization order, other objects might have been
finalized and released already.

Shouldn't this work fine if the GC is correct and consider references that the finalizer can access?
I think that's how it'd work e.g. on JVM.

Also accessing x while it's being finalized is a problem in itself (it would never be finalized, AFAIK also an issue on CRuby). So finalizers always need to capture just what they need, and not the object itself.

@XrXr
Copy link
Member Author

XrXr commented Oct 25, 2021

Shouldn't this work fine if the GC is correct and consider references that the finalizer can access?

Sorry, I'm using the terminology loosely here. I think the normal Ruby finalizers does allow for accessing other objects, but the callbacks I'm talking about in the ticket are the ones for doing deallocation exposed in the C API. These callbacks have different constraints from Ruby level finalizers.

@eregon
Copy link
Member

eregon commented Oct 27, 2021

Rereading a bit more about the issue, maybe Fiddle::MemoryView shouldn't try to autorelease (or if it does use a Ruby-level finalizer)?
I think autorelease is anyway a bad pattern, efficient resource handling requires explicitly releasing (e.g., at the end of a block) and not waiting for the GC to do it much later if ever.

@kou
Copy link
Member

kou commented Oct 28, 2021

I think autorelease is anyway a bad pattern, efficient resource handling requires explicitly releasing (e.g., at the end of a block) and not waiting for the GC to do it much later if ever.

ruby/fiddle#80 is for it.

@sergiodj

This comment has been minimized.

@XrXr

This comment has been minimized.

@mrkn
Copy link
Member

mrkn commented Nov 6, 2021

I consulted with @ko1 about how to fix this issue yesterday.
As the consequence, we found that we can stop looking up memory view entry in class at rb_memory_view_release.
I'm trying this approach in #5088.

@XrXr
Copy link
Member Author

XrXr commented Nov 6, 2021

Avoiding the lookup altogether sounds good to me since I also had the same idea: #5012 (comment) :). Closing this in favor of #5088.

@XrXr XrXr closed this Nov 6, 2021
@XrXr XrXr deleted the memview-bug branch November 6, 2021 16:20
mrkn added a commit to mrkn/ruby that referenced this pull request Nov 6, 2021
To stop looking up the memory_view_entry from the class of the original
object of a memory_view in rb_memory_view_release, a pointer of
rb_memory_view_entry_t is added as a member of rb_memory_view_t.

Fix rubyGH-5012
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

6 participants