Skip to content

Conversation

@bcardiff
Copy link
Member

If the segfault handler is executed in a thread the runtime knows nothing, Fiber.current will be not initialized, leading to a GC allocation which is not allowed.

This is discovered in bcardiff/crystal-fswatch#5 (comment)

Without this changes an unknown thread, in this case it was one created in darwin via dispatch_queue_create will block:

    2566 Thread_5240282   DispatchQueue_23: fswatch_event_queue  (serial)
      2566 start_wqthread  (in libsystem_pthread.dylib) + 15  [0x7ff8106e8843]
        2566 _pthread_wqthread  (in libsystem_pthread.dylib) + 298  [0x7ff8106e9861]
          2566 _dispatch_workloop_worker_thread  (in libdispatch.dylib) + 688  [0x7ff8105514e3]
            2566 _dispatch_root_queue_drain_deferred_wlh  (in libdispatch.dylib) + 275  [0x7ff810551b96]
              2566 _dispatch_lane_invoke  (in libdispatch.dylib) + 382  [0x7ff810548dad]
                2566 _dispatch_lane_serial_drain  (in libdispatch.dylib) + 319  [0x7ff810548193]
                  2566 _dispatch_source_invoke  (in libdispatch.dylib) + 2207  [0x7ff8105547fa]
                    2566 _dispatch_continuation_pop  (in libdispatch.dylib) + 518  [0x7ff8105450c3]
                      2566 _dispatch_client_callout  (in libdispatch.dylib) + 6  [0x7ff8105579fc]
                        2566 receive_and_dispatch_rcv_msg  (in FSEvents) + 294  [0x7ff81a09fb00]
                          2566 FSEventsD2F_server  (in FSEvents) + 55  [0x7ff81a09bad7]
                            2566 _Xcallback_rpc  (in FSEvents) + 218  [0x7ff81a09bbce]
                              2566 implementation_callback_rpc  (in FSEvents) + 4862  [0x7ff81a09cf4f]
                                2566 fsw::fsevents_monitor::fsevents_callback(__FSEventStream const*, void*, unsigned long, void*, unsigned int const*, unsigned long long const*)  (in libfswatch.13.dylib) + 851  [0x1097ddbd5]
                                  2566 fsw::monitor::notify_events(std::vector<fsw::event> const&) const  (in libfswatch.13.dylib) + 1102  [0x1097d57d8]
                                    2566 libfsw_cpp_callback_proxy(std::vector<fsw::event> const&, void*)  (in libfswatch.13.dylib) + 476  [0x1097c8416]
                                      2566 ~procProc(Pointer(LibFSWatch::Cevent), UInt32, Pointer(Void), (Int32 | Nil))@src/session.cr:37  (in crystal-run-01_monitor_curdir.tmp) + 368  [0x1095ea930]  session.cr:41
                                        2566 *FSWatch::ThreadPortal(Slice(FSWatch::Event))@FSWatch::ThreadPortal(T)#send<Slice(FSWatch::Event)>:Int32  (in crystal-run-01_monitor_curdir.tmp) + 85  [0x1096986a5]  thread_portal.cr:25
                                          2566 *IO+@IO#write_bytes<Int32>:Nil  (in crystal-run-01_monitor_curdir.tmp) + 42  [0x109698a6a]  io.cr:916
                                            2566 *Int32@Int#to_io<IO+, IO::ByteFormat::LittleEndian:Module>:Nil  (in crystal-run-01_monitor_curdir.tmp) + 9  [0x1095ff579]  int.cr:845
                                              2566 *IO::ByteFormat::LittleEndian::encode<Int32, IO+>:Nil  (in crystal-run-01_monitor_curdir.tmp) + 724  [0x109668dd4]  byte_format.cr:123
                                                2566 *IO::FileDescriptor+@IO::Buffered#write<Slice(UInt8)>:Nil  (in crystal-run-01_monitor_curdir.tmp) + 175  [0x10964d7df]  buffered.cr:147
                                                  2566 *IO::FileDescriptor+@IO::FileDescriptor#unbuffered_write<Slice(UInt8)>:Nil  (in crystal-run-01_monitor_curdir.tmp) + 102  [0x10964aa56]  file_descriptor.cr:335
                                                    2566 *IO::FileDescriptor+@Crystal::System::FileDescriptor#system_write<Slice(UInt8)>:Int32  (in crystal-run-01_monitor_curdir.tmp) + 215  [0x10964ab67]  file_descriptor.cr:405
                                                      2566 *IO::FileDescriptor+@Crystal::System::FileDescriptor#event_loop:Crystal::EventLoop+  (in crystal-run-01_monitor_curdir.tmp) + 9  [0x10964a389]  file_descriptor.cr:47
                                                        2566 *Crystal::EventLoop::current:Crystal::EventLoop+  (in crystal-run-01_monitor_curdir.tmp) + 9  [0x10964df69]  event_loop.cr:43
                                                          2566 *Crystal::Scheduler::event_loop:Crystal::EventLoop+  (in crystal-run-01_monitor_curdir.tmp) + 9  [0x109652ef9]  scheduler.cr:25
                                                            2566 *Thread::current:Thread  (in crystal-run-01_monitor_curdir.tmp) + 9  [0x109652fb9]  thread.cr:211
                                                              2566 *Crystal::System::Thread::current_thread:Thread  (in crystal-run-01_monitor_curdir.tmp) + 54  [0x109653836]  pthread.cr:65
                                                                2566 *Thread::new:Thread  (in crystal-run-01_monitor_curdir.tmp) + 278  [0x1096530d6]  thread.cr:149
                                                                  2566 *Thread#initialize:Nil  (in crystal-run-01_monitor_curdir.tmp) + 81  [0x109653131]  thread.cr:152
                                                                    2566 *Fiber::new<Pointer(Void), Thread>:Fiber  (in crystal-run-01_monitor_curdir.tmp) + 97  [0x109651b21]  fiber.cr:137
                                                                      2566 *Fiber#initialize<Pointer(Void), Thread>:Nil  (in crystal-run-01_monitor_curdir.tmp) + 146  [0x109651bc2]  fiber.cr:150
                                                                        2566 *GC::current_thread_stack_bottom:Tuple(Pointer(Void), Pointer(Void))  (in crystal-run-01_monitor_curdir.tmp) + 17  [0x109618681]  boehm.cr:419
                                                                          2566 0x0
                                                                            2566 _sigtramp  (in libsystem_platform.dylib) + 29  [0x7ff81072531d]
                                                                              2566 ~procProc(Int32, Pointer(LibC::SiginfoT), Pointer(Void), Nil)@/usr/local/Cellar/crystal/1.19.1/share/crystal/src/crystal/system/unix/signal.cr:180  (in crystal-run-01_monitor_curdir.tmp) + 32  [0x1095e9e10]  signal.cr:193
                                                                                2566 *Fiber::current:Fiber  (in crystal-run-01_monitor_curdir.tmp) + 9  [0x109651cb9]  fiber.cr:211
                                                                                  2566 *Thread::current:Thread  (in crystal-run-01_monitor_curdir.tmp) + 9  [0x109652fb9]  thread.cr:211
                                                                                    2566 *Crystal::System::Thread::current_thread:Thread  (in crystal-run-01_monitor_curdir.tmp) + 54  [0x109653836]  pthread.cr:65
                                                                                      2566 *Thread::new:Thread  (in crystal-run-01_monitor_curdir.tmp) + 18  [0x109652fd2]  thread.cr:149
                                                                                        2566 __crystal_malloc64  (in crystal-run-01_monitor_curdir.tmp) + 17  [0x1095e3b91]  gc.cr:24
                                                                                          2566 *GC::malloc<UInt64>:Pointer(Void)  (in crystal-run-01_monitor_curdir.tmp) + 59  [0x10961859b]  boehm.cr:193
                                                                                            2566 GC_malloc_kind_global  (in libgc.1.5.5.dylib) + 49  [0x10980f745]
                                                                                              2566 _pthread_mutex_firstfit_lock_slow  (in libsystem_pthread.dylib) + 217  [0x7ff8106e85a1]
                                                                                                2566 _pthread_mutex_firstfit_lock_wait  (in libsystem_pthread.dylib) + 78  [0x7ff8106ea758]
                                                                                                  2566 __psynch_mutexwait  (in libsystem_kernel.dylib) + 10  [0x7ff8106ace72]

After this change we will get a crash with a stacktrace

No current fiber found while handling segfault. Probable unregistered thread
Invalid memory access (signal 11) at address 0x20
[0x1024e173c] *Exception::CallStack::print_backtrace:Nil +112 in /Users/bcardiff/.cache/crystal/crystal-run-issue-5.tmp
[0x1024b6bb4] ~procProc(Int32, Pointer(LibC::SiginfoT), Pointer(Void), Nil)@/Users/bcardiff/Projects/crystal/master/src/crystal/system/unix/signal.cr:180 +288 in /Users/bcardiff/.cache/crystal/crystal-run-issue-5.tmp
[0x185c1d6a4] _sigtramp +56 in /usr/lib/system/libsystem_platform.dylib
[0x10277ba3c] GC_get_my_stackbottom +56 in /opt/homebrew/Cellar/bdw-gc/8.2.10/lib/libgc.1.5.5.dylib
[0x1024e3524] *GC::current_thread_stack_bottom:Tuple(Pointer(Void), Pointer(Void)) +20 in /Users/bcardiff/.cache/crystal/crystal-run-issue-5.tmp
[0x1024f9d60] *Fiber#initialize<Pointer(Void), Thread>:Nil +120 in /Users/bcardiff/.cache/crystal/crystal-run-issue-5.tmp
[0x1024f9cd8] *Fiber::new<Pointer(Void), Thread>:Fiber +116 in /Users/bcardiff/.cache/crystal/crystal-run-issue-5.tmp
[0x1024faee8] *Thread#initialize:Nil +84 in /Users/bcardiff/.cache/crystal/crystal-run-issue-5.tmp
[0x1024fae84] *Thread::new:Thread +220 in /Users/bcardiff/.cache/crystal/crystal-run-issue-5.tmp
[0x1024fb6cc] *Crystal::System::Thread::current_thread:Thread +64 in /Users/bcardiff/.cache/crystal/crystal-run-issue-5.tmp
[0x1024fada0] *Thread::current:Thread +12 in /Users/bcardiff/.cache/crystal/crystal-run-issue-5.tmp
[0x102500510] *Crystal::Scheduler::event_loop:Crystal::EventLoop+ +12 in /Users/bcardiff/.cache/crystal/crystal-run-issue-5.tmp
[0x102501794] *Crystal::EventLoop::current:Crystal::EventLoop+ +12 in /Users/bcardiff/.cache/crystal/crystal-run-issue-5.tmp
[0x102531cc4] *IO::FileDescriptor+@Crystal::System::FileDescriptor#event_loop:Crystal::EventLoop+ +12 in /Users/bcardiff/.cache/crystal/crystal-run-issue-5.tmp
[0x102532538] *IO::FileDescriptor+@Crystal::System::FileDescriptor#system_write<Slice(UInt8)>:Int32 +232 in /Users/bcardiff/.cache/crystal/crystal-run-issue-5.tmp
[0x10253240c] *IO::FileDescriptor+@IO::FileDescriptor#unbuffered_write<Slice(UInt8)>:Nil +116 in /Users/bcardiff/.cache/crystal/crystal-run-issue-5.tmp
[0x10253522c] *IO::FileDescriptor+@IO::Buffered#write<Slice(UInt8)>:Nil +156 in /Users/bcardiff/.cache/crystal/crystal-run-issue-5.tmp
[0x102536258] *IO::ByteFormat::LittleEndian::encode<Int32, IO+>:Nil +884 in /Users/bcardiff/.cache/crystal/crystal-run-issue-5.tmp
[0x1024cf90c] *Int32@Int#to_io<IO+, IO::ByteFormat::LittleEndian:Module>:Nil +12 in /Users/bcardiff/.cache/crystal/crystal-run-issue-5.tmp
[0x102585204] *IO+@IO#write_bytes<Int32>:Nil +52 in /Users/bcardiff/.cache/crystal/crystal-run-issue-5.tmp
[0x10259a154] *FSWatch::ThreadPortal(Slice(FSWatch::Event))@FSWatch::ThreadPortal(T)#send<Slice(FSWatch::Event)>:Int32 +96 in /Users/bcardiff/.cache/crystal/crystal-run-issue-5.tmp
[0x1024b7de8] ~procProc(Pointer(LibFSWatch::Cevent), UInt32, Pointer(Void), (Int32 | Nil))@src/session.cr:45 +364 in /Users/bcardiff/.cache/crystal/crystal-run-issue-5.tmp
[0x102725880] _ZL25libfsw_cpp_callback_proxyRKNSt3__16vectorIN3fsw5eventENS_9allocatorIS2_EEEEPv +420 in /opt/homebrew/Cellar/fswatch/1.18.3/lib/libfswatch.13.dylib
[0x102733c50] _ZNK3fsw7monitor13notify_eventsERKNSt3__16vectorINS_5eventENS1_9allocatorIS3_EEEE +1048 in /opt/homebrew/Cellar/fswatch/1.18.3/lib/libfswatch.13.dylib
[0x10273c408] _ZN3fsw16fsevents_monitor17fsevents_callbackEPK15__FSEventStreamPvmS4_PKjPKy +756 in /opt/homebrew/Cellar/fswatch/1.18.3/lib/libfswatch.13.dylib
[0x18f411bb0] implementation_callback_rpc +3696 in /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/FSEvents.framework/Versions/A/FSEvents
[0x18f410cb8] _Xcallback_rpc +220 in /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/FSEvents.framework/Versions/A/FSEvents
[0x18f410bac] FSEventsD2F_server +68 in /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/FSEvents.framework/Versions/A/FSEvents
[0x18f4145bc] receive_and_dispatch_rcv_msg +340 in /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/FSEvents.framework/Versions/A/FSEvents
[0x185a4585c] _dispatch_client_callout +16 in /usr/lib/system/libdispatch.dylib
[0x185a305e0] _dispatch_continuation_pop +596 in /usr/lib/system/libdispatch.dylib
[0x185a43620] _dispatch_source_latch_and_call +396 in /usr/lib/system/libdispatch.dylib
[0x185a422f8] _dispatch_source_invoke +844 in /usr/lib/system/libdispatch.dylib
[0x185a341b8] _dispatch_lane_serial_drain +332 in /usr/lib/system/libdispatch.dylib
[0x185a34e2c] _dispatch_lane_invoke +388 in /usr/lib/system/libdispatch.dylib
[0x185a3f264] _dispatch_root_queue_drain_deferred_wlh +292 in /usr/lib/system/libdispatch.dylib
[0x185a3eae8] _dispatch_workloop_worker_thread +540 in /usr/lib/system/libdispatch.dylib
[0x185bdfe20] _pthread_wqthread +292 in /usr/lib/system/libsystem_pthread.dylib

If the segfault handler is executed in a thread the runtime knows nothing, Fiber.current will be not initialized, leading to a GC allocation which is not allowed.
Co-authored-by: Julien Portalier <julien@portalier.com>
@ysbaddaden ysbaddaden added this to the 1.20.0 milestone Jan 26, 2026
@straight-shoota straight-shoota changed the title Lenient Fiber.current access on segfault handler Lenient Fiber.current access on segfault handler Jan 26, 2026
@bcardiff
Copy link
Member Author

Something I noticed is that when a signal is raised from a worker thread of the gc, that is flagger as "No current fiber found while handling segfault. Probable unregistered thread" which is correct yet maybe misleading since the user is doing nothing wrong with threads necessarily in that situation. Not sure how to identify a worker thread programatically. I tried to get the threads' name but I got nothing...

@straight-shoota
Copy link
Member

Perhaps we could check Thread.threads to see if the thread was even created by the Crystal runtime?
And if not, do something else...?

@bcardiff
Copy link
Member Author

GC worker threads are not created by Crystal directly. They won't be in Thread.threads.

I don't see how your idea leads to differentiate GC workers threads from other non-crystal created threads

@straight-shoota
Copy link
Member

No I only want to differentiate Crystal threads which are expected to have a current fiber, and other threads (including ones from the GC) which don't.

@ysbaddaden
Copy link
Collaborator

Is the message actually useful? I think it's more confusing that anything.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants