Open
Conversation
* Add worker thread pool for high-throughput Python operations Implement a general-purpose worker thread pool that eliminates per-request GIL acquisition overhead. Each worker holds the GIL (or has its own subinterpreter with OWN_GIL on Python 3.12+) and processes requests from a shared MPSC queue. Key features: - Sync API: call, apply, eval, exec, asgi_run, wsgi_run - Async API: all *_async variants returning request_id for non-blocking calls - await/1,2 for waiting on async results - Per-worker module caching to avoid reimport overhead - Support for FREE_THREADED (3.13+), SUBINTERP (3.12+), and FALLBACK modes * Fix eval locals_term initialization and add benchmark results - Fix potential crash when locals_term is uninitialized (check for 0) - Add benchmark results directory with baseline comparisons Known issue: ~0.5-1% of concurrent sync calls may timeout under high load (100+ concurrent callers). Async API unaffected. * Fix two race conditions in worker pool 1. Use-after-free on request_id: Save request_id BEFORE enqueueing the request to the worker pool. Once enqueued, a worker can process and free the request at any time. Accessing req->request_id after py_pool_enqueue() is undefined behavior. 2. Double-free of msg_env: After a successful enif_send(), the message environment is consumed/invalidated by the Erlang runtime. We must set req->msg_env = NULL to prevent py_pool_request_free() from calling enif_free_env() on an already-freed environment. These bugs caused ~0.5-1% of concurrent calls to timeout under high load because request IDs could be corrupted, leading to message/response mismatch. Also adds debug counters (responses_sent, responses_failed) to pool stats for monitoring send success rate. * Fix worker pool ASGI to use hornbeam run_asgi interface Changed py_pool_process_asgi to call run_asgi(module_name, callable_name, scope, body) instead of run(app, scope, body), matching hornbeam's hornbeam_asgi_runner interface. Also updated extract_asgi_response to handle both dict and tuple return formats, supporting hornbeam's dict-based response. * Add py_resource_pool and subinterpreter support with mutex locking - Add compile-time detection of PyInterpreterConfig_OWN_GIL (Python 3.12+) - Add mutex to py_subinterp_worker_t for thread-safe parallel access - Add nif_subinterp_asgi_run for ASGI on subinterpreters - Add py_resource_pool module with lock-free round-robin scheduling - Benchmark shows 8-10x improvement with subinterpreters enabled * Implement process-per-context architecture with reentrant callbacks Replace worker pool with process-per-context model where each Python context is owned by a dedicated Erlang process. Enables reentrant callbacks via suspension-based mechanism without deadlock. - Add py_context.erl with recursive receive pattern for inline callback handling - Add py_context_router.erl for scheduler-affinity based routing - Add nif_context_resume for Python replay with cached callback results - Support sequential callbacks via callback_results array accumulation - Remove old pool modules (py_pool, py_worker, py_worker_pool, etc.) * Fix timeout handling and add contexts_started helper - Pass timeout parameter through py:eval/3 and do_call/5 - Add py:contexts_started/0 and py_context_router:is_started/0 - Fix test_timeout to use time.sleep for reliable delay - Fix thread callback suite to check existing contexts * Fix thread worker handlers not re-registering after app restart When the application restarts, py_thread_handler registers as the new coordinator, but existing thread workers in the NIF-level pool still had has_handler=true from the previous run. This caused them to skip spawning new handler processes and write to dead pipes. Reset has_handler=false on all existing workers when a new coordinator is registered. * Fix subinterpreter cleanup and thread worker re-registration Two fixes: 1. suspended_context_state_destructor: For subinterpreters with OWN_GIL, use PyThreadState_Swap to switch to the correct interpreter before releasing Python objects. PyGILState_Ensure only works for the main interpreter and causes memory corruption with subinterpreter objects. 2. thread_worker_set_coordinator: Reset has_handler=false on all existing workers when a new coordinator registers (e.g., after app restart). Old workers kept has_handler=true but their handler processes were dead. * Unify erlang Python module with callback and event loop API - Rename priv/erlang/ to priv/_erlang_impl/ to avoid C module shadowing - Add _extend_erlang_module() helper in py_callback.c to re-export Python package functions (run, new_event_loop, EventLoopPolicy, etc.) - Update py_event_loop.erl to call extension during initialization - Delete buggy erlang_asyncio.py (blocking sleep replaced by proper asyncio.sleep backed by Erlang timers via call_later) - Add test infrastructure in priv/tests/ for event loop integration The unified erlang module now provides uvloop-compatible API: - erlang.run(coro) - run async code with Erlang event loop - erlang.new_event_loop() - create ErlangEventLoop instance - erlang.install() - install ErlangEventLoopPolicy (deprecated 3.12+) - erlang.call() / erlang.async_call() - call Erlang functions - asyncio.sleep() works via Erlang timers * Fix tests to use erlang.run() instead of removed erlang_asyncio module - Update py_erlang_sleep_SUITE to use erlang.run() with standard asyncio instead of the removed erlang_asyncio module - Skip py_asyncio_compat_SUITE: tests create standalone ErlangEventLoop instances via erlang.new_event_loop() and call loop.run_forever(). Timer scheduling for standalone loops needs work - timers fire immediately instead of after the scheduled delay. * Fix timer scheduling for standalone ErlangEventLoop instances - Add isolated parameter to ErlangEventLoop.__init__() that creates a per-loop capsule via _loop_new() for proper event routing - Update all loop methods (call_at, _run_once, stop, close, add_reader, remove_reader, add_writer, remove_writer) to use per-loop capsule APIs when running as isolated instance - new_event_loop() now passes isolated=True by default - Fix run_forever() to honor stop() called before run_forever() by not resetting _stopping flag at start - Simplify async_test_runner to run tests synchronously without erlang.run() wrapper, avoiding nested event loop issues - Add timeout fallback to test_add_remove_writer to prevent hanging - Remove skip from py_asyncio_compat_SUITE to enable tests Test results: 46 tests run, 42 passed, 4 failures (edge cases) * Replace async worker pthread backend with event loop model The pthread+usleep polling async workers have been replaced with an event-driven model using py_event_loop and enif_select: - Add _run_and_send wrapper in Python for result delivery via erlang.send() - Add nif_event_loop_run_async NIF for direct coroutine submission - Add py_event_loop:run_async/2 Erlang API - Add py_event_loop_pool.erl for managing event loop-based async execution - Rewrite py_async_pool.erl to delegate to event_loop_pool - Update supervisor tree to include py_event_loop_pool - Remove py_async_worker.erl and py_async_worker_sup.erl - Stub deprecated async_worker NIFs to return errors - Remove async_event_loop_thread and async_future_callback C code Performance improvements: - Latency: ~10-20ms polling -> <1ms (enif_select) - CPU idle: 100 wakeups/sec -> Zero - Threads: N pthreads -> 0 extra threads API unchanged: py:async_call/3,4 and py:await/1,2 work the same. * Remove global state from py_event_loop.c for per-interpreter isolation Replace global variables with module state structure stored in the Python module, enabling proper per-interpreter/per-context event loop isolation. Changes: - Add py_event_loop_module_state_t struct containing event_loop, shared_router, shared_router_valid, and isolation_mode - Update PyModuleDef to allocate module state (m_size) - Update get_interpreter_event_loop() to read from module state - Update set_interpreter_event_loop() to write to module state - Update nif_set_python_event_loop() to use module state - Update nif_set_isolation_mode() to use module state - Update nif_set_shared_router() to use module state - Update py_get_isolation_mode() to read from module state - Update py_loop_new() to read shared_router from module state - Update event_loop_destructor() to clear module state - Update create_default_event_loop() to use module state - Remove g_python_event_loop, g_shared_router, g_shared_router_valid, and g_isolation_mode global variables * Fix py_asyncio_compat_SUITE tests and consolidate erlang module - Remove erlang_loop.py, use _erlang_impl as the single implementation - Add get_event_loop_policy() export to _erlang_impl and erlang module - Fix signal tests: ErlangEventLoop has limited signal support (SIGINT, SIGTERM, SIGHUP only), other signals raise ValueError - Skip subprocess tests for Erlang (not yet implemented) - Update all imports to use erlang module (public API) with _erlang_impl as internal fallback - Update docs and examples to use erlang module imports * Fix unawaited coroutine warnings in tests - test_run_until_complete_nested_raises: Use asyncio.sleep(0.1) to ensure timer path (not fast path), properly close coroutine in finally block - test_run_until_complete_on_closed_raises: Store coroutine in variable and close it in finally block - tearDown: Cancel pending tasks and shutdown async generators before closing loop to prevent resource leaks - Add test_asyncio_sleep_zero_fast_path: Verify sleep(0) uses fast path - test_add_remove_writer: Use socketpair for reliable write readiness * Fix FD stealing and UDP connected socket issues - Share fd_resource per fd to prevent enif_select stealing errors - Add NIF functions for fd resource management - Use send() instead of sendto() for connected UDP sockets - Fix TCP EOF handling to call connection_lost properly * Fix context test expectations for Python contextvars behavior await coro() runs in shared context (changes visible to caller), while create_task(coro()) runs in copied context (changes isolated). Updated test_context_in_task and test_multiple_context_vars to reflect correct Python behavior. * Remove subprocess support from ErlangEventLoop Subprocess is not supported because Python's subprocess module uses fork() which corrupts the Erlang VM when called from within the NIF. Users should use Erlang ports directly via erlang.call() instead, which provides superior subprocess management with built-in supervision, monitoring, and fault tolerance. Changes: - Replace _subprocess.py with NotImplementedError stub and docs - Remove subprocess event handling from _loop.py - Remove subprocess functions from py_event_loop.c - Update tests to verify NotImplementedError is raised - Set HAS_SUBPROCESS_SUPPORT = False in test base * Add ETF encoding for pids/refs and fix executor/socket tests ETF encoding for pids and references: - Add decode_etf_string() helper in py_callback.c to convert __etf__:base64 encoded strings back to Erlang terms - Add ETF encoding in term_to_python_repr for pids and refs in py_context.erl and py_thread_handler.erl Test fixes: - Skip ProcessPoolExecutor test inside Erlang NIF (fork issues) - Use 'spawn' multiprocessing context instead of 'fork' - Accept OSError in addition to TimeoutError for connect timeout test Cleanup: - Remove obsolete multi_loop test files * Add erlang.reactor module for fd-based protocol handling Implement low-level fd-based API where Erlang handles I/O scheduling via enif_select and Python handles protocol logic. - Add priv/_erlang_impl/_reactor.py with Protocol base class and registry - Add src/py_reactor_context.erl for Erlang reactor context process - Expose erlang.reactor via sys.modules for 'import erlang.reactor' syntax - Add test suite (py_reactor_SUITE.erl) with 6 tests - Add Python tests (py_test_reactor.py) with 3 tests - Add examples/reactor_echo.erl as usage example Works with any fd - TCP, UDP, Unix sockets, pipes, etc. * Add audit hook sandbox and remove signal support - Add _sandbox.py with Python audit hooks (PEP 578) to block dangerous operations: fork, exec, spawn, subprocess, os.system, os.popen - Install sandbox automatically when running inside Erlang VM - Remove signal handling support (not applicable in Erlang context) - Update policy to always return ErlangEventLoop - Fix ExecutionMode test to check correct enum values - Remove signal tests and AIO subprocess tests from test suite * Update CHANGELOG for unreleased changes since 1.8.1 * Add security and reactor documentation, update asyncio docs New documentation: - docs/security.md: Document audit hook sandbox, blocked operations (fork, exec, subprocess), and Erlang port alternatives - docs/reactor.md: Document erlang.reactor module for FD-based protocol handling with Protocol base class and examples Updated documentation: - docs/asyncio.md: Update for unified erlang module, mark erlang.install() as deprecated in 3.12+, add Limitations section for subprocess/signal handling, add ExecutionMode documentation - docs/getting-started.md: Add Security Considerations section, update asyncio section to use erlang.run() - README.md: Add security sandbox to features, add doc links Also fixed edoc errors in source files: - src/py_nif.erl: Fix angle bracket syntax in reactor function docs - src/py_context_router.erl: Replace markdown code blocks with <pre> * Rename call_async to cast and add benchmark API change: py:call_async/3,4 renamed to py:cast/3,4 following gen_server convention (call=sync, cast=async). Add benchmark_compare.erl for comparing performance between versions. Current version shows ~2-3x improvement over v1.8.1: - Sync calls: 0.011ms -> 0.004ms (2.9x faster) - Cast single: 0.011ms -> 0.004ms (2.8x faster) - Throughput: ~90K -> ~250K calls/sec * Add migration guide for v1.8.x to v2.0 Covers: - py:call_async -> py:cast rename - py:bind/unbind removal (use py_context_router) - py:ctx_* removal (use py_context directly) - erlang_asyncio -> erlang module consolidation - Subprocess removal (use Erlang ports) - Signal handler removal (use Erlang level) - New features: context router, reactor, erlang.send() - Performance comparison table * Add subinterpreter event loop isolation Each subinterpreter context now gets its own event worker for asyncio support. This ensures asyncio.sleep() and timers work correctly in subinterpreter contexts. Changes: - Add nif_context_get_event_loop/1 NIF to retrieve event loop reference - Create dedicated event worker per subinterpreter context in py_context - Extend erlang module with run/new_event_loop in each subinterpreter - Handle EXIT signals properly (shutdown from supervisor vs normal exits) - Initialize event loop for worker pool subinterpreters Worker mode contexts (Python < 3.12) continue to use the shared router. * Skip tests incompatible with subinterpreters test_memory_stats and test_reload use modules (tracemalloc) that don't support Python subinterpreters. Skip these tests when running with subinterpreter support enabled (Python 3.12+). * Add OWN_GIL subinterpreter support for true Python parallelism Subinterpreters with PyInterpreterConfig_OWN_GIL run in dedicated threads, each with its own GIL, enabling true parallel Python execution on Python 3.12+. Key changes: - Thread pool manages subinterpreter lifecycle and context switching - Atomic state machine for thread-safe subinterpreter state management - Support blocking callbacks in thread-model subinterpreters - ProcessError exception class lookup for correct identity in subinterpreters - Test adjustments for subinterpreter path isolation and error messages * Fix ASan/TSan LD_PRELOAD in CI * Simplify CI: ASan only, fix LD_PRELOAD for all steps * Remove debug counters step from ASan builds * Document OWN_GIL subinterpreter parallelism * Fix py_async_e2e_SUITE for subinterpreters - Use explicit context for all tests - Relax timing constraint (0.3s instead of 0.15s) for CI - Add diagnostic messages to assertions
- reactor_register_fd now calls dup() on the fd automatically - Sets owns_fd=true so the duplicated fd is closed on cleanup - Users can safely close Erlang's socket after handoff - Updated documentation to reflect automatic dup()
- py:reset_context/1 - Soft reset that clears namespace (keeps builtins/erlang) - py:reload_context/2 - Reload specified modules using importlib.reload() - NIF functions: context_reset, context_reload - Full test coverage in py_context_reset_SUITE (8 tests)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
py:reset_context/1- Soft reset that clears namespace (keeps builtins/erlang)py:reload_context/2- Reload specified modules using importlib.reload()context_reset,context_reloadAPI
Use Cases