Fix Io\Poll memory-safety issues by iliaal · Pull Request #22316 · php/php-src

iliaal · 2026-06-15T10:58:21Z

Fixes memory-safety bugs in the new Io\Poll API, found by review and confirmed under valgrind.

Use-after-free: a Watcher kept a raw pointer to its Context's poll context with no reference, so dropping the Context while still holding a Watcher made remove()/modify() touch freed memory. The Context now clears its watchers (active=false, poll_ctx=NULL) before destruction, so those calls throw InactiveWatcherException.
Descriptor leak: StreamPollHandle referenced the stream resource in its constructor but never released it. Released in the handle cleanup.
Missing get_gc on Watcher and Context, so cycles through Watcher::$data leaked. Added for both.
clone of a Context/Watcher/StreamPollHandle went through the default handler, which copied the backing poll context and watcher map by pointer and double-freed them. All three are now uncloneable.
Calling __construct() twice on a Context or StreamPollHandle replaced the backing state without releasing the first, leaking it. Now throws if already constructed.
add()/modify()/remove()/wait() accepted a NULL ctx and forwarded it to php_poll_set_error(), which dereferenced it. The userland layer already gates on an active context before reaching the C API, so these now assert a non-NULL ctx.

Tests cover the clone guard, the double-construct guard, the watcher-outlives-context UAF, the fd release, and the get_gc cycle.

The wait()-error recording and the kqueue buffer cap from the original PR were split into #22326 and #22327.

devnexen · 2026-06-15T11:31:30Z

 					/* New FD, create new event */
-					ZEND_ASSERT(unique_events < max_events);
+					if (unique_events >= max_events) {
+						continue;


it seems continue also skips the oneshot bookkeeping below it, so any oneshot fd dropped past the cap leaves the backend's tracking desynced. Gate only the buffer write on unique_events < max_events and still fall through to the tracking.

Need to dig into it

Right, the continue also skipped the oneshot tracking. Gated only the buffer write now, so the bookkeeping runs regardless of the cap.

devnexen · 2026-06-15T11:39:28Z

+		php_poll_set_current_errno_error(ctx);
+		return -1;
+	}
+	if (nfds == 0) {


nit: I guess you can just merge as if (...) else if (...) here wdyt?

iliaal · 2026-06-15T12:44:58Z

@ndossche fixes pushed: clone now throws (Context/Watcher/StreamPollHandle marked uncloneable), double __construct throws instead of leaking, and php_poll_set_error() is NULL-safe so the add/modify/remove/wait guards no longer deref a NULL ctx. get_gc and the watcher-after-context-destroy UAF were already in the PR with tests (poll_watcher_gc_cycle, poll_watcher_outlives_context).

Couldn't reproduce the getAvailableBackends() foreach UAF under valgrind on Linux (poll and epoll), even mutating the array mid-iteration. Which platform/backend and build (ASAN?) did you hit it on?

ndossche · 2026-06-15T12:46:28Z

Couldn't reproduce the getAvailableBackends() foreach UAF under valgrind on Linux (poll and epoll), even mutating the array mid-iteration. Which platform/backend and build (ASAN?) did you hit it on?

See Slack, I already fixed that yesterday ;)

I'll have a review look at this PR this evening.

ndossche

Most of this seems right and straight-forward

ndossche · 2026-06-15T18:30:59Z

 /* Create new poll context */
 PHPAPI php_poll_ctx *php_poll_create_by_name(const char *preferred_backend, uint32_t flags)
 {
+	if (!preferred_backend) {


Is this defensive NULL check necessary? We could just make it part of the API contract that preferred_backend must not be NULL, which seems reasonable to me.

ndossche · 2026-06-15T18:31:55Z

 static inline void php_poll_set_error(php_poll_ctx *ctx, php_poll_error error)
 {
-	ctx->last_error = error;
+	if (ctx) {


This was added for the inverted NULL on the {add,modify,remove,...} APIs. But I wonder whether it should check for a NULL ctx in the first place. We could make it part of the API contract that they should not be called on a NULL ctx.

Agreed, it's all new code so no reason to keep the NULL tolerance. Made non-NULL ctx the contract: ZEND_ASSERT(ctx) in set_max_events_hint, add, modify, remove, wait, and dropped the set_error guard.

ndossche · 2026-06-15T18:32:58Z

+	}
+
+	zend_get_gc_buffer_use(gc_buffer, table, n);
+	return zend_std_get_properties(obj);


Could just be return NULL due to the class being final and with no properties.

Done, returns NULL now (final class, no declared properties).

ndossche · 2026-06-15T18:33:03Z

+	}
+
+	zend_get_gc_buffer_use(gc_buffer, table, n);
+	return zend_std_get_properties(obj);


Could just be return NULL due to the class being final and with no properties.

ndossche · 2026-06-15T18:35:40Z

-					events[unique_events].revents = revents;
-					events[unique_events].data = data;
-					unique_events++;
+					if (unique_events < max_events) {


Doesn't this check risk silently ignoring some new events?

Yes for oneshot, no for level-triggered. Dropped level-triggered fds stay ready and re-fire on the next wait(). Oneshot is the gap: kevent() has already disarmed them, so one past the cap is consumed but never delivered.

It happens because grouped mode dequeues up to max_events * 2 kevents (headroom to coalesce a read+write pair) while the caller buffer only holds max_events. Capping the kevent() count to max_events makes unique_events unable to exceed the buffer, so nothing is consumed-and-dropped; the tradeoff is a read+write fd may split across two wait() calls instead of coalescing. Want that? I can't valgrind kqueue here (no macOS), so it'd ride on macOS CI.

I need to think about this one - not sure if this is the right solution.

Can you separate to its own PR.

Makes sense, pulled the buffer cap into #22327 with the oneshot-bookkeeping change that goes with it. Out of scope for this one, and we can settle the right approach there.

bukka · 2026-06-15T21:34:43Z

 			events[i].revents = epoll_events_from_native(backend_data->events[i].events);
 			events[i].data = backend_data->events[i].data.ptr;
 		}
+	} else if (nfds < 0) {


hmm I remember that I was setting it in php_poll_wait but then dropped. I think it should be done as it makes sense. That said, this should apply to all backends I think.

Agreed it should be uniform. Pulled the wait()-error recording into its own PR, #22326, covering epoll, poll and kqueue (eventport and wsapoll already record). Kept it per-backend rather than central in php_poll_wait because wsapoll sets its error from WSAGetLastError(), not errno, so a central php_poll_set_current_errno_error() would clobber the Windows error.

bukka · 2026-06-15T21:36:35Z

Can you please sepearate the kqueue thing to itw own PR and the nfds checks as well as the first one needs more checking and the second one needs some other work and would like to have it done in a different commit. The rest looks fine.

Several memory-safety issues in the new Io\Poll API, found by review and confirmed under valgrind: - Watcher kept a raw pointer to its Context's php_poll_ctx with no reference, so dropping the Context while holding a Watcher left remove()/modify() dereferencing freed memory (use-after-free). The Context now neutralizes its watchers (active=false, poll_ctx=NULL) before it is destroyed, so those calls throw InactiveWatcherException. - StreamPollHandle took a reference on the stream resource in the constructor but never released it, leaking the descriptor for the rest of the request. Store the zend_resource and release it in the handle cleanup; the php_stream may already be freed by then (e.g. the user closed it), so the cleanup must not dereference it. - Watcher and Context had no get_gc handler, so reference cycles through Watcher::$data were uncollectable. Add get_gc for both. - Context, Watcher and StreamPollHandle were cloneable through the default handler, which shallow-copied the backing php_poll_ctx and the watcher map by pointer and double-freed them on destruction. Mark all three uncloneable. - Calling __construct() a second time on a Context or StreamPollHandle replaced the backing context or handle data without releasing the first, leaking it. Throw if the object is already constructed. - The add(), modify(), remove() and wait() entry points accepted a NULL ctx and forwarded it to php_poll_set_error(), which dereferenced it. The userland layer already gates on an active context before reaching the C API, so assert a non-NULL ctx in those entry points instead. Closes phpGH-22316

iliaal requested a review from bukka as a code owner June 15, 2026 10:58

iliaal mentioned this pull request Jun 15, 2026

Fix Io\Poll memory-safety and error-handling issues iliaal/php-src#85

Closed

github-actions Bot added the Extension: standard label Jun 15, 2026

iliaal requested a review from ndossche June 15, 2026 10:58

devnexen reviewed Jun 15, 2026

View reviewed changes

iliaal force-pushed the fix/io-poll-memory-safety branch from ef34a6e to 71ddf7d Compare June 15, 2026 12:41

ndossche reviewed Jun 15, 2026

View reviewed changes

iliaal force-pushed the fix/io-poll-memory-safety branch from 71ddf7d to dc8d0ce Compare June 15, 2026 20:59

bukka reviewed Jun 15, 2026

View reviewed changes

iliaal force-pushed the fix/io-poll-memory-safety branch from dc8d0ce to 64c39ca Compare June 15, 2026 22:03

This was referenced Jun 15, 2026

main/poll: Record wait() error on every backend #22326

Open

main/poll: Cap kqueue grouped-event buffer write at runtime #22327

Open

iliaal changed the title ~~Fix Io\Poll memory-safety and error-handling issues~~ Fix Io\Poll memory-safety issues Jun 15, 2026

iliaal requested review from bukka and ndossche June 16, 2026 00:14

ndossche approved these changes Jun 16, 2026

View reviewed changes

Conversation

iliaal commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

iliaal commented Jun 15, 2026

Uh oh!

ndossche commented Jun 15, 2026

Uh oh!

ndossche left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

iliaal Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bukka commented Jun 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

iliaal commented Jun 15, 2026 •

edited

Loading

iliaal Jun 15, 2026 •

edited

Loading