Fix Io\Poll memory-safety issues#22316
Conversation
| /* New FD, create new event */ | ||
| ZEND_ASSERT(unique_events < max_events); | ||
| if (unique_events >= max_events) { | ||
| continue; |
There was a problem hiding this comment.
it seems continue also skips the oneshot bookkeeping below it, so any oneshot fd dropped past the cap leaves the backend's tracking desynced. Gate only the buffer write on unique_events < max_events and still fall through to the tracking.
There was a problem hiding this comment.
Right, the continue also skipped the oneshot tracking. Gated only the buffer write now, so the bookkeeping runs regardless of the cap.
| php_poll_set_current_errno_error(ctx); | ||
| return -1; | ||
| } | ||
| if (nfds == 0) { |
There was a problem hiding this comment.
nit: I guess you can just merge as if (...) else if (...) here wdyt?
ef34a6e to
71ddf7d
Compare
|
@ndossche fixes pushed: clone now throws (Context/Watcher/StreamPollHandle marked uncloneable), double Couldn't reproduce the |
See Slack, I already fixed that yesterday ;) I'll have a review look at this PR this evening. |
ndossche
left a comment
There was a problem hiding this comment.
Most of this seems right and straight-forward
| /* Create new poll context */ | ||
| PHPAPI php_poll_ctx *php_poll_create_by_name(const char *preferred_backend, uint32_t flags) | ||
| { | ||
| if (!preferred_backend) { |
There was a problem hiding this comment.
Is this defensive NULL check necessary? We could just make it part of the API contract that preferred_backend must not be NULL, which seems reasonable to me.
| static inline void php_poll_set_error(php_poll_ctx *ctx, php_poll_error error) | ||
| { | ||
| ctx->last_error = error; | ||
| if (ctx) { |
There was a problem hiding this comment.
This was added for the inverted NULL on the {add,modify,remove,...} APIs. But I wonder whether it should check for a NULL ctx in the first place. We could make it part of the API contract that they should not be called on a NULL ctx.
There was a problem hiding this comment.
Agreed, it's all new code so no reason to keep the NULL tolerance. Made non-NULL ctx the contract: ZEND_ASSERT(ctx) in set_max_events_hint, add, modify, remove, wait, and dropped the set_error guard.
| } | ||
|
|
||
| zend_get_gc_buffer_use(gc_buffer, table, n); | ||
| return zend_std_get_properties(obj); |
There was a problem hiding this comment.
Could just be return NULL due to the class being final and with no properties.
There was a problem hiding this comment.
Done, returns NULL now (final class, no declared properties).
| } | ||
|
|
||
| zend_get_gc_buffer_use(gc_buffer, table, n); | ||
| return zend_std_get_properties(obj); |
There was a problem hiding this comment.
Could just be return NULL due to the class being final and with no properties.
| events[unique_events].revents = revents; | ||
| events[unique_events].data = data; | ||
| unique_events++; | ||
| if (unique_events < max_events) { |
There was a problem hiding this comment.
Doesn't this check risk silently ignoring some new events?
There was a problem hiding this comment.
Yes for oneshot, no for level-triggered. Dropped level-triggered fds stay ready and re-fire on the next wait(). Oneshot is the gap: kevent() has already disarmed them, so one past the cap is consumed but never delivered.
It happens because grouped mode dequeues up to max_events * 2 kevents (headroom to coalesce a read+write pair) while the caller buffer only holds max_events. Capping the kevent() count to max_events makes unique_events unable to exceed the buffer, so nothing is consumed-and-dropped; the tradeoff is a read+write fd may split across two wait() calls instead of coalescing. Want that? I can't valgrind kqueue here (no macOS), so it'd ride on macOS CI.
There was a problem hiding this comment.
I need to think about this one - not sure if this is the right solution.
There was a problem hiding this comment.
Makes sense, pulled the buffer cap into #22327 with the oneshot-bookkeeping change that goes with it. Out of scope for this one, and we can settle the right approach there.
71ddf7d to
dc8d0ce
Compare
| events[i].revents = epoll_events_from_native(backend_data->events[i].events); | ||
| events[i].data = backend_data->events[i].data.ptr; | ||
| } | ||
| } else if (nfds < 0) { |
There was a problem hiding this comment.
hmm I remember that I was setting it in php_poll_wait but then dropped. I think it should be done as it makes sense. That said, this should apply to all backends I think.
There was a problem hiding this comment.
Agreed it should be uniform. Pulled the wait()-error recording into its own PR, #22326, covering epoll, poll and kqueue (eventport and wsapoll already record). Kept it per-backend rather than central in php_poll_wait because wsapoll sets its error from WSAGetLastError(), not errno, so a central php_poll_set_current_errno_error() would clobber the Windows error.
|
Can you please sepearate the kqueue thing to itw own PR and the nfds checks as well as the first one needs more checking and the second one needs some other work and would like to have it done in a different commit. The rest looks fine. |
Several memory-safety issues in the new Io\Poll API, found by review and confirmed under valgrind: - Watcher kept a raw pointer to its Context's php_poll_ctx with no reference, so dropping the Context while holding a Watcher left remove()/modify() dereferencing freed memory (use-after-free). The Context now neutralizes its watchers (active=false, poll_ctx=NULL) before it is destroyed, so those calls throw InactiveWatcherException. - StreamPollHandle took a reference on the stream resource in the constructor but never released it, leaking the descriptor for the rest of the request. Store the zend_resource and release it in the handle cleanup; the php_stream may already be freed by then (e.g. the user closed it), so the cleanup must not dereference it. - Watcher and Context had no get_gc handler, so reference cycles through Watcher::$data were uncollectable. Add get_gc for both. - Context, Watcher and StreamPollHandle were cloneable through the default handler, which shallow-copied the backing php_poll_ctx and the watcher map by pointer and double-freed them on destruction. Mark all three uncloneable. - Calling __construct() a second time on a Context or StreamPollHandle replaced the backing context or handle data without releasing the first, leaking it. Throw if the object is already constructed. - The add(), modify(), remove() and wait() entry points accepted a NULL ctx and forwarded it to php_poll_set_error(), which dereferenced it. The userland layer already gates on an active context before reaching the C API, so assert a non-NULL ctx in those entry points instead. Closes phpGH-22316
dc8d0ce to
64c39ca
Compare
Fixes memory-safety bugs in the new Io\Poll API, found by review and confirmed under valgrind.
Tests cover the clone guard, the double-construct guard, the watcher-outlives-context UAF, the fd release, and the get_gc cycle.
The wait()-error recording and the kqueue buffer cap from the original PR were split into #22326 and #22327.