fix: critical session cache, async tools, and pairing security gaps#1862
fix: critical session cache, async tools, and pairing security gaps#1862cursor[bot] wants to merge 1 commit into
Conversation
- UnifiedSessionStore: invalidate in-memory cache when on-disk mtime changes to prevent cross-process message loss between TUI and interactive CLI - _arun_framework: use ToolResolver.resolve(..., instantiate=True) to match sync path and avoid async YAML tool initialisation failures - Pairing: reject approval callbacks from non-owner users; share gateway PairingStore with channel bots and reload store before is_paired checks Adds regression tests for each fix. Co-authored-by: Mervin Praison <MervinPraison@users.noreply.github.com>
|
@coderabbitai review |
|
/review |
Qodo reviews are paused for this user.Troubleshooting steps vary by plan Learn more β On a Teams plan? Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center? |
β Action performedReview finished.
|
|
Complex PR? Review this PR in Change Stack to move by importance, not file order. π WalkthroughWalkthroughThe PR introduces infrastructure improvements and security controls for pairing management and tool resolution. It adds file-mtime-based cache invalidation to session and pairing stores, enforces owner-only pairing approval, wires gateway pairing state into channel bots, and improves async framework tool resolution via explicit instantiation. ChangesCache Invalidation and Pairing Security Infrastructure
Estimated code review effortπ― 3 (Moderate) | β±οΈ ~22 minutes Possibly related PRs
Poem
π₯ Pre-merge checks | β 5β Passed checks (5 passed)
βοΈ Tip: You can configure your own custom pre-merge checks in the settings. β¨ Finishing Touchesπ Generate docstrings
π§ͺ Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@copilot Do a thorough review of this PR. Read ALL existing reviewer comments above from Qodo, Coderabbit, and Gemini first β incorporate their findings. Review areas:
|
There was a problem hiding this comment.
Actionable comments posted: 2
π§Ή Nitpick comments (2)
src/praisonai/praisonai/cli/session/unified.py (1)
145-157: βοΈ Poor tradeoffCache validity logic may incorrectly trust replaced files.
The comparison
current_mtime <= cached_mtimeconsiders the cache valid when the on-disk file is older than the cached version. If a session file is deleted and recreated with an earlier timestamp, or replaced via restoration from backup, the stale in-memory cache will be returned instead of the updated file content.Using strict equality (
current_mtime == cached_mtime) would detect all file replacements, though it may cause false invalidations if filesystem timestamp granularity differs across writes. The<=operator trades correctness for tolerance of clock skew or timestamp precision loss.Alternative: strict equality check
def _is_cache_valid(self, session_id: str) -> bool: """Return True if the in-memory cache matches the on-disk file.""" if session_id not in self._cache: return False path = self._get_session_path(session_id) if not path.exists(): return False try: current_mtime = path.stat().st_mtime_ns cached_mtime = self._cache_mtimes.get(session_id, 0) - return current_mtime <= cached_mtime + return current_mtime == cached_mtime except OSError: return Falseπ€ Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/praisonai/praisonai/cli/session/unified.py` around lines 145 - 157, In _is_cache_valid, the mtime comparison currently uses "current_mtime <= cached_mtime" which can incorrectly consider the cache valid when the on-disk file is older; change the comparison in the _is_cache_valid method to use strict equality (compare current_mtime == cached_mtime) so only an exact mtime match treats the cache as valid, leaving the rest of the method (missing cache entry, missing path, OSError handling via returning False) intact.src/praisonai/praisonai/gateway/server.py (1)
1722-1737: π€ Low valueConsider adding a comment explaining pairing support by platform.
Telegram, Discord, and Slack bots have their pairing stores wired to the gateway's shared instance (lines 1725, 1730, 1736), while WhatsApp, Linear, Email, and AgentMail do not. While this appears intentional based on the PR objectives, a brief comment would help future maintainers understand which platforms support pairing and avoid confusion.
π Suggested documentation addition
agent = apply_bot_smart_defaults(agent, config) + # Wire gateway pairing store for platforms that support owner-approval pairing flow. + # Currently: Telegram, Discord, Slack. Other platforms use alternative auth mechanisms. if channel_type == "telegram": from praisonai.bots import TelegramBotπ€ Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/praisonai/praisonai/gateway/server.py` around lines 1722 - 1737, The gatekeeper wiring for pairing is only applied for channel_type values "telegram", "discord", and "slack" by calling self._wire_gateway_pairing_store(bot) after instantiating TelegramBot, DiscordBot, and SlackBot; add a brief inline comment above these branches (near the channel_type conditional and above the calls to self._wire_gateway_pairing_store) explaining that pairing store wiring is intentionally supported for Telegram/Discord/Slack and that other channels (WhatsApp, Linear, Email, AgentMail) do not wire the pairing store per current design to avoid confusion for future maintainers; reference the channel_type conditional, the TelegramBot/DiscordBot/SlackBot instantiations and the _wire_gateway_pairing_store method in your comment.
π€ Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/praisonai/praisonai/bots/_pairing_ui.py`:
- Around line 204-216: Add a startup-time validation to fail-fast when pairing
approvals are allowed but no owner is configured: in the bot
initialization/validation path (e.g., BotAdapter.__init__ or its config
validation routine) check config.unknown_user_policy == "pair" and that
config.owner_user_id is present/non-empty, and if not raise/raise a
ConfigurationError or similar so the process won't run with an unset
owner_user_id; this prevents the runtime authorization bypass around the
PairingApprovalResult path that inspects bot_adapter.config.owner_user_id and
owner_user_id.
In `@src/praisonai/praisonai/cli/session/unified.py`:
- Line 134: The type annotation for self._cache_mtimes is incorrect: st_mtime_ns
returns an int (nanoseconds) but _cache_mtimes is annotated Dict[str, float];
change the annotation to Dict[str, int] (or consistently convert values to float
seconds if you prefer) so stored st_mtime_ns values match the declared
typeβupdate the declaration of _cache_mtimes and any code that reads/writes it
(references to _cache_mtimes and calls that use os.stat(...).st_mtime_ns) to use
the chosen integer representation.
---
Nitpick comments:
In `@src/praisonai/praisonai/cli/session/unified.py`:
- Around line 145-157: In _is_cache_valid, the mtime comparison currently uses
"current_mtime <= cached_mtime" which can incorrectly consider the cache valid
when the on-disk file is older; change the comparison in the _is_cache_valid
method to use strict equality (compare current_mtime == cached_mtime) so only an
exact mtime match treats the cache as valid, leaving the rest of the method
(missing cache entry, missing path, OSError handling via returning False)
intact.
In `@src/praisonai/praisonai/gateway/server.py`:
- Around line 1722-1737: The gatekeeper wiring for pairing is only applied for
channel_type values "telegram", "discord", and "slack" by calling
self._wire_gateway_pairing_store(bot) after instantiating TelegramBot,
DiscordBot, and SlackBot; add a brief inline comment above these branches (near
the channel_type conditional and above the calls to
self._wire_gateway_pairing_store) explaining that pairing store wiring is
intentionally supported for Telegram/Discord/Slack and that other channels
(WhatsApp, Linear, Email, AgentMail) do not wire the pairing store per current
design to avoid confusion for future maintainers; reference the channel_type
conditional, the TelegramBot/DiscordBot/SlackBot instantiations and the
_wire_gateway_pairing_store method in your comment.
πͺ Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
βΉοΈ Review info
βοΈ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 88edab21-89ac-4017-bc34-a3d35101d731
π Files selected for processing (8)
src/praisonai/praisonai/agents_generator.pysrc/praisonai/praisonai/bots/_pairing_ui.pysrc/praisonai/praisonai/cli/session/unified.pysrc/praisonai/praisonai/gateway/pairing.pysrc/praisonai/praisonai/gateway/server.pysrc/praisonai/tests/integration/bots/test_pairing_owner_dm.pysrc/praisonai/tests/unit/cli/test_unified_session.pytests/test_wrapper_layer_regression.py
| # Only the configured bot owner may approve or deny pairing requests | ||
| config = getattr(bot_adapter, "config", None) | ||
| expected_owner = getattr(config, "owner_user_id", None) if config else None | ||
| if expected_owner and str(expected_owner) != str(owner_user_id): | ||
| logger.warning( | ||
| "Rejected pairing callback from non-owner user %s (expected %s)", | ||
| owner_user_id, | ||
| expected_owner, | ||
| ) | ||
| return PairingApprovalResult( | ||
| success=False, | ||
| message="Only the bot owner can approve pairing requests", | ||
| ) |
There was a problem hiding this comment.
Consider requiring owner_user_id when pairing approval is enabled.
When expected_owner is None or empty, the authorization check is skipped (line 207), allowing any user to approve pairing requests. While the test suite shows this enables CLI fallback for deployments without a configured owner, it creates a security gap if owner_user_id is accidentally omitted in production.
Consider adding validation during bot initialization to enforce that owner_user_id is set when unknown_user_policy="pair", rather than silently allowing unauthenticated approvals at runtime.
π€ Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/praisonai/praisonai/bots/_pairing_ui.py` around lines 204 - 216, Add a
startup-time validation to fail-fast when pairing approvals are allowed but no
owner is configured: in the bot initialization/validation path (e.g.,
BotAdapter.__init__ or its config validation routine) check
config.unknown_user_policy == "pair" and that config.owner_user_id is
present/non-empty, and if not raise/raise a ConfigurationError or similar so the
process won't run with an unset owner_user_id; this prevents the runtime
authorization bypass around the PairingApprovalResult path that inspects
bot_adapter.config.owner_user_id and owner_user_id.
| self.session_dir = Path(session_dir) if session_dir else DEFAULT_SESSION_DIR | ||
| self.session_dir.mkdir(parents=True, exist_ok=True) | ||
| self._cache: Dict[str, UnifiedSession] = {} | ||
| self._cache_mtimes: Dict[str, float] = {} |
There was a problem hiding this comment.
Type annotation mismatch with st_mtime_ns.
_cache_mtimes is typed as Dict[str, float], but st_mtime_ns returns int (nanoseconds since epoch). This creates a type inconsistency between the annotation and the actual values stored.
π§ Proposed fix
- self._cache_mtimes: Dict[str, float] = {}
+ self._cache_mtimes: Dict[str, int] = {}π Committable suggestion
βΌοΈ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| self._cache_mtimes: Dict[str, float] = {} | |
| self._cache_mtimes: Dict[str, int] = {} |
π€ Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/praisonai/praisonai/cli/session/unified.py` at line 134, The type
annotation for self._cache_mtimes is incorrect: st_mtime_ns returns an int
(nanoseconds) but _cache_mtimes is annotated Dict[str, float]; change the
annotation to Dict[str, int] (or consistently convert values to float seconds if
you prefer) so stored st_mtime_ns values match the declared typeβupdate the
declaration of _cache_mtimes and any code that reads/writes it (references to
_cache_mtimes and calls that use os.stat(...).st_mtime_ns) to use the chosen
integer representation.
|
@claude You are the FINAL architecture reviewer. If the branch is under MervinPraison/PraisonAI (not a fork), you are able to make modifications to this branch and push directly. SCOPE: Focus ONLY on Python packages (praisonaiagents, praisonai). Do NOT modify praisonai-rust or praisonai-ts. Read ALL comments above from Gemini, Qodo, CodeRabbit, and Copilot carefully before responding. Phase 1: Review per AGENTS.md
Phase 2: FIX Valid Issues Phase 3: Final Verdict |
|
Claude finished @MervinPraison's task ββ View job FINAL Architecture ReviewPhase 1: Architecture Review per AGENTS.md
Phase 2: Address Valid Issues from Previous Reviewers
Phase 3: Final Verdict
Architecture Analysisβ Protocol-Driven Design: COMPLIANT
β Backward Compatibility: MAINTAINED
β Performance: NO REGRESSIONS
Critical Issues Found
|
|
Merge conflicts blocking this PR Branch Please:
Do not merge β fix and push only. |
|
Claude finished @MervinPraison's task ββ View job Resolving Merge ConflictsCRITICAL: This is a fork PR - I cannot push changes directly. I'll provide detailed resolution guidance. Todo List:
|
Qodo reviews are paused for this user.Troubleshooting steps vary by plan Learn more β On a Teams plan? Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center? |
|
@claude Draft + conflicts β likely superseded (PR #1862) Merge state: draft, Hot files: Recommend: compare diff to |
|
Claude finished @MervinPraison's task ββ View job PR Supersession AnalysisBranch: Todo List:
π Supersession Analysis ResultsVERDICT: FULLY SUPERSEDED
|
Greptile SummaryThis PR targets three claimed high-severity bugs: cross-process session cache staleness, async YAML tool resolution missing
Confidence Score: 3/5The pairing security fix is incomplete: it protects configured deployments but leaves those without Two real defects remain on the changed path: the owner check in
|
| Filename | Overview |
|---|---|
| src/praisonai/praisonai/bots/_pairing_ui.py | Adds owner-ID check before accepting pairing callbacks, but the guard is a no-op when owner_user_id is not configured, leaving unconfigured deployments fully vulnerable to the original attack. |
| src/praisonai/praisonai/gateway/server.py | Shares the gateway PairingStore with Telegram, Discord, and Slack bots via _wire_gateway_pairing_store, but WhatsApp and any other bot types in the same switch are left unwired. |
| src/praisonai/praisonai/cli/session/unified.py | Adds mtime-based cache invalidation using st_mtime_ns for cross-process stale-cache detection; logic is correct in the common case but has a narrow TOCTOU window when capturing mtime after the write lock is released. |
| src/praisonai/praisonai/gateway/pairing.py | Adds _reload_if_stale() driven by float-precision os.path.getmtime(); correctly called within self._lock in is_paired() and mtime is refreshed after each _save(). Uses float mtime (vs the session store's st_mtime_ns), which is slightly less precise but adequate for pairing workflows. |
| src/praisonai/praisonai/agents_generator.py | Async YAML tool resolution now calls resolve(tool_name, instantiate=True) matching the sync path; removes redundant manual class-instantiation guard. |
| src/praisonai/tests/unit/cli/test_unified_session.py | New test accurately models the cross-process stale-cache scenario using two separate UnifiedSessionStore instances writing and reading the same session file. |
| src/praisonai/tests/integration/bots/test_pairing_owner_dm.py | Adds test_non_owner_cannot_approve_pairing which covers the rejection path, but doesn't test the case where owner_user_id is absent (bypasses the guard entirely). |
| tests/test_wrapper_layer_regression.py | Regression test verifying _arun_framework calls resolve(..., instantiate=True); correctly uses AsyncMock for the adapter. |
Sequence Diagram
sequenceDiagram
participant TG as Telegram User (attacker)
participant CB as PairingCallbackHandler
participant CFG as bot.config
participant PS as PairingStore (shared)
TG->>CB: "handle_approval_callback(callback_data, owner_user_id="attacker")"
CB->>CB: parse_and_verify_callback() β HMAC passes
CB->>CFG: getattr(config, "owner_user_id", None)
alt owner_user_id IS configured
CFG-->>CB: "real-owner-id"
CB->>CB: ""attacker" != "real-owner-id" β reject"
CB-->>TG: "PairingApprovalResult(success=False)"
else owner_user_id NOT configured
CFG-->>CB: None
CB->>CB: if None β skip check (vulnerability)
CB->>PS: verify_and_pair(code, channel_id, channel_type)
PS-->>CB: True
CB-->>TG: "PairingApprovalResult(success=True)"
end
Comments Outside Diff (1)
-
src/praisonai/praisonai/gateway/server.py, line 1738-1744 (link)WhatsApp bot not wired to the shared pairing store
Telegram, Discord, and Slack bots all receive
_wire_gateway_pairing_store(), but theWhatsAppBotbranch (and any futureelifbranches below it) is left unchanged. If WhatsApp supports pairing-related flows, it will still hold a differentPairingStoreinstance than the gateway, reproducing the desync that this PR intends to fix for the other three channels.
Reviews (1): Last reviewed commit: "fix: critical session cache, async tools..." | Re-trigger Greptile
| if expected_owner and str(expected_owner) != str(owner_user_id): | ||
| logger.warning( | ||
| "Rejected pairing callback from non-owner user %s (expected %s)", | ||
| owner_user_id, | ||
| expected_owner, | ||
| ) | ||
| return PairingApprovalResult( | ||
| success=False, | ||
| message="Only the bot owner can approve pairing requests", | ||
| ) |
There was a problem hiding this comment.
Owner check silently bypassed when
owner_user_id is not configured
The guard if expected_owner and ... short-circuits to False when expected_owner is None or an empty string, meaning any Telegram user can approve pairing in deployments that have not set owner_user_id. The PR description frames this as a "pairing security" fix, but unconfigured deployments retain the exact vulnerability it claims to close. The safer default is to reject all approval callbacks when no owner identity can be verified β if the owner is unknown, there is no one to approve as.
| json_data = json.dumps(session.to_dict(), indent=2).encode('utf-8') | ||
| f.write(json_data) | ||
|
|
||
| # Update cache | ||
| # Update cache and track file mtime for cross-process invalidation | ||
| self._cache[session.session_id] = session | ||
| try: |
There was a problem hiding this comment.
mtime captured in a separate
stat() call after the write completes
There is a narrow window between the f.flush() / os.fsync() inside the lock and the path.stat().st_mtime_ns call below (outside the file lock). A concurrent writer could finish its own write in that window, and the locally cached mtime would then reflect the foreign write rather than this process's write. On the next load(), _is_cache_valid() would return True (the disk mtime matches what we stored), but the in-memory session object would be stale. The exposure is narrow and hard to hit in practice, but it partly undermines the cross-process safety guarantee this PR is adding.
|
@claude Merge conflicts β rebase required (2026-06-12 audit) Status: Fix: git fetch origin
git checkout cursor/critical-bug-investigation-9404
git rebase origin/mainResolve conflicts in these hot files (all touched on this PR):
git push --force-with-leaseThen confirm CI: Overlap with #1861 / draft #1868 on session + agents_generator β prefer one coherent rebase stack rather than three conflicting fixes. |
|
Claude finished @MervinPraison's task ββ View job Analyzing Merge Conflicts and PR StatusBranch: Todo List:
π Supersession Analysis ResultsVERDICT: FULLY SUPERSEDED
|
Rebase required β resolve conflicts in these filesBranch: Steps
Conflict hot files (verified with
|
| File | Guidance (main vs this PR) |
|---|---|
src/praisonai/praisonai/agents_generator.py |
Content conflict. Main (#1884, #1881) unified sync/async prep and adapter lifecycle β keep mainβs structure; port only any unique async-tool or session-cache hooks from this branch. |
src/praisonai/praisonai/cli/session/unified.py |
Content conflict. Main (#1892, #1885, #1854) fixed concurrent save / message loss and pairing-related session paths β merge your cache/security changes into mainβs locking/persistence logic, do not revert #1892. |
src/praisonai/tests/unit/cli/test_unified_session.py |
Content conflict. Main added/extended tests with #1892/#1885 β combine assertions (concurrent save + your regression cases) rather than picking one side. |
tests/test_wrapper_layer_regression.py |
Content conflict. Main (#1896) expanded wrapper regression coverage β keep mainβs tests and add only net-new cases from this PR. |
Also touched (rebase cleanly β re-run tests)
src/praisonai/praisonai/bots/_pairing_ui.py,gateway/pairing.py,gateway/server.pyβ verify against fix: pairing approval channel_type mismatch and UnifiedSessionStore message lossΒ #1892 pairingchannel_typefix before keeping duplicate UI/routes.
Supersession check
Session + pairing fixes largely landed in #1892 and #1885. If rebase leaves only duplicate changes, close this PR and open a narrow follow-up for any remaining gap.
|
Claude finished @MervinPraison's task ββ View job Analyzing Merge Conflicts and PR StatusBranch: Todo List:
π Supersession Analysis ResultsVERDICT: FULLY SUPERSEDED
|
|
@claude Rebase conflict ping (fresh) Please rebase onto Conflicts (merge-tree vs
Do not merge; reply here when rebased and checks pass. |
|
Claude finished @MervinPraison's task ββ View job Analyzing Rebase Conflicts and PR StatusBranch: Todo List:
π Supersession Analysis ResultsVERDICT: FULLY SUPERSEDED
|
|
@claude resolve merge conflict note that you have access to this branch , so you can update and fix src/praisonai/praisonai/agents_generator.py |
|
Claude finished @MervinPraison's task ββ View job Resolving Merge Conflicts and PR AnalysisBranch: Todo List:
π Supersession Analysis ResultsVERDICT: FULLY SUPERSEDED
|
|
@claude resolve merge conflict Before fixing: run an in-depth analysis using multi-agents to understand overlap with recent merges on main (#1849, #1896, #1908, bot PRs #1918β#1922, session fixes on main, etc.) and whether this PR is still needed or should be closed. Then rebase onto
You have access to this branch β update and fix directly, then |
|
Claude finished @MervinPraison's task ββ View job Analyzing Merge Conflicts and PR StatusBranch: Todo List:
π Supersession Analysis ResultsVERDICT: FULLY SUPERSEDED
|
|
@claude resolve merge conflict You have direct access to this branch on MervinPraison/PraisonAI β you can checkout, rebase onto origin/main, edit files, commit, and push --force-with-lease yourself. Do not only analyse or comment; apply the fix on the branch. Fresh audit (2026-06-13T11:47:19Z UTC): Before fixing: run in-depth multi-agent analysis (overlap with #1861 wrapper changes, #1868 draft session fixes, and #1949 session store β close if superseded). Rebase onto
Push when mergeable is CLEAN and CI ( |
|
Claude finished @MervinPraison's task ββ View job Analyzing Merge Conflicts and PR StatusBranch: Todo List:
π Supersession Analysis ResultsVERDICT: FULLY SUPERSEDED
|
Summary
Scheduled critical-bug scan found three high-severity issues in recent commits and fixed them with minimal, targeted changes.
Bugs fixed
1. UnifiedSessionStore cross-process message loss (data loss)
--interactive) could lose messages when one process held a stale in-memory cache and later saved.load()returned_cache[session_id]without checking whether the on-disk file had been updated by another process. File locking on write did not help because the stale reader never re-read disk.st_mtime_nsand invalidate cache onload()when the file is newer than the cached write.2. Async YAML tool resolution regression (user-facing breakage)
await praisonai.arun("agents.yaml")could fail at tool execution for class-based/registry tools that work on the sync path._arun_framework()calledtool_resolver.resolve(name)withoutinstantiate=True, diverging from sync_run_framework()after commit fix: wrapper layer key gaps - native async, template substitution, per-agent LLM (fixes #1758)Β #1794.resolve(tool_name, instantiate=True)in the async path.3. Pairing security / gateway desync
PairingStorethan channel bots, so approved users could remain blocked until restart.handle_approval_callback()never compared clicker ID toconfig.owner_user_id; gateway and bots each constructed their ownPairingStore, andis_paired()never reloaded disk.gateway.pairing_storewith Telegram/Discord/Slack bots; reload pairing store from disk when mtime changes beforeis_paired().Validation
test_load_invalidates_stale_cache_after_external_writetest_arun_framework_uses_instantiate_truetest_non_owner_cannot_approve_pairingOut of scope (reported, not fixed)
allowed_usersallowing all users is intentional (documented + tested).content[0]truncation (silent data loss; separate fix).Summary by CodeRabbit
Release Notes
Bug Fixes
Improvements