8000
Skip to content

fix(studio): ensure first chat tool call starts in session sandbox#4810

Merged
danielhanchen merged 8 commits intounslothai:mainfrom
neodon:fix/studio-sandbox-thread-init
Apr 3, 2026
Merged

fix(studio): ensure first chat tool call starts in session sandbox#4810
danielhanchen merged 8 commits intounslothai:mainfrom
neodon:fix/studio-sandbox-thread-init

Conversation

@neodon
Copy link
Copy Markdown
Contributor
@neodon neodon commented Apr 3, 2026

Fixes #4809

Summary

On a new Studio chat, the first tool call can start before the frontend initializes the thread ID.

That means the first request can go out without a session_id, so the backend starts the tool in the shared sandbox root instead of the chat's session sandbox. After the thread finishes initializing, later tool calls use the correct session sandbox.

Changes

Frontend

  • eagerly initialize the thread when switching to a new chat
  • resolve the thread ID once at request time using unstable_threadId first and activeThreadId as a fallback
  • use the resolved thread ID consistently for threadKey and session_id

Backend

  • use ~/studio_sandbox/_default when a request arrives without a session_id

The backend change is only a safety net to prevent access to all session sandboxes when session_id is not provided.

Test plan

  • Start a new chat in Studio
  • Prompt touch tmp.txt
  • Confirm the file is created at ~/studio_sandbox/<session_id>/tmp.txt
  • Confirm the file is not created at ~/studio_sandbox/tmp.txt

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@neodon neodon force-pushed the fix/studio-sandbox-thread-init branch from c5f6ca0 to bebcb4c Compare April 3, 2026 00:30
@neodon neodon marked this pull request as ready for review April 3, 2026 11:49
@neodon neodon force-pushed the fix/studio-sandbox-thread-init branch from bebcb4c to 17c0e32 Compare April 3, 2026 11:50
Copy link
Copy Markdown
@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bebcb4c42d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copy link
Copy Markdown
@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4577fcbcd1

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".


// Re-read store after potential auto-load / model ready wait
runtime = useChatRuntimeStore.getState();
resolvedThreadId = unstable_threadId ?? runtime.activeThreadId ?? undefined;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Guard session fallback against stale active thread

resolvedThreadId now falls back to runtime.activeThreadId when unstable_threadId is missing, but during a new-chat transition activeThreadId can still point to the previous thread (it is driven by ActiveThreadSync from mainThreadId in runtime-provider.tsx and only later overwritten after ThreadNewChatSwitch finishes initialize()). In that window, the first run can send session_id for the old conversation, so tool execution and adapter selection happen in the wrong thread sandbox and can mix state/files across chats. This regression is triggered by quickly sending the first message after creating a new chat, before the new remote thread ID is finalized.

Useful? React with 👍 / 👎.

@neodon neodon force-pushed the fix/studio-sandbox-thread-init branch from d1ca422 to 4577fcb Compare April 3, 2026 12:08
@neodon
Copy link
Copy Markdown
Contributor Author
neodon commented Apr 3, 2026

I temporarily reverted the last commit so I can take a closer look at it.

Copy link
Copy Markdown
@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d1ca4226eb

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +705 to +707
return () => {
cancelled = true;
};
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Clear init flag when new-chat switch is cancelled

If ThreadNewChatSwitch unmounts before initialize() resolves (for example, the user clicks New Chat and immediately selects another thread), cleanup only flips cancelled and never resets newThreadInitializing. Because both success and error paths guard setNewThreadInitializing(false) behind !cancelled, the flag can stay true indefinitely; then ActiveThreadSync keeps bailing out and activeThreadId stops updating, which can route later tool calls with a missing/stale session_id.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor
@danielhanchen danielhanchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR! The goal of this PR is to fix the first tool call in a new Studio chat starting outside the session sandbox. As a summary, this PR resolves the thread ID eagerly at request time using unstable_threadId/activeThreadId as a fallback, eagerly initializes the thread when switching to a new chat, and adds a backend safety net to use a _default subdirectory instead of the sandbox root.

I pushed a follow-up commit (ce0becc) that fixes three issues found during review:

Reviewers Severity Finding
9/10 High ActiveThreadSync can repopulate stale activeThreadId from previous chat during new-thread initialization, routing first tool call to the wrong sandbox
7/10 High resolvedThreadId was re-read after model load waits, allowing it to drift if the user switches chats during auto-load
5/10 Medium initialize() in the fire-and-forget IIFE had no error handling, leaving unhandled rejections and inconsistent store state

Changes in the follow-up commit:

  1. Disable ActiveThreadSync during new-chat creation -- pass !newThreadNonce to the enabled prop so the old mainThreadId does not get written back into activeThreadId while initialize() is resolving.

  2. Clear activeThreadId synchronously at the start of the ThreadNewChatSwitch effect, before the async IIFE. This ensures the adapter never picks up a stale thread ID from a previous chat during the initialization window.

  3. Make resolvedThreadId stable -- captured once with const at the start of run() instead of re-read after model load. This prevents the request from being retargeted if the user switches chats while waiting for auto-load.

  4. Add try/catch around the async initialize() call so failures are logged and the store is reset to null rather than left in a broken state.

@danielhanchen
Copy link
Copy Markdown
Contributor

/gemini review

Copy link
Copy Markdown
Contributor
@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the backend to use a _default subdirectory for shared sandboxes and improves frontend thread ID stability during chat initialization. The frontend now captures the thread ID at the start of the stream adapter and manages active thread IDs more robustly when creating new chats. Feedback was provided regarding the fallback session ID in the frontend, which should be updated to match the backend's _default directory name to prevent 404 errors when retrieving images.

if (imgIdx !== -1) {
const text = rawResult.slice(0, imgIdx);
const sessionId = unstable_threadId || "";
const sessionId = resolvedThreadId || "";
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To ensure that images generated during tool calls without a session ID can be correctly retrieved from the backend, the fallback sessionId should match the backend's _default directory name. Currently, it defaults to an empty string, which causes a 404. Per repository rules, when duplicating default values between frontend and backend, define them as named constants and add comments to cross-reference them to ensure they stay in sync.

const DEFAULT_SESSION_ID = "_default"; // Matches DEFAULT_SESSION_ID in studio/backend/app/constants.py
const sessionId = resolvedThreadId || DEFAULT_SESSION_ID;
References
  1. When default values must be duplicated between frontend and backend, define them as named constants in both places and add comments to cross-reference them, rather than creating a shared API endpoint.

Copy link
Copy Markdown
@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ce0becc504

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

// Capture the thread ID once at the start so it stays stable even if
// the user switches chats while waiting for model load / auto-load.
const resolvedThreadId =
unstable_threadId ?? runtime.activeThreadId ?? undefined;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Gate active-thread fallback to single-chat requests

The new resolvedThreadId fallback can misroute compare runs into an unrelated single-chat sandbox. In compare mode, activeThreadId is not synchronized (ActiveThreadSync is disabled when pairId is set in runtime-provider.tsx), so it commonly retains the last single-chat thread ID. When a compare run starts before unstable_threadId is populated (the same missing-thread window this patch targets), this line falls back to that stale ID and then sends it as session_id / adapter lookup, which can mix tool state/files across conversations.

Useful? React with 👍 / 👎.

@neodon
Copy link
Copy Markdown
Contributor Author
neodon commented Apr 3, 2026

I revised the last commit I made (and reverted) and ran some testing against it. I'll reconcile it with your changes and rerun my testing, then push it.

@neodon neodon marked this pull request as draft April 3, 2026 13:22
@neodon
Copy link
Copy Markdown
Contributor Author
neodon commented Apr 3, 2026

@danielhanchen Apologies, but I'm having trouble with us both working on the branch at the same time. Can you pause until I re-mark the PR Ready for Review?

@neodon
Copy link
Copy Markdown
Contributor Author
neodon commented Apr 3, 2026

Pardon the churn. I'm pretty new at contributing.

Your changes fix the issue with a simpler approach, and I see you are also working on sandbox hardening. I'll mark the PR Ready for Review and if I come across anything else I'll open a separate issue/PR. I appreciate the engagement!

@neodon neodon marked this pull request as ready for review April 3, 2026 13:31
Neodon and others added 6 commits April 3, 2026 08:34
- Clear activeThreadId synchronously when switching to a new chat so the
  adapter never picks up the previous thread's ID during initialization
- Disable ActiveThreadSync while a new-chat nonce is active to prevent it
  from re-populating a stale mainThreadId before initialize() resolves
- Capture resolvedThreadId once at the start of run() and keep it stable
  through model load / auto-load waits
- Add try/catch around the async initialize() call so failures are logged
  and do not leave the store in an inconsistent state
When resolvedThreadId is undefined (during new-chat initialization),
the image sessionId now falls back to "_default" instead of "" to match
the backend's _get_workdir fallback directory. This ensures tool-generated
images can be retrieved even if the first tool call fires before thread
initialization completes.
Prevents compare panes from inheriting a stale single-chat thread ID
as a fallback for session_id routing. Without this, the first compare
tool call could route to the previous single-chat's sandbox.
7292
Both handleNewCompare and enterCompare transition into compare mode,
but only handleNewCompare was clearing activeThreadId. This makes
enterCompare consistent so compare panes never inherit a stale
single-chat thread ID for session_id routing.
@neodon neodon force-pushed the fix/studio-sandbox-thread-init branch from a7846cc to badd9ef Compare April 3, 2026 13:35
restore context usage from saved view on exit

- Clear activeThreadId in the training handoff compare path so it is
  consistent with handleNewCompare and enterCompare
- In exitCompare, read the thread ID from viewBeforeCompare instead of
  the store, since activeThreadId may have been cleared on compare entry
@danielhanchen
Copy link
Copy Markdown
Contributor

/gemini review

Copy link
Copy Markdown
Contributor
@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request improves session and thread ID management to ensure consistent sandbox routing and prevent state leakage between chat sessions. Key changes include defaulting backend workdirs to a _default subdirectory, capturing stable thread IDs at the start of inference in the frontend, and clearing active thread IDs when entering comparison modes or initializing new chats. One piece of feedback suggests refining the cleanup logic in the ThreadNewChatSwitch component's effect to more robustly handle potential race conditions during asynchronous initialization.

Comment on lines +691 to +704
void (async () => {
try {
aui.threads().switchToNewThread();
const { remoteId } = await aui.threadListItem().initialize();
if (!cancelled) {
useChatRuntimeStore.getState().setActiveThreadId(remoteId);
}
} catch (error) {
if (!cancelled) {
useChatRuntimeStore.getState().setActiveThreadId(null);
}
console.error("Failed to initialize new chat thread", error);
}
})();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The async IIFE inside the useEffect hook is not properly awaited or handled in a way that ensures the cleanup logic (setting activeThreadId to null) is consistently applied if the component unmounts during the initialization process. While the cancelled flag is used, it is safer to ensure that the cleanup logic is explicitly handled in the return function of the effect to avoid potential race conditions or stale state.

@danielhanchen
Copy link
Copy Markdown
Contributor

/gemini review

Copy link
Copy Markdown
Contributor
@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request improves session and thread ID management across the frontend and backend to ensure consistent sandbox directory usage. Key changes include defaulting to a '_default' sandbox directory in the backend, stabilizing thread IDs in the chat adapter, and ensuring active thread IDs are cleared during compare mode or new thread initialization to prevent stale state. I have reviewed the suggested fix for the resolvedThreadId initialization in the chat adapter and agree that coercing falsy values to undefined is necessary to prevent inconsistent behavior between tool calls and image URL construction.

Comment on lines +426 to +427
const resolvedThreadId =
unstable_threadId ?? runtime.activeThreadId ?? undefined;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

An empty string for threadId can cause inconsistent behavior. If resolvedThreadId is "":

  • session_id sent to the backend for tool calls will be "". The backend will then use an _invalid sandbox directory.
  • sessionId used for constructing image URLs will become "_default" because "" || "_default" evaluates to "_default".

This mismatch will lead to broken image URLs.

By coercing a falsy threadId (like an empty string) to undefined, we ensure that it's handled consistently as the default case (no session) throughout the adapter.

Suggested change
const resolvedThreadId =
unstable_threadId ?? runtime.activeThreadId < 992E span class="pl-c1">?? undefined;
const resolvedThreadId = (unstable_threadId ?? runtime.activeThreadId) || undefined;

An empty string threadId would cause a mismatch: the backend sanitizes
it to _invalid while the frontend image URL falls back to _default.
Using || instead of ?? ensures any falsy value (including "") is
treated as the no-session case consistently.
@danielhanchen
Copy link
Copy Markdown
Contributor

/gemini review

Copy link
Copy Markdown
Contributor
@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request improves thread ID management and session routing consistency across the frontend and backend. Key changes include stabilizing the thread ID during model loading, ensuring compare panes do not inherit stale thread IDs, and updating the backend to use a '_default' directory for unassigned sessions. I have no feedback to provide on the current implementation.

Copy link
Copy Markdown
@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d6054a344c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

// Capture the thread ID once at the start so it stays stable even if
// the user switches chats while waiting for model load / auto-load.
const resolvedThreadId =
(unstable_threadId ?? runtime.activeThreadId) || undefined;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Restrict active-thread fallback for sidebar compare switches

resolvedThreadId now falls back to runtime.activeThreadId for all runs, but there is still a compare-entry path where that store value is stale: selecting an existing compare conversation from the sidebar goes through handleThreadSelect (in chat-page.tsx), which only calls setView(nextView) and does not clear activeThreadId. If a compare run starts before its pane has unstable_threadId (the startup window this patch is addressing), this fallback sends the previous single-chat ID as session_id, routing tool execution/files into the wrong sandbox. Fresh evidence: only handleNewCompare, enterCompare, and the training handoff path clear activeThreadId; sidebar selection does not.

Useful? React with 👍 / 👎.

@danielhanchen danielhanchen merged commit c027ec1 into unslothai:main Apr 3, 2026
1 check passed
shibizhao pushed a commit to shibizhao/unsloth-npu that referenced this pull request Apr 7, 2026
…nslothai#4810)

Fixes unslothai#4809

On a new Studio chat, the first tool call could start before the frontend
initializes the thread ID. That meant the first request could go out without
a session_id, so the backend started the tool in the shared sandbox root
instead of the chat's session sandbox.

Frontend:
- Eagerly initialize the thread when switching to a new chat
- Resolve the thread ID once at request time and keep it stable through
  async model-load waits
- Disable ActiveThreadSync during new-chat initialization to prevent
  stale thread IDs from being written back
- Add error handling for thread initialization failures
- Clear activeThreadId on all compare-mode entry paths to prevent
  cross-session leakage
- Fix exitCompare to restore context usage from the saved view
- Coerce falsy thread IDs to undefined for consistent backend/frontend
  fallback behavior
- Use _default as the image sessionId fallback to match the backend

Backend:
- Use ~/studio_sandbox/_default when a request arrives without a session_id
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] First tool call in a new Studio chat starts outside the session sandbox

2 participants

0