8000
Skip to content

add support for latest Gemini preview model ids#424

Open
w0wl0lxd wants to merge 1 commit intoBeehiveInnovations:mainfrom
w0wl0lxd:feat/latest-gemini-preview-models
Open

add support for latest Gemini preview model ids#424
w0wl0lxd wants to merge 1 commit intoBeehiveInnovations:mainfrom
w0wl0lxd:feat/latest-gemini-preview-models

Conversation

@w0wl0lxd
Copy link
Copy Markdown
@w0wl0lxd w0wl0lxd commented Apr 5, 2026

Summary

  • add the latest Gemini preview model ids to PAL's Gemini registry, including gemini-3.1-pro-preview, gemini-3-flash-preview, and gemini-3.1-flash-lite-preview
  • add regression tests covering alias resolution and GOOGLE_ALLOWED_MODELS allowlists for the new canonical model names
  • align requires-python with the effective dependency floor so local uv run resolution succeeds consistently

Why

Google's current Gemini API docs and examples use the newer 3.1/3.0 preview model ids, but PAL's bundled Gemini registry still only recognizes older entries such as gemini-3-pro-preview. That causes valid GOOGLE_ALLOWED_MODELS configurations to fail startup in auto mode because PAL treats the latest official model ids as unknown.

Validation

  • uv run --directory /home/w0w/dev/pal-mcp-server --with pytest pytest tests/test_supported_models_aliases.py tests/test_model_restrictions.py
  • env PATH="/etc/profiles/per-user/w0w/bin:/usr/local/bin:/usr/bin:/bin" GEMINI_API_KEY=*** DEFAULT_MODEL=auto GOOGLE_ALLOWED_MODELS="gemini-3.1-pro-preview,gemini-3-flash-preview" LOG_LEVEL=INFO /etc/profiles/per-user/w0w/bin/uv run --directory /home/w0w/dev/pal-mcp-server pal-mcp-server --help

Google's current Gemini API exposes 3.1 Pro and 3 Flash preview model IDs that PAL does not recognize yet, which causes valid allowlists to fail during startup. Add the new Gemini registry entries with regression coverage and align the declared Python floor with the mcp dependency requirements so local runs resolve cleanly.
Copy link
Copy Markdown
Contributor
@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for new Gemini preview models (gemini-3.1-pro-preview, gemini-3-flash-preview, and gemini-3.1-flash-lite-preview) by updating the model configuration and adding corresponding tests for model restrictions and aliases. It also bumps the minimum Python version to 3.10. A potential issue was identified in the configuration for gemini-3.1-flash-lite-preview, which enables extended thinking but lacks the required max_thinking_tokens field, rendering the feature unusable.

"description": "Latest Gemini 3.1 Flash-Lite preview for low-cost, high-volume multimodal workloads",
"context_window": 1048576,
"max_output_tokens": 65536,
"supports_extended_thinking": true,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The model gemini-3.1-flash-lite-preview has supports_extended_thinking set to true, but it is missing the max_thinking_tokens field. In providers/gemini.py, the thinking budget calculation (lines 196-202) requires max_thinking_tokens > 0. Without this field, the model will claim to support extended thinking but will never actually use it because the budget calculation will be skipped. Given that other 'Lite' models (like gemini-2.0-flash-lite on line 170) do not support this feature, please verify if this should be false or if a max_thinking_tokens value needs to be added.

Copy link
Copy Markdown
@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

πŸ’‘ Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d14c983829

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with πŸ‘.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

"description": "Latest Gemini 3.1 Flash-Lite preview for low-cost, high-volume multimodal workloads",
"context_window": 1048576,
"max_output_tokens": 65536,
"supports_extended_thinking": true,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Define thinking budget for flash-lite preview entry

This new model is marked with supports_extended_thinking: true, but the same object omits max_thinking_tokens. In GeminiModelProvider.generate_content (providers/gemini.py), the thinking config is only sent when max_thinking_tokens > 0, so thinking_mode will be silently ignored for gemini-3.1-flash-lite-preview even though capability metadata advertises thinking support. That can cause inconsistent behavior in auto/reasoning flows that choose this model based on the thinking flag.

Useful? React with πŸ‘Β / πŸ‘Ž.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

@chatgpt-codex-connector chatgpt-codex-connector[bot] chatgpt-codex-connector[bot] left review comments

+1 more reviewer

@gemini-code-assist gemini-code-assist[bot] gemini-code-assist[bot] left review comments

Reviewers whose approvals may not affect merge requirements

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

0