Gemini 2.5 Flash - OpenAI SDK - API Reference
- Call gemini-2.5-flash model using OpenAI SDK format
- Synchronous processing mode, returns conversation content in real-time
- Plain text conversation: Single-turn or multi-turn contextual dialogue, see simple_text and multi_turn examples in code samples
- System prompt: Customize AI role and behavior, see system_prompt example in code samples
- Multimodal input: Supports text + image mixed input, see vision and multi_image examples in code samples
Authorization
##All APIs require Bearer Token authentication## **Get API Key:** Visit [API Key Management Page](https://starmagic.ai/app/api-keys) to get your API Key **Add to request header:** ``` Authorization: Bearer YOUR_API_KEY ```
Authorization: Bearer YOUR_API_KEYRequest body
application/jsonChat model name
"gemini-2.5-flash"List of chat messages, supports multi-turn dialogue and multimodal input
[
{
"role": "user",
"content": null,
"tool_call_id": "string"
}
]Whether to return response in streaming mode - `true`: Streaming return, receives content in real-time chunks - `false`: Returns complete response at once
falseMaximum number of completion tokens for the generated response, corresponding to Gemini's maxOutputTokens.
2000Maximum number of tokens for the generated response, compatible with the legacy OpenAI parameter.
2000Sampling temperature, controls output randomness **Description**: - Lower values (e.g., 0.2): More deterministic, focused output - Higher values (e.g., 1.5): More random, creative output
0.7Nucleus Sampling parameter **Description**: - Controls sampling from tokens with cumulative probability - For example, 0.9 means selecting from tokens with cumulative probability up to 90% - Default: 1.0 (considers all tokens) **Recommendation**: Do not adjust temperature and top_p simultaneously
0.9Frequency penalty coefficient. Range: -2.0 to 2.0. Corresponds to Gemini's frequencyPenalty.
0Presence penalty coefficient. Range: -2.0 to 2.0. Corresponds to Gemini's presencePenalty.
0Stop sequences. Supports a string or string array, corresponding to Gemini's stopSequences.
Number of generated candidates.
1Limits reasoning effort. Gemini 2.5 Flash and Flash Lite support none to disable thinking; low/medium/high map to different reasoning budgets.
"medium"Random seed used to make output as reproducible as possible, corresponding to Gemini's seed.
12345Whether to return token logprob information, corresponding to Gemini's responseLogprobs.
trueNumber of top logprob values returned for each token, corresponding to Gemini's logprobs.
5Response format settings, supporting JSON mode and JSON Schema, corresponding to Gemini's responseMimeType, responseSchema and responseJsonSchema.
Streaming response options. Can be set when stream is true.
{
"include_usage": true
}List of tool definitions for Function Calling.
[
{
"type": "function",
"function": {
"name": "string",
"description": "string",
"parameters": {}
}
}
]Controls tool-calling behavior.
Gemini extension parameters.
{
"google": {
"cached_content": "string",
"thinking_config": {}
}
}Response
application/jsonResponse body
Unique identifier for the chat completion
"chatcmpl-20251010015944503180122WJNB8Eid"Model name actually used
"gemini-2.5-flash"Response type
"chat.completion"Creation timestamp
1760032810List of chat completion choices
[
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm pleased to introduce myself.\n\nI'm a Large Language Model, trained and developed by Google.\n\nSimply put, you can think of me as a \"smart brain\" that has been trained on massive amounts of text data and is able to understand and generate human language. My core capability is processing and generating text. Specifically, I can do the following:\n\n**1. Information Query & Knowledge Answering**\nI can act like a \"talking encyclopedia,\" answering various questions, whether they're about scientific knowledge, historical events, or everyday facts.\n\n**2. Creative Writing & Text Generation**\nI can create various types of text based on your requirements, such as:\n* **Writing**: Poetry, stories, scripts, emails, speeches, advertising copy, etc.\n* **Planning**: Travel plans, study outlines, event proposals, etc.\n* **Brainstorming**: Working with you to generate new ideas and spark creativity.\n\n**3. Translation & Language Processing**\nI'm proficient in multiple languages and can provide fast, fluent translation services. I can also help you polish, proofread, summarize, or rewrite text to make your expression clearer and more professional.\n\n**4. Programming & Code Assistance**\nI can write code snippets, explain code logic, debug errors, or \"translate\" code from one programming language to another, making me a helpful companion for programmers.\n\n**5. Logical Analysis & Reasoning**\nI can help you analyze complex problems, organize logical chains, and make inferences and summaries based on the information you provide.\n\n---\n\n**In summary**, my goal is to be a powerful and useful tool that helps you obtain information more efficiently, complete tasks, and spark creativity through natural language communication.\n\n**Remember:** I'm an artificial intelligence, my knowledge comes from the data I've learned, and it may not be the most up-to-date. Sometimes I may also make mistakes, so for very important information, I recommend you verify it again.",
"tool_calls": [
null
]
},
"logprobs": {
"content": [
{
"token": null,
"logprob": null,
"bytes": null,
"top_logprobs": null
}
]
},
"finish_reason": "stop"
}
]Token usage statistics
{
"prompt_tokens": 13,
"completion_tokens": 1891,
"total_tokens": 1904,
"prompt_tokens_details": {
"cached_tokens": 0,
"text_tokens": 13,
"audio_tokens": 0,
"image_tokens": 0
},
"completion_tokens_details": {
"text_tokens": 0,
"audio_tokens": 0,
"reasoning_tokens": 1480
},
"input_tokens": 0,
"output_tokens": 0,
"input_tokens_details": null
}
