MiniMax-M3 - Anthropic-Compatible API
- Use the Anthropic Messages protocol to call the MiniMax-M3 model
- Request / response structure aligns with the Anthropic API
- Multimodal conversation:
contentsupports text and image content blocks - System prompts: Passed via the top-level
systemfield - Thinking mode: Controlled via the
thinkingobject; thinking content is returned viacontent[type=thinking]block - Streaming output: SSE event stream
- Tool calling: Compatible with Anthropic
tool_use/tool_resultflow
Authorization
##All APIs require Bearer Token authentication## **Get API Key:** Visit [API Key Management Page](https://starmagic.ai/app/api-keys) to get your API Key **Add to request header:** ``` Authorization: Bearer YOUR_API_KEY ``` **Note**: EvoLink uses Bearer Token authentication uniformly for `/v1/messages`.
Authorization: Bearer YOUR_API_KEYRequest body
application/jsonModel to call
"MiniMax-M3"Upper limit for generated content length (in tokens) **Notes**: - MiniMax-M3 recommended **131,072** (128K), maximum **524,288** (512K) - Tokens generated by thinking also count toward this limit - Content exceeding the limit will be truncated; if generation is interrupted due to `length`, try increasing this value
1024List of conversation messages, alternating user / assistant turns **Notes**: - Must contain at least 1 message - The last message is typically `role=user`
[
{
"role": "user",
"content": null
}
]System prompt, used to set the AI's role and behavior **Notes**: - Supports a string or an array of strings - Passed via the top-level `system` field
Sampling temperature **Notes**: - Range: `[0, 2]` - Default 1; higher values produce more divergent output, lower values produce more deterministic output
1Nucleus sampling threshold **Notes**: - Range: `[0, 1]`, MiniMax-M3 default 0.95 - It is recommended not to adjust temperature and top_p simultaneously
0.95Whether to return via SSE streaming - `true`: Server-Sent Events streaming response - `false`: Wait for complete response before returning (default)
falseControls deep thinking. When thinking is enabled, thinking blocks must be passed back as-is in multi-turn conversations **Notes**: - **Defaults to `adaptive`**: The model adaptively decides whether to engage in deep thinking based on problem difficulty - When enabled, the response `content` array will include a `type="thinking"` reasoning block (billed as output tokens)
{
"type": "adaptive"
}Tool definition list **Notes**: - Follows the Anthropic tool definition specification - `input_schema` uses a JSON Schema object
[
{
"name": "string",
"description": "string",
"input_schema": {},
"cache_control": {
"type": "ephemeral"
}
}
]Tool selection strategy. Only auto and none are supported
{
"type": "auto"
}Request metadata
{
"user_id": "string"
}Response
application/jsonResponse body
Unique message ID
"string"Response object type
"message""assistant"Model actually used
"MiniMax-M3"Response content block list **Possible block types**: - `thinking`: Reasoning process (only when thinking is active) - `text`: Final answer text - `tool_use`: Tool call initiated by the model
[
{
"type": "text",
"text": "string",
"thinking": "string",
"signature": "string",
"id": "string",
"name": "string",
"input": {}
}
]Stop reason - `end_turn`: Natural completion - `max_tokens`: Reached max_tokens limit - `tool_use`: Model triggered a tool call
"end_turn"Token usage statistics (Anthropic specification)
{
"input_tokens": 7,
"output_tokens": 77,
"cache_creation_input_tokens": 0,
"cache_read_input_tokens": 0
}
