安装依赖:
pip3 install openai aws-bedrock-token-generator
验证依赖安装成功:
python3 -c "import openai; print(f'OpenAI SDK version: {openai.__version__}')"
应该会输出类似于OpenAI SDK version: 2.43.0
生成token:
# 1. 设置AWS区域
export AWS_REGION="us-east-1"
# 2. 把Bedrock的OpenAI兼容端点设为base URL
export OPENAI_BASE_URL="https://bedrock-mantle.${AWS_REGION}.api.aws/openai/v1"
# 3. 指定模型(通过Bedrock调用GPT-5.5)
export MODEL_ID="openai.gpt-5.5"
# 4. 用AWS凭证生成临时token,当作OpenAI API Key用
export OPENAI_API_KEY=$(python3 -c "from aws_bedrock_token_generator import provide_token; print(provide_token(region='$AWS_REGION'))")
持有者令牌的有效期约为 12 小时。
发起请求:
curl -s "${OPENAI_BASE_URL}/responses" \
-H "Authorization: Bearer ${OPENAI_API_KEY}" \
-H "Content-Type: application/json" \
-d "{\"model\": \"${MODEL_ID}\", \"input\": \"Respond with a fun fact about dinosaurs.\"}" \
| jq '{status, error, text: ([.output[]? | select(.type=="message") | .content[].text] | join("\n")), usage}'

import os
from openai import OpenAI
from aws_bedrock_token_generator import provide_token
region = "us-east-1"
os.environ["OPENAI_BASE_URL"] = f"https://bedrock-mantle.{region}.api.aws/openai/v1"
os.environ["OPENAI_API_KEY"] = provide_token(region=region)
client = OpenAI()
response = client.responses.create(
model="openai.gpt-5.5",
input=[
{"role": "user", "content": "Hello! How can you help me today?"}
]
)
print(response.output_text)
输出:

Responses API 默认是有状态的。每个响应都会存储在服务器端,并可在后续请求中通过 ID 引用,因此我们无需重新发送完整的对话历史。
将以下代码复制到文件 openai_multi_turn.py:
import os
from openai import OpenAI
from aws_bedrock_token_generator import provide_token
region = "us-east-1"
os.environ["OPENAI_BASE_URL"] = f"https://bedrock-mantle.{region}.api.aws/openai/v1"
os.environ["OPENAI_API_KEY"] = provide_token(region=region)
client = OpenAI()
# First turn
response1 = client.responses.create(
model='openai.gpt-5.5',
input="What is the capital of France?",
store=True
)
print(f"Turn 1: {response1.output_text}")
# Follow-up: the server recalls the full conversation context
response2 = client.responses.create(
model=openai.gpt-5.5,
input="What is its population?",
previous_response_id=response1.id,
store=True
)
print(f"Turn 2: {response2.output_text}")
执行该文件,请注意,store=True 是默认值,因此我们可以省略它。

要流式传输响应,传入 stream=True 作为附加参数。
将以下代码复制到文件 openai_stream_response.py:
import os
from openai import OpenAI
from aws_bedrock_token_generator import provide_token
region = "us-east-1"
os.environ["OPENAI_BASE_URL"] = f"https://bedrock-mantle.{region}.api.aws/openai/v1"
os.environ["OPENAI_API_KEY"] = provide_token(region=region)
client = OpenAI()
stream = client.responses.create(
model='openai.gpt-5.5',
input=[
{"role": "user", "content": "Hello! How can you help me today?"}
],
stream=True
)
for event in stream:
if event.type == "response.output_text.delta":
print(event.delta, end="", flush=True)
print()
执行该文件:
python openai_stream_response.py
结果会流式输出:

对于CLI,也支持流式响应:
# 前面注意要调用生成token的几行命令,这里省略掉了
curl -N "${OPENAI_BASE_URL}/responses" \
-H "Authorization: Bearer ${OPENAI_API_KEY}" \
-H "Content-Type: application/json" \
-d "{
\"model\": \"${MODEL_ID}\",
\"input\": \"Write a Python function that calculates the Fibonacci sequence using memoization.\",
\"stream\": true
}"

响应以 Server-Sent Events (SSE) 流的形式到达。每一行以 data: 开头,后跟一个带有 type 字段的 JSON 对象,告诉我们发生了什么:
response.created:请求被接受,生成开始response.output_text.delta:一段文本块(delta 字段包含新字符)response.completed:生成完成,包含最终的使用统在 macOS/Linux 上,使用 curl -N(无缓冲)以便在流式 token 到达时即时查看。如果不使用,curl 可能会在显示之前缓冲整个响应。