OpenAI Batch API란 무엇인가요?

OpenAI Batch API는 대량의 API 요청을 비동기적으로 처리하는 서비스입니다. 여러 요청이 포함된 JSONL 파일을 업로드하면 OpenAI가 24시간 이내에 실시간 API 호출 대비 50% 할인된 가격으로 처리합니다.

Batch API에 필요한 JSONL 형식은 무엇인가요?

각 줄은 세 개의 필드가 포함된 JSON 객체여야 합니다: custom_id(고유 요청 식별자), method(항상 POST), url(/v1/chat/completions 같은 API 엔드포인트 경로), body(API 형식에 맞는 요청 페이로드).

Batch API의 비용은 얼마인가요?

Batch API는 표준 API 가격 대비 지원되는 모든 모델에서 50% 할인을 제공합니다. 예를 들어, gpt-4o-mini가 일반적으로 백만 입력 토큰당 $0.15인 경우 Batch API를 통하면 백만 토큰당 $0.075입니다.

최대 배치 크기는 얼마인가요?

단일 배치에는 최대 50,000개의 요청을 포함할 수 있습니다. 입력 JSONL 파일은 최대 200MB까지 가능합니다. 더 큰 워크로드의 경우 여러 배치를 생성하세요. 각 조직은 최대 100개의 동시 배치를 가질 수 있습니다.

배치 처리는 얼마나 걸리나요?

OpenAI는 24시간 이내 완료를 보장하지만, 대부분의 배치는 요청 수와 현재 시스템 부하에 따라 일반적으로 1~4시간 내에 훨씬 빨리 완료됩니다. 배치 상태를 폴링하여 진행 상황을 확인할 수 있습니다.

배치 응답에서 오류는 어떻게 처리하나요?

출력 JSONL 파일에는 성공한 요청과 실패한 요청이 모두 포함됩니다. 각 줄에는 성공 시 response 필드 또는 실패 시 error 필드가 있습니다. 각 줄을 파싱하고 error가 null인지 확인하여 성공 여부를 판별합니다. 실패한 요청에는 디버깅을 위한 오류 코드와 메시지가 포함됩니다.

OpenAI Batch API JSONL 형식 가이드

비용을 50% 절약하기 위한 OpenAI Batch API용 JSONL 파일 구조화 방법을 알아보세요

최종 업데이트: 2026년 2월

OpenAI Batch API란?

OpenAI Batch API를 사용하면 대량의 API 요청을 단일 배치 작업으로 보낼 수 있습니다. 수천 개의 개별 API 호출을 하는 대신, 모든 요청이 포함된 JSONL 파일을 업로드하면 OpenAI가 24시간 이내에 비동기적으로 처리합니다.

핵심 이점은 비용입니다: 배치 요청은 실시간 API 호출보다 50% 저렴합니다. 이로 인해 Batch API는 콘텐츠 생성, 데이터 분류, 임베딩 계산, 평가 파이프라인 같이 즉각적인 응답이 필요하지 않은 작업에 이상적입니다.

JSONL 요청 형식

배치 입력 JSONL 파일의 각 줄은 하나의 API 요청을 나타냅니다. 모든 줄에는 세 개의 필수 필드가 포함되어야 하며 특정 구조를 따릅니다.

custom_id — 각 요청의 고유 식별자. 요청과 응답을 매칭하는 데 사용됩니다. 최대 512자의 임의 문자열이 가능합니다.

method — HTTP 메서드. 현재 POST만 지원됩니다.

url — API 엔드포인트 경로. Chat Completions: /v1/chat/completions. Embeddings: /v1/embeddings.

body — 요청 본문. 해당 API 엔드포인트의 요청 본문과 동일한 형식입니다.

기본 요청 구조

{"custom_id": "req-001", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "gpt-4o-mini", "messages": [{"role": "user", "content": "Hello"}]}}

Chat Completions 배치

Batch API의 가장 일반적인 사용 사례는 여러 Chat Completion 요청을 보내는 것입니다. 각 줄에는 model, messages 및 max_tokens나 temperature 같은 선택적 매개변수가 포함된 전체 요청이 들어갑니다.

단일 요청

각 줄에는 model, messages 및 선택적 매개변수가 포함된 완전한 Chat Completion 요청이 들어갑니다.

단일 요청

{"custom_id": "task-001", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "gpt-4o-mini", "messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Summarize the benefits of renewable energy."}], "max_tokens": 500}}

다중 요청 JSONL 파일

배치 파일에는 일반적으로 수백 또는 수천 개의 요청이 포함됩니다. 여기 요약, 분류, 번역 세 가지 다른 작업이 포함된 파일이 있습니다.

다중 요청 JSONL 파일

{"custom_id": "summary-001", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "gpt-4o-mini", "messages": [{"role": "user", "content": "Summarize: Solar energy is a renewable source of power that harnesses sunlight using photovoltaic cells."}], "max_tokens": 200}}
{"custom_id": "classify-001", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "gpt-4o-mini", "messages": [{"role": "user", "content": "Classify this review as positive or negative: Great product, works perfectly!"}], "max_tokens": 50}}
{"custom_id": "translate-001", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "gpt-4o-mini", "messages": [{"role": "user", "content": "Translate to French: Hello, how are you today?"}], "max_tokens": 100}}

Embeddings 배치

Batch API는 임베딩 요청도 지원하며, 대량 문서 컬렉션 처리에 유용합니다. 임베딩 배치는 동일한 JSONL 구조를 사용하지만 /v1/embeddings 엔드포인트를 대상으로 합니다.

임베딩 요청은 model과 입력 텍스트가 포함된 더 간단한 본문 구조를 사용합니다.

Embeddings 요청 형식

{"custom_id": "embed-001", "method": "POST", "url": "/v1/embeddings", "body": {"model": "text-embedding-3-small", "input": "JSONL is a text format for storing structured data."}}
{"custom_id": "embed-002", "method": "POST", "url": "/v1/embeddings", "body": {"model": "text-embedding-3-small", "input": "Each line contains one valid JSON object."}}
{"custom_id": "embed-003", "method": "POST", "url": "/v1/embeddings", "body": {"model": "text-embedding-3-small", "input": "JSONL files use the .jsonl extension."}}

응답 형식

배치 작업이 완료되면 OpenAI는 각 줄에 하나의 요청에 대한 응답이 포함된 출력 JSONL 파일을 제공합니다. 응답에는 원래 custom_id가 포함되어 있어 응답을 요청과 매칭할 수 있습니다.

성공 응답

성공 응답에는 custom_id, 응답 상태 코드, 전체 API 응답 본문이 포함됩니다.

성공 응답

{"id": "batch_req_abc123", "custom_id": "task-001", "response": {"status_code": 200, "body": {"id": "chatcmpl-xyz", "object": "chat.completion", "choices": [{"index": 0, "message": {"role": "assistant", "content": "Renewable energy provides numerous benefits including reduced greenhouse gas emissions, lower long-term costs, and energy independence."}}]}}, "error": null}

오류 응답

실패한 요청에는 무엇이 잘못되었는지 설명하는 코드와 메시지가 포함된 오류 객체가 포함됩니다.

오류 응답

{"id": "batch_req_def456", "custom_id": "task-002", "response": null, "error": {"code": "invalid_request_error", "message": "The model 'gpt-5' does not exist."}}

완전한 Batch API 워크플로우

Python SDK를 사용한 Batch API의 완전한 5단계 워크플로우입니다. JSONL 파일 생성부터 결과 다운로드 및 파싱까지 다룹니다.

1줄당 하나의 요청이 포함된 JSONL 파일 생성
2purpose를 batch로 설정하여 Files API로 파일 업로드
3업로드된 파일 ID를 참조하여 배치 작업 생성
4배치 상태가 completed, failed 또는 cancelled가 될 때까지 폴링
5출력 JSONL 파일을 다운로드하고 파싱하여 결과 추출

Python — Batch API Workflow

from openai import OpenAI
import json
import time

client = OpenAI()

# Step 1: Create the JSONL batch file
prompts = [
    "Summarize the benefits of renewable energy.",
    "Classify this text as positive or negative: Great product!",
    "Translate to French: Hello, how are you?",
]

requests = [
    {
        "custom_id": f"req-{i}",
        "method": "POST",
        "url": "/v1/chat/completions",
        "body": {
            "model": "gpt-4o-mini",
            "messages": [{"role": "user", "content": prompt}],
            "max_tokens": 200,
        },
    }
    for i, prompt in enumerate(prompts)
]

with open("batch_input.jsonl", "w") as f:
    for req in requests:
        f.write(json.dumps(req) + "\n")

# Step 2: Upload the file
batch_file = client.files.create(
    file=open("batch_input.jsonl", "rb"),
    purpose="batch",
)

# Step 3: Create the batch
batch = client.batches.create(
    input_file_id=batch_file.id,
    endpoint="/v1/chat/completions",
    completion_window="24h",
)
print(f"Batch created: {batch.id}")

# Step 4: Poll for completion
while True:
    status = client.batches.retrieve(batch.id)
    print(f"Status: {status.status}")
    if status.status in ("completed", "failed", "cancelled"):
        break
    time.sleep(60)

# Step 5: Download results
if status.output_file_id:
    content = client.files.content(status.output_file_id)
    with open("batch_output.jsonl", "wb") as f:
        f.write(content.read())

Batch API vs 실시간 API

Batch API와 표준 실시간 API를 언제 사용해야 하는지 이해하면 비용과 성능을 모두 최적화할 수 있습니다.

기능

Batch API

실시간 API

비용

50% 할인

표준 요금

지연 시간

최대 24시간

수초

Rate Limits

더 높은 한도

표준 한도

입력 형식

JSONL 파일 업로드

개별 API 호출

적합한 용도

대량 처리, 평가, 데이터 생성

대화형 앱, 실시간 채팅, 스트리밍

OpenAI 파인튜닝용 JSONL 형식에 대한 자세한 내용은 OpenAI JSONL 형식 가이드.

배치 파일 준비하기

OpenAI에 제출하기 전에 무료 도구를 사용하여 JSONL 배치 파일을 생성, 검증, 검사하세요.

OpenAI JSONL format guide

JSONL splitter

JSONL validator

온라인에서 JSONL 파일 작업하기

브라우저에서 최대 1GB의 JSONL 파일을 보고, 검증하고, 변환하세요. 업로드 불필요, 100% 프라이빗.

OpenAI Batch API JSONL 형식 가이드

OpenAI Batch API란?

JSONL 요청 형식

Chat Completions 배치

단일 요청

다중 요청 JSONL 파일

Embeddings 배치

응답 형식

성공 응답

오류 응답

완전한 Batch API 워크플로우

Batch API vs 실시간 API

배치 파일 준비하기

온라인에서 JSONL 파일 작업하기

자주 묻는 질문

OpenAI Batch API란 무엇인가요?

Batch API에 필요한 JSONL 형식은 무엇인가요?

Batch API의 비용은 얼마인가요?

최대 배치 크기는 얼마인가요?

배치 처리는 얼마나 걸리나요?

배치 응답에서 오류는 어떻게 처리하나요?