ai-ml

Claude Batch API in TypeScript: Bulk Extraction (2026)

June 11, 2026

#claude #batch api #message batches #typescript #zod #structured outputs #anthropic #llm #ai-ml #tutorial

Claude Batch API in TypeScript: Bulk Extraction (2026)

The Claude Batch API processes up to 100,000 Messages requests asynchronously at 50% off standard token prices, with most batches finishing in under an hour. This tutorial builds a TypeScript pipeline that submits structured-output extraction requests, polls status, and validates streamed results with Zod.

TL;DR

You will build a bulk-extraction pipeline — three runnable scripts plus a tested result handler — on @anthropic-ai/sdk 0.104.1: turn a JSONL file of raw customer reviews into batch requests that each carry a Zod-derived output_config.format schema, submit them as one Message Batch, poll processing_status until it reads ended, then stream the results file and sort every entry into extracted / retryable / rejected with an exhaustively typed handler. Every code block type-checks under TypeScript 6 strict, and the result handler ships with nine offline tests that run without an API key. Budget roughly 30–40 minutes of build time, plus however long your batch takes to process.

What you'll learn

How the Message Batches API works — lifecycle, limits, and why everything costs half price
How to define one Zod schema that drives both the request format and result validation
How to build batch requests with valid custom_id values and structured outputs
How to submit a batch and poll its status with live request counts
How to handle all four result types — including the two-layer error envelope the official TypeScript example gets wrong
How to stream results to JSONL files keyed by custom_id
How to test the result handler offline, without an API key
What a 10,000-review extraction batch actually costs

How does the Claude Batch API work?

You send the Message Batches API a list of up to 100,000 Messages requests (or 256 MB, whichever limit you hit first), each tagged with a unique custom_id. The batch starts in processing_status: "in_progress", each request is processed independently, and the status flips to ended when every request has finished — most batches complete in less than an hour, and anything still unprocessed at 24 hours expires. All usage is charged at 50% of standard API prices, on both input and output tokens.¹

Four things make batch the right lane for bulk extraction rather than a for loop over messages.create():

Price. Claude Haiku 4.5 drops from $1/$5 per million input/output tokens to $0.50/$2.50. Claude Sonnet 4.6 drops from $3/$15 to $1.50/$7.50.¹
Throughput. The Batches API has its own rate limits, separate from the Messages API — at Tier 1 you can have 100,000 batch requests sitting in the processing queue while your synchronous limits stay untouched.²
No connection babysitting. Results land in a downloadable JSONL file that stays available for 29 days after batch creation.¹
Structured outputs work inside batches. The official feature-compatibility list calls this out directly: "Process structured outputs at scale with 50% discount."³

The full pipeline you are about to build:

flowchart TD
    A[reviews.jsonl — raw text] --> B[buildRequest — Zod schema as output_config.format]
    B --> C[messages.batches.create — status in_progress]
    C --> D[poll retrieve every 60s]
    D -->|processing_status == ended| E[stream results JSONL]
    E --> F{result.type}
    F -->|succeeded| G[stop-reason guards, then JSON.parse + Zod safeParse]
    F -->|errored| H[invalid_request — rejected / server error — retryable]
    F -->|canceled or expired| I[retryable]
    G -->|valid| J[extracted.jsonl]
    G -->|refusal, truncation, mismatch| K[failures.jsonl]

This Claude Batch API tutorial uses the GA endpoints — anthropic.messages.batches.* with no beta headers. The SDK's client.beta.messages.batches namespace still exists, but you only need it for beta features such as extended output (covered in the gotchas section).

Prerequisites

Node.js 24 LTS (Active LTS, supported through April 2028)⁴
An Anthropic API key in ANTHROPIC_API_KEY (a real batch run costs money; the offline tests in Step 8 do not)
Pinned packages, current on npm as of June 11, 2026:⁵

mkdir claude-batch-extraction && cd claude-batch-extraction
npm init -y
npm pkg set type=module
npm install @anthropic-ai/sdk@0.104.1 zod@4.4.3
npm install -D typescript@6.0.3 tsx@4.22.4 @types/node@24.13.2

Step 1 — Scaffold the project

Create a strict tsconfig.json. Every TypeScript block in this tutorial type-checks against it as written:

{
  "compilerOptions": {
    "target": "es2023",
    "module": "nodenext",
    "moduleResolution": "nodenext",
    "strict": true,
    "noUncheckedIndexedAccess": true,
    "verbatimModuleSyntax": true,
    "allowImportingTsExtensions": true,
    "noEmit": true,
    "skipLibCheck": true,
    "types": ["node"]
  },
  "include": ["src/**/*.ts", "test/**/*.ts"]
}

The "type": "module" from npm pkg set matters: top-level await in the submit and poll scripts requires ES modules, and verbatimModuleSyntax will refuse to compile ESM syntax in a CommonJS package.

Add the input data — a reviews.jsonl in the project root with one review per line. Three rows are enough to exercise the whole pipeline before you point it at your real backlog:

{"id": "1001", "text": "Battery dies before lunch and the proprietary charger is a pain. The screen is gorgeous though. Wish it charged over USB-C like everything else I own."}
{"id": "1002", "text": "Exactly what I wanted. Setup took five minutes and it has worked flawlessly for three months. Recommending it to my whole team."}
{"id": "1003", "text": "It's fine. Does what the box says, nothing more. Would be great if the app had an offline mode."}

Step 2 — Define one Zod schema as the single source of truth

The same schema does double duty: zodOutputFormat() turns it into the JSON Schema that constrains Claude's output on the server, and safeParse() validates every result locally when the batch comes back. Create src/schema.ts:

import { z } from "zod";

export const ReviewExtraction = z.object({
  sentiment: z.enum(["positive", "negative", "mixed", "neutral"]),
  productIssues: z
    .array(z.string())
    .describe("Specific product problems the reviewer reports, empty if none"),
  featureRequests: z
    .array(z.string())
    .describe("Features the reviewer asks for, empty if none"),
  wouldRecommend: z
    .boolean()
    .describe("Whether the reviewer would recommend the product"),
  summary: z.string().describe("One-sentence summary of the review"),
});

export type ReviewExtraction = z.infer<typeof ReviewExtraction>;

One behavior worth knowing before you trust the wire format: printing zodOutputFormat(ReviewExtraction) on SDK 0.104.1 shows the helper demotes the enum constraint into description text — the sentiment property goes over the wire as "type": "string" with "description": "{enum: [\"positive\",\"negative\",\"mixed\",\"neutral\"]}" — rather than emitting a JSON Schema enum keyword. The server grammar guarantees sentiment is a string; your local Zod validation is what enforces that it is one of the four allowed values. That is not a defect in this pipeline — it is the reason Step 6's schema-mismatch branch is reachable and worth testing.⁶

Local validation also covers the trust gap that batch processing adds on top: results sit in a downloadable file for up to 29 days, so they may be weeks old by the time a consumer reads them — revalidating at the point of use is cheap insurance.

Step 3 — Build batch requests with custom_id and structured outputs

Each entry in a batch pairs a custom_id with the exact params you would pass to messages.create() — minus a few parameters that make no sense asynchronously (stream: true is the most obvious one; including it returns a validation error).¹ Create src/requests.ts:

import { zodOutputFormat } from "@anthropic-ai/sdk/helpers/zod";
import type Anthropic from "@anthropic-ai/sdk";
import { ReviewExtraction } from "./schema.ts";

export const MODEL = "claude-haiku-4-5-20251001";

export interface Review {
  id: string;
  text: string;
}

const SYSTEM_PROMPT = `You extract structured data from customer product reviews.
Read the review and fill every field of the output schema.
Base every value only on what the review says. Keep arrays empty when the
review mentions no issues or no feature requests.`;

export function buildRequest(
  review: Review
): Anthropic.Messages.BatchCreateParams.Request {
  return {
    custom_id: `review-${review.id}`,
    params: {
      model: MODEL,
      max_tokens: 1024,
      system: SYSTEM_PROMPT,
      output_config: { format: zodOutputFormat(ReviewExtraction) },
      messages: [{ role: "user", content: review.text }],
    },
  };
}

Three constraints on this file are load-bearing:

custom_id must match ^[a-zA-Z0-9_-]{1,64}$ and be unique within the batch. Results can come back in any order, so custom_id is the only way to reattach an answer to its source review.¹
params is typed as MessageCreateParamsNonStreaming, and output_config is a first-class field on it — the installed SDK types confirm structured outputs are valid inside batch requests with no beta flag.⁶
Haiku 4.5 is the deliberate model choice. Classification-grade extraction does not need a frontier model, and at batch rates Haiku 4.5 costs $0.50/$2.50 per million tokens. Swap MODEL to claude-sonnet-4-6 ($1.50/$7.50 batch) if your documents need genuine reasoning.¹

Step 4 — Submit the batch

Create src/submit.ts:

import { readFileSync } from "node:fs";
import Anthropic from "@anthropic-ai/sdk";
import { buildRequest, type Review } from "./requests.ts";

const anthropic = new Anthropic();

const reviews: Review[] = readFileSync("reviews.jsonl", "utf8")
  .split("\n")
  .filter((line) => line.trim().length > 0)
  .map((line) => JSON.parse(line) as Review);

const batch = await anthropic.messages.batches.create({
  requests: reviews.map(buildRequest),
});

console.log(`Submitted batch ${batch.id}`);
console.log(`Status: ${batch.processing_status}`);
console.log(`Requests processing: ${batch.request_counts.processing}`);

Run it:

node --import tsx src/submit.ts

The shape of the output (the msgbatch_ ID is whatever the API assigns to your batch):

Submitted batch msgbatch_...
Status: in_progress
Requests processing: 3

One submission detail that surprises people: per-request validation is asynchronous. The official docs state that validation of each request's params object happens asynchronously — a bad params object (a typo'd model name, a max_tokens: 0) does not fail at submission. It surfaces as an errored result when the batch ends. The official docs recommend dry-running one request through the synchronous Messages API first to catch shape errors before you submit 10,000 copies of them.¹

Step 5 — Poll the message batch status

How long does a Claude batch take? Most batches finish in under an hour, but the documented contract is that results become available when all messages have completed or after 24 hours, whichever comes first — so poll rather than assume.¹ Create src/poll.ts:

import Anthropic from "@anthropic-ai/sdk";

const anthropic = new Anthropic();

const batchId = process.argv[2];
if (!batchId) {
  console.error("Usage: node --import tsx src/poll.ts <batch_id>");
  process.exit(1);
}

const POLL_INTERVAL_MS = 60_000;

let batch = await anthropic.messages.batches.retrieve(batchId);
while (batch.processing_status !== "ended") {
  const c = batch.request_counts;
  console.log(
    `${new Date().toISOString()} ${batch.processing_status} — ` +
      `processing: ${c.processing}, succeeded: ${c.succeeded}, ` +
      `errored: ${c.errored}, canceled: ${c.canceled}, expired: ${c.expired}`
  );
  await new Promise((resolve) => setTimeout(resolve, POLL_INTERVAL_MS));
  batch = await anthropic.messages.batches.retrieve(batchId);
}

console.log(`Batch ${batch.id} ended.`);
console.log(JSON.stringify(batch.request_counts, null, 2));

node --import tsx src/poll.ts msgbatch_YOUR_ID_HERE

request_counts gives you live progress across the five buckets (processing, succeeded, errored, canceled, expired), so long-running batches are observable rather than a black box. A 60-second interval is plenty; each poll is a normal HTTP request that counts against the Batches API requests-per-minute limit (50 RPM at Tier 1), not against your Messages API limits.²

If you need to abort, await anthropic.messages.batches.cancel(batchId) moves the batch to canceling; it still finishes as ended and may contain partial results for requests processed before the cancellation took effect.¹

Step 6 — Handle all four result types

Every line of the results file is one request's outcome, and there are exactly four result types: succeeded, errored, canceled, and expired. You are billed only for succeeded requests.¹ The handler below is a pure function — no I/O — which is what makes Step 8's offline tests possible. Create src/handle-result.ts:

import type Anthropic from "@anthropic-ai/sdk";
import { ReviewExtraction } from "./schema.ts";

export type Outcome =
  | { kind: "extracted"; customId: string; data: ReviewExtraction }
  | { kind: "retryable"; customId: string; reason: string }
  | { kind: "rejected"; customId: string; reason: string };

export function handleResult(
  entry: Anthropic.Messages.MessageBatchIndividualResponse
): Outcome {
  const { custom_id: customId, result } = entry;

  switch (result.type) {
    case "succeeded": {
      const message = result.message;

      if (message.stop_reason === "refusal") {
        return { kind: "rejected", customId, reason: "model refused" };
      }
      if (message.stop_reason === "max_tokens") {
        return {
          kind: "rejected",
          customId,
          reason: "output truncated at max_tokens; JSON may be incomplete",
        };
      }

      const text = message.content
        .filter((block) => block.type === "text")
        .map((block) => block.text)
        .join("");

      let parsed: unknown;
      try {
        parsed = JSON.parse(text);
      } catch {
        return { kind: "rejected", customId, reason: "response is not valid JSON" };
      }

      const validated = ReviewExtraction.safeParse(parsed);
      if (!validated.success) {
        return {
          kind: "rejected",
          customId,
          reason: `schema mismatch: ${validated.error.issues
            .map((issue) => issue.path.join(".") + " " + issue.message)
            .join("; ")}`,
        };
      }

      return { kind: "extracted", customId, data: validated.data };
    }

    case "errored": {
      // result.error is an ErrorResponse envelope: { type: "error", error: {...} }.
      // The error CLASS lives one level deeper, on result.error.error.type.
      const errorType = result.error.error.type;
      if (errorType === "invalid_request_error") {
        return {
          kind: "rejected",
          customId,
          reason: `invalid request: ${result.error.error.message}`,
        };
      }
      return { kind: "retryable", customId, reason: `server error: ${errorType}` };
    }

    case "canceled":
      return { kind: "retryable", customId, reason: "batch canceled before processing" };

    case "expired":
      return { kind: "retryable", customId, reason: "not processed within 24 hours" };
  }
}

The branching logic encodes the operational playbook:

errored + invalid_request_error → rejected. The request body itself is wrong; resubmitting the same bytes fails the same way. Fix the request first.
errored + anything else (such as api_error) → retryable. Server-side failures can be resubmitted as-is in a follow-up batch.
canceled and expired → retryable. These requests never reached the model and were never billed.
succeeded is not the same as usable. A refusal, a max_tokens truncation, or an out-of-enum value all arrive inside a "succeeded" result. The stop-reason guards and the Zod safeParse are what separate "the API returned a message" from "the data is safe to load."

Note the two-layer error envelope in the errored branch. result.error is an ErrorResponse whose own type field is always the literal "error"; the class you branch on — invalid_request_error, api_error, and friends — lives one level deeper at result.error.error.type. As of June 11, 2026, the TypeScript example in the official batch-processing doc checks the shallower result.result.error.type, which can never equal "invalid_request_error" per the SDK's own shipped types in 0.104.1 (the Python example in the same doc uses the correct deeper path).⁶

Step 7 — Stream batch results to JSONL by custom_id

Results live at a results_url as JSONL, and the SDK's results() method returns an async-iterable decoder, so you process one entry at a time instead of buffering a potentially huge file — the official guidance for large batches.¹ Create src/results.ts:

import { createWriteStream } from "node:fs";
import Anthropic from "@anthropic-ai/sdk";
import { handleResult } from "./handle-result.ts";

const anthropic = new Anthropic();

const batchId = process.argv[2];
if (!batchId) {
  console.error("Usage: node --import tsx src/results.ts <batch_id>");
  process.exit(1);
}

const extracted = createWriteStream("extracted.jsonl");
const failures = createWriteStream("failures.jsonl");
const counts = { extracted: 0, retryable: 0, rejected: 0 };

for await (const entry of await anthropic.messages.batches.results(batchId)) {
  const outcome = handleResult(entry);
  counts[outcome.kind] += 1;

  if (outcome.kind === "extracted") {
    extracted.write(
      JSON.stringify({ customId: outcome.customId, ...outcome.data }) + "\n"
    );
  } else {
    failures.write(JSON.stringify(outcome) + "\n");
  }
}

extracted.end();
failures.end();

console.log(
  `Done. extracted: ${counts.extracted}, ` +
    `retryable: ${counts.retryable}, rejected: ${counts.rejected}`
);
console.log("Wrote extracted.jsonl and failures.jsonl");

node --import tsx src/results.ts msgbatch_YOUR_ID_HERE

Results may arrive in a different order than you submitted them — the docs are explicit that ordering is not guaranteed and custom_id is the join key.¹ Because every output line carries customId, downstream consumers can rejoin extractions to source reviews regardless of order, and failures.jsonl is ready to drive a follow-up batch for the retryable entries.

Step 8 — Test the result handler offline, free

handleResult is pure, so every branch is testable with fixture objects — no API key, no tokens spent. Create test/handle-result.test.ts:

import { test } from "node:test";
import assert from "node:assert/strict";
import type Anthropic from "@anthropic-ai/sdk";
import { handleResult } from "../src/handle-result.ts";

type Entry = Anthropic.Messages.MessageBatchIndividualResponse;

function succeededEntry(
  text: string,
  stopReason: "end_turn" | "refusal" | "max_tokens" = "end_turn"
): Entry {
  return {
    custom_id: "review-1001",
    result: {
      type: "succeeded",
      message: {
        id: "msg_test",
        type: "message",
        role: "assistant",
        model: "claude-haiku-4-5-20251001",
        content: [{ type: "text", text, citations: null }],
        container: null,
        stop_details: null,
        stop_reason: stopReason,
        stop_sequence: null,
        usage: {
          input_tokens: 100,
          output_tokens: 50,
          cache_creation: null,
          cache_creation_input_tokens: null,
          cache_read_input_tokens: null,
          inference_geo: null,
          output_tokens_details: null,
          server_tool_use: null,
          service_tier: null,
        },
      },
    },
  };
}

const VALID = JSON.stringify({
  sentiment: "negative",
  productIssues: ["battery drains overnight"],
  featureRequests: ["usb-c charging"],
  wouldRecommend: false,
  summary: "Disappointed by battery life and the proprietary charger.",
});

test("succeeded + valid JSON + valid schema -> extracted", () => {
  const outcome = handleResult(succeededEntry(VALID));
  assert.equal(outcome.kind, "extracted");
  if (outcome.kind === "extracted") {
    assert.equal(outcome.data.sentiment, "negative");
    assert.equal(outcome.data.productIssues.length, 1);
  }
});

test("refusal stop reason -> rejected", () => {
  const outcome = handleResult(succeededEntry(VALID, "refusal"));
  assert.equal(outcome.kind, "rejected");
});

test("max_tokens stop reason -> rejected", () => {
  const outcome = handleResult(succeededEntry('{"sentiment":"neg', "max_tokens"));
  assert.equal(outcome.kind, "rejected");
});

test("invalid JSON -> rejected", () => {
  const outcome = handleResult(succeededEntry("not json at all"));
  assert.equal(outcome.kind, "rejected");
  if (outcome.kind === "rejected") {
    assert.match(outcome.reason, /not valid JSON/);
  }
});

test("valid JSON but schema mismatch -> rejected with issue path", () => {
  const outcome = handleResult(
    succeededEntry(JSON.stringify({ sentiment: "angry" }))
  );
  assert.equal(outcome.kind, "rejected");
  if (outcome.kind === "rejected") {
    assert.match(outcome.reason, /schema mismatch/);
    assert.match(outcome.reason, /sentiment/);
  }
});

test("errored with invalid_request_error -> rejected", () => {
  const entry: Entry = {
    custom_id: "review-1002",
    result: {
      type: "errored",
      error: {
        type: "error",
        request_id: "req_test",
        error: { type: "invalid_request_error", message: "max_tokens must be at least 1" },
      },
    },
  };
  const outcome = handleResult(entry);
  assert.equal(outcome.kind, "rejected");
  if (outcome.kind === "rejected") {
    assert.match(outcome.reason, /invalid request/);
  }
});

test("errored with api_error -> retryable", () => {
  const entry: Entry = {
    custom_id: "review-1003",
    result: {
      type: "errored",
      error: {
        type: "error",
        request_id: "req_test2",
        error: { type: "api_error", message: "internal server error" },
      },
    },
  };
  const outcome = handleResult(entry);
  assert.equal(outcome.kind, "retryable");
});

test("canceled -> retryable", () => {
  const entry: Entry = {
    custom_id: "review-1004",
    result: { type: "canceled" },
  };
  assert.equal(handleResult(entry).kind, "retryable");
});

test("expired -> retryable", () => {
  const entry: Entry = {
    custom_id: "review-1005",
    result: { type: "expired" },
  };
  assert.equal(handleResult(entry).kind, "retryable");
});

Run the suite:

node --import tsx --test test/handle-result.test.ts

# tests 9
# pass 9
# fail 0

The fixtures type-check against the real MessageBatchIndividualResponse type, which keeps them honest: on 0.104.1 the fixture must include container, stop_details, and usage.output_tokens_details, among other required Message fields — and if a future SDK version adds more, the compiler flags the fixture instead of letting your tests drift from reality.

What does a 10,000-review batch cost?

The Batch API charges exactly half the standard per-token price. For Claude Haiku 4.5, that is $0.50 per million input tokens and $2.50 per million output tokens, versus $1/$5 synchronous.¹

Concrete estimate — assuming an average of 400 input tokens per request (review text plus the system prompt and the format instructions the API injects for structured outputs) and 130 output tokens per extraction:

	Input	Output	Total
Tokens (10,000 reviews)	4.0M	1.3M	—
Synchronous Haiku 4.5 ($1/$5)	$4.00	$6.50	$10.50
Batch Haiku 4.5 ($0.50/$2.50)	$2.00	$3.25	$5.25

Those token counts are assumptions to make the math concrete — your reviews will differ; check usage on a dry-run request to calibrate. The ratio does not change: batch is half of whatever synchronous costs. Two compounding notes from the official docs: structured outputs add some input overhead because the API injects a format-instructions system prompt, and the first request with a new schema pays a one-time grammar-compilation latency (compiled grammars are cached for 24 hours from last use).³ Prompt caching also works inside batches and stacks with the 50% discount, but hits are best-effort under concurrent processing — observed rates range from 30% to 98% depending on traffic patterns.¹

Verification

Three checks confirm the pipeline end to end:

The batch ended cleanly. Poll output should reach ended with all requests in succeeded:

Batch msgbatch_... ended.
{
  "processing": 0,
  "succeeded": 3,
  "errored": 0,
  "canceled": 0,
  "expired": 0
}

Raw API agrees with the SDK. The retrieve endpoint should show the same status and a non-null results_url:

curl "https://api.anthropic.com/v1/messages/batches/msgbatch_YOUR_ID_HERE" \
  --header "x-api-key: $ANTHROPIC_API_KEY" \
  --header "anthropic-version: 2023-06-01"

Every line of extracted.jsonl re-validates. The file should have one line per succeeded review, each carrying a customId that maps back to reviews.jsonl, and each re-parsing under the same Zod schema. Spot-check a few rows against their source reviews — confirm the sentiment and the issue/feature arrays reflect what each review actually says. Extraction is still model inference, so exact strings vary run to run; the schema guarantees shape, not judgment.

Before scaling to your real backlog, dry-run one request through messages.create() with the same params object — submission-time validation is async, and the docs recommend exactly this to keep shape errors from multiplying across a large batch.¹

Troubleshooting

413 request_too_large on create. The batch exceeded 256 MB total. Split the input — the 100,000-request ceiling and the 256 MB ceiling are independent, and with large documents you hit bytes first.¹

A custom_id fails validation. IDs must match ^[a-zA-Z0-9_-]{1,64}$ and be unique within the batch. Spaces, dots, and emails fail the regex; duplicate IDs fail the uniqueness check.¹

Everything comes back errored with invalid_request_error. Per-request validation runs asynchronously, so a systematic mistake (wrong model string, max_tokens: 0, an unsupported parameter like stream: true) shows up as a batch full of errored results rather than a failed create call. Fix the params, verify with one synchronous request, resubmit.¹

TypeScript error TS2367: the types '"error"' and '"invalid_request_error"' have no overlap. You are checking the envelope, not the error. Use result.error.error.type — the outer ErrorResponse.type is always the literal "error" in the shipped SDK types.⁶

Requests show as expired. The batch hit the 24-hour window before those requests were processed — heavy demand or very large batches make this more likely. Expired requests are not billed; collect them from failures.jsonl and resubmit.¹

Results stop downloading weeks later. Results are available for 29 days after the batch's created_at (not its ended_at). After that you can still view the batch object, but its results are no longer available for download — persist outputs promptly.¹

Limits and gotchas worth knowing

Queue limits scale with your tier. Tier 1 allows 100,000 batch requests in the processing queue; Tier 4 allows 500,000. The per-batch cap stays 100,000 at every tier.²
Batches are Workspace-scoped. Keys from another Workspace cannot see your batches or results.¹
Not Zero Data Retention eligible. Batch processing stores requests and results for up to 29 days; you can DELETE /v1/messages/batches/{id} after processing (cancel first if in progress).¹
Extended output is batch-only. The output-300k-2026-03-24 beta header raises max_tokens to 300,000 on Opus 4.8/4.7/4.6 and Sonnet 4.6 — via client.beta.messages.batches.create() with a betas array — for exhaustive extraction or long-form generation. It is not available on the synchronous Messages API.¹
Server tools run in batches too, and the batch worker runs more agentic-loop iterations per turn than a synchronous request before returning pause_turn.¹

Next steps

The schema-first pattern here is the batch-scale version of our Claude structured outputs in TypeScript tutorial — read it for the synchronous messages.parse() path, strict tool use, and the full guard taxonomy. To regression-test the extraction prompt before each large submission, wire it into CI with promptfoo. And if your batch jobs feed an agent, the Claude tool use agentic-loop tutorial covers the synchronous side of the same SDK.

Anthropic, "Batch processing" — Claude API documentation, https://platform.claude.com/docs/en/build-with-claude/batch-processing (fetched June 11, 2026). Limits, pricing table, result types, billing rules, prompt-caching hit rates, extended-output beta, data retention. ↩ ↩² ↩³ ↩⁴ ↩⁵ ↩⁶ ↩⁷ ↩⁸ ↩⁹ ↩¹⁰ ↩¹¹ ↩¹² ↩¹³ ↩¹⁴ ↩¹⁵ ↩¹⁶ ↩¹⁷ ↩¹⁸ ↩¹⁹ ↩²⁰ ↩²¹ ↩²² ↩²³ ↩²⁴
Anthropic, "Rate limits" — Claude API documentation, https://platform.claude.com/docs/en/api/rate-limits (fetched June 11, 2026). Message Batches API tier table: RPM, processing-queue, and per-batch caps; separation from Messages API limits. ↩ ↩² ↩³
Anthropic, "Structured outputs" — Claude API documentation, https://platform.claude.com/docs/en/build-with-claude/structured-outputs (fetched June 11, 2026). Batch compatibility, grammar compilation and 24-hour caching, format-instructions overhead. ↩ ↩²
Node.js release schedule, https://endoflife.date/nodejs (Node 24 Active LTS; maintenance through April 2028; checked June 11, 2026). ↩
npm registry, checked June 11, 2026: @anthropic-ai/sdk 0.104.1, zod 4.4.3, typescript 6.0.3, tsx 4.22.4, @types/node 24.13.2. ↩
Shipped type definitions and source of @anthropic-ai/sdk 0.104.1 (resources/messages/batches.d.ts, resources/shared.d.ts, helpers/zod.d.ts), inspected June 11, 2026, plus runtime output of zodOutputFormat() on the tutorial schema. ↩ ↩² ↩³ ↩⁴