Project 05 · Connecting to Claude · Kindling Builders Edition

"API" is a word adults use to scare you. It's not scary. An API is a URL you POST to, with a body, and you get a JSON response back. That's it.

The clever part isn't the call — it's what you ask Claude for, and what you do with what comes back. The first kid your age to learn this in a serious way ends up with abilities most adult developers don't have yet.

A safety warning, up front

If you put your API key into the page directly, anyone who visits the page can steal it and run up your bill. We're going to use a small backend (Cloudflare Worker — free, takes 5 minutes) so the key stays on a server, not on the visitor's computer. The full Worker code is in the worked example below.

Step by step

Get an API key. (Probably with a grown-up.)

Go to console.anthropic.com. Sign up. Add $5 of credit (this lasts most kids a long time). Go to Keys and create a new one. Treat it like a password. Don't paste it in chat. Don't put it in a screenshot. Don't commit it to GitHub.
Stand up a Cloudflare Worker (the safe-key trick).

A Worker is a tiny server, free, that runs in 5 lines of JavaScript. It will hold your API key and forward your requests to Claude. Sign up at workers.cloudflare.com (free tier covers 100,000 requests/day, more than you'll ever need). Install wrangler (their CLI), wrangler login, then wrangler deploy. The full Worker code, including CORS handling, retries, and error responses, is in the example below.
Call the Worker from your page.

Now your browser talks to your worker, not to Anthropic directly. The worker forwards. The key stays safe. Use async/await — promise chains will hurt you when the call gets more complex. Always wrap in try/catch and surface the error to the UI; never leave the user staring at a spinner.
Make the prompt do the part only you'd ask.

Default prompts produce default answers. The trick of this whole project is that your system prompt has your taste in it. Build a system prompt that scopes Claude tightly: who the user is, what voice to use, what to refuse, what JSON shape to return.
Add streaming. The page should feel alive.

Streaming makes the response appear word-by-word instead of all at once. The user feels heard. The page feels alive. Set stream: true in your request body and read the response as a stream of Server-Sent Events. About 30 lines of code. The Worker code below already supports it.
Handle the four real failure modes.

(1) Network down. User has no internet. Show a clear message, not a spinner. (2) Rate limit (429). User asked too fast. Wait + retry with exponential backoff. (3) Bad JSON. Claude occasionally returns text that doesn't parse. Catch + fallback. (4) Empty response. The model returned nothing usable. Detect + show the empty state. The harness below has all four built in.
Cap your spend. Set a budget. Telemetry lite.

An API key with no spend cap is a vulnerability. In the Anthropic console, set a hard monthly limit (start at $5). In your Worker, count tokens used per session and refuse politely above a threshold. Log the usage to a simple localStorage counter on the client. Real engineers do this on day one.

A complete worked example, every file

The full Owen book picker, but now with Claude generating the why-this-book line. Ready to copy. Worker is deployable to Cloudflare in 5 minutes.

worker.js · runs on Cloudflare, holds your key

/**
 * Cloudflare Worker — Claude API forwarder.
 * - Holds the API key in env.ANTHROPIC_KEY (set via `wrangler secret put`).
 * - Adds CORS so the browser can call it.
 * - Streams responses through to the client.
 * - Retries 429s with exponential backoff.
 * - Refuses suspicious requests (basic abuse guard).
 */

const CORS = {
  'access-control-allow-origin':  '*',
  'access-control-allow-methods': 'POST, OPTIONS',
  'access-control-allow-headers': 'content-type',
};

export default {
  async fetch(req, env) {
    if (req.method === 'OPTIONS') return new Response(null, { headers: CORS });
    if (req.method !== 'POST')    return new Response('POST only', { status: 405, headers: CORS });

    const body = await req.text();

    // Basic abuse guard — reject huge payloads, weird origins, missing model.
    if (body.length > 50_000) return new Response('payload too large', { status: 413, headers: CORS });
    let parsed;
    try { parsed = JSON.parse(body); } catch { return new Response('bad json', { status: 400, headers: CORS }); }
    if (!parsed.model || !parsed.messages) return new Response('missing model or messages', { status: 400, headers: CORS });

    return await callClaudeWithRetry(parsed, env, 0);
  }
};

async function callClaudeWithRetry(payload, env, attempt) {
  const r = await fetch('https://api.anthropic.com/v1/messages', {
    method: 'POST',
    headers: {
      'x-api-key':         env.ANTHROPIC_KEY,
      'anthropic-version': '2023-06-01',
      'content-type':      'application/json',
    },
    body: JSON.stringify(payload),
  });

  // Retry rate limits with exponential backoff, max 3 attempts.
  if (r.status === 429 && attempt < 3) {
    const wait = Math.min(1000 * Math.pow(2, attempt), 8000);
    await new Promise(res => setTimeout(res, wait));
    return await callClaudeWithRetry(payload, env, attempt + 1);
  }

  // Stream straight through if streaming was requested.
  if (payload.stream && r.body) {
    return new Response(r.body, {
      status: r.status,
      headers: {
        ...CORS,
        'content-type': 'text/event-stream',
        'cache-control': 'no-cache',
      },
    });
  }

  // Non-streamed — pass through with CORS.
  return new Response(await r.text(), {
    status: r.status,
    headers: { ...CORS, 'content-type': 'application/json' },
  });
}

wrangler.toml · the Worker config

name = "owen-claude-forwarder"
main = "worker.js"
compatibility_date = "2026-04-01"

[vars]
# non-secret vars go here; the API key is set via:
#   wrangler secret put ANTHROPIC_KEY

script.js (browser side) · ~80 lines, real engineering

import { OWEN_SYSTEM, recommendPrompt } from './prompts.js';

const ENDPOINT = 'https://owen-claude-forwarder.YOUR-USERNAME.workers.dev';
const SPEND_CAP_REQUESTS = 50;        // hard cap per session

const result = document.getElementById('result');
const status = document.getElementById('status');
const usage  = document.getElementById('usage');

let inFlight = false;   // prevent rapid double-clicks
let sessionRequests = parseInt(localStorage.getItem('owen-req-count') || '0');
function bumpUsage(){ sessionRequests++; localStorage.setItem('owen-req-count', sessionRequests); usage.textContent = `requests this session: ${sessionRequests}/${SPEND_CAP_REQUESTS}`; }
bumpUsage();

async function recommend(vibe){
  if (inFlight) return;
  if (sessionRequests >= SPEND_CAP_REQUESTS) {
    setState('error', "you've hit the session cap — refresh to reset.");
    return;
  }
  inFlight = true;
  setState('loading', 'thinking…');

  try {
    const text = await streamClaude({
      model:  'claude-haiku-4-5',
      max_tokens: 200,
      stream: true,
      system: OWEN_SYSTEM,
      messages: [{ role: 'user', content: recommendPrompt(vibe) }],
    });
    bumpUsage();
    setState('done', text || "(empty response — try again)");
  } catch (e) {
    setState('error', friendly(e));
  } finally {
    inFlight = false;
  }
}

async function streamClaude(body) {
  const r = await fetch(ENDPOINT, {
    method: 'POST',
    headers: { 'content-type': 'application/json' },
    body: JSON.stringify(body),
  });
  if (!r.ok) throw new Error('api ' + r.status);

  const reader = r.body.getReader();
  const dec = new TextDecoder();
  let buf = '', text = '';
  while (true) {
    const { value, done } = await reader.read();
    if (done) break;
    buf += dec.decode(value, { stream: true });
    const lines = buf.split('\n');
    buf = lines.pop();
    for (const line of lines) {
      if (line.startsWith('data: ')) {
        try {
          const j = JSON.parse(line.slice(6));
          if (j.type === 'content_block_delta') {
            text += j.delta.text || '';
            setState('streaming', text);          // partial render
          }
        } catch {}
      }
    }
  }
  return text;
}

function setState(state, message) {
  result.dataset.state = state;
  result.textContent = message;
  result.setAttribute('aria-live', 'polite');
  status.className = 'pill ' + state;
  status.textContent = state;
}

function friendly(err) {
  if (!navigator.onLine) return "no internet — try again when you're back online.";
  const msg = String(err.message || err);
  if (msg.includes('429')) return "i'm being asked too fast — wait 30 seconds, then try again.";
  if (msg.includes('401')) return "the API key is wrong on the server. ask whoever set it up to check.";
  if (msg.includes('5'))   return "claude is having a bad moment. try in a minute.";
  return "something broke: " + msg + ". if it keeps happening, tell whoever built this page.";
}

// wire up the buttons
document.querySelectorAll('.vibe').forEach(b =>
  b.addEventListener('click', () => recommend(b.dataset.vibe))
);

prompts.js · where your taste lives

/**
 * The system prompt is the most important file in this project.
 * It's where Owen's specific taste — and the rules of the page —
 * are encoded. Keep it readable; you'll iterate it a dozen times.
 */

export const OWEN_SYSTEM = `
You suggest the next book for Owen, age 13.

About Owen:
  - reads weird sci-fi (VanderMeer, Strugatsky, Le Guin)
  - reads narrative history that reads like a novel (Grann, Lansing)
  - reads graphic novels with strong art (Saga, Persepolis)
  - actively dislikes YA fantasy unless the prose is sharp
  - has tried Stephen King twice, didn't stick. don't suggest him.

Output rules:
  - suggest exactly ONE book
  - format: "Title — Author. ."
  - no padding. no "I think you'll love this!"
  - never apologize. never list multiple options. never explain
    that you're an AI. just suggest the book.

If the user asks for something outside Owen's reading taste:
  - politely note this isn't Owen's vibe
  - suggest a book from his actual taste that comes close

Refuse, with a redirect, if asked anything not about books for Owen.
`.trim();

export function recommendPrompt(vibe) {
  const VIBE_HINTS = {
    'weird-scifi':  "Owen wants weird sci-fi today. lean into the strange.",
    'real-history': "Owen wants narrative history that reads like a novel.",
    'graphic':      "Owen wants a graphic novel where the art does work the prose can't.",
    'surprise':     "pick anything from Owen's taste. but pick something that would surprise him — not the obvious choice.",
  };
  return VIBE_HINTS[vibe] || `Owen says: "${vibe}". interpret in line with his taste.`;
}

deploy.md · 5-minute setup

## one-time setup

# 1. install wrangler (Cloudflare's CLI)
npm install -g wrangler

# 2. log in
wrangler login

# 3. deploy your worker
wrangler deploy

# 4. set the API key (do NOT put it in code)
wrangler secret put ANTHROPIC_KEY
# (paste your key when prompted; it's stored encrypted on Cloudflare)

# 5. note your worker URL — looks like
#    owen-claude-forwarder.YOUR-USERNAME.workers.dev
# put it in script.js as ENDPOINT

## redeploy after edits
wrangler deploy

## see live logs
wrangler tail

## set a hard spend cap
1. log in to console.anthropic.com
2. settings → billing → set monthly spend limit to $5
3. anthropic will refuse calls above the cap (better than a $200 bill)

Live demo 1: how specific is your system prompt?

Below: a frozen sample. Type a system prompt and a user question; we'll show you what kind of answer Claude would produce given that combination, by mocking the call. (Real calls require a key — your page will do them; this demo just simulates the shape.)

Prompt sketcher (mock)

Live demo 2: simulate the four real failure modes

Click each button to simulate a failure. Watch how the UI handles it. Now copy these patterns into your own code — every real production app handles all four.

Failure-mode simulator

Live demo 3: what does a streaming response feel like?

Most kids haven't experienced the difference. Click "instant" — the answer arrives all at once. Click "stream" — it appears word by word, the way Claude actually delivers it. Same total time. Completely different feel. This is one of the highest-leverage 30 lines of code you'll ever write.

Stream vs. instant

What makes this hard

Two things bite kids on this project: (1) the API key on the client — you'll be tempted to skip the Worker because it's "just a side project." Don't. Anyone with browser dev tools can read your key. The Worker is the right move.

(2) Default prompts produce default answers. The Claude API will give you the median book recommendation if you ask for one. The whole project is iterating the system prompt until the recommendations sound like you would actually agree with them. That iteration is taste.

(3) Forgetting to handle the failures. The happy path is easy. The four failure modes (network, rate limit, bad JSON, empty) are what separates a demo from a tool. Most kid projects ship without them. Yours won't.

Self-check before you ship

API key is on a Worker, never in the browser source. (Proof: open devtools → sources, search for "sk-". Should find nothing.)
The system prompt has my voice in it — at least one constraint or preference an adult wouldn't have written.
The page handles all 4 failure modes (loading / 429 / offline / empty) with friendly messages.
Spend cap set in the Anthropic console (≤ $5/mo to start).
Session request counter visible to the user (so I notice if I'm burning tokens).
The named user has tried it and one of Claude's suggestions actually surprised them.
I can write one paragraph: "what AI made better, what AI made worse" — based on real evidence.

Push further · for the harder end of 15+

Calling Claude is the floor. Here's how serious it gets.

Ship a tool-use agent in the browser. Add a tool to the Claude call: look_up_book_metadata(title) that hits the OpenLibrary API. Claude will sometimes choose to call your tool; you handle it, return the data, and Claude continues. This is the foundational pattern of every agent on Earth — and you can implement it in ~50 lines.
Add structured output validation with retries. Ask Claude to return JSON of a specific shape. Validate it client-side (use the schema-validator pattern from Skills 02). When it doesn't match, automatically re-prompt with the validation error and ask Claude to fix it. Real LLM-powered apps do this. Yours will.
Build a tiny eval harness. 10 inputs, 10 expected qualities of the response. Run them all through your Worker, score each automatically (with another Claude call as the judge — see Project 11 for the pattern). Track your "win rate" over prompt iterations. The first kid in your school who can show "my prompt went from 60% to 85% on the eval" is having a different conversation than everyone else.

Step by step

Get an API key. (Probably with a grown-up.)

Stand up a Cloudflare Worker (the safe-key trick).

Call the Worker from your page.

Make the prompt do the part only you'd ask.

Add streaming. The page should feel alive.

Handle the four real failure modes.

Cap your spend. Set a budget. Telemetry lite.