Handling User Requests: REPL and Telegram Bot

💬 So far each script asked Claude one fixed question and exited. Now you'll hand the keyboard to a real person. A coding agent is, at heart, a loop: read input, build messages, call Claude, emit output, repeat. The Claude call stays the same; only the I/O around it changes - so in this chapter you wrap that one call in two different frontends, a terminal REPL and a Telegram bot.

text

  read input → build messages → call Claude → emit output → repeat
                                │
                                ▼
                                stop_reason 'end_turn' ends the turn

You already have new Anthropic(), the env vars, and the content-array narrowing from Chapter 1, plus stream(), .on('text'), and finalMessage() from Chapter 2. Here you only add the input/output plumbing. The one new env var is TELEGRAM_BOT_TOKEN, which lives in .env next to your Anthropic credentials - Bun auto-loads it, and as always the key comes from the environment; never hardcode it.

The terminal REPL

The friendliest place to start is your own terminal. Bun exposes stdin as an async iterable, so for await (const line of console) reads one line per turn with no readline import. You keep a messages: Anthropic.MessageParam[] array, push each user line and each assistant reply onto it, and the model sees the whole session as context.

// bun run examples/03-repl-telegram/repl.ts

import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic();
const model = process.env.ANTHROPIC_DEFAULT_SONNET_MODEL ?? 'claude-sonnet-4-6';

// One messages array, reused across turns, is this session's whole memory.
const messages: Anthropic.MessageParam[] = [];

process.stdout.write('you: ');
// The loop ends at EOF (Ctrl-D, or end of piped input).
for await (const line of console) {
  const text = line.trim();
  if (!text) {
    process.stdout.write('you: ');
    continue;
  }

  messages.push({ role: 'user', content: text });
  const message = await client.messages.create({ model, max_tokens: 1024, messages });

  // content is a block array; narrow on type === 'text' before reading the reply.
  const first = message.content[0];
  const reply = first?.type === 'text' ? first.text : '';
  console.log(`claude: ${reply}`);

  messages.push({ role: 'assistant', content: reply });
  process.stdout.write('you: ');
}

Notice the two pushes per turn - { role: 'user', content: line } before the call, { role: 'assistant', content: replyText } after - and the type === 'text' narrowing on the first content block. Run it and chat:

bun run examples/03-repl-telegram/repl.ts

The reply only appears once Claude has finished the whole answer. Swap create for stream and the same loop prints tokens as they land - the .on('text', ...) and finalMessage() you met in Chapter 2:

// bun run examples/03-repl-telegram/repl-stream.ts

import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic();
const model = process.env.ANTHROPIC_DEFAULT_SONNET_MODEL ?? 'claude-sonnet-4-6';

const messages: Anthropic.MessageParam[] = [];

process.stdout.write('you: ');
for await (const line of console) {
  const input = line.trim();
  if (input) {
    messages.push({ role: 'user', content: input });

    process.stdout.write('claude: ');
    const stream = client.messages.stream({ model, max_tokens: 1024, messages });
    // 'text' fires once per chunk; write each delta to paint the reply live.
    stream.on('text', (delta) => process.stdout.write(delta));
    const final = await stream.finalMessage();
    process.stdout.write('\n');

    const first = final.content[0];
    const reply = first?.type === 'text' ? first.text : '';
    messages.push({ role: 'assistant', content: reply });
  }
  process.stdout.write('you: ');
}

The assistant text you push back onto messages comes from final.content, narrowed to its first text block, so the next turn carries the full reply as context.

Talking to Telegram

Let's put the same loop in front of the world. First, message @BotFather on Telegram, send /newbot, and it hands you a token - drop that into .env as TELEGRAM_BOT_TOKEN. The Bot API is plain HTTP: you POST JSON to https://api.telegram.org/bot<token>/<method> with raw fetch, no third-party library.

How do updates reach you? Two ways: polling, where you long-poll getUpdates and Telegram holds the request open until a message arrives, and webhooks, where Telegram POSTs to a public URL of yours. Polling needs nothing but an outbound connection, so it's what you'll use here.

Going deeper: webhooks

Webhooks flip the direction - Telegram calls you - which is more efficient at scale but needs a public HTTPS endpoint (a deployed server or a tunnel). The handler logic is identical; only the transport differs. See the Telegram Bot API docs for setWebhook when you deploy.

Each getUpdates call returns an array of updates; you pull message.chat.id and message.text, ask Claude, and sendMessage the first text block back. Bumping offset to update_id + 1 acks the batch so you never re-read it.

// bun run examples/03-repl-telegram/telegram-bot.ts

import Anthropic from '@anthropic-ai/sdk';

const token = process.env.TELEGRAM_BOT_TOKEN;
if (!token) {
  throw new Error('Set TELEGRAM_BOT_TOKEN in your .env (from BotFather).');
}

const base = `https://api.telegram.org/bot${token}`;
const client = new Anthropic();
const model = process.env.ANTHROPIC_DEFAULT_SONNET_MODEL ?? 'claude-sonnet-4-6';

type Update = {
  update_id: number;
  message?: {
    chat: { id: number };
    text?: string;
  };
};

type TgResult<T> = {
  ok: boolean;
  result?: T;
  description?: string;
};

async function tg<T>(method: string, body: object): Promise<TgResult<T>> {
  const response = await fetch(`${base}/${method}`, {
    method: 'POST',
    headers: { 'content-type': 'application/json' },
    body: JSON.stringify(body),
  });
  return response.json() as Promise<TgResult<T>>;
}

async function* pollUpdates(): AsyncGenerator<Update> {
  let offset = 0;
  while (true) {
    const { result = [] } = await tg<Update[]>('getUpdates', { offset, timeout: 30 });
    for (const update of result) {
      offset = update.update_id + 1;
      yield update;
    }
  }
}

for await (const update of pollUpdates()) {
  const message = update.message;
  if (!message?.text) {
    continue;
  }
  const reply = await client.messages.create({
    model,
    max_tokens: 1024,
    messages: [{ role: 'user', content: message.text }],
  });
  const first = reply.content[0];
  const answer = first?.type === 'text' ? first.text : '...';
  await tg('sendMessage', { chat_id: message.chat.id, text: answer });
}

Minimal types describe just the fields you read off each Update and response - no unknown anywhere. The long timeout: 30 is the long-poll: the request blocks for up to 30 seconds rather than hammering Telegram in a tight loop.

Streaming into a live-edited message

Here's the fun part: a Telegram message can be edited, so you can stream Claude's reply into one growing bubble. You sendMessage a placeholder to get a message_id, then call editMessageText as deltas arrive.

The headline gotcha: throttle your edits

Editing once per streamed token trips Telegram's flood limit almost instantly. Batch the deltas and edit at most about once per second (track the last-edit time), plus one final edit with the complete text. Two responses you must handle: a 400 whose description says message is not modified (the text didn't change - ignore it), and a 429 carrying parameters.retry_after (wait that many seconds, then retry).

// bun run examples/03-repl-telegram/telegram-stream.ts

import Anthropic from '@anthropic-ai/sdk';

const token = process.env.TELEGRAM_BOT_TOKEN;
if (!token) {
  throw new Error('Set TELEGRAM_BOT_TOKEN in your .env');
}

const base = `https://api.telegram.org/bot${token}`;
const client = new Anthropic();
const model = process.env.ANTHROPIC_DEFAULT_SONNET_MODEL ?? 'claude-sonnet-4-6';

type Update = {
  update_id: number;
  message?: {
    chat: { id: number };
    text?: string;
  };
};

type TgResult<T> = {
  ok: boolean;
  result?: T;
  error_code: number;
  description: string;
  parameters?: { retry_after?: number };
};

async function tg<T>(method: string, body: object): Promise<TgResult<T>> {
  const response = await fetch(`${base}/${method}`, {
    method: 'POST',
    headers: { 'content-type': 'application/json' },
    body: JSON.stringify(body),
  });
  return response.json() as Promise<TgResult<T>>;
}

async function* pollUpdates(): AsyncGenerator<Update> {
  let offset = 0;
  while (true) {
    const { result = [] } = await tg<Update[]>('getUpdates', { offset, timeout: 30 });
    for (const update of result) {
      offset = update.update_id + 1;
      yield update;
    }
  }
}

async function edit(chat_id: number, message_id: number, text: string): Promise<void> {
  const result = await tg('editMessageText', { chat_id, message_id, text });
  if (result.ok || result.description?.includes('not modified') || result.error_code !== 429) {
    return;
  }
  const retryAfter = result.parameters?.retry_after ?? 1;
  await new Promise((resolve) => setTimeout(resolve, retryAfter * 1000));
  await edit(chat_id, message_id, text);
}

for await (const update of pollUpdates()) {
  const chat_id = update.message?.chat.id;
  const prompt = update.message?.text?.trim();
  if (chat_id === undefined || !prompt) {
    continue;
  }

  const placeholder = await tg<{ message_id: number }>('sendMessage', {
    chat_id,
    text: '...',
  });
  const message_id = placeholder.result!.message_id;

  const stream = client.messages.stream({
    model,
    max_tokens: 1024,
    messages: [{ role: 'user', content: prompt }],
  });

  let text = '';
  let lastEdit = 0;
  // Editing on every token trips the flood limit; batch deltas and edit at most ~1/sec.
  stream.on('text', (delta) => {
    text += delta;
    if (Date.now() - lastEdit < 1000) {
      return;
    }
    lastEdit = Date.now();
    edit(chat_id, message_id, text);
  });

  await stream.finalMessage();
  await edit(chat_id, message_id, text);
}

The edit fires only when at least a second has passed since the last one, and the final edit after the stream guarantees the message ends complete even if the last batch was throttled.

Verifying the bots

The Telegram bots are long-running and need a real TELEGRAM_BOT_TOKEN, so this repo typechecks and reviews them rather than running them end to end. The REPLs you can run right now.

One guard for both frontends

Both frontends face the same messy input, so they share a few pure helpers. You skip empty or whitespace-only lines, answer /start and /help locally before spending an API call, and split any reply over Telegram's 4096-character limit into chunks.

// bun run examples/03-repl-telegram/input-guard.ts

export function isBlank(text: string): boolean {
  return text.trim().length === 0;
}

// Answer /start and /help locally, before spending a token on Claude.
export function handleCommand(text: string): string | null {
  const command = text.trim().split(/\s+/)[0];
  if (command === '/start') return 'Hi! Send me a message and I will ask Claude.';
  if (command === '/help') return 'Just type a question. /start and /help are handled here.';
  return null;
}

// Telegram caps one message at 4096 characters; split longer replies to fit.
export function chunk(text: string, size = 4096): string[] {
  const pieces: string[] = [];
  for (let i = 0; i < text.length; i += size) {
    pieces.push(text.slice(i, i + size));
  }
  return pieces.length > 0 ? pieces : [''];
}

if (import.meta.main) {
  console.log('isBlank("")        ->', isBlank(''));
  console.log('isBlank("   ")     ->', isBlank('   '));
  console.log('isBlank("hi")      ->', isBlank('hi'));
  console.log('handleCommand("/start") ->', handleCommand('/start'));
  console.log('handleCommand("/help")  ->', handleCommand('/help'));
  console.log('handleCommand("weather?")->', handleCommand('weather?'));
  console.log('chunk(9000 chars)  -> lengths', chunk('x'.repeat(9000)).map((p) => p.length));
}

isBlank, handleCommand, and chunk are network-free and independently testable - the file runs standalone and prints each on sample inputs. Wire them in ahead of every Claude call and your loop stays calm under real-world input.

What's next: Chapter 4 - Context and Conversation Management.

Handling User Requests: REPL and Telegram Bot ​

The terminal REPL ​

Talking to Telegram ​

Streaming into a live-edited message ​

One guard for both frontends ​

Handling User Requests: REPL and Telegram Bot

The terminal REPL

Talking to Telegram

Streaming into a live-edited message

One guard for both frontends