Shipping an AI‑powered chatbot is more than calling an API. You need a solid client experience (instant feedback, streaming), a secure server boundary (keys stay server‑side), robust validation, and a plan for costs, logging, and deployment. In this tutorial, you’ll build a production‑ready chatbot using React (TypeScript) for the UI, Node.js (Express) for the API, and OpenAI’s GPT API for responses.
If you prefer Next.js with App Router and typed routes, check our step‑by‑step guide to integrating OpenAI in Next.js: Integrate OpenAI’s API into a Next.js app. For grounding answers on your own docs, see our RAG blueprint: RAG production guide.
TL;DR
- React client with a minimal chat UI; progressive streaming updates.
- Express API with strict input validation and server‑only GPT calls.
- Streaming via Server‑Sent Events (SSE) for responsive UX.
- Safety, rate limiting, and observability you can actually maintain.
- Deployment pointers and internal links to expand the system.
Also revisit your metadata and link strategy for higher engagement: Next.js SEO best practices.
Architecture Overview
+-------------------+ POST /api/chat +-------------------+
| React Client | ───────────────────────────────▶ | Node Express |
| (TypeScript) | | (TypeScript) |
| - Chat UI | ◀─────── SSE: token stream ───── | - OpenAI client |
| - Streaming render| | - Validation |
| - Abort/retry | | - Rate limiting |
+-------------------+ +-------------------+
│
▼
+---------------+
| OpenAI GPT |
+---------------+Key rules:
- Keep API keys on the server; never call GPT from the browser directly.
- Validate every request (shape, size, temperature bounds).
- Stream for responsiveness; fall back to non‑stream when needed.
- Log request IDs and failure categories for observability.
Prerequisites
- Node.js 18+
- React 18+
- TypeScript throughout
1) Project Setup
mkdir ai-chatbot && cd ai-chatbot
pnpm init -y
pnpm add express cors zod openai
pnpm add -D typescript ts-node @types/node @types/express
npx tsc --init --rootDir src --outDir dist --esModuleInterop true --module commonjs --target es2022
mkdir -p src/server src/client src/sharedCreate a .env for secrets:
OPENAI_API_KEY=sk-...
PORT=3001Update package.json scripts:
{
"scripts": {
"dev": "ts-node src/server/index.ts",
"build": "tsc",
"start": "node dist/server/index.js"
}
}2) Server: Express API with Validation and OpenAI Client
// src/server/openai.ts
import OpenAI from "openai";
export const getOpenAI = () => {
const apiKey = process.env.OPENAI_API_KEY;
if (!apiKey) throw new Error("Missing OPENAI_API_KEY");
return new OpenAI({ apiKey });
};// src/server/index.ts
import "dotenv/config";
import express from "express";
import cors from "cors";
import { z } from "zod";
import { getOpenAI } from "./openai";
const app = express();
app.use(cors());
app.use(express.json({ limit: "1mb" }));
// Simple in-memory limiter (swap with Redis in production)
const requestsByIp = new Map<string, { count: number; ts: number }>();
const RATE_LIMIT = { windowMs: 60_000, max: 60 };
const limit = (ip: string) => {
const now = Date.now();
const rec = requestsByIp.get(ip);
if (!rec || now - rec.ts > RATE_LIMIT.windowMs) {
requestsByIp.set(ip, { count: 1, ts: now });
return false;
}
rec.count += 1;
return rec.count > RATE_LIMIT.max;
};
const ChatBody = z.object({
messages: z
.array(
z.object({
role: z.enum(["system", "user", "assistant"]),
content: z.string().min(1),
})
)
.min(1),
temperature: z.number().min(0).max(2).optional(),
});
app.post("/api/chat", async (req, res) => {
const ip = (req.headers["x-forwarded-for"] as string) || req.ip || "unknown";
if (limit(ip)) return res.status(429).json({ error: "Too many requests" });
const parsed = ChatBody.safeParse(req.body);
if (!parsed.success) return res.status(400).json({ error: "Invalid body" });
const { messages, temperature } = parsed.data;
try {
const openai = getOpenAI();
const completion = await openai.chat.completions.create({
model: "gpt-4o-mini",
messages,
temperature: temperature ?? 0.5,
});
const text = completion.choices[0]?.message?.content ?? "";
return res.json({ id: crypto.randomUUID(), text });
} catch (e: any) {
return res.status(500).json({ error: e?.message || "Server error" });
}
});
// Streaming via SSE
app.post("/api/chat/stream", async (req, res) => {
const ip = (req.headers["x-forwarded-for"] as string) || req.ip || "unknown";
if (limit(ip)) return res.status(429).end();
const parsed = ChatBody.safeParse(req.body);
if (!parsed.success) return res.status(400).end();
res.setHeader("Content-Type", "text/event-stream; charset=utf-8");
res.setHeader("Cache-Control", "no-cache, no-transform");
res.setHeader("Connection", "keep-alive");
const send = (data: string) => res.write(`data: ${data}\n\n`);
try {
const openai = getOpenAI();
const stream = await openai.chat.completions.create({
model: "gpt-4o-mini",
stream: true,
messages: parsed.data.messages,
});
for await (const chunk of stream) {
const delta = chunk.choices[0]?.delta?.content;
if (delta) send(JSON.stringify({ delta }));
}
send(JSON.stringify({ done: true }));
} catch (e: any) {
send(JSON.stringify({ error: e?.message || "stream_error" }));
} finally {
res.end();
}
});
const port = Number(process.env.PORT || 3001);
app.listen(port, () => {
console.log(`Server listening on http://localhost:${port}`);
});Notes:
- This uses simple in‑memory rate limiting. Replace with Redis or a provider in production.
- Streaming uses SSE (Server‑Sent Events); it’s simple and fits token streams well.
- Keep
OPENAI_API_KEYserver‑only; do not expose it to the client.
If you’re building with Next.js API routes, see the streaming approach in our Next.js integration guide.
3) Client: React Chat UI (TypeScript)
We’ll keep the UI minimal and focus on correctness: optimistic append, incremental rendering for streams, and an abort option.
// src/client/ChatApp.tsx
import { useCallback, useMemo, useRef, useState } from "react";
type Role = "system" | "user" | "assistant";
type ChatMessage = { role: Role; content: string };
const ERROR_GENERIC = "Something went wrong. Please try again." as const;
export const ChatApp = () => {
const [messages, setMessages] = useState<ChatMessage[]>([
{ role: "system", content: "You are a helpful, concise assistant." },
]);
const [input, setInput] = useState("");
const [thinking, setThinking] = useState(false);
const [streaming, setStreaming] = useState(true);
const abortRef = useRef<AbortController | null>(null);
const canSend = useMemo(() => input.trim().length > 0 && !thinking, [input, thinking]);
const sendNonStream = useCallback(async (payload: ChatMessage[]) => {
const res = await fetch("http://localhost:3001/api/chat", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ messages: payload }),
});
return (await res.json()) as { text?: string; error?: string };
}, []);
const sendStream = useCallback(async (payload: ChatMessage[]) => {
abortRef.current?.abort();
abortRef.current = new AbortController();
const res = await fetch("http://localhost:3001/api/chat/stream", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ messages: payload }),
signal: abortRef.current.signal,
});
if (!res.body) throw new Error("No stream");
const reader = res.body.getReader();
const decoder = new TextDecoder();
let acc = "";
setMessages((prev) => [...prev, { role: "assistant", content: "" }]);
while (true) {
const { done, value } = await reader.read();
if (done) break;
const chunk = decoder.decode(value);
// SSE frames: lines prefixed with "data: "
const lines = chunk.split("\n\n").map((s) => s.trim()).filter(Boolean);
for (const line of lines) {
if (!line.startsWith("data:")) continue;
const json = line.slice(5).trim();
try {
const { delta, done, error } = JSON.parse(json);
if (error) throw new Error(error);
if (done) return;
if (delta) {
acc += delta;
setMessages((prev) => {
const copy = [...prev];
copy[copy.length - 1] = { role: "assistant", content: acc };
return copy;
});
}
} catch (_) {
// ignore malformed frames
}
}
}
}, []);
const onSend = useCallback(async () => {
if (!canSend) return;
const userMsg: ChatMessage = { role: "user", content: input.trim() };
setInput("");
setThinking(true);
setMessages((prev) => [...prev, userMsg]);
try {
const payload = [...messages, userMsg];
if (streaming) {
await sendStream(payload);
} else {
const data = await sendNonStream(payload);
const content = data.text ?? data.error ?? ERROR_GENERIC;
setMessages((prev) => [...prev, { role: "assistant", content }]);
}
} catch (_) {
setMessages((prev) => [...prev, { role: "assistant", content: ERROR_GENERIC }]);
} finally {
setThinking(false);
}
}, [canSend, input, messages, sendNonStream, sendStream, streaming]);
return (
<div style={{ maxWidth: 720, margin: "0 auto", padding: 16 }}>
<h1>AI Chatbot</h1>
<div style={{ marginBottom: 8, opacity: 0.75 }}>
<label>
<input type="checkbox" checked={streaming} onChange={(e) => setStreaming(e.target.checked)} /> Stream tokens
</label>
{" "}
{thinking && <button onClick={() => abortRef.current?.abort()}>Abort</button>}
</div>
<div style={{ border: "1px solid #ddd", borderRadius: 8, padding: 12, minHeight: 240 }}>
{messages.map((m, i) => (
<div key={i} style={{ whiteSpace: "pre-wrap", margin: "6px 0" }}>
<strong>{m.role}:</strong> {m.content}
</div>
))}
</div>
<div style={{ display: "flex", gap: 8, marginTop: 8 }}>
<input
value={input}
onChange={(e) => setInput(e.target.value)}
placeholder="Ask anything…"
style={{ flex: 1, padding: 8, border: "1px solid #ccc", borderRadius: 6 }}
/>
<button onClick={onSend} disabled={!canSend}>
{thinking ? "Thinking…" : "Send"}
</button>
</div>
</div>
);
};Mount this component in your SPA (e.g., Vite or CRA) and ensure CORS is allowed from http://localhost:3000 to your API on http://localhost:3001.
4) Safety, Limits, and Observability
- Input validation: bound message length, limit total messages per request.
- Rate limiting: enforce per‑IP and per‑user limits; store counters in Redis.
- Logging: record a request ID, timing, and outcome (success, 4xx, 5xx).
- Cost control: cap temperature, limit maximum tokens, cache deterministic prompts.
- Moderation: consider pre‑ or post‑moderation for public‑facing bots.
For enterprise reliability, ground answers on your knowledge and require citations; see our RAG production guide.
5) Testing the Flow
# 1) Start the server
pnpm dev
# 2) Test non-streaming
curl -s -X POST http://localhost:3001/api/chat \
-H 'Content-Type: application/json' \
-d '{"messages":[{"role":"user","content":"Say hello in 5 words"}]}' | jq
# 3) Test streaming (SSE)
curl -N -X POST http://localhost:3001/api/chat/stream \
-H 'Content-Type: application/json' \
-d '{"messages":[{"role":"user","content":"Give 3 bullet tips for React"}]}'Add a minimal UI integration test to verify the stream renders incrementally and abort works.
6) Production Deployment Pointers
- Host API and client behind a reverse proxy; enable gzip and keep‑alive.
- Use a managed Redis for rate limiting and short‑term caches.
- Rotate the
OPENAI_API_KEY; restrict scope as needed. - Add health checks, logs aggregation, and error alerts.
If you ship with containers or a VPS, follow the practical steps in Deploy Next.js on a VPS for Node runtime pointers (many apply to Express too). For metadata hygiene that boosts CTR, revisit Next.js SEO best practices.
7) Enhancements You’ll Want Next
- Conversations with persistence (DB) and auth‑scoped histories.
- Tools: retrieval over your docs, search, or actions like “create ticket”.
- Guardrails: structured outputs (JSON) validated with
zodand rendered safely. - Analytics: latency percentiles, token usage per request, and cost dashboards.
To wire subscriptions when you monetize, pair the chatbot with a clean billing flow - our Paddle integration for SaaS shows a production setup. For faster developer loops, see Vibe coding with Cursor.
8) Common Pitfalls (and Fixes)
- Calling GPT directly from the client - always call from your server to protect keys.
- No input caps - bound message lengths and total tokens to avoid surprise bills.
- Streaming only on the client - keep a server stream and progressively render.
- Swallowing errors - return clear 4xx/5xx with messages suitable for end users.
- Ignoring observability - add IDs and categories to error logs.
9) FAQ
Can I use WebSockets instead of SSE? Yes. SSE is simpler and good for token streams; WebSockets are useful if you also push events from server to client outside request scope.
Edge vs Node runtime? The official SDK supports fetch, but confirm streaming behavior in your chosen environment. If you pivot to Next.js Edge, validate the Next.js integration guide.
How do I ground answers on my docs? Implement a retrieval layer before generation; see the RAG production guide.
Conclusion
You now have a secure, responsive chatbot with React and Node.js that can scale as you add features. Keep secrets on the server, validate inputs, stream for responsiveness, and log what matters. From here, consider grounding with RAG, monetization, and deployment hardening. When you’re ready to scale traffic, tighten your hosting setup using our VPS deployment guide and keep your content strategy healthy with the SEO checklist.
