Developer Portfolio - AI-First Engineer

The Challenge: Real-Time AI Interaction

One of the core features of Dojo Genesis is a live chat interface where users can interact with an AI agent. But a simple request-response cycle feels slow and clunky. To create a truly modern and engaging experience, the AI's response needs to stream in token-by-token, just like ChatGPT. This presented a fun technical challenge: how do you efficiently stream data from a Python (FastAPI) backend to a React (Next.js) frontend? As a freelance developer specializing in AI integration, this is a pattern I've been perfecting.

The Build: A Three-Part Architecture

The solution involved three key components working in harmony: the backend streaming endpoint, a Next.js API route acting as a proxy, and the frontend chat component powered by the Vercel AI SDK.

The Backend (FastAPI with SSE): The first step was to create a backend endpoint that could handle streaming. Instead of a standard JSON response, we used Server-Sent Events (SSE), a simple and efficient protocol for pushing real-time updates from a server to a client. We used the sse-starlette library to make this incredibly easy. The endpoint calls our AI model router (litellm) with stream=True, and as each chunk of data arrives from the AI model, it's immediately yielded to the client as a JSON object.

python

# backend/api/ai.py (simplified)
from sse_starlette.sse import EventSourceResponse

@router.post("/ai/chat")
async def run_chat(request: CompletionRequest):
    async def event_stream():
        response_stream = await completion(model=request.model, messages=request.messages, stream=True)
        async for chunk in response_stream:
            if chunk.choices[0].delta.content:
                yield {"data": json.dumps({"content": chunk.choices[0].delta.content})}
    
    return EventSourceResponse(event_stream())

# backend/api/ai.py (simplified)
from sse_starlette.sse import EventSourceResponse

@router.post("/ai/chat")
async def run_chat(request: CompletionRequest):
    async def event_stream():
        response_stream = await completion(model=request.model, messages=request.messages, stream=True)
        async for chunk in response_stream:
            if chunk.choices[0].delta.content:
                yield {"data": json.dumps({"content": chunk.choices[0].delta.content})}
    
    return EventSourceResponse(event_stream())

The Proxy (Next.js API Route): We didn't want to expose our backend directly to the public internet or deal with CORS issues in the browser. The solution was to create a Next.js API route (/api/chat) that acts as an authenticated proxy. This route receives the request from the frontend, attaches the user's Supabase JWT for authentication, and then forwards the request to the FastAPI backend. It then streams the response from the backend directly to the frontend client using the Vercel AI SDK's AIStream and StreamingTextResponse helpers. This keeps our backend secure and our frontend code clean.
The Frontend (Vercel AI SDK): This is where the magic happens. The Vercel AI SDK provides a powerful useChat hook that handles all the complexity of managing chat state. It manages the message history, user input, and form submission. We simply point it to our Next.js API route (api: "/api/chat"), and it handles the streaming response, updating the UI in real-time as new tokens arrive. This allowed us to build a beautiful, responsive chat interface with just a few lines of code.

Key Insights

SSE is Perfect for AI Streaming: While WebSockets are powerful, SSE is a simpler, one-way communication channel that is perfectly suited for streaming AI responses from a server to a client. It's lighter weight and easier to implement on the backend.
The Proxy Pattern is Your Friend: Using a Next.js API route as a proxy is a powerful pattern for full-stack TypeScript applications. It provides a secure and clean way to communicate with your backend, handle authentication, and hide your backend's implementation details from the client.
Abstract, Don't Rebuild: The Vercel AI SDK is a prime example of leveraging a well-built library to handle complexity. Instead of manually managing streaming state, message history, and UI updates, we were able to focus on building the UI itself, saving hours of development time.

Results: A Seamless Chat Experience

The final result is a fast, fluid, and fully functional chat interface at the heart of Dojo Genesis. Users can type a message, press send, and immediately see the AI's response streaming in, creating a dynamic and engaging experience. This full-stack integration, from the FastAPI backend to the Next.js frontend, is a testament to the power of modern web development tools and a well-architected system. It's a core piece of my work as a freelance developer building cutting-edge AI applications.

Takeaway

Building a streaming AI chat interface doesn't have to be complicated. By combining the power of FastAPI with SSE, the security of a Next.js proxy route, and the simplicity of the Vercel AI SDK, you can create a world-class user experience with surprisingly little code. The key is to choose the right tool for each part of the job.

Building a Streaming AI Chat with FastAPI and Next.js

The Challenge: Real-Time AI Interaction

The Build: A Three-Part Architecture

Key Insights

Results: A Seamless Chat Experience

Takeaway

Cruz Morales

More Insights