How I built a ChatGPT clone in just 1 week

April 02, 2024

I still remember the ChatGPT craze that took the world by storm when it was first launched. Sometimes, I'm not so quick to jump in on things that are so new. When I first heard about it, there were a few weeks that went by of me reading articles, and posts on twitter from devs who started using it to write code and saying how good it was. Finally, I finally decided to give it a shot. When I did, I was, just like the rest of the world, in shock. But the more time I spent using it, the more I started to notice not just how good ChatGPT was, but rather how good the UI/UX was. The user interface was clean, polished and simple, and it was very easy to use. I thought to myself "man, I wish I could have been part of the team who built this".

Of course, the UI/UX has only been getting better and better, and I'm even more of a fan of it than I was last year. Recently, I thought to myself, "I'm pretty sure I can build it from scratch and it probably wouldn't take me no more than a week to do it." So I determined that I would give myself a week to build it. I got to work, and here's how I built it.

Determining Scope

The very first order of business was determining just how far I wanted to go with this. Did I want to do a complete clone, or just a basic interface? I've seen a couple of other "clones" out there that just built a very basic chat UI and nothing more. I was not satisfied with this. I started to review every part of the web app to identify all of the features, and determine what would be part of the clone. In a few minutes of clicking around, it was clear to me that I would build mostly every feature of the web app—except for the user management and plan selection parts. I thought that these were irrelevant for the clone, so I didn't build them, even though I could easily have done so. Here's what's included in the clone:

  1. Streaming Chat UI
  2. Animated Chat Suggestions
  3. Model Selection Dropdown
  4. Chat History
  5. ChatGPT-powered Chat titles with typewriter effect
  6. Create and store chats

The Tech Stack

Now that I determined what the scope was, next was to figure out what tools and technologies I would use. This was pretty easy, as I had already had a few ideas in the back of my head on what I'd use. Here's what I went with:

  1. Remix (React)
  2. Tailwind CSS
  3. Vercel AI SDK with OpenAI
  4. Shadcn/UI
  5. Jotai
  6. Indexed DB

Remix

I went with Remix as my React framework of choice because I've been a fan of it ever since it launched. I love how simple it is to use, and how it makes developing apps easy and straight forward, without too many bells, whistles and without a huge learning curve. It really does make for a fun and productive DX. Plus, I love that the whole foundation for the framework is based on web fundamentals, so it pushes web accessibility and progressive enhancement forward. Also, since ChatGPT (web) itself is built with Next.js, I figured I wanted the challenge of building it with something other than Next. Some other options I contemplated was SvelteKit (which I love), and Qwik (which I also love), but ultimately thought Remix was especially a great fit because I wanted to use Shadcn/UI to build the UI with, which is a React only technology.

TailwindCSS

When I first saw Tailwind, I did not like it. All I could see was the big mess of the utility classes in the html and I immediately thought BOOTSTRAP. Yuck. But over time, I saw how so many devs were raving about it. I read on Twitter some time ago how Adam Wathan (founder) said "you just got to try it out. You won't regret it." Eventually, I decided to give it a shot. I was blown away at how much it helped me to write consistent css, and how quickly I could write UI with it. I then realized, it's NOT Bootstrap. It's a framework to help you build your own custom design system. It exposes apis for maximum customizability and composability. It's amazing, and now I won't build anything without it. Did I mention that ChatGPT itself is built on Tailwind?

Shadcn/UI

Shadcn/UI is easily the most popular React component library on the web right now (although technically it says it's not a library). I love how great of a DX it offers, and built upon Radix UI primitives, it provides great accessibility, and maintainability out of the box. It's also based upon Tailwind! Plus, the design is one of best I've ever seen in UI components. I'm a graphic designer turned front-end engineer, so I have a strong eye for good design aesthetics and I have a high standard for design quality. I love everything about the design of these components, both visually and technically.

Vercel AI SDK

The guys over at Vercel are doing a remarkable job pushing AI on the front-end forward. I first saw the Vercel AI SDK when a couple of the devs working at Vercel were teasing V0. V0 was blowing people away at the time, and still is. It's a platform that is able to generate UI components from simple text prompts. They call it, "generative UI". Neat. This platform is built on the Vercel AI SDK. Taken from its website, it says "The Vercel AI SDK is an open-source library designed to help developers build conversational streaming user interfaces in JavaScript and TypeScript. The SDK supports React/Next.js, Svelte/SvelteKit, and Vue/Nuxt as well as Node.js, Serverless, and the Edge Runtime."

Taking a look at their docs, code samples and examples, and seeing how easy it is to get started building AI-powered web apps, I decided it was the best fit for this project. They handle a lot of the plumbing work needed to get streaming text responses from OpenAI (and many others), managing the message state between client and server, loading and error states, event hooks and much more. While I am more than capable of doing all of this work myself, why bother when there's already an AI SDK that can do it for me? This was the secret key for me to be able to build the clone in a week! Had I handled all of this myself, it could have taken me weeks to accomplish.

Jotai

One of the biggest things that React lacks is a global state management solution. It's one of the reasons why I don't like React all that much anymore. The fragmented state of state management in React is a mess, and there's a handful of other solutions out there that are good and bad. Jotai is one of the good ones, and arguably one of the best out there. I love how simple it is to use. I'm a big fan of atomic state, which keeps things simple and straightforward, yet powerful and expressive enough to build anything you need. I just create small pieces of state called atoms for different parts of the ui, export them, and consume with useState style hooks anywhere I want in the app. No messy boilerplate, and no contexts. It just works, and it's extremely scalable.

Indexed DB

I decided to go with Indexed DB to store the chat data because I didn't want to deal with the overhead of storing user data in the cloud for a simple project like this. If this were a real app, I would definitely have reached for Vercel's KV or Postgres stores being that the clone is deployed on Vercel. If you haven't noticed already, the tech stack is very much a Vercel tech stack. Shadcn/UI, Vercel AI and deployed on Vercel infrastructure via @vercel/remix.

I wanted a solution that was quick and minimal for the purposes of the demo, and it would only serve to store the chat history. I initially thought of using localStorage, but while localStorage has an excellent kv store api and is easy to use, it does have a low storage limit of 2.5-10MB, which will easily be surpassed with just one chat thread. So the only other good alternative was Indexed DB. While the api is not as easy and straight forward to use, it does have a much higher storage limit of up to 80% of the storage capacity of the users' disk space! For example, If you have 100GB, Indexed DB can use up to 80GB of space, and 60GB by a single origin. Plenty of space for a bunch of chat threads. Not to mention, it's really fast, and has support for query privileges (R, RW, etc) to optimize for performance. Here's a good article that talks more about how Indexed DB works.

Implementation

Building the ChatGPT clone was fun. I learned a fair deal, and that's really the object of a project like this: learning. Building something from scratch is the best way to understand how something works under the hood, and even though I didn't build every single thing from scratch, I still have a much better understanding of how it works under the hood and what it takes to pull it off.

Chat UI

The star of the show here is the chat UI. It's front and center and I love that. It's the main experience. Building with Vercel AI made this a breeze. Here's all the code I needed to get, initiate, and update the chat state:

const {
    messages, input, handleInputChange,
    handleSubmit, append,  isLoading
  } = useChat({
    id: chatID,
    initialMessages: history,
    body: { model: model.value }
 })

With the useChat hook, I have access to the messages state that it manages for me, with a bunch of other handlers for the form, the current input value, an isLoading flag, and helpers to append messages programmatically, which is what I used to implement the chat suggestions. It also exposes a config object to initialize the hook to add a host of other things like a chat ID, chat history, and a body field that allows you to pass any other arbitrary data along with the api request to the /api/chat endpoint which is really nice. This was useful for me to be able to add the current selected model from global state, which is set with model selector component.

Screenshot 2024-04-02 at 2.38.13 PM.png

Streaming Text

The useChat also handles the streaming of the responses to the UI for you. On the server, it uses OpenAIStream which is a utility function that transforms the response from OpenAI's completion and chat models into a ReadableStream. It handles the stream of message chunks and makes it available to you as a message prop, which you then take and map through like so:

{messages.map(({ id, content, role }, i) => {
    const isChatGPT = role === 'assistant'
    const isStreaming = isChatGPT && isLoading && i === messages.length - 1
     return (
        <div key={id} className='mb-8'>
           <div className='flex items-center mb-1'>
              <div
                 className={cn('w-4 h-4 rounded-full mr-2', {
                  'bg-purple-500': isChatGPT,
                  'bg-white/15': !isChatGPT
                 })}>
              </div>
              <strong>{isChatGPT ? 'ChatGPT' : 'You'}</strong>
           </div>

            <Show when={role !== 'data'}>
              {chunkContent(content).map((chunk, i) => {
                //only add the text cursor to the last streaming chunk
                 const lastChunk = i === chunkContent(content).length - 1
                  return (
                     <p key={i}
                        className={cn('antialiased ml-6 mb-4 last:mb-0', {
                          relative: isStreaming && lastChunk,
                          'after:w-4 after:h-4 after:rounded-full after:absolute after:bottom-1 after:ml-1 after:bg-white':
                        isStreaming && lastChunk
                       })}>
                       {chunk}
                     </p>
                   )
              })}
          </Show>
        </div>
       )
 })}

The Form

The form was super simple and easy to build. It's just a textarea, which is needed for multi-line input as opposed to an input which is just a single line. It needed to be styled like a text input though. Here's the code for the form, which includes the animated chat suggestions.

<Form method='POST' onSubmit={handleSubmit}>
   <div className='mb-2'>
     <Show when={messages.length === 0 && history.length === 0}>
        <ChatSuggestions onSuggestionClick={handleSuggestionClick} />
     </Show>
   </div>

   <Textarea
      placeholder='Message ChatGPT...'
      name='message'
      onChange={handleInputChange}
      value={input}
      className='resize-none'
   />

  <p className='text-center text-xs text-white/50 mt-3 mb-1'>
    ChatGPT can make mistakes. Consider checking important information.
  </p>
</Form>

Notice that i'm using the handleSubmit handler from the useChat hook and passing that into the form. This handles the form submission for me, which takes the current value of the input, the messages array, and correctly formats it to how the chat completions api expects it to be. I'm also passing the handleInputChange handler from useChat to the textarea so that it keeps the text input state in sync, and it gets exposed as the input prop from the useChat hook, which I then pass back to the input's value prop so that when the form submission occurs and the UI re-renders, it get re-initialized with the input state.

Aside from this, there isn't much left for the chat UI implementation. I implemented a /chat/:id route in order to render a chat thread from history after the browser reloads. Here's how it works on a high level. When a user clicks on a chat from the history sidebar, I route them to /chat/:id. When that route is requested, I access the request params from the loader, and call getMessages which is a promise that accepts the chat ID and resolves the chat data stored in the browser like so:

export async function clientLoader({ params }: ClientLoaderFunctionArgs) {
  const { id } = params

  if (id) {
    return {
      id,
      messages: await getMessages(id)
    }
  }
}

On the front-end, I then access the data returned from the loader using useLoaderData. This is one of the things I LOVE about Remix. The loader and the route UI is co-located in the same file!

export default function ChatPage() {
  const data = useLoaderData<typeof clientLoader>()

  return (
    <main>
      <ChatLayout id={data?.id} history={data?.messages} />
    </main>
  )
}

Now, I just pass the chat ID and the messages to <ChatLayout /> which is a higher-order component I built in order to reuse the same UI and logic across multiple pages. In my case, I use the <ChatLayout /> in the root page, and in the chat detail page, so I built it in a way that would allow me to reuse the same code in both locations, but without having to have duplicated code. It exposes a render prop, so that I can share the internal state of <ChatLayout />, but with the flexibility of telling it how to render what UI at render based on it's state. The props interface for <ChatLayout /> is this:

interface ChatLayoutProps {
  id?: string
  children?: ReactNode | ((state: State) => ReactNode)
  history?: Messages
}

So the children prop accepts either a ReactNode OR a function that provides the current state, which returns a ReactNode. This is why it's called a render prop. Yea, it's fabulous. On the root page, I use it like this:

export default function Index() {
  return (
    <ChatLayout>
      {({ messages }) => {
        return (
          <>
            {messages.length === 0 ? (
              <div className='flex flex-col items-center'>
                <ChatGPTLogo className='scale-90' />

                <h1 className='font-semibold text-2xl text-center antialiased mt-3'>
                  How can I help you today?
                </h1>
              </div>
            ) : null}
          </>
        )
      }}
    </ChatLayout>
  )
}

and in <ChatLayout />, I have this in the render: {typeof children === 'function' ? children({ messages, isLoading }) : children}

This pattern allows for developers to create highly composable UI's, and why I chose to use it in this app.

Chat History

The <ChatHistory /> component was very interesting to build, and had a blast while doing it. It's pretty simple in nature: when the app loads, fetch all of the stored chats in the db and render them grouped by day. That's pretty straight forward. This all happens on the client side, since the (indexed) db is on the client. Traditionally in React, you'd employ a useEffect to fetch data on client load only once like this:

import { getChatHistory } from '~/lib/indexedDB'
...
// only run this once on load
useEffect(() => {
  const getHistory = async () => {
     const chatHistory = await getChatHistory()
     setHistory(chatHistory)
  }
}
  getHistory()
}, [])

I started to implement it this way. However, because I'm using Remix v2, a light bulb went off in my head: I can use the new ClientLoader api for this! So now, I can get the same DX for data fetching on the server, but on the client, and it still uses the same useLoaderData convention, but instead, I would do: useLoaderData<typeof clientLoader>. With this approach, I effectively can eliminate the need to use useEffect and useState which is GREAT. This is all I need to do:

// server
export async function clientLoader({}: ClientLoaderFunctionArgs) {
  return { chatHistory: await getChats() }
}

// client
export default function App() {
    const { chatHistory } = useLoaderData<typeof clientLoader>()
    return (...)
}

All of this is pretty easy to implement. The next thing that this component has to do is to dynamically render new chats that are created in the chatUI. There were multiple ways I could do this. Some options included:

  1. Global state
  2. Observables
  3. Custom events

At first, I wondered if there was any way to listen for indexed db events on the client side. Then, I would just setup an event listener, and handle some action with the data. Turns out, nothing like this exists natively on the browser from the Indexed DB api itself. A quick search yielded Dexie which is basically an observable interface that emits an event stream of indexed db events. Pretty cool. But I didn't want to have to download a whole library just for this.

For this reason, using observables was eliminated. Using global state was an option, but didn't want to go through all of the plumbing to set it up. I wanted something extremely simple. I just wanted to listen to events from indexed db, as they happened without having to do anything extra. Enter Custom events! Much like event emitters on Node, the Custom Event interface allows you to create arbitrary events to emit data, and subscribe to them anywhere in your app. This is exactly what I was looking for. So now, the implementation looks like this:

const [chats, setChats] = useState<Chat[]>(props.history)
  const [latestChat, setLatestChat] = useState<Chat | null>(null)

  useEffect(() => {
    //listen for custom chat-created events
    window.addEventListener('chat-created', (event) => {
      const chat: Chat = (event as CustomEvent).detail
      setChats([...chats, chat])
    })
    //listen for custom chat-classified events
    window.addEventListener('chat-classified', (event) => {
      const chat: Chat = (event as CustomEvent).detail
      setLatestChat(chat)
    })
}, [chats])

The <ChatHistory /> component accepts a history prop, which is an array of chats, initializes it's own chat state from that, and then sets up two event listeners to receive chat events, and update the state accordingly. I use 'chat-created' which is an event fired when a chat is created, and the other is the chat-classified which is an event that is fired once a chat is classified with ChatGPT. The first enables me to render set all chats created after initialization in state, while the second enables me to update the chat title with the signature ChatGPT typewriter effect as soon as I get the classified title from the api:

useEffect(() => {
    if (chats.length > 0 && latestChat) {
      const { title, id } = latestChat
      const element = document.getElementById(id)

      if (element) {
        typewriter(title as string, element, 50).then(() => {
          setLatestChat(null)
       })
     }
   }
}, [latestChat])

And that is essentially it for the <ChatHistory /> component. Again, a really simple component, but super fun to build, and I even learned something new with the Custom Events api!

Storage

Now we've come to the last part of the implementation. I didn't have a ton of requirements for the db, I just needed something simple, scalable, and fast. I had fun working with Indexed DB. To store the chat data, I wrote a simple set of getter/setter functions like storeMessages, updateChatTitle, getChat, getChats, etc. Overall, it went pretty smooth. Here's the storeMessages function:

export const storeMessages = async (id: string, messages: Messages) => {
  const db = await openDB()

  if (db) {
    const tx = db.transaction(STORE_NAME, 'readwrite')
    const os = tx.objectStore(STORE_NAME)

    const chat = await new Promise<Chat | undefined>((resolve, reject) => {
      const req = os.get(id)
      req.onerror = () => reject((req as IDBOpenDBRequest).error)
      req.onsuccess = () => resolve(req.result)
    })

    //optimistically creates a new chat if it doesn't exist
    if (!chat) {
      return await createChat({
        id,
        messages,
        createdAt: new Date()
      })
    }

    chat.messages = messages
    const req = os.put(chat)

    req.onerror = (event) => {
      const error = (event.target as IDBOpenDBRequest).error
      console.error('Failed to store message:', error?.message)
    }
    req.onsuccess = () => {
      console.log('Messages stored successfully!')
    }
  }
}

As you can see, this function take a chat id, and an array messages, and stores it in the db. It works very well. I also optimistically create a new chat in the event that the chat doesn't exist for whatever reason. The createChat looks like this:


const createChat = async (chat: Chat) => {
  const db = await openDB()

  if (db) {
    const tx = db.transaction(STORE_NAME, 'readwrite')
    const os = tx.objectStore(STORE_NAME)

    const req = os.add(chat)

    req.onerror = (event) => {
      const error = (event.target as IDBOpenDBRequest).error
      console.error('Failed to create chat:', error?.message)
    }
    req.onsuccess = () => {
      console.log('Chat created successfully!')
      window.dispatchEvent(new CustomEvent('chat-created', { detail: chat }))
    }
  }
}

You can see the event that I dispatch after I successfully create the chat. The { detail: chat } is part of the Custom Event's api to send the data you want.

Challenges

Although the app was a blast to build and went pretty smooth, it wasn't without it's challenges. There were two issues that I had to overcome while building this app.

Too Many Re-Renders

The first one was related to the classic re-rendering issues that plague React apps while using useEffect hooks. In the <ChatLayout /> component, I need to get the latest messages returned from the useChat hook, and store the messages to the db. Well, the initial useEffect looked something like this:

import { storeMessages } from '~/lib/indexedDB'
const { messages } = useChat()
const chatID = props.id || useMemo(() => generateUniqueId(), [])

useEffect(() => {
  if (!isLoading) {
      if (messages.length === 2) {
        //we can use the first message to classify the chat subject
        console.log('classifying chat...')

        const req = await fetch('/api/classify', {
          method: 'POST',
          body: JSON.stringify({
            data: messages[0]
          })
        })
        const res: ChatCompletion = await req.json()
        const title = res.choices[0].message.content

        if (title) {
          //store the messages first
          //then update the chat title
          await storeMessages(chatID, messages)
          await updateChatTitle(chatID, title)
          navigate(`/chat/${chatID}`)
        }
      } else {
        if (messages.length) {
          await storeMessages(chatID, messages)
          setMessageCount(messages.length)
        }
      }
    }
}, [messages])

The problem with this is that the first conditional is supposed to run only on the initial round of messages, but it was running every time messages was updated! This resulted in 4+ calls to storeMessages and updateChatTitle, when it should only be once, and then storeMessages would run every time there was a new message available. To fix these issues, I had to be very specific about when these effects should run in order to avoid duplicate api calls, db operations, and re-renders. All of this would be detrimental to performance (and very expensive) had this been used by thousands or millions of people.

What I ended up doing to solve the problem was changing the two conditionals to this:

if (messages.length === 2 && !id) {...} // store messages & classify chat
if (messages.length > messageCount && history.length !== messages.length) {...} // store messages

I needed a way to keep track of the message length, and then store the messages only if messages.length is greater than the message count, signifying that there were new messages that needed to be stored. After implementing these changes, the first only runs after the first round of messages (initial query and ChatGPT's response), and the second runs only when there's a new message available, nice!

TypeError: nodeResponse.headers.raw is not a function

After I built the app, I was excited to deploy it to Vercel and try it out in production. The deploy goes without a hitch, the UI loads perfectly fine. I thought, great! Until I submitted my first query. The UI didn't crash, but I expected the streaming response from ChatGPT, but it didn't happen, which was odd. I had a look at the logs on the Vercel dashboard to see what was up. I saw several 500 errors that said this: TypeError: nodeResponse.headers.raw is not a function. After a couple of searches on Google for an hour or so, I found that this was an issue for Remix apps deployed to Vercel, using the @vercel/remix package.

Apparently, the issue stems from compatibility problems between StreamingTextResponse and Remix action functions when using the Vercel AI package. More specifically, it's a compatibility issue between the OpenAI client and Remix, related to their use of node-fetch and @remix-run/web-fetch, respectively. This issue arises because Remix's fetch implementation relies on a polyfilled ReadableStream, which checks for a polyfill-specific property during pipeThrough. The OpenAI client, when returning a response, uses a non-polyfilled TransformStream that fails these checks due to the absence of the polyfill property, leading to the described compatibility issue.

At the time of this writing, this issue has not been resolved using Remix v2.7.2. Following this Github issue, there were some who suggested a couple of patches, one of which resolved my issue. It basically involved writing a custom StreamingTextResponse instead of using the one from Vercel AI, which manually adds in the headers.raw function which is missing in Remix at the moment.

Summary

So this is how I built ChatGPT in one week. Ok, I admit, it's not every feature, just the critical parts. There were other parts that I did not implement, like functions, and formatting all of possible responses to handle code snippets, markdown, etc. But I don't need to, because it's not the real thing. I just wanted to build it out enough to learn on a high level how it works and what is needed to get it to a useable state.

I'm pretty proud of what I built. It's so good, that I've used it for many queries instead of the real ChatGPT UI, although, I've since stopped doing that because I'm ultimately paying for those api calls! Who knows, maybe one day I'll get to work on the team who builds ChatGPT at OpenAI, so that I can make it better!

Thanks for reading. Oh, and one more thing. It's open source! You can find all of the source code for the ChatGPT clone that I built here. You can try it out here!