ChatGPT API Tutorial: Build Your Own AI Chatbot with Python

The ChatGPT API has opened up extraordinary possibilities for developers who want to integrate conversational AI into their applications. Whether you are building a customer support bot, a coding assistant, or a creative writing companion, the OpenAI API gives you direct access to the same powerful language models behind ChatGPT. In this tutorial, we will walk through building a fully functional AI chatbot from scratch using Python, covering everything from basic API calls to streaming responses and maintaining conversation context.

Setting Up Your Environment

Before writing any code, you need an OpenAI API key and the official Python library installed. Head to platform.openai.com, create an account, and generate an API key from your dashboard. Then set up your Python environment:

pip install openai python-dotenv

Create a .env file in your project root to store your API key securely:

ADVERTISEMENT
OPENAI_API_KEY=sk-your-api-key-here

Now create your main Python file and load the environment variables:

import os
from dotenv import load_dotenv
from openai import OpenAI

load_dotenv()

client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

The openai library version 1.x uses a client-based pattern, which is cleaner and more explicit than the older module-level approach. The OpenAI client handles authentication, retries, and connection management automatically.

Making Your First Chat Completion Request

The core of the ChatGPT API is the chat completions endpoint. It accepts a list of messages, each with a role (system, user, or assistant) and content. The system message sets the behavior of the assistant, while user and assistant messages form the conversation history.

def get_chat_response(user_message: str) -> str:
    """Send a single message and return the assistant's reply."""
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {
                "role": "system",
                "content": "You are a helpful programming assistant. "
                           "You give concise, accurate answers with code examples."
            },
            {
                "role": "user",
                "content": user_message
            }
        ],
        temperature=0.7,
        max_tokens=1024
    )
    return response.choices[0].message.content


# Test it out
answer = get_chat_response("How do I reverse a string in Python?")
print(answer)

The temperature parameter controls randomness. A value of 0.0 gives deterministic responses, while 1.0 adds more creativity. For a chatbot, 0.7 strikes a good balance between helpful and varied answers.

Building a Chatbot with Conversation Memory

A single request-response is useful, but a real chatbot needs to remember the conversation. The API itself is stateless, so you must send the full message history with each request. Here is a complete chatbot class that manages conversation context:

class Chatbot:
    def __init__(self, system_prompt: str = "You are a helpful assistant.",
                 model: str = "gpt-4o"):
        self.model = model
        self.messages: list[dict] = [
            {"role": "system", "content": system_prompt}
        ]

    def chat(self, user_input: str) -> str:
        """Send a message and get a response, maintaining history."""
        self.messages.append({"role": "user", "content": user_input})

        response = client.chat.completions.create(
            model=self.model,
            messages=self.messages,
            temperature=0.7,
            max_tokens=1024
        )

        assistant_message = response.choices[0].message.content
        self.messages.append({"role": "assistant", "content": assistant_message})

        return assistant_message

    def reset(self):
        """Clear conversation history, keeping the system prompt."""
        self.messages = [self.messages[0]]


def main():
    bot = Chatbot(
        system_prompt="You are ByteBot, a friendly coding tutor. "
                      "Explain concepts clearly with examples."
    )

    print("ByteBot is ready! Type 'quit' to exit, 'reset' to start over.\n")

    while True:
        user_input = input("You: ").strip()

        if not user_input:
            continue
        if user_input.lower() == "quit":
            print("Goodbye!")
            break
        if user_input.lower() == "reset":
            bot.reset()
            print("Conversation reset.\n")
            continue

        response = bot.chat(user_input)
        print(f"\nByteBot: {response}\n")


if __name__ == "__main__":
    main()

This chatbot stores every exchange in self.messages, so the model receives full context each time. Keep in mind that longer conversations consume more tokens, so you may want to implement a sliding window or summarization strategy for production use.

Adding Streaming Responses

For a better user experience, you can stream the response token by token instead of waiting for the complete answer. This makes the chatbot feel much more responsive, especially for longer replies:

def chat_stream(self, user_input: str) -> str:
    """Send a message and stream the response in real-time."""
    self.messages.append({"role": "user", "content": user_input})

    stream = client.chat.completions.create(
        model=self.model,
        messages=self.messages,
        temperature=0.7,
        max_tokens=1024,
        stream=True
    )

    full_response = ""
    for chunk in stream:
        delta = chunk.choices[0].delta
        if delta.content:
            print(delta.content, end="", flush=True)
            full_response += delta.content

    print()  # newline after streaming completes

    self.messages.append({"role": "assistant", "content": full_response})
    return full_response

When stream=True, the API returns an iterator of chunk objects. Each chunk contains a delta with partial content. You print each piece immediately and accumulate the full response for your conversation history.

Error Handling and Production Considerations

A production chatbot needs robust error handling. The OpenAI library provides specific exception classes for different failure scenarios:

from openai import (
    APIConnectionError,
    RateLimitError,
    APIStatusError
)
import time

def chat_with_retry(self, user_input: str, max_retries: int = 3) -> str:
    """Chat with automatic retry on transient failures."""
    self.messages.append({"role": "user", "content": user_input})

    for attempt in range(max_retries):
        try:
            response = client.chat.completions.create(
                model=self.model,
                messages=self.messages,
                temperature=0.7,
                max_tokens=1024
            )
            assistant_message = response.choices[0].message.content
            self.messages.append({
                "role": "assistant",
                "content": assistant_message
            })
            return assistant_message

        except RateLimitError:
            wait_time = 2 ** attempt
            print(f"Rate limited. Retrying in {wait_time}s...")
            time.sleep(wait_time)

        except APIConnectionError:
            print("Connection error. Check your network.")
            if attempt == max_retries - 1:
                self.messages.pop()  # remove failed user message
                raise

        except APIStatusError as e:
            print(f"API error {e.status_code}: {e.message}")
            self.messages.pop()
            raise

    self.messages.pop()
    raise RuntimeError("Max retries exceeded")

This implements exponential backoff for rate limits, which is essential when your chatbot handles many concurrent users. Always remove the user message from history if the request ultimately fails, so your conversation state remains consistent.

Conclusion

You now have a solid foundation for building AI chatbots with the OpenAI API. We covered the basics of chat completions, built a class with conversation memory, added streaming for real-time responses, and implemented production-grade error handling. From here, you can extend the chatbot with function calling for tool use, integrate it into a web framework like FastAPI or Flask, or add a database to persist conversation history across sessions. The key takeaway is that the API is stateless by design, so your application controls the context, giving you complete flexibility over how your chatbot behaves.

ADVERTISEMENT

Leave a Comment

Your email address will not be published. Required fields are marked with an asterisk.