GPT4Free aggregates multiple LLM providers into one unified Python client
G

GPT4Free aggregates multiple LLM providers into one unified Python client

GPT4Free aggregates multiple LLM providers into one unified Python client

CLI
66,455 stars
N/A forks
N/A contributors

README

Project documentation from GitHub

GPT4Free: One Python Client to Rule All the LLMs

If you've ever tried to switch between OpenAI, Cohere, or a dozen other LLM providers, you know the pain. Each API has its own client, authentication, and quirks. It's like speaking a different language every time you want to ask a question.

That's where GPT4Free comes in. It's a Python library that aggregates multiple large language model providers into a single, unified client. No more juggling API keys, different request formats, or wondering which SDK to install next. One import, one interface, many models.

What It Does

GPT4Free is a Python package that wraps multiple LLM APIs and endpoints behind a common API. You write your code once, and it handles the provider-specific details under the hood. Currently, it supports a growing list of providers—from well-known ones like OpenAI and Cohere to community-maintained endpoints and even some free tiers.

You don't need to worry about rate limits, authentication schemes, or response parsing per provider. Just pass your prompt, pick the model or provider, and get your response back in a consistent format.

Why It’s Cool

One API to learn.
You learn g4f.ChatCompletion.create() once. That's it. Whether you're hitting GPT-4, Claude, or a local model, the method stays the same. This is a huge time saver when prototyping or when you want to switch models mid-project.

Provider fallback.
If one provider is down or rate-limited, GPT4Free can automatically fall back to another. You can set a list of providers and let the library try each one until it gets a response. This makes your apps more resilient without extra code.

Free tier support.
Yes, there are community-maintained providers that offer free access to some models. While reliability varies, it's incredibly useful for experimentation, learning, or low-budget projects. You can test ideas without spending a dime.

Async and streaming support.
It's built for real-world use. You can stream responses as they come in, and async support means you can handle multiple requests concurrently without blocking.

How to Try It

Getting started takes two minutes.

pip install g4f

Then, in Python:

import g4f response = g4f.ChatCompletion.create( model="gpt-3.5-turbo", messages=[{"role": "user", "content": "Hello, what can you do?"}], provider=g4f.Provider.OpenaiChat, # optional, defaults to smart selection stream=True
) for message in response: print(message, end="", flush=True)

That's it. You can swap the provider to g4f.Provider.Cohere or g4f.Provider.Bing (where available)

Did you like this issue?

Join our weekly newsletter

Related Projects

Love discovering amazing projects?

Help us continue bringing you the best open-source discoveries every week.

Back to Projects
Last updated: Jun 17, 2026