Ollama OpenAI Compatibility

Ollama now offers built-in support for the OpenAI Chat Completions API, enabling seamless integration with more tools and applications locally.

Setup

Start by downloading Ollama and pulling a model, such as Llama 2 or Mistral:

ollama pull llama2

Usage

Using cURL

To interact with Ollama’s API, which is compatible with OpenAI’s format, adjust the hostname to http://localhost:11434:

curl http://localhost:11434/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d '{
        "model": "llama2",
        "messages": [
            {
                "role": "system",
                "content": "You are a helpful assistant."
            },
            {
                "role": "user",
                "content": "Hello!"
            }
        ]
    }'

Using the OpenAI Python Library

from openai import OpenAI

client = OpenAI(
    base_url='http://localhost:11434/v1',
    api_key='ollama',  # required but not used
)

response = client.chat.completions.create(
  model="llama2",
  messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Who won the world series in 2020?"},
    {"role": "assistant", "content": "The LA Dodgers won in 2020."},
    {"role": "user", "content": "Where was it played?"}
  ]
)
print(response.choices[0].message.content)

Using the OpenAI JavaScript Library

import OpenAI from 'openai'

const openai = new OpenAI({
  baseURL: 'http://localhost:11434/v1',
  apiKey: 'ollama', // required but not used
})

const completion = await openai.chat.completions.create({
  model: 'llama2',
  messages: [{ role: 'user', content: 'Why is the sky blue?' }],
})

console.log(completion.choices[0].message.content)

Examples

Vercel AI SDK

The Vercel AI SDK is an open-source tool for building conversational streaming applications. To get started, clone the example repo with create-next-app:

npx create-next-app --example https://github.com/vercel/ai/tree/main/examples/next-openai example
cd example

Make these two modifications in app/api/chat/route.ts to use Ollama:

const openai = new OpenAI({
  baseURL: 'http://localhost:11434/v1',
  apiKey: 'ollama',
});
const response = await openai.chat.completions.create({
  model: 'llama2',
  stream: true,
  messages,
});

Run the app:

npm run dev

Then open the example app in your browser at http://localhost:3000.

Autogen

Autogen is a popular open-source framework from Microsoft for creating multi-agent applications. For this example, use the Code Llama model:

ollama pull codellama

Install Autogen:

pip install pyautogen

Create a Python script example.py to integrate Ollama with Autogen:

from autogen import AssistantAgent, UserProxyAgent

config_list = [
  {
    "model": "codellama",
    "base_url": "http://localhost:11434/v1",
    "api_key": "ollama",
  }
]

assistant = AssistantAgent("assistant", llm_config={"config_list": config_list})

user_proxy = UserProxyAgent("user_proxy", code_execution_config={"work_dir": "coding", "use_docker": False})
user_proxy.initiate_chat(assistant, message="Plot a chart of NVDA and TESLA stock price change YTD.")

Run the example to have the assistant generate code for plotting the chart:

python example.py

Future Enhancements

This is the initial experimental integration with the OpenAI API. Planned improvements include:

Support for the Embeddings API
Function calling
Vision support
Logprobs

or more information, see the OpenAI compatibility docs.