Ollama now offers built-in support for the OpenAI Chat Completions API, enabling seamless integration with more tools and applications locally.
Setup
Start by downloading Ollama and pulling a model, such as Llama 2 or Mistral:
ollama pull llama2
Usage
Using cURL
To interact with Ollama’s API, which is compatible with OpenAI’s format, adjust the hostname to http://localhost:11434
:
curl http://localhost:11434/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "llama2",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Hello!"
}
]
}'
Using the OpenAI Python Library
from openai import OpenAI
client = OpenAI(
base_url='http://localhost:11434/v1',
api_key='ollama', # required but not used
)
response = client.chat.completions.create(
model="llama2",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Who won the world series in 2020?"},
{"role": "assistant", "content": "The LA Dodgers won in 2020."},
{"role": "user", "content": "Where was it played?"}
]
)
print(response.choices[0].message.content)
Using the OpenAI JavaScript Library
import OpenAI from 'openai'
const openai = new OpenAI({
baseURL: 'http://localhost:11434/v1',
apiKey: 'ollama', // required but not used
})
const completion = await openai.chat.completions.create({
model: 'llama2',
messages: [{ role: 'user', content: 'Why is the sky blue?' }],
})
console.log(completion.choices[0].message.content)
Examples
Vercel AI SDK
The Vercel AI SDK is an open-source tool for building conversational streaming applications. To get started, clone the example repo with create-next-app
:
npx create-next-app --example https://github.com/vercel/ai/tree/main/examples/next-openai example
cd example
Make these two modifications in app/api/chat/route.ts
to use Ollama:
const openai = new OpenAI({
baseURL: 'http://localhost:11434/v1',
apiKey: 'ollama',
});
const response = await openai.chat.completions.create({
model: 'llama2',
stream: true,
messages,
});
Run the app:
npm run dev
Then open the example app in your browser at http://localhost:3000
.
Autogen
Autogen is a popular open-source framework from Microsoft for creating multi-agent applications. For this example, use the Code Llama model:
ollama pull codellama
Install Autogen:
pip install pyautogen
Create a Python script example.py
to integrate Ollama with Autogen:
from autogen import AssistantAgent, UserProxyAgent
config_list = [
{
"model": "codellama",
"base_url": "http://localhost:11434/v1",
"api_key": "ollama",
}
]
assistant = AssistantAgent("assistant", llm_config={"config_list": config_list})
user_proxy = UserProxyAgent("user_proxy", code_execution_config={"work_dir": "coding", "use_docker": False})
user_proxy.initiate_chat(assistant, message="Plot a chart of NVDA and TESLA stock price change YTD.")
Run the example to have the assistant generate code for plotting the chart:
python example.py
Future Enhancements
This is the initial experimental integration with the OpenAI API. Planned improvements include:
- Support for the Embeddings API
- Function calling
- Vision support
- Logprobs
or more information, see the OpenAI compatibility docs.