Self-host Llama3 for Bytebase AI Assistant
For data security reasons, you may want to enable AI Assistant with a self-deployed LLM. Here we chose the powerful open source model Llama3. We used Ollama and One API as a relay to convert between Bytebase's OpenAI API-compliant requests and Llama3 API requests.
Prerequisites
Before you begin, make sure you have:
Get Llama3 running in Docker
Run the following command in terminal to get a Docker container running:
Container starts and returns id, then enter the container with the following command:
Pull and run the Llama3 model. Due to mapping issues, the model needs to be renamed to gpt-3.5-turbo
(or mapped in One-API). After renaming, the model name is gpt-3.5-turbo
, but indeed it is still Llama3.
Now that the model is running, you can test if the API is working properly in a new terminal page:
Seeing the results streaming out means that API is working well.
Configure One API
Choose a directory with read&write permissions (replace YOUR_PATH
in the following command) to save data and logs. For example, you can use the pwd
command in the mac terminal to view the current path and replace YOUR_PATH
with it.
Seeing the Docker container start and id output means successful deployment. If you encounter any issues, refer to the solution in One API documentation.
In Docker dashboard, you can see one-api container and its address as well. You can access localhost:3000
here to log in to One API dashboard.
Configure Channel
Enter channel page, select Add a new channel. Fill in model information:
- Type:
ollama
- Name:
Llama3
- Group:
default
- Model:
gpt-3.5-turbo
- Key: Anything (for example
SSSS|sssss|1111
) with formatAPPID|APISecret|APIKey
if ollama has not set up for key - Proxy: the IP address of the ollama container
http://host.docker.internal:11434
Furthermore, we mentioned above that the model name can be mapped in One-API. This can be done in the Model redirection bar on this page using a JSON string.
Configure API Keys
In the API keys page, click Add New Token, and fill in the Name (for example Llama3
) and Model scope (for example gpt-3.5-turbo
).
After clicking Submit, you will see the new API key in My keys list within API keys page. Click Copy to get a token starting with sk-
, with witch you can repalce YOUR_TOKEN
in the code below. If the code runs successfully in your terminal, it means that One API configuration is complete.
Configure Bytebase and run
In Bytebase Workspace, go to Settings -> General, and scroll down to AI Assistant section. Fill YOUR_TOKEN
we generated in One API into OpenAI API Key
bar, and fill the OpenAI API Endpoint
bar with http://localhost:3000
. Click Update.
Enter SQL Editor from top of any page. You can see an OpenAI icon on top right corner. Click it to start conversation with AI assistant, ask questions in natural language and get SQL results.