Mendhak / Code

The Firefox chatbot sidebar with a local LLM

Firefox has a chatbot sidebar that can be used to interact with the popular LLM chatbot providers, such as Claude, Gemini, and Claude. It is possible to allow it to also talk to a local LLM, although it’s not a readily visible option.

Firefox running open-webui with ollama

The steps, roughly, involved installing ollama, open-webui, and configuring Firefox.

Ollama

Installing ollama was simple enough, there’s a convenience script which also sets it up as a systemd service.

The only change I made was to the /etc/systemd/system/ollama.service file, to make it listen on all interfaces. I added this line to the [Service] section:

...
[Service]
Environment="OLLAMA_HOST=0.0.0.0"
...

Of course I also pulled a few models locally:

ollama pull llama3.2:1b
ollama pull qwen2.5:1.5b

open-webui

Ollama just provides an API, but no web interface. The Firefox chatbot sidebar needs to load a web interface, that’s where open-webui comes in.

I decided to run it in Docker.

docker run -d -p 8080:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main

open-webui running in Docker

Then quickly tested it by browsing to http://localhost:8080.

Since ollama is listening on all interfaces, the open-webui container can reach it easily. It also conveniently lists all the models that ollama has downloaded.

Firefox config

The final bit is to tell Firefox to use the local open-webui. This was done by setting a preference.

Under about:config, I searched for browser.ml.chat.hideLocalhost and set it to false. By default, Firefox will now look for an interface running on http://localhost:8080, which open-webui just happens to run on.

That’s it, the chatbot sidebar started showing “localhost” as an option in the top dropdown.

Notes

Although it’s possible, and great for privacy as well as tinkering, I don’t generally like messing about in the about:config settings. It’s too easy to forget what’s been changed, and why.

If I want to make this a more permanent solution, I’d probably look to run open-webui in systemd too. I don’t think this would be a huge strain on the system, since ollama does unload the models from memory when not in use.