Ollama

Ollama is a tool to run large language models locally. You can simply download Ollama and install it on your local machine. After installation, you can open a terminal window and use Ollama CLI command ollama to work with it.

There are many models available for use with Ollama, see Ollama's models page for a full list.

We can use ollama pull to pull a model. Here we are using Qwen3.

Pull a model
ollama pull qwen3:0.6b

The size of qwen3:0.6b is only 523MB. It's good for local development and testing.

After the model is pulled, it can be run using ollama run.

Run a model
ollama run qwen3:0.6b

信息

ollama run command pulls non-existing models automatically.

ollama run starts a command-line session with the LLM. You can simply type any text to receive completions from LLM.