How to install Ollama and run models locally

If you want to run open models on your own machine without building an inference stack from scratch, Ollama is one of the easiest starting points.

It gives you a local runtime, a simple CLI, and access to a large model library through a single workflow.

Step 1: Install Ollama

Install the app from the official Ollama site, or use Homebrew if that is your preferred path.

Run:

curl -fsSL https://ollama.com/install.sh | sh

Download and install Ollama from the official website.

On desktop installs, Ollama usually starts as part of the app or background service.

You can confirm the CLI is available by running:

ollama --help

A simple starting example is:

ollama run gemma3

The first time you run a model, Ollama downloads it. After that, it launches the model locally and opens an interactive prompt.

You can also choose an explicit size tag, for example:

ollama run gemma3:12b
ollama run llama3.1:8b
ollama run mistral

To list downloaded models:

ollama list

To remove one you no longer need:

ollama rm gemma3:12b

This is where many people get stuck.

Do not start with the biggest model you can find. Start with something that fits your machine comfortably.

As a rough rule:

If you care about speed and experimentation, smaller is often the better starting point.

Ollama also exposes a local API, usually on:

http://localhost:11434

That means you can build local tools, scripts, chat interfaces, and RAG systems on top of it without sending requests to a cloud provider.

If you are new to local models, start with one general-purpose model and learn its limits before downloading five more.

A good first pass is:

The point is not to collect models. The point is to pick one that fits your machine and your task.

Ollama is one of the simplest ways to run open models locally on macOS, Linux, or Windows without building a custom inference stack from scratch.

Workollab

15 Mar 2026

#blog-posts #how-to-guides #ai-news

Loading rating and view data.