Running Your First Model

Let's pull and run your first local LLM. In just a few minutes, you'll have a powerful AI running entirely on your machine.

Your First Model: Llama 3.2

# Pull the model (one-time download)
ollama pull llama3.2

# Output:
# pulling manifest
# pulling 8934d96d3f08... 100% ▕████████████████▏ 4.7 GB
# pulling 8c17c2ebb0ea... 100% ▕████████████████▏ 7.0 KB
# verifying sha256 digest
# writing manifest
# success

The 8B model is about 4.7 GB. Download time depends on your internet speed.

Running the Model

# Start an interactive chat session
ollama run llama3.2

# You'll see:
>>>

# Type your message and press Enter
>>> What is the capital of France?

The capital of France is Paris.

>>>

Basic Chat Interactions

>>> Explain quantum computing in simple terms

Quantum computing uses quantum mechanics principles to process
information in ways classical computers cannot.

Key concepts:
1. **Qubits**: Unlike classical bits (0 or 1), qubits can exist
   in superposition - being 0 and 1 simultaneously.

2. **Entanglement**: Qubits can be connected so measuring one
   instantly affects the other, regardless of distance.

3. **Quantum speedup**: For specific problems, quantum computers
   can explore many solutions at once, dramatically faster.

>>> /exit

Useful Chat Commands

Inside the chat session, you can use special commands:

>>> /help           # Show available commands
>>> /clear          # Clear the conversation history
>>> /set parameter  # Change runtime parameters
>>> /show info      # Display model information
>>> /exit           # Exit the chat session (or Ctrl+D)

Multi-line Input

For longer prompts, use triple quotes:

>>> """
... Write a Python function that:
... 1. Takes a list of numbers
... 2. Returns the sum of even numbers
... 3. Includes docstring and type hints
... """

def sum_even_numbers(numbers: list[int]) -> int:
    """
    Calculate the sum of even numbers in a list.

    Args:
        numbers: A list of integers

    Returns:
        The sum of all even numbers in the list
    """
    return sum(n for n in numbers if n % 2 == 0)

Running Different Models

# List downloaded models
ollama list

# NAME                    ID              SIZE      MODIFIED
# llama3.2:latest         a80c4f17acd5    4.7 GB    5 minutes ago

# Pull and run Mistral
ollama pull mistral
ollama run mistral

# Pull a specific size variant
ollama pull llama3.3:70b  # Larger, more capable
ollama pull llama3.2:1b   # Smaller, faster

One-liner Queries

For scripting and quick queries:

# Single prompt (no interactive session)
ollama run llama3.2 "What is 2+2?"
# Output: 2 + 2 = 4

# Pipe content to the model
echo "Summarize this text:" | ollama run llama3.2

# Read from file
cat document.txt | ollama run llama3.2 "Summarize:"

Comparing Models

Try the same prompt with different models:

# Test coding ability
echo "Write a binary search in Python" | ollama run llama3.2
echo "Write a binary search in Python" | ollama run mistral
echo "Write a binary search in Python" | ollama run deepseek-coder

Performance Indicators

Watch the output speed to understand model performance:

# Run with verbose output
ollama run llama3.2 --verbose

>>> Hello

Hello! How can I help you today?

# Stats shown after response:
# total duration:       1.234s
# load duration:        0.123s
# prompt eval count:    5 token(s)
# prompt eval duration: 0.050s
# prompt eval rate:     100.00 tokens/s
# eval count:           12 token(s)
# eval duration:        0.800s
# eval rate:            15.00 tokens/s  <-- Generation speed

Model Storage

# Check model storage location
ls ~/.ollama/models/

# See disk usage
du -sh ~/.ollama/models/*

# Remove a model to free space
ollama rm mistral

Quick Reference

Command	Description
`ollama pull <model>`	Download a model
`ollama run <model>`	Start interactive chat
`ollama run <model> "prompt"`	Single query
`ollama list`	Show downloaded models
`ollama rm <model>`	Delete a model
`ollama show <model>`	Show model details

You now have a local LLM running on your machine. No internet required, no API costs, complete privacy. In the next lesson, we'll explore more CLI commands and parameters. :::