Lesson 12 of 20

Memory & Knowledge

Knowledge Cutoffs & Updates

2 min read

Every LLM has a knowledge cutoff date—the point where its training data ends. For agents handling real-world tasks, bridging this gap is essential.

Current Knowledge Cutoffs (December 2025)

Model Knowledge Cutoff Notes
GPT-4o ~June 2024 Has web search capability
Claude Sonnet 4 ~March 2025 Newer Claude models available
Gemini 2.5 Pro ~January 2025 Has grounding/search
Llama 3.3 ~December 2023 Open weights model

Important: Cutoff dates change frequently with model updates. Always check official documentation for current values. Many models now include real-time search capabilities that supplement their training data.

Strategies for Current Information

1. Real-time Search Integration

from langchain_community.tools import DuckDuckGoSearchRun

search = DuckDuckGoSearchRun()

def get_current_info(query):
    """Fetch current information for time-sensitive queries"""
    # Check if query needs current info
    needs_current = llm.generate(f"""
    Does this query require information after April 2024?
    Query: {query}
    Answer (yes/no):
    """).strip().lower() == "yes"

    if needs_current:
        search_results = search.run(query)
        return f"Current information (as of today): {search_results}"
    return None

2. Scheduled Knowledge Updates

class KnowledgeUpdater:
    def __init__(self, vectorstore, sources):
        self.vectorstore = vectorstore
        self.sources = sources

    async def update(self):
        """Run daily to keep knowledge current"""
        for source in self.sources:
            # Fetch new content
            new_docs = await source.fetch_updates()

            # Check for changes
            for doc in new_docs:
                existing = self.vectorstore.similarity_search(
                    doc.content, k=1
                )
                if self.is_significantly_different(doc, existing):
                    # Update the knowledge base
                    self.vectorstore.add_documents([doc])
                    self.vectorstore.delete(existing[0].id)

# Schedule daily updates
schedule.every().day.at("02:00").do(updater.update)

3. Source Attribution

Always tell users where information comes from:

def answer_with_attribution(query):
    # Get from knowledge base
    docs = retriever.get_relevant_documents(query)

    response = llm.generate(f"""
    Based on these sources, answer the question.
    Always cite your sources.

    Sources:
    {format_sources(docs)}

    Question: {query}
    """)

    return {
        "answer": response,
        "sources": [{"title": d.metadata["title"],
                    "date": d.metadata["date"],
                    "url": d.metadata["url"]} for d in docs]
    }

Handling Outdated Information

def check_freshness(query, response):
    """Warn users when information might be outdated"""

    # Topics that change frequently
    volatile_topics = [
        "stock price", "weather", "news",
        "latest", "current", "today"
    ]

    if any(topic in query.lower() for topic in volatile_topics):
        return f"""
        {response}

        ⚠️ Note: This information may have changed.
        Last verified: {get_source_date()}
        Consider checking current sources for the latest data.
        """

    return response

Best Practices

Practice Implementation
Declare limitations "My knowledge was last updated..."
Use real-time tools Search, APIs for current data
Date your sources Include when info was retrieved
Update regularly Schedule knowledge base refreshes
Validate critical info Cross-reference important facts

Key Takeaways

  1. Know your model's cutoff and communicate it
  2. Use tools to bridge knowledge gaps
  3. Attribute sources to build trust
  4. Update proactively for time-sensitive domains

Test your memory and knowledge understanding in the quiz! :::

Quiz

Module 3: Memory & Knowledge

Take Quiz