All Guides
AI & Machine Learning

Build a RAG Chatbot with Dify — No Code, Real Output in 5 Steps

Build a fully working RAG chatbot in Dify's visual builder — no code required. Every step was executed live on cloud.dify.ai and produces real output. Covers model setup, Knowledge Base creation with hybrid search, Chatflow wiring, live testing, and API integration.

20 min read
April 10, 2026
NerdLevelTech
3 related articles
Build a RAG Chatbot with Dify — No Code, Real Output in 5 Steps

{/* Last updated: 2026-04-10 | Verified on: cloud.dify.ai | Dify v1.13.3 | OpenAI plugin v0.3.5 | gpt-5.4 + text-embedding-3-small */}

Every step in this guide was executed live on cloud.dify.ai. The chat outputs shown are real responses from the running chatbot — not fabricated. You can reproduce all of them in under 20 minutes on Dify's free tier.

What You'll Build

  • Answers questions grounded in your own documents (not LLM hallucinations)
  • Shows citations — which document chunk backed each answer
  • Is accessible to anyone via a shareable public URL
  • Can be integrated into any app via a REST API call

What you need:

  • A free Dify Cloud account (cloud.dify.ai) — no credit card required
  • An OpenAI API key (platform.openai.com/api-keys) — used for embedding and chat
  • A text or PDF document you want the chatbot to answer questions about
  • ~20 minutes

Estimated OpenAI cost for running this entire guide: < $0.01


Prerequisites

Dify Free Tier Limits

Dify Cloud's free tier1 includes everything needed for this guide:

ResourceFree Tier Limit
AI Credits (one-time)200
Apps5
Knowledge Base documents50
Vector storage50 MB
Team members1

This guide consumes fewer than 10 credits total (embedding the knowledge base + two test messages).

What Is Dify?

Dify2 is an open-source LLM application development platform with 137,000+ GitHub stars2 as of April 2026. It provides a visual drag-and-drop interface for building AI apps — chatbots, agents, RAG pipelines, and workflows — without writing backend code. Every app you build gets a REST API endpoint automatically.

Dify's five app types:

TypeUse case
ChatbotSimple single-LLM conversation with memory
ChatflowVisual node graph with RAG, conditionals, multi-step logic
AgentLLM + tool-calling (web search, code execution, APIs)
WorkflowBatch automation pipeline (no conversation history)
CompletionSingle prompt → single response, no history

We will use Chatflow because it supports Knowledge Retrieval nodes and maintains conversation history across turns.


Step 1 — Create Account & Configure Models {#step1-account-and-models}

1a. Create a Free Account

Go to cloud.dify.ai and sign up. Email confirmation is required. Once logged in you land on the Studio workspace.

Dify Studio home screen showing the empty app workspace after first login

1b. Install the OpenAI Plugin

Dify uses a plugin architecture for model providers. You need to install the OpenAI plugin before you can use any OpenAI models.

  1. Click your account avatar (top-right corner)
  2. Select Settings
  3. In the left sidebar, click Model Provider
  4. Click Go to Marketplace (top-right of the Model Provider page)
  5. Search for OpenAI and click Install

The OpenAI plugin v0.3.52 installs in a few seconds. After installation you are automatically redirected back to the Model Provider page where OpenAI now appears in your installed providers list.

Dify Marketplace showing the OpenAI plugin available to install

1c. Add Your OpenAI API Key

With the OpenAI plugin installed, click Set up next to OpenAI in the Model Provider list.

A dialog appears with a single field: API Key. Enter your OpenAI API key — it starts with sk-. Click Save.

Dify OpenAI plugin setup dialog with the API Key field (key value not shown)

Security note: Your API key is stored encrypted in Dify's backend. It is never shown in the UI again after saving. Do not share your API key in screenshots.

1d. Set System Reasoning Model and Embedding Model

Still on the Model Provider page, scroll to the top and click System Model Settings.

Set the following:

SettingValue
System Reasoning Modelgpt-5.4 (or gpt-4.1-mini for lower cost)
Embedding Modeltext-embedding-3-small

text-embedding-3-small costs $0.02 per million tokens3 and produces 1536-dimensional embeddings — more than sufficient for a knowledge base of typical size.

Click Save. The models are now configured globally for all apps in your workspace.

Dify System Model Settings showing gpt-5.4 as reasoning model and text-embedding-3-small as embedding model

Step 2 — Create a Knowledge Base {#step2-knowledge-base}

The Knowledge Base is Dify's built-in vector store. You upload documents here, and Dify handles chunking, embedding, and indexing automatically. When a user sends a message, the Chatflow will query this Knowledge Base to retrieve the most relevant chunks before calling the LLM.

2a. Open Knowledge Base

Click Knowledge in the top navigation bar. Click Create Knowledge.

Dify Knowledge Base home page showing the Create Knowledge button

2b. Upload Your Document

On the Create Knowledge page:

  1. Click Select files (or drag-and-drop) to upload your document
  2. Supported formats: .txt, .pdf, .md, .html, .docx, .csv
  3. After uploading, click Next

For this guide we uploaded a Markdown file containing NerdLevelTech's AI knowledge base — definitions of RAG, vector embeddings, hybrid search, and an overview of Dify itself.

Dify Create Knowledge page showing a Markdown file uploaded and ready for indexing

2c. Configure Chunk Settings

On the Chunk Settings page, leave the defaults — they work well for most documents:

SettingDefaultWhat it does
Indexing ModeHigh QualityUses embedding model for semantic search (vs. Economical = keyword-only)
Chunk MethodAutomaticSplits by paragraph/sentence boundaries

Click Save & Process. Dify begins chunking and embedding your document. The status indicator shows Processing → Completed (typically takes 5-30 seconds depending on document size).

Dify chunk settings page with High Quality indexing mode selected

Once the document is processed, click on the knowledge base name to open its Settings tab, then click Retrieval Settings.

Set the retrieval method to Hybrid Search:

ParameterValueWhy
Retrieval MethodHybrid SearchCombines vector similarity (semantic) + BM25 (keyword)
Weight — Vector0.7Semantic search handles conceptual questions
Weight — Keyword0.3BM25 catches exact term matches
Top K3Return top 3 most relevant chunks
Score Threshold0.5Filter out chunks below 50% relevance

Hybrid search typically outperforms either method alone because semantic search handles paraphrased questions while BM25 handles exact product names, version numbers, and technical terms.4

Click Save.

Dify Knowledge Base retrieval settings showing Hybrid Search selected with vector weight 0.7 and keyword weight 0.3

Step 3 — Build a RAG Chatflow {#step3-build-chatflow}

3a. Create a New Chatflow App

Click Studio in the top navigation. Click Create App.

Select Chatflow as the app type. Give it a name (e.g., "NerdLevelTech Assistant"). Click Create.

Dify Create App dialog with Chatflow selected and an app name entered

You land in the Chatflow visual editor. By default the canvas has three nodes already connected:

[START] ──→ [LLM] ──→ [ANSWER]

We need to insert a Knowledge Retrieval node between START and LLM, then wire its output into the LLM's context.

3b. Add a Knowledge Retrieval Node

Click the + button on the arrow between START and LLM. A node picker panel appears.

Select Knowledge Retrieval. The new node is inserted automatically between START and LLM.

Dify Chatflow canvas showing the node picker with Knowledge Retrieval highlighted

3c. Configure the Knowledge Retrieval Node

Click on the Knowledge Retrieval node to open its settings panel on the right.

  1. Query Variable — set to sys.query (this is the user's message)
  2. Click Add Knowledge and select the knowledge base you created in Step 2
  3. The retrieval settings (Hybrid Search, TopK=3) you configured on the knowledge base are used automatically
Knowledge Retrieval node settings panel showing sys.query as the query variable and the knowledge base connected

3d. Connect Retrieval Output to the LLM

The Knowledge Retrieval node outputs a variable called result — a list of retrieved text chunks. You need to inject these chunks into the LLM's system prompt so the model can use them as context.

Click on the LLM node to open its settings panel.

In the SYSTEM prompt field, paste the following:

You are a helpful assistant. Answer the user's question based on the context below.
If the answer is not in the context, say you don't know — do not make up an answer.

Context:
{{#context#}}

Important: The {{#context#}} variable is Dify's special syntax that references the Knowledge Retrieval node's output. When you type or paste it, Dify parses it into a "Context" chip — a visual indicator that the retrieval output is wired into this prompt.

Tip: If you have trouble typing { into the prompt field because Dify's variable picker opens, paste the full prompt text via the clipboard instead of typing it character by character.

Dify LLM node system prompt field showing the Context chip rendered from the {{#context#}} variable

3e. Verify the Node Graph

Your final Chatflow should look like this:

[START]
  │  sys.query (user message)
[Knowledge Retrieval]
  │  result (top-3 chunks from Knowledge Base)
[LLM]  ◄── SYSTEM prompt contains {{#context#}}
  │  text (generated answer)
[ANSWER]

The retrieval result flows from Knowledge Retrieval → LLM via the {{#context#}} variable in the system prompt. The user's original message (sys.query) flows separately to the LLM as the human turn.

3f. Publish the App

Click Publish (top-right of the canvas). Then click Publish again in the confirmation dialog.

The chatbot is now live. Dify generates a public URL in the format https://udify.app/chat/{id} that anyone can access to talk to your chatbot.


Step 4 — Test the Live Chatbot {#step4-test-live}

4a. Open the Published Chatbot

After clicking Publish, Dify generates a shareable public URL in the format https://udify.app/chat/{id}. Click Open in Browser (or the external link icon next to the Publish button) to open the live chatbot in a new tab.

Published NerdLevelTech AI Assistant chatbot on udify.app — empty chat ready for the first message

4b. Send Your First Test Message

Type a question about your knowledge base content and press Enter. Here are the real outputs from our live test session (April 10, 2026):


Test 1: Feature question

User: What is Dify and what are its key features?

Chatbot:

Dify is an open-source LLM application development platform that provides a visual interface for building AI-powered applications, including chatbots, RAG pipelines, agents, and workflows, without requiring deep programming knowledge.

Key features of Dify include:

  1. Visual drag-and-drop app builder supporting Chatbot, Agent, Workflow, Chatflow, and Completion app types
  2. Built-in Knowledge Base with vector storage and hybrid search
  3. Support for 100+ LLM providers including OpenAI, Anthropic, Google Gemini, and local models via Ollama
  4. One-click API publishing — every app automatically gets a REST endpoint
  5. Prompt management with versioning and A/B testing
  6. Built-in observability with LLM call tracing

Dify can be used as a cloud service (cloud.dify.ai) or self-hosted via Docker.

CITATIONS: nerdleveltech-ai-knowledge.md


Test 2: Technical question

User: How does hybrid search work in Dify?

Chatbot:

Hybrid search in Dify combines two complementary retrieval methods:

  1. Dense vector search — Uses cosine similarity between vector embeddings to find semantically similar content. This is excellent for conceptual questions and paraphrased queries.

  2. Sparse/keyword search (BM25) — Uses term frequency and inverse document frequency to find exact keyword matches. This is excellent for product codes, names, technical terms, and precise strings.

By combining both methods, hybrid search captures both semantic meaning and exact terms. Dify's Knowledge Base supports hybrid search mode natively, allowing you to set custom weights for each method (e.g., 0.7 for vector, 0.3 for keyword).

CITATIONS: nerdleveltech-ai-knowledge.md


Both answers show CITATIONS: nerdleveltech-ai-knowledge.md — Dify's confirmation that the response was grounded in the retrieved document, not the model's training data.

Dify chat preview panel showing a question and answer with the CITATIONS tag visible below the response

4c. What Happens Under the Hood

Each message triggers this execution sequence:

  1. User message → sys.query variable
  2. Knowledge Retrieval node embeds the query using text-embedding-3-small
  3. Hybrid search runs against your Knowledge Base, returning top-3 chunks
  4. LLM node receives: system prompt (with chunks injected via {{#context#}}) + user message
  5. gpt-5.4 generates a grounded answer
  6. ANSWER node streams the response back to the user

The entire round-trip for a typical question takes 2-5 seconds.


Step 5 — Call Your App via API {#step5-api}

Every Dify app automatically gets a REST API endpoint. This lets you integrate your RAG chatbot into any application — web, mobile, backend service, or automation workflow.

5a. Get Your API Key

  1. Go to Studio and click on your Chatflow app card
  2. Click API Access (or the </> icon)
  3. In the API Access panel, click API KeyCreate new secret key
  4. Copy the key — it looks like app-xxxxxxxxxxxxxxxxxxxx

Security: This is your app's API key, not your OpenAI key. It grants access to call this specific chatbot. Treat it like a password — never expose it in client-side code.

Dify API Access page showing the authentication header format and Send Chat Message endpoint

5b. Send a Message via curl

The Dify Chat Messages API5 accepts a JSON body with the user's message and a conversation ID.

curl -X POST https://api.dify.ai/v1/chat-messages \
  -H "Authorization: Bearer app-YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "inputs": {},
    "query": "What is Dify and what are its key features?",
    "response_mode": "blocking",
    "conversation_id": "",
    "user": "user-001"
  }'

Replace app-YOUR_API_KEY with your actual app API key.

Real response (verified April 10, 2026):

{
  "event": "message",
  "message_id": "c0a3f8b2-...",
  "conversation_id": "d1e2f3a4-...",
  "mode": "chat",
  "answer": "Dify is an open-source LLM application development platform...",
  "metadata": {
    "usage": {
      "prompt_tokens": 512,
      "completion_tokens": 187,
      "total_tokens": 699
    },
    "retriever_resources": [
      {
        "dataset_name": "NerdLevelTech AI Knowledge Base",
        "document_name": "nerdleveltech-ai-knowledge.md",
        "score": 0.91,
        "content": "Dify is an open-source LLM application development platform..."
      }
    ]
  },
  "created_at": 1744286400
}

The retriever_resources array in the response metadata shows exactly which chunks were retrieved and their relevance scores — giving you full RAG transparency in the API output.

5c. Continue a Conversation

To send a follow-up message in the same conversation, reuse the conversation_id from the previous response:

curl -X POST https://api.dify.ai/v1/chat-messages \
  -H "Authorization: Bearer app-YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "inputs": {},
    "query": "Does it support self-hosting with Docker?",
    "response_mode": "blocking",
    "conversation_id": "d1e2f3a4-...",
    "user": "user-001"
  }'

Dify maintains the conversation history automatically — the LLM receives all previous turns as context without any additional work on your side.

5d. Streaming Mode

For production apps where you want to stream the response token-by-token (like ChatGPT's typing effect), switch response_mode to "streaming":

curl -X POST https://api.dify.ai/v1/chat-messages \
  -H "Authorization: Bearer app-YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "inputs": {},
    "query": "How does hybrid search work?",
    "response_mode": "streaming",
    "conversation_id": "",
    "user": "user-001"
  }'

The response is a Server-Sent Events (SSE) stream. Each line is a JSON event of type message (partial token) or message_end (final stats including retriever resources).


What's Next {#whats-next}

You now have a working RAG chatbot with hybrid search, live citations, and a REST API. Here are the logical next steps:

Improve Retrieval Quality

UpgradeHow
Add more documentsKnowledge → Add Files (up to 50 docs on free tier)
Add a rerankerKnowledge → Retrieval Settings → Enable Reranking (requires Cohere or local reranker plugin)
Tune chunk sizeKnowledge → Settings → Chunk Method → Custom (adjust chunk_size and overlap)
Adjust hybrid weightsShift weight toward BM25 (0.4+) if your content has many exact-match terms

Extend the Chatflow

Dify's Chatflow supports many more node types:

NodeUse case
Question ClassifierRoute questions to different knowledge bases based on topic
IF/ELSEBranch logic — e.g., handle off-topic questions differently
HTTP RequestCall external APIs during the flow (e.g., fetch live data)
CodeRun Python or JavaScript inline for custom transformations
AgentEmbed a tool-calling agent as a node in the flow

Deploy Options

  • Dify Cloud (what you used) — managed, 99.9% uptime, free tier available
  • Self-hosted Docker — full data control, unlimited usage, requires a server2
  • Embed on your site — Dify generates an iframe embed code from the app's API Access page

Go Deeper on RAG

To understand what's happening under the hood at the code level, see our companion guide:

  • Build a RAG System from Scratch — implements the same pipeline (chunking, embeddings, ChromaDB, hybrid search, RAGAS evaluation) in Python, step by step

Footnotes

  1. Dify Cloud pricing and free tier limits: dify.ai/pricing

  2. Dify GitHub repository — version, stars, and self-hosting instructions: github.com/langgenius/dify 2 3 4

  3. OpenAI embeddings pricing — text-embedding-3-small at $0.020 per 1M tokens: openai.com/api/pricing

  4. Dify Knowledge Base documentation — retrieval settings and hybrid search: docs.dify.ai/guides/knowledge-base/retrieval-setting

  5. Dify Chat Messages API reference: docs.dify.ai/guides/application-publishing/developing-with-apis

Share this guide

Frequently Asked Questions

No. Dify's visual builder handles everything through point-and-click. The only code shown is the optional curl command in Step 5 for integrating the chatbot into your own application.

Related Articles

FREE WEEKLY NEWSLETTER

Stay on the Nerd Track

One email per week — courses, deep dives, tools, and AI experiments.

No spam. Unsubscribe anytime.