Build a RAG knowledge-base bot with Dify

Deploy Dify on Virtual Kubernetes Service (VKS), wire it up to LLMs and a knowledge base, and ship an agent that answers questions from your own business data

Dify is an open-source LLMOps platform that lets you stand up an AI app without a deep AI background. It ships with knowledge base management, prompt orchestration, model-provider switching, and more. This guide deploys Dify on Virtual Kubernetes Service (VKS) and builds a customer-support agent on top of your own business data.

Prerequisites

VKS is provisioned
VKS is connected
Business data is ready (PDF, Word, or plain text all work)

One-click Dify deployment

Download the deployment manifest that matches your VKS region (Beijing Zone 1, Zone 2, and Zone 3 each have their own YAML). The example below uses Beijing Zone 1:

kubectl apply -f dify.yaml

Once it is up, query the access URL:

kubectl describe serviceexporter dify-web-se -n dify

The web URL appears after the url field in the output. VKS uses the fixed external port 22443, so the address looks like:

https://<domain>:22443

The initial password on first login is password.

Dify application

Apply for an LLM API key

Every model invocation consumes tokens, so you first need to obtain an API key from the model provider's website.

Go to Settings → Model Provider, pick the provider, click Add Model, fill in the model type, name, and API key, and save.

Add a model

To plug in a model you have deployed yourself with Xinference (for example, QwQ), pick Xorbits Inference as shown below:

Xinference provider

Fill in the model name, server URL, and model UID, then save:

Xinference setup

Create a knowledge base

Open Knowledge and load your business data into it.

Pick a data source: existing files, Notion, web import, or just create an empty knowledge base.
Upload the business documents. Dify handles chunking and cleaning. Two indexing modes are available: High Quality (consumes tokens, requires an Embedding model API key, but produces higher accuracy) and Economy. High Quality is recommended.
Save and process. Once text embedding finishes, the knowledge base is ready.

Create an app

Go to Studio → Create Blank App, pick the app type and orchestration method, and fill in a name and description:

Create an app

Inside the app, do the following:

Prompt orchestration: define the role, tone, and answer scope (for example, "only answer questions related to this company's products").
Add an opening message: greet the user when they open the chat box.
Bind the knowledge base: hook in your business data.
Pick a model: choose the model and tune temperature and other parameters.
Chat: debug from the right-hand panel.

App chat

Bind a knowledge base to the app

Back in the app, in the Context section click Add, pick the knowledge base you just created, and confirm:

Bind knowledge base

Debug and publish

Use the chat panel to debug and verify answers. Once you are satisfied, click Publish to obtain the public Web/API entry points for downstream integration.

Summary

Dify's value is bundling the LLM, the knowledge base, prompt templates, and the calling interface into a single UI, which turns building a customer-support agent from "writing a pile of RAG code" into a few clicks. With VKS, both the model inference and the application itself can run in the same cluster, so data never leaves the boundary.

Build a RAG knowledge-base bot with Dify

On this page