SurferCloud Blog SurferCloud Blog
  • HOME
  • NEWS
    • Latest Events
    • Product Updates
    • Service announcement
  • TUTORIAL
  • COMPARISONS
  • INDUSTRY INFORMATION
  • Telegram Group
  • Affiliates
SurferCloud Blog SurferCloud Blog
SurferCloud Blog SurferCloud Blog
  • HOME
  • NEWS
    • Latest Events
    • Product Updates
    • Service announcement
  • TUTORIAL
  • COMPARISONS
  • INDUSTRY INFORMATION
  • Telegram Group
  • Affiliates
  • banner shape
  • banner shape
  • banner shape
  • banner shape
  • plus icon
  • plus icon

Developing RAG Systems with DeepSeek R1 & Ollama (Complete Code Included)

March 7, 2025
4 minutes
INDUSTRY INFORMATION,TUTORIAL
479 Views

Ever wished you could directly ask questions to a PDF or technical manual? This guide will show you how to build a Retrieval-Augmented Generation (RAG) system using DeepSeek R1, an open-source reasoning tool, and Ollama, a lightweight framework for running local AI models.

Pro tip: Streamline API Testing with Apidog

Developing RAG Systems with DeepSeek R1 & Ollama (Complete Code Included)

Looking to simplify your API workflows? Apidog acts as an all-in-one solution for creating, managing, and running tests and mock servers. With Apidog, you can:

  • Automate critical workflows without juggling multiple tools or writing extensive scripts.
  • Maintain smooth CI/CD pipelines.
  • Identify bottlenecks to ensure API reliability.

Save time and focus on perfecting your product. Ready to try it?  UModelVerse supports the DeepSeek R1/V3 model, available for free.

Why DeepSeek R1?

DeepSeek R1, a model comparable to OpenAI’s o1 but 95% cheaper, is revolutionizing RAG systems. Developers love it for its:

  • Focused retrieval: Uses only 3 document chunks per answer.
  • Strict prompting: Avoids hallucinations with an “I don’t know” response.
  • Local execution: Eliminates cloud API latency.

What You’ll Need to Build a Local RAG System

1. Ollama

Ollama lets you run models like DeepSeek R1 locally.

  • Download: Ollama
  • Setup: Install and run the following command via your terminal.
ollama run deepseek-r1  # For the 7B model (default)  
Developing RAG Systems with DeepSeek R1 & Ollama (Complete Code Included)

2. DeepSeek R1 Model Variants

DeepSeek R1 ranges from 1.5B to 671B parameters. Start small with the 1.5B model for lightweight RAG applications.

ollama run deepseek-r1:1.5b

Pro Tip: Larger models (e.g., 70B) offer better reasoning but need more RAM.

Step-by-Step Guide to Building the RAG Pipeline

Step 1: Import Libraries

We’ll use:

  • LangChain for document processing and retrieval.
  • Streamlit for the user-friendly web interface.
import streamlit as st  
from langchain_community.document_loaders import PDFPlumberLoader
from langchain_experimental.text_splitter import SemanticChunker
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_community.vectorstores import FAISS
from langchain_community.llms import Ollama
Developing RAG Systems with DeepSeek R1 & Ollama (Complete Code Included)

Step 2: Upload & Process PDFs

Leverage Streamlit’s file uploader to select a local PDF. Use PDFPlumberLoader to extract text efficiently without manual parsing.

# Streamlit file uploader  
uploaded_file = st.file_uploader("Upload a PDF file", type="pdf")

if uploaded_file:
# Save PDF temporarily
with open("temp.pdf", "wb") as f:
f.write(uploaded_file.getvalue())

# Load PDF text
loader = PDFPlumberLoader("temp.pdf")
docs = loader.load()

Step 3: Chunk Documents Strategically

Leverage Streamlit’s file uploader to select a local PDF. Use PDFPlumberLoader to extract text efficiently without manual parsing.

# Split text into semantic chunks  
text_splitter = SemanticChunker(HuggingFaceEmbeddings())
documents = text_splitter.split_documents(docs)
Developing RAG Systems with DeepSeek R1 & Ollama (Complete Code Included)

Step 4: Create a Searchable Knowledge Base

Generate vector embeddings for the chunks and store them in a FAISS index.

  • Embeddings allow fast, contextually relevant searches.
# Generate embeddings  
embeddings = HuggingFaceEmbeddings()
vector_store = FAISS.from_documents(documents, embeddings)

# Connect retriever
retriever = vector_store.as_retriever(search_kwargs={"k": 3}) # Fetch top 3 chunks

Step 5: Configure DeepSeek R1

Set up a RetrievalQA chain using the DeepSeek R1 1.5B model.

  • This ensures answers are grounded in the PDF’s content rather than relying on the model’s training data.
llm = Ollama(model="deepseek-r1:1.5b")  # Our 1.5B parameter model  

# Craft the prompt template
prompt = """
1. Use ONLY the context below.
2. If unsure, say "I don’t know".
3. Keep answers under 4 sentences.

Context: {context}

Question: {question}

Answer:
"""
QA_CHAIN_PROMPT = PromptTemplate.from_template(prompt)

Step 6: Assemble the RAG Chain

Integrate uploading, chunking, and retrieval into a cohesive pipeline.

  • This approach gives the model verified context, enhancing accuracy.
# Chain 1: Generate answers  
llm_chain = LLMChain(llm=llm, prompt=QA_CHAIN_PROMPT)

# Chain 2: Combine document chunks
document_prompt = PromptTemplate(
template="Context:\ncontent:{page_content}\nsource:{source}",
input_variables=["page_content", "source"]
)

# Final RAG pipeline
qa = RetrievalQA(
combine_documents_chain=StuffDocumentsChain(
llm_chain=llm_chain,
document_prompt=document_prompt
),
retriever=retriever
)

Step 7: Launch the Web Interface

Launch the Web Interface

Streamlit enables users to type questions and receive instant answers.

  • Queries retrieve matching chunks, feed them to the model, and display results in real-time.
# Streamlit UI  
user_input = st.text_input("Ask your PDF a question:")

if user_input:
with st.spinner("Thinking..."):
response = qa(user_input)["result"]
st.write(response)
Developing RAG Systems with DeepSeek R1 & Ollama (Complete Code Included)

You can find the complete code here: https://gist.github.com/lisakim0/0204d7504d17cefceaf2d37261c1b7d5

The Future of RAG with DeepSeek

DeepSeek R1 is just the beginning. With upcoming features like self-verification and multi-hop reasoning, future RAG systems could debate and refine their logic autonomously.

Build your own RAG system today and unlock the full potential of document-based AI!

Reposted from: Developing RAG Systems with DeepSeek R1 & Ollama (Complete Code Included) | by Sebastian Petrus | Jan, 2025 | Medium.

Tags : DeepSeek R1 Ollama

Related Post

3 minutes INDUSTRY INFORMATION

SSD Storage: What It Is and Why It’s Essent

SSD (Solid State Drive) storage has become an essential...

5 minutes INDUSTRY INFORMATION

Launch Your Hosting Brand: A Complete Guide t

If you’ve ever wanted to start your own web hosting b...

5 minutes TUTORIAL

How to List Processes in Linux?

Linux is a powerful operating system that enables users...

Affordable CDN

ucdn

2025 Special Offers

annual vps

Light Server promotion:

Light Server promotion

Cloud Server promotion:

Cloud Server promotion

Copyright © 2024 SurferCloud All Rights Reserved.  Sitemap.