Artificial Intelligence 16 min read

Deploying DeepSeek Locally with Ollama, Building Personal and Organizational Knowledge Bases, and Integrating with Spring AI

This guide explains how to locally deploy the DeepSeek large‑language model using Ollama on Windows, macOS, and Linux, configure model storage and CORS, build personal and enterprise RAG knowledge bases with AnythingLLM and Open WebUI, and integrate the model into a Spring AI application via Docker and Docker‑Compose.

Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Deploying DeepSeek Locally with Ollama, Building Personal and Organizational Knowledge Bases, and Integrating with Spring AI

This article introduces the end‑to‑end process of deploying the DeepSeek model locally, constructing knowledge bases, and integrating the model with Spring AI.

1. DeepSeek Local Deployment

We use ollama to simplify installation and internal network migration. The article provides download links for Windows and macOS, environment variable configuration, model storage path adjustment, and CORS settings (setting OLLAMA_ORIGINS to * for Open WebUI access).

Model hardware requirements are listed in a table, showing GPU memory and recommended cards for 1.5B, 8B, 14B, 32B, 70B, and 671B variants, together with the corresponding ollama run deepseek-r1: commands.

Example command to install a model:

ollama run deepseek-r1:32b

For Windows internal network migration, the model files are packaged, copied to the internal server, and ollama is reinstalled. Token rate can be verified with ollama run deepseek-r1:32b --verbose , and GPU usage checked via nvidia‑smi or the task manager.

1.2 Linux Deployment

Containerized deployment uses Docker images from Huawei Cloud. Pull and run commands:

sudo docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/ollama/ollama
sudo docker run -d -v ollama:/ollama -p 11434:11434 --name ollama swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/ollama/ollama

After the container starts, enter it with:

docker ps | grep ollama
docker exec -it ${containerId} /bin/bash
ollama run deepseek-r1:${tag}

2. Knowledge Base Construction

For confidential internal documents, a local RAG knowledge base is built using LlamaIndex or LangChain . Full‑parameter fine‑tuning or LoRA can be applied if model training is required.

2.1 Personal Knowledge Base

AnythingLLM is used as the front‑end client. After installing the chat client, configure the LLM provider to ollama and set the base URL to the local model endpoint. Upload documents, move them to the workspace, and click “Save and Embed” to generate embeddings.

2.2 Organizational Knowledge Base

Open WebUI provides a clean UI for team knowledge bases. Deploy it with Docker:

docker run -d -p 3030:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.nju.edu.cn/open-webui/open-webui:main

After registration, configure model permissions to public and test queries.

3. Code Integration with Spring AI

Add Spring AI dependencies (including spring-ai-ollama-spring-boot-starter ) to the Maven pom.xml . Implement a Completion component that maintains a bounded message history, sends prompts to the Ollama client, and supports both synchronous and streaming chat interfaces.

Expose REST endpoints:

@RestController
@RequestMapping("/api")
public class OllamaTestController {
    @Autowired
    private OllamaChatModel ollamaChatClient;

    @RequestMapping("/chat")
    public String chat(){
        // build system and user prompts, call ollamaChatClient, return result
    }

    @RequestMapping("/stream")
    public SseEmitter stream(HttpServletResponse response){
        // stream responses via SSE
    }
}

4. Appendix

4.1 Containerization Basics

Instructions for installing Docker on CentOS, basic Docker commands (pull, tag, run, rmi), and writing a simple Dockerfile (example with Nginx).

4.2 Single‑Node Service Orchestration

Install Docker‑Compose and use common commands ( docker-compose up -d , logs , down , etc.) to manage multi‑container setups, such as the combined Ollama + Open WebUI stack.

5. References

EdgeX Foundry – Docker and service orchestration

Spring AI integration with Ollama

DockerRAGContainerizationDeepSeekKnowledge BaseSpring AIOllama
Rare Earth Juejin Tech Community
Written by

Rare Earth Juejin Tech Community

Juejin, a tech community that helps developers grow.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.