Artificial Intelligence 22 min read

Building a Simple Local AI Question‑Answer System with Java, LangChain4J, Ollama, and ChromaDB

This article guides readers through the concepts of large language models, embeddings, vector databases, and Retrieval‑Augmented Generation, then demonstrates step‑by‑step how to set up Ollama, install a local Chroma vector store, configure Maven dependencies, and write Java code using LangChain4J to build and test a functional AI Q&A application.

JD Tech

Oct 13, 2024

Building a Simple Local AI Question‑Answer System with Java, LangChain4J, Ollama, and ChromaDB

Introduction

The author, interested in AI large models, shares a concise guide for building a local AI question‑answer system using Java, avoiding the complexities of ChatGPT and OpenAI APIs by opting for open‑source models like LLaMA and Qwen.

(1) Large Language Model (LLM)

LLMs are deep‑learning models with billions of parameters, typically based on the Transformer architecture, trained on massive text corpora to perform tasks such as text generation, translation, and question answering.

(2) Embedding

Embeddings map words, sentences, or documents into high‑dimensional vectors that capture semantic similarity. Common methods include Word2Vec, GloVe, FastText, BERT, ELMo, and Sentence‑Transformers.

(3) Vector Database

Vector databases store and index high‑dimensional vectors for efficient similarity search, supporting ANN queries, hybrid filtering, scalability, and real‑time updates. Examples: FAISS, Pinecone, Weaviate, Qdrant, Milvus.

(4) Retrieval‑Augmented Generation (RAG)

RAG combines retrieval of relevant documents from an external knowledge base with generative LLM output, reducing hallucinations, improving factuality, and enabling domain‑specific answers.

AI Application Development Framework

(1) LangChain

LangChain is a framework that simplifies LLM application development by providing chains, agents, memory, loaders, prompt engineering, and integrations with external data sources.

(2) LangChain4J

LangChain4J brings LangChain’s capabilities to the Java ecosystem, offering modular components, multi‑model support, memory, tool integration, and chain execution.

Local Environment Preparation

(1) Start a Local Model with Ollama

Download Ollama, install it, and pull open‑source models (e.g., llama3, qwen) via ollama pull modelName. Verify with ollama list and run a model using ollama run modelName.

(2) Launch a Local Vector Database (ChromaDB)

Install with pip install chromadb and start the service using chroma run.

Implementing the Local AI Q&A in Java

(1) Maven Dependencies

<properties>
    <maven.compiler.source>8</maven.compiler.source>
    <maven.compiler.target>8</maven.compiler.target>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    <langchain4j.version>0.31.0</langchain4j.version>
</properties>

<dependencies>
    <!-- LangChain4J core -->
    <dependency>
        <groupId>dev.langchain4j</groupId>
        <artifactId>langchain4j-core</artifactId>
        <version>${langchain4j.version}</version>
    </dependency>
    <!-- Other LangChain4J modules (ollama, chroma, embeddings) -->
    ...
</dependencies>

(2) Core Java Code

Key steps are illustrated below; each code block is kept intact inside

tags.</p>
<pre><code>public static void main(String[] args) throws ApiException {
    // Load a local text file as knowledge base
    Document document = getDocument("笑话.txt");
    // Split the document into segments
    DocumentByLineSplitter lineSplitter = new DocumentByLineSplitter(200, 0, new OpenAiTokenizer());
    List<TextSegment> segments = lineSplitter.split(document);
    // Embed segments using Ollama model
    OllamaEmbeddingModel embeddingModel = OllamaEmbeddingModel.builder()
        .baseUrl("http://localhost:11434")
        .modelName("llama3")
        .build();
    // Store embeddings in ChromaDB
    Client client = new Client(CHROMA_URL);
    EmbeddingStore<TextSegment> embeddingStore = ChromaEmbeddingStore.builder()
        .baseUrl(CHROMA_URL)
        .collectionName(CHROMA_DB_DEFAULT_COLLECTION_NAME)
        .build();
    segments.forEach(segment -> {
        Embedding e = embeddingModel.embed(segment).content();
        embeddingStore.add(e, segment);
    });
    // Retrieve relevant segment for a query
    String qryText = "北极熊";
    Embedding queryEmbedding = embeddingModel.embed(qryText).content();
    EmbeddingSearchRequest request = EmbeddingSearchRequest.builder()
        .queryEmbedding(queryEmbedding)
        .maxResults(1)
        .build();
    EmbeddingSearchResult<TextSegment> result = embeddingStore.search(request);
    TextSegment textSegment = result.matches().get(0).embedded();
    // Build prompt and query LLM
    PromptTemplate promptTemplate = PromptTemplate.from(
        "基于如下信息用中文回答:
{{context}}
提问:
{{question}}");
    Map<String, Object> vars = new HashMap<>();
    vars.put("context", textSegment.text());
    vars.put("question", "北极熊干了什么");
    Prompt prompt = promptTemplate.apply(vars);
    OllamaChatModel chatModel = OllamaChatModel.builder()
        .baseUrl("http://localhost:11434")
        .modelName("llama3")
        .build();
    Response<AiMessage> resp = chatModel.generate(prompt.toUserMessage());
    System.out.println("Answer: " + resp.content().text());
}

private static Document getDocument(String fileName) {
    URL docUrl = LangChainMainTest.class.getClassLoader().getResource(fileName);
    if (docUrl == null) {
        log.error("File not found");
        return null;
    }
    try {
        Path path = Paths.get(docUrl.toURI());
        return FileSystemDocumentLoader.loadDocument(path);
    } catch (URISyntaxException e) {
        log.error("Error loading file", e);
        return null;
    }
}

(3) Testing

The sample text "有一只北极熊和一只企鹅…" is loaded, split, embedded, stored, and queried. When asking "北极熊干了什么", the system correctly returns "北极熊把自己的身上的毛一根一根地拔了下来".

Conclusion

The guide demonstrates a minimal end‑to‑end AI Q&A pipeline using Java, LangChain4J, Ollama, and ChromaDB, and suggests extending it with Spring Boot, advanced prompting, tool calling, and memory features.

References

LangChain official site

LangChain4J GitHub repository

Ollama documentation

ChromaDB project

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Java AI LLM Ollama LangChain4j

Written by

JD Tech

Official JD technology sharing platform. All the cutting‑edge JD tech, innovative insights, and open‑source solutions you’re looking for, all in one place.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.