Artificial Intelligence 9 min read

Resolving 02_DocQA.py Errors and Using LangChain to Call Large Models Locally

This guide explains how to fix the ArkNotFoundError in the 02_DocQA.py script by configuring a Doubao‑embedding endpoint, setting up a Conda environment with the latest LangChain packages, and provides step‑by‑step code examples for invoking both Zhipu glm‑4 and Volcano large language models via LangChain.

Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Resolving 02_DocQA.py Errors and Using LangChain to Call Large Models Locally

The article first addresses a common error when running 02_DocQA.py , which shows an ArkNotFoundError because the embedding model endpoint is missing. The solution is to create a new inference endpoint on Volcano Engine, select the Doubao‑embedding model, and add the endpoint URL to the .cloudiderc configuration file:

export EMBEDDING_MODELEND="your_modelend_point"
source ~/.cloudiderc

After sourcing the configuration, the script runs without errors and the web UI can be accessed for Q&A.

The second part details how to set up a fresh Conda environment for LangChain (version 0.3) using Python 3.9, install the latest LangChain packages, and add required dependencies:

conda create --name bytedance python=3.9
conda activate bytedance
conda install conda-forge::langchain
conda install conda-forge::langchain-openai
conda install conda-forge::langchain-community

With the environment ready, the guide provides two complete code examples for calling large models.

1. Calling Zhipu glm‑4 (free)

import os
from langchain_openai import ChatOpenAI
from langchain.prompts import (
    ChatPromptTemplate,
    MessagesPlaceholder,
    SystemMessagePromptTemplate,
    HumanMessagePromptTemplate,
)
from langchain.chains import LLMChain
from langchain.memory import ConversationBufferMemory

api_key = os.environ.get("ZHIPUAI_API_KEY")
llm = ChatOpenAI(
    model="glm-4",
    max_tokens=100,
    openai_api_key=api_key,
    openai_api_base="https://open.bigmodel.cn/api/paas/v4/",
)

prompt = ChatPromptTemplate(messages=[
    SystemMessagePromptTemplate.from_template("你是一个智能机器人,可以回答人类许多问题。"),
    MessagesPlaceholder(variable_name="chat_history"),
    HumanMessagePromptTemplate.from_template("{question}")
])

memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
chat = LLMChain(llm=llm, prompt=prompt, verbose=True, memory=memory)
response = chat.invoke({"question": "请给我的花店起个名字。"})
print(response['text'])

A simplified version removes the prompt template and calls llm.predict directly:

import os
from langchain_openai import ChatOpenAI

api_key = os.environ.get("ZHIPUAI_API_KEY")
llm = ChatOpenAI(
    openai_api_key=api_key,
    openai_api_base="https://open.bigmodel.cn/api/paas/v4/",
    model="glm-4",
    temperature=0.8,
    max_tokens=60,
)
response = llm.predict("请给我的花店起个名字")
print(response)

2. Calling Volcano Engine model

Two variants are shown. Version 1 uses the same ChatOpenAI class with the Volcano endpoint:

import os
from langchain_openai import ChatOpenAI

api_key = os.environ.get("ARK_API_KEY")
model = os.environ.get("LLM_MODELEND")
llm = ChatOpenAI(
    openai_api_key=api_key,
    openai_api_base="https://ark.cn-beijing.volces.com/api/v3",
    model=model,
    temperature=0.8,
    max_tokens=60,
)
response = llm.predict("请给我的花店起个名字")
print(response)

Version 2 builds explicit message objects before invoking the model:

import os
from langchain_openai import ChatOpenAI
from langchain.schema import HumanMessage, SystemMessage

api_key = os.environ.get("ARK_API_KEY")
model = os.environ.get("LLM_MODELEND")
llm = ChatOpenAI(
    openai_api_key=api_key,
    openai_api_base="https://ark.cn-beijing.volces.com/api/v3",
    model=model,
    temperature=0.8,
    max_tokens=60,
)
messages = [
    SystemMessage(content="你是一个很棒的智能助手"),
    HumanMessage(content="请给我的花店起个名"),
]
response = llm.invoke(messages)
print(response.content)

The article concludes with practical tips: always verify environment variables, check the last line of a traceback for the root cause, and consider recreating the Conda environment if dependency issues persist. It also reminds readers that while AI can assist, not every problem is automatically solved.

PythonLangChainembeddinglarge language modelEnvironment Setup
Rare Earth Juejin Tech Community
Written by

Rare Earth Juejin Tech Community

Juejin, a tech community that helps developers grow.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.