Artificial Intelligence 10 min read

Understanding LangChain Callback Mechanism, Custom Async Handlers, and Token Cost Management in Python

This article introduces LangChain's callback mechanism, demonstrates how to implement custom synchronous and asynchronous callbacks in Python, compares them with JavaScript async patterns, and shows how to monitor token usage and control costs using OpenAI callbacks.

Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Understanding LangChain Callback Mechanism, Custom Async Handlers, and Token Cost Management in Python

In the preface, the author notes that 2024 will be a harvest year for artificial intelligence and decides to review LangChain, a popular framework for building LLM applications.

AI practice: RAG for self‑service model rooms – introduces document loading and embedding.

LangChain practice: Text‑to‑SQL – showcases a new paradigm for large‑model databases.

LangChain practice: SequentialChain – explains LangChain's chain‑based workflow.

The article then focuses on learning the callback mechanism, referencing a previous discussion of AutoGen callbacks and preparing a comparison.

Callbacks and Asynchrony

For developers familiar with JavaScript, callbacks and async programming are common in event listeners, Ajax requests, and timers. The same concepts are demonstrated in Python using asyncio :

# Python asyncio module for asynchronous programming
import asyncio
async def rectangleArea(w, h, callback):
    print("开始计算矩形的面积...")
    await asyncio.sleep(0.5)
    # = x * y
    print("计算结束")

async def circleArea():
    print("开始圆形计算")
    await asyncio.sleep(1)
    print("完成圆形计算")

async def main():
    print("主线程开始...")
    task1 = asyncio.create_task(rectangleArea(3, 4, print_result))
    task2 = asyncio.create_task(circleArea())
    await task1
    await task2
    print("主线程结束...")

asyncio.run(main())

When the code reaches await asyncio.sleep() , the current task pauses and the event loop runs other tasks, illustrating asynchronous behavior similar to JavaScript's async/await .

LangChain Callback Mechanism

LangChain relies heavily on CallbackHandler objects for logging, monitoring, and data flow control. A simple example writes the LLM output to an output.log file:

from loguru import logger
from langchain.callbacks import FileCallbackHandler
from langchain.chain import LLMChain
from langchain.prompts import PromptTemplate

logfile = "output.log"
logger.add(logfile, colorize=True, enqueue=True)
handler = FileCallbackHandler(logfile)

llm = OpenAI()
prompt = PromptTemplate.from_template("1 + {number} = ")
chain = LLMChain(llm=llm, prompt=prompt, callbacks=[handler], verbose=True)
answer = chain.run(number=2)
logger.info(answer)

During LLM execution, the provided FileCallbackHandler captures the result and writes it to the log file.

Custom Callback Functions

The following code shows how to create custom synchronous and asynchronous callback handlers for a hypothetical "dry‑food shop" chatbot.

# Imports
import asyncio
from typing import Any, Dict, List
from langchain.chat_models import ChatOpenAI
from langchain.schema import LLMResult, HumanMessage
from langchain.callbacks.base import AsyncCallbackHandler, BaseCallbackHandler
# Synchronous handler based on BaseCallbackHandler
class MyDryFoodShopSyncHandler(BaseCallbackHandler):
    def on_llm_new_token(self, token: str, **kwargs) -> None:
        print(f"干货数据: token: {token}")
# Asynchronous handler
class MyDryFoodAsyncHandler(AsyncCallbackHandler):
    async def on_llm_start(self, serialized: Dict[str, Any], prompts: List[str], **kwargs: Any) -> None:
        print("正在获取干货数据...")
        await asyncio.sleep(0.5)
        print("干货数据获取完毕。提供建议...")

    async def on_llm_end(self, response: LLMResult, **kwargs: Any) -> None:
        print("整理干货建议...")
        await asyncio.sleep(0.5)
        print("祝你买货愉快!")
# Main async function invoking the chatbot with both handlers
async def main():
    drayfood_shop_chat = ChatOpenAI(
        max_tokens=100,
        streaming=True,
        callbacks=[MyDryFoodShopSyncHandler(), MyDryFoodAsyncHandler()],
    )
    await drayfood_shop_chat.agenerate([[HumanMessage(content="哪种干货最适合炖鸡?只简单说3种,不超过60字")]])

asyncio.run(main())

When a user asks about suitable dried goods for soup, the chatbot prints a message for each new token, logs messages before and after the OpenAI call, and finally wishes the user a pleasant purchase.

Calculating Token Usage and Cost Control

from langchain import OpenAI
from langchain.chains import ConversationChain
from langchain.chains.conversation.memory import ConversationBufferMemory

llm = OpenAI(temperature=0.5, model_name="gpt-3.5-turbo-instruct")
conversation = ConversationChain(llm=llm, memory=ConversationBufferMemory())

# Sample dialogue
conversation("我家明天要开party,我需要一些干海货。")
print("第一次对话后的记忆:", conversation.memory.buffer)
conversation("爷爷喜欢虾干,一两一只的。")
print("第二次对话后的记忆:", conversation.memory.buffer)
conversation("我又来了,还记得我昨天为什么要买干海货吗?")
print("第三次对话后提示:", conversation.prompt.template)
print("第三次对话后的记忆:", conversation.memory.buffer)

To obtain precise token counts, the get_openai_callback context manager can be used:

from langchain import OpenAI
from langchain.chains import ConversationChain
from langchain.chains.conversation.memory import ConversationBufferMemory
from langchain.callbacks import get_openai_callback

llm = OpenAI(temperature=0.5, model_name="gpt-3.5-turbo-instruct")
conversation = ConversationChain(llm=llm, memory=ConversationBufferMemory())

with get_openai_callback() as cb:
    conversation("我家明天要开party,我需要一些干海货。")
    conversation("爷爷喜欢虾干,一两一只的。")
    conversation("我又来了,还记得我昨天为什么要买干海货吗?")

print("\n总计使用的tokens:", cb.total_tokens)

The callback reports the total number of tokens used (e.g., 1023), allowing developers to monitor and manage LLM costs effectively.

Conclusion

By leveraging LangChain's callback system, developers can handle token accounting, logging, and custom workflow steps, making it easier to control costs and gain insight into LLM interactions.

PythonLLMLangChaincallbackasyncioTokenCost
Rare Earth Juejin Tech Community
Written by

Rare Earth Juejin Tech Community

Juejin, a tech community that helps developers grow.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.