Building a Juejin Author Profile Bot with Coze: Data Collection, Processing, and AI Summarization
This tutorial walks through creating a Coze bot that fetches a Juejin author's information and articles via web scraping, processes the data to generate a concise author profile, identifies expertise domains, ranks hot articles, and finally publishes the bot for interactive use.
1. Introduction
The article introduces a Coze bot project that automatically generates a Juejin author profile by collecting the author's basic information and all article metadata, then summarizing the data with AI.
2. Data Acquisition
2.1 User Information
Using browser developer tools to locate the API that returns user details, a Python requests_async call is written to fetch fields such as username, description, registration time, follower count, article count, digg count, and view count.
2.2 Article Information
The article list API is identified, and a POST request with JSON payload is used to retrieve paginated article data. A code node iterates through pages, extracts title, brief content, view/collect/digg/comment counts, tags, and constructs a list of ArticleInfo objects.
2.3 Parallel Crawling
To avoid timeout when an author has many articles, the crawling task is split into multiple parallel code nodes, each handling a subset of cursors, and a final node merges the results.
3. Data Processing
3.1 Author Profile Generation
The collected article abstracts are sent to a large language model (e.g., Zhipu AI) via an HTTP request to generate a concise author portrait (max 200 characters).
3.2 Expertise Domain Analysis
A script counts tag occurrences across all articles, calculates percentages, merges low‑frequency tags into an "Other" category, and formats the top four domains with their share and article count.
3.3 Hot Article Ranking
A custom heat score H = 0.15*R/100 + 0.25*L + 0.35*C + 0.25*F (R: views, L: likes, C: comments, F: collections) is computed for each article, sorted descending, and the top ten are formatted as markdown‑style links.
4. Final Output Assembly
The three result strings (author portrait, expertise domains, hot articles) are concatenated with the basic user info to produce a comprehensive response, which is then returned by the bot.
5. Bot Publication
The completed workflow is saved as a Coze bot, configured to trigger when a user sends a Juejin author URL, and published on the Juejin platform for interactive queries.
Rare Earth Juejin Tech Community
Juejin, a tech community that helps developers grow.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.