Fundamentals 9 min read

A Data‑Driven Guide to Finding a Partner: From Crawling Zhihu Answers to Ranking Candidates

This article walks through a complete data‑analysis workflow—scraping Zhihu dating‑preference answers, cleaning and filtering the data, deriving gender and activity metrics, designing a four‑step screening process, and finally ranking candidates with a custom like‑to‑comment index—to help a single programmer create a concise, high‑quality list of potential partners.

360 Tech Engineering
360 Tech Engineering
360 Tech Engineering
A Data‑Driven Guide to Finding a Partner: From Crawling Zhihu Answers to Ranking Candidates

On May 20th, a tongue‑in‑cheek “dating guide” is presented, claiming a scientific basis for helping single programmers find a girlfriend by mining Zhihu’s popular question “What are your dating criteria?”.

Data acquisition : The author crawls the XHR endpoint of the Zhihu question, retrieving over 27,000 answers with fields such as username, follower count, likes, gender, and timestamps.

Data overview : Approximately 48.9% of respondents are anonymous (default male), leaving about 4,758 identified female users. Answer timestamps are converted from Zhihu’s epoch format to readable dates, revealing spikes in activity around holidays and festivals.

Narrowing the scope : A four‑step filtering process is applied: (1) discard answers shorter than 30 characters; (2) keep only users active within the last 30 days; (3) exclude profiles that have already indicated they found a partner or removed content; (4) filter out high‑profile users (followers > 1500) and answers with many likes (> 150) or comments (> 100) to avoid intense competition.

After these steps, the candidate pool shrinks from 27,664 answers to about 480 promising profiles.

Introducing an index for ranking : The author defines a “like‑to‑comment ratio” (likes divided by comments) as a metric to gauge popularity per comment, preferring higher values. Examples illustrate how this index differentiates between answers with similar like counts but different comment volumes.

Using this index, the remaining candidates are sorted, and the top 30 are selected as the final “dating shortlist”.

The article concludes with a light‑hearted note that the author’s friend, after receiving the list, treats the author to a barbecue, and wishes all programmers success in finding love.

metricsdata analysisRankingWeb ScrapingdatingZhihu
360 Tech Engineering
Written by

360 Tech Engineering

Official tech channel of 360, building the most professional technology aggregation platform for the brand.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.