Big Data 8 min read

Data‑Driven Dating Guide: Analyzing Zhihu Answers to Identify Potential Partners

In a playful data‑driven experiment, the author scraped 27,664 Zhihu answers to “What are your dating criteria?”, filtered out short, outdated, high‑profile or already‑matched posts, applied follower‑and engagement‑thresholds to narrow the pool to 480 candidates, then ranked the top 30 by a like‑to‑comment ratio, sharing the code and dataset for reproducibility.

Youku Technology
Youku Technology
Youku Technology
Data‑Driven Dating Guide: Analyzing Zhihu Answers to Identify Potential Partners

As a playful data‑driven experiment, the author collected answers from Zhihu’s most popular question “What are your dating criteria?” to build a partner‑selection guide.

Data acquisition: over 27,000 answers were scraped by locating the XHR request URL and spoofing the User‑Agent header. The dataset includes answer ID, author nickname, follower count, like count, gender, and other fields.

Initial insights: among 27,664 answers, 48.9% (13,527) are anonymous (treated as male by default). After removing anonymous users, gender distribution becomes roughly balanced, with 4,758 non‑anonymous female respondents.

Answer creation time distribution shows spikes around Chinese holidays (Oct 7 2018, Dec 13 2018, Spring Festival 2019) indicating higher activity during festive periods.

Four‑step filtering method:

Discard answers shorter than 30 characters (considered superficial).

Keep only answers updated within the last 30 days to ensure recent activity.

Exclude answers that state the photo or content has been removed (already found a partner).

Apply thresholds: remove users with >1500 followers (high‑profile) and answers with >150 likes or >100 comments (high competition).

After these steps the candidate pool shrank from 27,664 to 480 profiles.

Ranking: a “like‑to‑comment ratio” (likes ÷ comments) was calculated for each remaining answer to quantify popularity versus competition. Higher ratios indicate more favorable candidates. The top 30 profiles by this metric formed the final recommendation list.

The author provides the scraping code, raw dataset, and filtering scripts for readers to reproduce or customize the analysis.

data analysisRankingfilteringdatingZhihu
Youku Technology
Written by

Youku Technology

Discover top-tier entertainment technology here.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.