Analyzing WeChat Friend Data with Python: Gender, Avatar, Signature, and Location Insights
This tutorial demonstrates how to use Python libraries such as itchat, jieba, matplotlib, SnowNLP, and Tencent Youtu SDK to collect WeChat friend information and perform data analysis on gender distribution, avatar characteristics, signature text (including word‑cloud and sentiment analysis), and geographic location, presenting the results with visual charts and maps.
With the widespread use of WeChat, the article treats the platform as a data source and shows how to analyze a user's friend list using Python, focusing on four dimensions: gender, avatar, signature, and location.
The required third‑party modules are listed, including itchat for accessing WeChat data, jieba for Chinese word segmentation, matplotlib for charting, snownlp for sentiment analysis, PIL and numpy for image handling, wordcloud for visualizing text, and TencentYoutuyun SDK for face detection and image tagging.
First, the friend list is obtained with itchat:
itchat.auto_login(hotReload = True)
friends = itchat.get_friends(update = True)Each friend is represented as a dictionary containing fields such as Sex , HeadImgUrl , Signature , Province , and City . The analysis always skips the first element (the logged‑in user) and works on friends[1:] .
Gender analysis extracts the Sex value, counts Male, Female, and Unknown, and draws a pie chart with matplotlib:
def analyseSex(friends):
sexs = list(map(lambda x: x['Sex'], friends[1:]))
counts = list(map(lambda x: x[1], Counter(sexs).items()))
labels = ['Unknow', 'Male', 'Female']
colors = ['red', 'yellowgreen', 'lightskyblue']
plt.figure(figsize=(8,5), dpi=80)
plt.axes(aspect=1)
plt.pie(counts, labels=labels, colors=colors, labeldistance=1.1,
autopct='%3.1f%%', shadow=False, startangle=90, pctdistance=0.6)
plt.legend(loc='upper right')
plt.title(u"%s的微信好友性别组成" % friends[0]['NickName'])
plt.show()Avatar analysis downloads each friend's head image, uses the Tencent Youtu FaceAPI to detect whether a face is present, and extracts image tags. The results are visualized with a pie chart for face‑avatar ratio and a word‑cloud for the collected tags:
def analyseHeadImage(friends):
basePath = os.path.abspath('.')
baseFolder = basePath + '\\HeadImages\\'
if not os.path.exists(baseFolder):
os.makedirs(baseFolder)
faceApi = FaceAPI()
use_face = 0
not_use_face = 0
image_tags = ''
for index in range(1, len(friends)):
friend = friends[index]
imgFile = baseFolder + '\\Image%s.jpg' % str(index)
imgData = itchat.get_head_img(userName=friend['UserName'])
if not os.path.exists(imgFile):
with open(imgFile, 'wb') as file:
file.write(imgData)
time.sleep(1)
if faceApi.detectFace(imgFile):
use_face += 1
else:
not_use_face += 1
tags = faceApi.extractTags(imgFile)
image_tags += ','.join([t['tag_name'] for t in tags])
# plot pie chart and wordcloud (code omitted for brevity)Signature analysis processes the Signature field. Jieba extracts the top keywords to build a word‑cloud, while SnowNLP evaluates sentiment scores and groups them into positive, neutral, and negative categories:
def analyseSignature(friends):
signatures = ''
emotions = []
for friend in friends:
signature = friend.get('Signature')
if signature:
signature = re.sub(r'1f(\d.+)', '', signature)
nlp = SnowNLP(signature)
emotions.append(nlp.sentiments)
signatures += ' '.join(jieba.analyse.extract_tags(signature, 5))
# generate wordcloud and sentiment bar chart (code omitted)Location analysis extracts Province and City fields, writes them to a CSV file, and later visualizes the geographic distribution with an external map tool (BDP):
def analyseLocation(friends):
headers = ['NickName', 'Province', 'City']
with open('location.csv', 'w', encoding='utf-8', newline='') as csvFile:
writer = csv.DictWriter(csvFile, headers)
writer.writeheader()
for friend in friends[1:]:
row = {
'NickName': friend['NickName'],
'Province': friend['Province'],
'City': friend['City']
}
writer.writerow(row)The visual results show that roughly 25% of friends use a face avatar, gender distribution is roughly balanced, common avatar tags include "girl", "tree", "house", etc., signature keywords such as "努力", "快乐", and sentiment analysis indicates about 55% positive, 32% neutral, and 12% negative attitudes. Geographically, the friends are concentrated in Ningxia and Shaanxi provinces.
In conclusion, the article emphasizes that data visualization is a means to uncover insights rather than an end in itself, encouraging readers to apply similar Python‑based analysis to their own WeChat data.
Python Programming Learning Circle
A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.