Artificial Intelligence 14 min read

Analyzing WeChat Friend Data with Python: Gender, Avatar, Signature, and Location Insights

This tutorial demonstrates how to use Python libraries such as itchat, jieba, matplotlib, SnowNLP, and Tencent Youtu SDK to collect WeChat friend information and perform data analysis on gender distribution, avatar characteristics, signature text (including word‑cloud and sentiment analysis), and geographic location, presenting the results with visual charts and maps.

Python Programming Learning Circle

Mar 21, 2023

With the widespread use of WeChat, the article treats the platform as a data source and shows how to analyze a user's friend list using Python, focusing on four dimensions: gender, avatar, signature, and location.

The required third‑party modules are listed, including itchat for accessing WeChat data, jieba for Chinese word segmentation, matplotlib for charting, snownlp for sentiment analysis, PIL and numpy for image handling, wordcloud for visualizing text, and TencentYoutuyun SDK for face detection and image tagging.

First, the friend list is obtained with itchat:

itchat.auto_login(hotReload = True)
friends = itchat.get_friends(update = True)

Each friend is represented as a dictionary containing fields such as Sex, HeadImgUrl, Signature, Province, and City. The analysis always skips the first element (the logged‑in user) and works on friends[1:].

Gender analysis extracts the Sex value, counts Male, Female, and Unknown, and draws a pie chart with matplotlib:

def analyseSex(friends):
    sexs = list(map(lambda x: x['Sex'], friends[1:]))
    counts = list(map(lambda x: x[1], Counter(sexs).items()))
    labels = ['Unknow', 'Male', 'Female']
    colors = ['red', 'yellowgreen', 'lightskyblue']
    plt.figure(figsize=(8,5), dpi=80)
    plt.axes(aspect=1)
    plt.pie(counts, labels=labels, colors=colors, labeldistance=1.1,
            autopct='%3.1f%%', shadow=False, startangle=90, pctdistance=0.6)
    plt.legend(loc='upper right')
    plt.title(u"%s的微信好友性别组成" % friends[0]['NickName'])
    plt.show()

Avatar analysis downloads each friend's head image, uses the Tencent Youtu FaceAPI to detect whether a face is present, and extracts image tags. The results are visualized with a pie chart for face‑avatar ratio and a word‑cloud for the collected tags:

def analyseHeadImage(friends):
    basePath = os.path.abspath('.')
    baseFolder = basePath + '\\HeadImages\\'
    if not os.path.exists(baseFolder):
        os.makedirs(baseFolder)
    faceApi = FaceAPI()
    use_face = 0
    not_use_face = 0
    image_tags = ''
    for index in range(1, len(friends)):
        friend = friends[index]
        imgFile = baseFolder + '\\Image%s.jpg' % str(index)
        imgData = itchat.get_head_img(userName=friend['UserName'])
        if not os.path.exists(imgFile):
            with open(imgFile, 'wb') as file:
                file.write(imgData)
        time.sleep(1)
        if faceApi.detectFace(imgFile):
            use_face += 1
        else:
            not_use_face += 1
        tags = faceApi.extractTags(imgFile)
        image_tags += ','.join([t['tag_name'] for t in tags])
    # plot pie chart and wordcloud (code omitted for brevity)

Signature analysis processes the Signature field. Jieba extracts the top keywords to build a word‑cloud, while SnowNLP evaluates sentiment scores and groups them into positive, neutral, and negative categories:

def analyseSignature(friends):
    signatures = ''
    emotions = []
    for friend in friends:
        signature = friend.get('Signature')
        if signature:
            signature = re.sub(r'1f(\d.+)', '', signature)
            nlp = SnowNLP(signature)
            emotions.append(nlp.sentiments)
            signatures += ' '.join(jieba.analyse.extract_tags(signature, 5))
    # generate wordcloud and sentiment bar chart (code omitted)

Location analysis extracts Province and City fields, writes them to a CSV file, and later visualizes the geographic distribution with an external map tool (BDP):

def analyseLocation(friends):
    headers = ['NickName', 'Province', 'City']
    with open('location.csv', 'w', encoding='utf-8', newline='') as csvFile:
        writer = csv.DictWriter(csvFile, headers)
        writer.writeheader()
        for friend in friends[1:]:
            row = {
                'NickName': friend['NickName'],
                'Province': friend['Province'],
                'City': friend['City']
            }
            writer.writerow(row)

The visual results show that roughly 25% of friends use a face avatar, gender distribution is roughly balanced, common avatar tags include "girl", "tree", "house", etc., signature keywords such as "努力", "快乐", and sentiment analysis indicates about 55% positive, 32% neutral, and 12% negative attitudes. Geographically, the friends are concentrated in Ningxia and Shaanxi provinces.

In conclusion, the article emphasizes that data visualization is a means to uncover insights rather than an end in itself, encouraging readers to apply similar Python‑based analysis to their own WeChat data.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

NLP image recognition visualization wechat data-analysis sentiment-analysis

Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.