Artificial Intelligence 14 min read

Analyzing WeChat Friend Data with Python: Gender, Avatar, Signature, and Location Insights

This tutorial demonstrates how to use Python libraries such as itchat, jieba, matplotlib, snownlp, and Tencent Youtu SDK to collect WeChat friend information and perform data analysis on gender distribution, avatar characteristics, signature sentiment, and geographic location, presenting the results with charts and word clouds.

Python Programming Learning Circle

Dec 19, 2023

With the popularity of WeChat, the author uses Python to collect friend information via itchat and performs data analysis on gender, avatar, signature, and location.

Required third‑party modules are listed (itchat, jieba, matplotlib, snownlp, PIL, numpy, wordcloud, TencentYoutuyun) and can be installed via pip.

Login to WeChat Web by itchat.auto_login(hotReload=True) and retrieve friends with friends = itchat.get_friends(update=True). The first element is the user; friends[1:] are used for analysis.

Gender analysis extracts the Sex field, counts Male, Female and Unknown, and draws a pie chart with matplotlib:

def analyseSex(friends):
    sexs = list(map(lambda x: x['Sex'], friends[1:]))
    counts = list(map(lambda x: x[1], Counter(sexs).items()))
    labels = ['Unknow', 'Male', 'Female']
    colors = ['red', 'yellowgreen', 'lightskyblue']
    plt.figure(figsize=(8,5), dpi=80)
    plt.axes(aspect=1)
    plt.pie(counts, labels=labels, colors=colors, labeldistance=1.1,
            autopct='%3.1f%%', shadow=False, startangle=90, pctdistance=0.6)
    plt.legend(loc='upper right')
    plt.title(u'%s的微信好友性别组成' % friends[0]['NickName'])
    plt.show()

Avatar analysis downloads each head image, uses Tencent Youtu SDK to detect faces and extract image tags, then visualises face‑usage proportion with a pie chart and creates a word cloud of image tags:

def analyseHeadImage(friends):
    basePath = os.path.abspath('.')
    baseFolder = basePath + '\\HeadImages\\'
    if not os.path.exists(baseFolder):
        os.makedirs(baseFolder)
    faceApi = FaceAPI()
    use_face = 0
    not_use_face = 0
    image_tags = ''
    for index in range(1, len(friends)):
        friend = friends[index]
        imgFile = baseFolder + '\\Image%s.jpg' % str(index)
        imgData = itchat.get_head_img(userName=friend['UserName'])
        if not os.path.exists(imgFile):
            with open(imgFile, 'wb') as file:
                file.write(imgData)
        time.sleep(1)
        if faceApi.detectFace(imgFile):
            use_face += 1
        else:
            not_use_face += 1
        result = faceApi.extractTags(imgFile)
        image_tags += ','.join([x['tag_name'] for x in result])
    labels = [u'使用人脸头像', u'不使用人脸头像']
    counts = [use_face, not_use_face]
    colors = ['red', 'yellowgreen', 'lightskyblue']
    plt.figure(figsize=(8,5), dpi=80)
    plt.axes(aspect=1)
    plt.pie(counts, labels=labels, colors=colors, labeldistance=1.1,
            autopct='%3.1f%%', shadow=False, startangle=90, pctdistance=0.6)
    plt.legend(loc='upper right')
    plt.title(u'%s的微信好友使用人脸头像情况' % friends[0]['NickName'])
    plt.show()
    image_tags = image_tags.encode('iso8859-1').decode('utf-8')
    back_coloring = np.array(Image.open('face.jpg'))
    wordcloud = WordCloud(font_path='simfang.ttf', background_color='white',
                          max_words=1200, mask=back_coloring, max_font_size=75,
                          random_state=45, width=800, height=480, margin=15)
    wordcloud.generate(image_tags)
    plt.imshow(wordcloud)
    plt.axis('off')
    plt.show()

Signature analysis extracts the Signature field, generates a word cloud using jieba, and performs sentiment classification with SnowNLP, presenting the distribution of positive, neutral and negative sentiments:

def analyseSignature(friends):
    signatures = ''
    emotions = []
    pattern = re.compile("1f\d.+")
    for friend in friends:
        signature = friend['Signature']
        if signature:
            signature = signature.strip().replace('span','').replace('class','').replace('emoji','')
            signature = re.sub(r'1f(\d.+)', '', signature)
            if len(signature) > 0:
                nlp = SnowNLP(signature)
                emotions.append(nlp.sentiments)
                signatures += ' '.join(jieba.analyse.extract_tags(signature, 5))
    # word cloud generation omitted for brevity
    count_good = len(list(filter(lambda x: x>0.66, emotions)))
    count_normal = len(list(filter(lambda x: 0.33<=x<=0.66, emotions)))
    count_bad = len(list(filter(lambda x: x<0.33, emotions)))
    labels = [u'负面消极', u'中性', u'正面积极']
    values = (count_bad, count_normal, count_good)
    plt.bar(range(3), values, color='rgb')
    plt.xticks(range(3), labels)
    plt.title(u'%s的微信好友签名信息情感分析' % friends[0]['NickName'])
    plt.show()

Location analysis extracts Province and City, writes them to a CSV file, and visualises geographic distribution via Baidu BDP (code snippet shown):

def analyseLocation(friends):
    headers = ['NickName','Province','City']
    with open('location.csv','w',encoding='utf-8',newline='') as csvFile:
        writer = csv.DictWriter(csvFile, headers)
        writer.writeheader()
        for friend in friends[1:]:
            row = {'NickName': friend['NickName'],
                   'Province': friend['Province'],
                   'City': friend['City']}
            writer.writerow(row)

The article concludes that visualisation is a means rather than an end, encouraging readers to derive meaningful insights from the generated charts and word clouds.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

face recognition Sentiment Analysis visualization WeChat data-analysis wordcloud

Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.