Big Data 4 min read

Analyzing Mid‑Autumn Festival Mooncake Sales on Taobao with Python

This article demonstrates how to collect, clean, and visualize Taobao mooncake sales data using Python libraries such as Pandas, Pyecharts, jieba and collections, revealing top‑selling flavors, regional distribution, price ranges and shop rankings through step‑by‑step data‑processing and charting techniques.

Python Programming Learning Circle
Python Programming Learning Circle
Python Programming Learning Circle
Analyzing Mid‑Autumn Festival Mooncake Sales on Taobao with Python

The Mid‑Autumn Festival, a traditional Chinese holiday, features various customs including the consumption of mooncakes; this tutorial analyzes mooncake sales data from Taobao to identify popular flavors and regions.

Data is processed with Python libraries: Pandas for data handling, Pyecharts for visualization, jieba for text segmentation, and collections for statistical aggregation.

1. Import Modules – Load the required libraries.

2. Data Processing with Pandas 2.1 Read the raw CSV data. 2.2 Remove duplicate entries, reducing 4,520 rows to 1,885 unique records. 2.3 Handle missing values, especially empty purchase‑person fields. 2.4 Convert payment‑person figures expressed with the Chinese "万" unit back to numeric values and compute sales volume (price × quantity).

3. Visualization with Pyecharts 3.1 Plot the top‑10 mooncake products by sales. 3.2 Show the top‑10 shops selling mooncakes. 3.3 Map nationwide sales distribution, highlighting concentrations in Beijing, Shandong, Zhejiang, Guangdong and Yunnan. 3.4 Display sales proportion across price intervals, noting that over 50% of sales are for mooncakes under ¥50 and 85% under ¥100. 3.5 Visualize flavor distribution with a bar chart and a word‑cloud, emphasizing popular flavors such as lava, five‑nut, egg‑yolk lotus‑seed paste and red‑bean paste.

The analysis concludes that certain flavors dominate the market and that most mooncakes are priced affordably, providing actionable insights for sellers and marketers.

big dataPythondata analysisvisualizationpandaspyechartsmooncakesales
Python Programming Learning Circle
Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.