Scraping Stock Data with Python: From Extracting Stock Codes to Saving Results in Excel
This tutorial guides readers through analyzing web pages, using Python to write network programs, applying regular expressions, and operating Excel to scrape all listed company stock data for a specified date range, save it by stock code, and optionally store results in databases.
This article teaches how to build a web‑scraping project that collects stock data for all listed companies within a given date range. It covers three essential skills: analyzing web pages, writing Python network programs, and using Excel and regular expressions.
Step 1 – Scrape Stock Codes : Identify a website that lists stock codes (e.g., http://quote.eastmoney.com/stocklist.html). Open the browser’s developer tools (F12) to view the page source and locate the HTML element containing each code. Observe that each stock code appears in a fixed HTML pattern, allowing a regular‑expression template such as SS(.*?) to capture the code. Write a Python function (e.g., urlTolist ) that uses urllib.request.urlopen to fetch the page, reads the content, compiles the regex, and extracts all matching codes, filtering for valid prefixes (6, 0, 3). The function returns a list of stock codes, which can be printed to verify the first ten entries belong to the Shanghai Stock Exchange.
Step 2 – Scrape Stock Content : Use the stock codes to call an external service (e.g., NetEase API) that returns detailed price data for a specified period. Iterate over all codes, request the data with urllib.request.urlopen , and save each response to an Excel file using urllib.request.urlretrieve . Set the time window (e.g., 2016‑11‑31 to 2016‑12‑31) and ensure the destination folder (e.g., D:\all_stock_data) exists. Running this script produces Excel files containing the historical stock data, which can later be imported into MySQL or other storage systems.
The guide concludes that with the presented techniques, readers can extend the scraper to gather any web‑based information they need.
Python Programming Learning Circle
A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.