Python Script for Analyzing Lottery Historical Data and Generating Excel Reports
This article demonstrates how to use Python to scrape historical lottery data, apply a custom pattern‑matching algorithm to identify winning opportunities, aggregate the results over multiple days, and export the analysis to an Excel file for further review.
The article introduces a lottery‑analysis task that requires fetching historical data for two games ("极速飞艇" and "极速赛车"), applying a rule‑based strategy where matching numbers in the previous two draws determine the next bet, and counting how often a streak of six or more bets leads to a win.
Data acquisition : The author uses the request URL https://api.api68.com/pks/getPksHistoryList.do?date=2021-{date}&lotCode=10037 with appropriate User‑Agent headers, detects the response encoding via chardet , and parses the JSON payload.
Utility function for reversing a list:
def reverse_list(lst):
"""Reverse the order of a list.
:param lst: original list
:return: reversed list"""
return [ele for ele in reversed(lst)]Preparation of date list to iterate over a month of data:
data_list = []
for h in range(31):
data_list.append(f'1-{h+1}')
for h in range(28):
data_list.append(f'2-{h+1}')
for h in range(31):
data_list.append(f'3-{h+1}')
for h in range(20):
data_list.append(f'4-{h+1}')Main analysis loop processes each day’s data, extracts the three most recent draw codes, and checks the custom betting rule. It tracks total "ladder" counts (N), successful hits (n), and records the streak length (record_number) together with the corresponding timestamps in dict_time_record :
for data in data_list:
low_list = ["01","02","03","04","05"]
high_list = ["06","07","08","09","10"]
N = 0
n = 0
url = f'https://api.api68.com/pks/getPksHistoryList.do?date=2021-{data}&lotCode=10037'
response = requests.get(url=url, headers=headers)
response.encoding = chardet.detect(response.content)['encoding']
new_response = json.loads(response.text)
# reverse data order for chronological processing
for k in range(1152):
if k < 1150:
new_result1 = reverse_list(new_response["result"]["data"])[k]
new_result2 = reverse_list(new_response["result"]["data"])[k+1]
new_result3 = reverse_list(new_response["result"]["data"])[k+2]
data1 = new_result1['preDrawCode'].split(',')
data2 = new_result2['preDrawCode'].split(',')
data3 = new_result3['preDrawCode'].split(',')
for m in range(10):
if m == 0:
if data2[0] == data1[1]:
N += 1
if (data2[0] in low_list and data3[0] in low_list) or (data2[0] in high_list and data3[0] in high_list):
n += 1
list_data_number.append(record_number)
if f"{record_number}" in dict_time_record:
dict_time_record[f"{record_number}"].append(new_result3['preDrawTime'][11:])
else:
dict_time_record[f"{record_number}"] = [new_result3['preDrawTime'][11:]]
record_number = 1
else:
record_number += 1
break
elif m == 9:
if data2[9] == data1[8]:
N += 1
if (data2[9] in low_list and data3[9] in low_list) or (data2[9] in high_list and data3[9] in high_list):
n += 1
list_data_number.append(record_number)
if f"{record_number}" in dict_time_record:
dict_time_record[f"{record_number}"].append(new_result3['preDrawTime'][11:])
else:
dict_time_record[f"{record_number}"] = [new_result3['preDrawTime'][11:]]
record_number = 1
else:
record_number += 1
break
else:
if data2[m] == data1[m+1] or data2[m] == data1[m-1]:
N += 1
if (data2[m] in low_list and data3[m] in low_list) or (data2[m] in high_list and data3[m] in high_list):
n += 1
list_data_number.append(record_number)
if f"{record_number}" in dict_time_record:
dict_time_record[f"{record_number}"].append(new_result3['preDrawTime'][11:])
else:
dict_time_record[f"{record_number}"] = [new_result3['preDrawTime'][11:]]
record_number = 1
else:
record_number += 1
break
# write daily summary to Excel
sh.write(sheet_seek_position, 0, new_response['result']['data'][0]['preDrawTime'][:10])
sh.write(sheet_seek_position, 1, N)
sh.write(sheet_seek_position, 2, n)
sh.write(sheet_seek_position, 3, N - n)
# further aggregation of streak lengths
list_data_number.sort()
dict_record = {}
for i in list_data_number:
if f"{i}" in dict_record:
dict_record[f"{i}"] += 1
else:
dict_record[f"{i}"] = 1
for j in dict_record:
if int(j) >= 6 and int(j) < 15:
sh.write(sheet_seek_position, 4 + (int(j)-6)*2, dict_record[j])
times = ", ".join(dict_time_record[j])
sh.write(sheet_seek_position, 5 + (int(j)-6)*2, times[:-2])
sheet_seek_position += 1
print(f"Date:{new_response['result']['data'][0]['preDrawTime'][:10]}, total ladders:{N}, hits:{n}, misses:{N-n}")
wb.save('极速飞艇彩票分析结果.xls')After processing all dates, the script prints summary statistics for each day and aggregates how many times a streak of six or more consecutive bets resulted in a win, displaying the corresponding timestamps. The final Excel file contains columns for date, total ladder count, hit count, miss count, and detailed streak information.
Conclusion : The author reflects that, although the task is straightforward, it reinforced several Python programming techniques such as web requests, JSON handling, list manipulation, and Excel file generation, providing a practical example of data‑driven lottery analysis.
Python Programming Learning Circle
A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.