Fundamentals 15 min read

Python Script for Analyzing Lottery Historical Data and Generating Excel Reports

This article demonstrates how to use Python to scrape historical lottery data, apply a custom pattern‑matching algorithm to identify winning opportunities, aggregate the results over multiple days, and export the analysis to an Excel file for further review.

Python Programming Learning Circle

Jun 1, 2021

Python Script for Analyzing Lottery Historical Data and Generating Excel Reports

The article introduces a lottery‑analysis task that requires fetching historical data for two games ("极速飞艇" and "极速赛车"), applying a rule‑based strategy where matching numbers in the previous two draws determine the next bet, and counting how often a streak of six or more bets leads to a win.

Data acquisition : The author uses the request URL

https://api.api68.com/pks/getPksHistoryList.do?date=2021-{date}&lotCode=10037

with appropriate User‑Agent headers, detects the response encoding via chardet, and parses the JSON payload.

Utility function for reversing a list:

def reverse_list(lst):
    """Reverse the order of a list.
    :param lst: original list
    :return: reversed list"""
    return [ele for ele in reversed(lst)]

Preparation of date list to iterate over a month of data:

data_list = []
for h in range(31):
    data_list.append(f'1-{h+1}')
for h in range(28):
    data_list.append(f'2-{h+1}')
for h in range(31):
    data_list.append(f'3-{h+1}')
for h in range(20):
    data_list.append(f'4-{h+1}')

Main analysis loop processes each day’s data, extracts the three most recent draw codes, and checks the custom betting rule. It tracks total "ladder" counts (N), successful hits (n), and records the streak length (record_number) together with the corresponding timestamps in dict_time_record:

for data in data_list:
    low_list = ["01","02","03","04","05"]
    high_list = ["06","07","08","09","10"]
    N = 0
    n = 0
    url = f'https://api.api68.com/pks/getPksHistoryList.do?date=2021-{data}&lotCode=10037'
    response = requests.get(url=url, headers=headers)
    response.encoding = chardet.detect(response.content)['encoding']
    new_response = json.loads(response.text)
    # reverse data order for chronological processing
    for k in range(1152):
        if k < 1150:
            new_result1 = reverse_list(new_response["result"]["data"])[k]
            new_result2 = reverse_list(new_response["result"]["data"])[k+1]
            new_result3 = reverse_list(new_response["result"]["data"])[k+2]
            data1 = new_result1['preDrawCode'].split(',')
            data2 = new_result2['preDrawCode'].split(',')
            data3 = new_result3['preDrawCode'].split(',')
            for m in range(10):
                if m == 0:
                    if data2[0] == data1[1]:
                        N += 1
                        if (data2[0] in low_list and data3[0] in low_list) or (data2[0] in high_list and data3[0] in high_list):
                            n += 1
                            list_data_number.append(record_number)
                            if f"{record_number}" in dict_time_record:
                                dict_time_record[f"{record_number}"].append(new_result3['preDrawTime'][11:])
                            else:
                                dict_time_record[f"{record_number}"] = [new_result3['preDrawTime'][11:]]
                            record_number = 1
                        else:
                            record_number += 1
                        break
                elif m == 9:
                    if data2[9] == data1[8]:
                        N += 1
                        if (data2[9] in low_list and data3[9] in low_list) or (data2[9] in high_list and data3[9] in high_list):
                            n += 1
                            list_data_number.append(record_number)
                            if f"{record_number}" in dict_time_record:
                                dict_time_record[f"{record_number}"].append(new_result3['preDrawTime'][11:])
                            else:
                                dict_time_record[f"{record_number}"] = [new_result3['preDrawTime'][11:]]
                            record_number = 1
                        else:
                            record_number += 1
                        break
                else:
                    if data2[m] == data1[m+1] or data2[m] == data1[m-1]:
                        N += 1
                        if (data2[m] in low_list and data3[m] in low_list) or (data2[m] in high_list and data3[m] in high_list):
                            n += 1
                            list_data_number.append(record_number)
                            if f"{record_number}" in dict_time_record:
                                dict_time_record[f"{record_number}"].append(new_result3['preDrawTime'][11:])
                            else:
                                dict_time_record[f"{record_number}"] = [new_result3['preDrawTime'][11:]]
                            record_number = 1
                        else:
                            record_number += 1
                        break
    # write daily summary to Excel
    sh.write(sheet_seek_position, 0, new_response['result']['data'][0]['preDrawTime'][:10])
    sh.write(sheet_seek_position, 1, N)
    sh.write(sheet_seek_position, 2, n)
    sh.write(sheet_seek_position, 3, N - n)
    # further aggregation of streak lengths
    list_data_number.sort()
    dict_record = {}
    for i in list_data_number:
        if f"{i}" in dict_record:
            dict_record[f"{i}"] += 1
        else:
            dict_record[f"{i}"] = 1
    for j in dict_record:
        if int(j) >= 6 and int(j) < 15:
            sh.write(sheet_seek_position, 4 + (int(j)-6)*2, dict_record[j])
            times = ", ".join(dict_time_record[j])
            sh.write(sheet_seek_position, 5 + (int(j)-6)*2, times[:-2])
    sheet_seek_position += 1
print(f"Date:{new_response['result']['data'][0]['preDrawTime'][:10]}, total ladders:{N}, hits:{n}, misses:{N-n}")
wb.save('极速飞艇彩票分析结果.xls')

After processing all dates, the script prints summary statistics for each day and aggregates how many times a streak of six or more consecutive bets resulted in a win, displaying the corresponding timestamps. The final Excel file contains columns for date, total ladder count, hit count, miss count, and detailed streak information.

Conclusion : The author reflects that, although the task is straightforward, it reinforced several Python programming techniques such as web requests, JSON handling, list manipulation, and Excel file generation, providing a practical example of data‑driven lottery analysis.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Python Lottery data-analysis web-scraping

Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.