Fundamentals 12 min read

Practical Python Implementation of Credit Card User Profiling Using SQL and Data Visualization

This tutorial demonstrates a complete workflow for building credit‑card user profiles by first extracting and transforming the KDD99 dataset with SQL, then applying Python libraries such as pandas, matplotlib, and seaborn to perform descriptive statistics and produce visualizations of demographic, transaction, and financial characteristics.

Python Programming Learning Circle
Python Programming Learning Circle
Python Programming Learning Circle
Practical Python Implementation of Credit Card User Profiling Using SQL and Data Visualization

The article presents a practical case study of constructing credit‑card user profiles with Python, divided into two major sections: (1) an introduction to descriptive statistics using Python as the foundation for profiling, and (2) a hands‑on demonstration using a real credit‑card dataset.

In the first part, the author explains the basic plotting capabilities of matplotlib (including the pandas wrapper pandas.plot ) and introduces the more advanced seaborn library for richer visual styles, emphasizing their use for exploratory data analysis.

The second part focuses on the KDD99 dataset, a public financial dataset from a 1999 data‑mining competition. The workflow shows how to categorize data into static, point‑in‑time, and period windows, then extracts the necessary fields with a series of SQL statements. The key SQL scripts are:

<code>create table kdd99.var1 as
    select a.*,c.sex,datediff(a.issued,c.birth_date)/365 as age
    from kdd99.kdd99_card a
        left join kdd99.kdd99_disp b on a.disp_id=b.disp_id
        left join kdd99.kdd99_clients c on b.client_id=c.client_id;</code>
<code>create table kdd99.var2 as
  select a.*,c.date as trans_date,datediff(a.issued,c.date) as last_days,c.balance
  from kdd99.kdd99_card a
        left join kdd99.kdd99_disp b on a.disp_id=b.disp_id
        left join kdd99.kdd99_Trans c on b.account_id=c.account_id
  join (select card_id,max(trans_date) as max_date
        from kdd99.rfm_balance1
        group by card_id) d on a.card_id=d.card_id and c.date=d.max_date
  order by a.card_id;</code>
<code>create table kdd99.var3 as
  select a.card_id,
         count(trans_id) as trans_count,
         sum(if(c.type='借',amount,0)) as trans_in_amount,
         sum(if(c.type='贷',amount,0)) as trans_out_amount,
         sum(if(c.type='贷',amount,0))/avg(c.balance) as out_ratio,
         sum(if(c.type='贷',amount,0))/sum(if(c.type='借',amount,0)) as out_in_ratio
  from kdd99.kdd99_card a
    left join kdd99.kdd99_disp b on a.disp_id=b.disp_id
    left join kdd99.kdd99_Trans c on b.account_id=c.account_id and a.issued>c.date and c.date>a.issued-180
  group by a.card_id;</code>

Finally, the three intermediate tables are merged to produce a comprehensive Creditcard table:

<code>create table kdd99.Creditcard as
  select a.*,b.last_days,b.balance,c.trans_count,c.trans_in_amount,c.trans_out_amount,c.out_ratio,c.out_in_ratio
  from kdd99.var1 as a
        left join kdd99.var2 as b on a.card_id=b.card_id
        left join kdd99.var3 as c on a.card_id=c.card_id
  order by a.card_id;</code>

With the prepared data, Python visualizations are generated. Examples include a bar chart of credit‑card type counts, box‑plots of age distribution, transaction count, and account balance, and a gender‑vs‑card‑type cross‑tab bar chart. Relevant Python snippets are:

<code>from pylab import mpl
mpl.rcParams['font.sans-serif'] = ['Songti SC']
mpl.rcParams['axes.unicode_minus'] = False
Creditcard.type.value_counts().plot(kind='bar')
</code>
<code>import seaborn as sns
sns.boxplot(x='type', y='age', data=Creditcard)
</code>
<code>pd.crosstab(Creditcard.sex, Creditcard.type).plot(kind='bar')
</code>

These visualizations reveal that ordinary cards dominate issuance, age distributions differ across card types, and gender patterns vary by card category, providing a clear, data‑driven user profile.

The tutorial demonstrates how to combine SQL data extraction with Python analytics and visualization to build comprehensive user profiles for credit‑card customers.

SQLuser profilingData Visualizationseaborncredit card
Python Programming Learning Circle
Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.