Artificial Intelligence 12 min read

Building a Simple Celebrity Face Matching System Using Baidu AI API and MySQL in Python

This article describes how to create a Python‑based face‑matching tool that crawls celebrity photos, stores them in a MySQL database, and uses Baidu's AI face‑compare API to identify a person from an uploaded image, presenting the matching results with similarity scores.

Python Programming Learning Circle
Python Programming Learning Circle
Python Programming Learning Circle
Building a Simple Celebrity Face Matching System Using Baidu AI API and MySQL in Python

The author was inspired by movie scenes that instantly identify a person from a photo and decided to build a small prototype that can match a given image against a database of celebrity faces.

The workflow consists of five main steps:

Collect celebrity data by crawling online sources for images and related information.

Normalize the data and store the image file names and metadata in a MySQL table.

Use Baidu's face‑recognition API to encode images in base64 and send them for similarity comparison.

Upload a target image, compare it with every entry in the database, and record the similarity scores.

Output the best match along with the corresponding celebrity information and similarity percentage.

The core Python script (shown below) implements these steps, handling database connections, web requests, image encoding, and result parsing.

<code># encoding:utf-8
import base64
import urllib
import urllib2
import simplejson as json
from  os import listdir
import MySQLdb
import sys
reload(sys)
sys.setdefaultencoding('utf-8')

def conmysql():
    conn= MySQLdb.connect(
        host='localhost',
        port =3306,
        user='root',
        passwd='123456',
        db ='xxnlove',
        charset='utf8'
        )
    return conn

'''
人脸比对接口
'''

def facecompar(image01,image02):
    matchUrl = "https://aip.baidubce.com/rest/2.0/face/v2/match"
    f = open(image01, 'rb')
    img1 = base64.b64encode(f.read())
    f = open(image02, 'rb')
    img2 = base64.b64encode(f.read())
    params = {"images": img1 + ',' + img2}
    params = urllib.urlencode(params)
    access_token = '24.1a060b87a0dfcab77317999d.25922220.1505832798.282335-10029360'
    matchUrl = matchUrl + "?access_token=" + access_token
    request = urllib2.Request(url=matchUrl, data=params)
    request.add_header('Content-Type', 'application/x-www-form-urlencoded')
    response = urllib2.urlopen(request)
    content = response.read()
    if content:
        content = json.loads(content)
        similar = content['result'][0]['score']
        return similar

def compare():
    similarlist=[]
    similardict={}
    for img in listdir('./star/'):
        similarvalue=facecompar('compar.jpg','./star/'+img)
        similarlist.append(similarvalue)
        similardict[similarvalue]=img
    return similarlist,similardict

if __name__=="__main__":
    similarlist,similardict=compare()
    similarkey=sorted(similarlist)[-1]
    starname=similardict[similarkey]
    conn = conmysql()
    cur = conn.cursor()
    sql="select * from face where iamge='%s'" % starname
    cur.execute(sql)
    results = cur.fetchall()
    print "一共对数据库进行比对了"+str(len(similarlist))+"条信息"
    for info in results:
        print "匹配到明星的信息:"+ info[0],info[1],info[2],"相似度:"+str(similarkey)
    conn.close()
</code>

Running the program on a test image matched the celebrity "曾轶可" with a similarity of 63.69%, and the script printed the total number of comparisons performed. The author notes that image size inconsistencies affect accuracy and that the core face‑matching logic relies on Baidu's API rather than a custom algorithm, prompting further study of data structures and algorithms.

PythonMySQLFace Recognitionweb scrapingimage matchingBaidu API
Python Programming Learning Circle
Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.