Hand Gesture Recognition Using OpenCV: Video Capture, Skin Detection, and Contour Processing in Python
This article demonstrates a Python-based hand‑gesture recognition pipeline using OpenCV, covering video capture, skin detection via YCrCb conversion, contour extraction, and visualization, with complete source code and step‑by‑step explanations suitable for beginners.
The author presents a beginner‑friendly project that implements hand‑gesture detection using Python and OpenCV. The workflow includes acquiring video frames, applying skin‑color detection, extracting hand contours, and displaying the results.
Video Capture
The video source is opened with cap = cv2.VideoCapture("C:/Users/lenovo/Videos/1.mp4") (or a webcam with cv2.VideoCapture(0) ). A loop reads frames, resizes them, draws a rectangle to define the region of interest, and extracts that ROI for further processing.
Skin Detection (Function A)
The ROI is converted from BGR to YCrCb color space to reduce the influence of illumination. The Cr channel is blurred, then Otsu’s thresholding isolates skin pixels. The mask is applied with cv2.bitwise_and to obtain a skin‑only image.
<code>def A(img):
YCrCb = cv2.cvtColor(img, cv2.COLOR_BGR2YCR_CB) # convert to YCrCb
(y, cr, cb) = cv2.split(YCrCb) # split channels
cr1 = cv2.GaussianBlur(cr, (5, 5), 0)
_, skin = cv2.threshold(cr1, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU) # Otsu
res = cv2.bitwise_and(img, img, mask=skin)
return res</code>Contour Extraction (Function B)
After converting the skin‑detected image to grayscale and applying a Laplacian filter, contours are found with cv2.findContours . The contours are sorted by area, and the largest one (assumed to be the hand) is drawn on a white canvas.
<code>def B(img):
h = cv2.findContours(img, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
contour = h[0]
contour = sorted(contour, key=cv2.contourArea, reverse=True)
bg = np.ones(dst.shape, np.uint8) * 255 # white background
ret = cv2.drawContours(bg, contour[0], -1, (0, 0, 0), 3) # draw black contour
return ret</code>Main Loop
The script continuously reads frames, extracts the ROI, runs A(roi) for skin detection, converts the result to grayscale, applies a Laplacian filter, then calls B(Laplacian) to obtain the hand contour. The original ROI, skin mask, and contour image are displayed in separate windows. Pressing ‘q’ exits the loop.
<code>while True:
ret, frame = cap.read()
src = cv2.resize(frame, (400, 350), interpolation=cv2.INTER_CUBIC)
cv2.rectangle(src, (90, 60), (300, 300), (0, 255, 0))
roi = src[60:300, 90:300]
res = A(roi)
cv2.imshow("0", roi)
gray = cv2.cvtColor(res, cv2.COLOR_BGR2GRAY)
dst = cv2.Laplacian(gray, cv2.CV_16S, ksize=3)
Laplacian = cv2.convertScaleAbs(dst)
contour = B(Laplacian)
cv2.imshow("2", contour)
key = cv2.waitKey(50) & 0xFF
if key == ord('q'):
break
cap.release()
cv2.destroyAllWindows()</code>The article also includes animated GIFs and images illustrating the final hand‑gesture detection result, as well as promotional material for a free Python course, which is not part of the technical content.
Python Programming Learning Circle
A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.