Artificial Intelligence 8 min read

TacRefineNet: A Tactile‑Driven Model for Millimeter‑Precision Robotic Grasp Refinement

TacRefineNet leverages high‑resolution tactile sensors, multimodal fusion of fingertip touch and proprioception, and a goal‑conditioned refinement network to achieve millimeter‑level grasp adjustments without vision or 3D models, demonstrating zero‑shot deployment and robust generalization across diverse automotive‑factory parts in both simulation and real‑world tests.

Xiaomi Tech

Feb 5, 2026

TacRefineNet: A Tactile‑Driven Model for Millimeter‑Precision Robotic Grasp Refinement

Introduction

Thanks to the joint progress of AI and robotics, humanoid robots are expanding rapidly, yet the final‑meter challenge of precise manipulation remains. Human "blind" tasks such as egg‑shell peeling show that high‑precision tactile perception is essential for fine‑grained operations, and TacRefineNet is presented as a solution that learns comparable tactile capabilities.

Key Features

No vision: Based on a high‑spatial‑resolution tactile sensor, the system is immune to lighting and occlusion, enabling reliable perception under complex contact.

Millimeter‑level refinement: Multifold fingertip tactile data and proprioceptive information are fused, driving pose convergence and reducing grasp error to the millimeter scale.

No 3D model required: The approach eliminates dependence on prior geometric models, turning grasp adjustment into a target‑alignment problem within tactile space.

One model, many uses: A single model handles fine‑grasp adjustments for various automotive‑factory parts, performing well in both simulation and real environments.

Generalization to unseen objects: The model retains high robustness on unseen but similar objects and under dynamic disturbances without additional training.

Goal‑driven: Users can specify any target grasp pose without retraining, achieving plug‑and‑play, zero‑shot deployment.

TacRefineNet: seamless Sim‑to‑Real, pure simulation‑driven, zero‑sample deployment

Performance Demonstration

Extensive simulation and real‑robot tests show that TacRefineNet can refine multiple factory objects to millimeter‑level accuracy; after adjustment, average positional error quickly converges to the millimeter range.

Curve of grasp‑adjustment accuracy versus iteration steps for multiple factory objects

Overall evaluation metrics for grasp adjustment across multiple factory objects

Zero‑Shot Target Specification

TacRefineNet can refine any target pose within the dataset range without extra training, enabling immediate plug‑and‑play deployment. The system refines diverse initial grasps to align with the precise tactile image of the desired pose.

Dynamic Long‑Range Tracking

The network maintains stable long‑range tracking; even when object position and orientation shift frequently, real‑time feedback commands correct the grasp, keeping the object in the desired grasp state.

Generalization to Unseen Objects

Without specific training, TacRefineNet extracts similar physical features and successfully generalizes to unseen objects that share typical geometric characteristics.

Data + Algorithm + Hardware Synergy

The high‑precision performance stems from deep integration of data, algorithm, and hardware.

Data: Using the MuJoCo physics engine, a tactile simulator aligned with real physics was built, enabling high‑fidelity contact force simulation. Large‑scale data collection on diverse automotive parts yields strong zero‑shot generalization.

Algorithm: An end‑to‑end multimodal network fuses tactile images, proprioception, and spatial actions via a goal‑conditioned refinement mechanism, learning a nonlinear mapping from perception to motion. A single model supports multiple object types, showing cross‑object generalization.

Hardware: The fingertip integrates an 11 × 9 pressure‑resistive tactile array with 1.1 mm spacing, capturing fine pressure distributions and providing high‑fidelity tactile input, allowing operation under complete visual occlusion.

Open Resources

Technical details and experiment videos are publicly available. Project page: https://sites.google.com/view/tacrefinenet. Arxiv paper: https://arxiv.org/pdf/2509.25746.

We believe that without real touch there is no true generalization; tactile perception is the "last piece of the puzzle" for robot intelligence, and TacRefineNet marks a new step for Xiaomi Robotics from the lab to production lines.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

multimodal fusion robotic manipulation zero-shot learning tactile sensing sim-to-real grasp refinement

Written by

Xiaomi Tech

Chat about technology with Xiaomi and change life together.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.

Introduction

Key Features

Performance Demonstration

Zero‑Shot Target Specification

Dynamic Long‑Range Tracking

Generalization to Unseen Objects

Data + Algorithm + Hardware Synergy

Open Resources

Xiaomi Tech

How this landed with the community

Was this worth your time?

0 Comments

Data + Algorithm + Hardware Synergy