Designing Effective Voice User Interfaces: A Practical Guide
This guide walks designers through the essential stages of creating voice user interfaces—exploring environmental constraints, defining interaction devices, mapping user scenarios, handling technical limitations, and applying design rules for triggers, feedback, and conversational flows—to build trustworthy and engaging VUI experiences.
Introduction
With the proliferation of smart devices, voice interaction scenarios are becoming more common. This guide, based on Atlassian design director Justin Baker’s recommendations, outlines the unique design challenges of Voice User Interfaces (VUI) and provides a three‑part framework.
Phase 1: Exploration
Understanding the environmental constraints that affect voice interaction helps designers grasp where and how users speak to devices.
Define Interaction Devices
Device form factor determines the interaction mode. Common categories include:
Mobile phones – iPhone, Google Pixel, Samsung Galaxy; network via cellular, Wi‑Fi, Bluetooth; visual, auditory, and haptic feedback.
Wearables – smartwatches, bands, shoes; strong scenario focus; network via cellular, Wi‑Fi, Bluetooth; varied feedback.
Fixed devices – desktops, smart home controllers, TVs; network via cellular, Wi‑Fi, Bluetooth; used in fixed locations.
Non‑fixed devices – laptops, tablets, car media systems; network via cellular, wired, Wi‑Fi, Bluetooth; primarily non‑voice interaction.
User Case Analysis Table
Identify primary, secondary, and non‑essential use cases for each device. Create a matrix to understand why users engage with the device, which interactions are core, and which are optional.
Phase 2: Input
After exploring constraints, focus on how devices listen to user commands. Key interaction nodes include trigger cues, wake‑up feedback, listening feedback, and end‑of‑listening signals.
Trigger Cues
Voice cue – a spoken phrase activates listening.
Touch cue – a button press.
Gesture cue – a hand wave.
Self‑wake – device wakes based on predefined commands.
Wake‑Up Feedback
Devices should provide immediate auditory, visual, or haptic signals to confirm they are listening.
Listening Feedback
Timely visual feedback (e.g., Siri’s waveform).
Audio playback of recorded speech.
Real‑time transcription.
External visual signals such as LEDs.
End‑of‑Listening Feedback
Signals that the device has stopped listening and is processing the command, with appropriate pause length and flexibility.
Phase 3: Conversation
Simple commands may not require dialogue, but complex tasks do. Design rules include providing clear affirmation, allowing user correction, and showing empathy.
Give explicit confirmation (e.g., “Okay, I’ll turn off the lights”).
Allow users to correct misunderstandings.
Show empathy when the AI cannot fulfill a request.
Advanced Scenarios
Personified Interaction
Adding personality through lights, animations, or synthetic voices builds trust and emotional connection.
Dynamic Interaction
Maintain seamless transitions, lively feedback, and efficient processing cues, especially for complex tasks like cooking guidance.
Conclusion
Voice UI design is multifaceted and still evolving. As digital devices become more pervasive, the time spent interacting with them may surpass human conversation, making VUI a potential mainstream interaction method.
We-Design
Tencent WeChat Design Center, handling design and UX research for WeChat products.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.