Mobile Development 18 min read

Implementing Voice Playback for iOS Push Notifications Using Notification Service Extension and Baidu TTS

This article details the background, development steps, and debugging process for enabling dynamic voice playback in iOS push notifications via Notification Service Extension, covering iOS version constraints, integration of system AVSpeechSynthesizer and Baidu offline TTS SDK, code examples, and deployment considerations.

Sohu Tech Products
Sohu Tech Products
Sohu Tech Products
Implementing Voice Playback for iOS Push Notifications Using Notification Service Extension and Baidu TTS

1. Background

iOS push notification voice playback is required to read the notification text aloud, similar to Alipay and WeChat payment voice alerts. Only iOS 10+ supports background/audio playback after the app is awakened; iOS <10 can only play a fixed ringtone.

iOS 12 and later restrict background audio in Notification Service Extension, making implementation harder.

If the app is to be published on the App Store, only fixed audio or concatenated audio can be used via notification sound settings.

For internal distribution, the Notification Service Extension can be manually enabled for background playback.

2. Development Process

a. Notification Service Extension

After adding a Notification Service Extension target, the system invokes its methods when a push arrives, allowing modification of title, content, and sound before displaying the notification.

Lifecycle of the notification bar is roughly 6 seconds; if the user does not open the notification, the system calls serviceExtensionTimeWillExpire after up to 30 seconds.

Ensure new files are added to the correct target.

Resources such as sound files can be shared via App Groups.

Creation steps:

Create a Notification Service Extension target in Xcode (File → New → Target).

Enter a product name and finish the wizard.

Open NotificationService.m to handle the push.

@interface NotificationService ()

@property (nonatomic, strong) void (^contentHandler)(UNNotificationContent *contentToDeliver);
@property (nonatomic, strong) UNMutableNotificationContent *bestAttemptContent;

@end

@implementation NotificationService

- (void)didReceiveNotificationRequest:(UNNotificationRequest *)request withContentHandler:(void (^)(UNNotificationContent * _Nonnull))contentHandler {
    self.contentHandler = contentHandler;
    self.bestAttemptContent = [request.content mutableCopy];
    // Modify the notification content here…
    [self playVoiceWithInfo:self.bestAttemptContent.userInfo];
    self.contentHandler(self.bestAttemptContent);
}

- (void)serviceExtensionTimeWillExpire {
    self.contentHandler(self.bestAttemptContent);
}

- (void)playVoiceWithInfo:(NSDictionary *)userInfo {
    NSString *title = userInfo[@"aps"][@"alert"][@"title"];
    NSString *isRead = userInfo[@"isRead"];
    NSString *isUseBaiDu = userInfo[@"isBaiDu"];
    [[AVAudioSession sharedInstance] setCategory:AVAudioSessionCategoryPlayback withOptions:AVAudioSessionCategoryOptionDuckOthers error:nil];
    [[AVAudioSession sharedInstance] setActive:YES withOptions:AVAudioSessionSetActiveOptionNotifyOthersOnDeactivation error:nil];
    if ([isRead isEqual:@"1"]) {
        if ([isUseBaiDu isEqual:@"1"]) {
            [[BaiDuTtsUtils shared] playBaiDuTTSVoiceWithContent:title];
        } else {
            [[AppleTtsUtils shared] playAppleTtsVoiceWithContent:title];
        }
    }
}

@end

Key points in AppleTtsUtils :

Volume is the product of the set volume and the system volume.

Numbers are spoken correctly by inserting a space after each digit.

#import "AppleTtsUtils.h"
#import
@interface AppleTtsUtils ()
@property (nonatomic, strong) AVSpeechSynthesizer *speechSynthesizer;
@property (nonatomic, strong) AVSpeechSynthesisVoice *speechSynthesisVoice;
@end

@implementation AppleTtsUtils
+ (instancetype)shared {
    static id instance = nil;
    static dispatch_once_t onceToken;
    dispatch_once(&onceToken, ^{ instance = [[self class] new]; });
    return instance;
}
- (void)playAppleTtsVoiceWithContent:(NSString *)content {
    if (!content.length) return;
    NSString *newResult = @"";
    for (int i = 0; i < content.length; i++) {
        NSString *tempStr = [content substringWithRange:NSMakeRange(i, 1)];
        newResult = [newResult stringByAppendingString:tempStr];
        if ([self deptNumInputShouldNumber:tempStr]) {
            newResult = [newResult stringByAppendingString:@" "];
        }
    }
    AVSpeechUtterance *utterance = [AVSpeechUtterance speechUtteranceWithString:newResult];
    utterance.rate = AVSpeechUtteranceDefaultSpeechRate;
    utterance.voice = self.speechSynthesisVoice;
    utterance.volume = 1.0;
    [self.speechSynthesizer speakUtterance:utterance];
}
// ... delegate methods omitted for brevity ...
@end

b. Baidu TTS Offline SDK Integration

Create a Baidu AI console application and use the Notification Service Extension bundle ID.

Download the offline SDK, obtain AppId, AppKey, SecretKey, and SN.

Add the SDK files (BDSClientHeaders, BDSClientLib, BDSClientResource) to the extension target, ensuring the copy flag is set.

Link required system libraries as shown in the Baidu sample project.

#import "BaiDuTtsUtils.h"
#import "BDSSpeechSynthesizer.h"

NSString *BaiDuTTSAPP_ID = @"Your_APP_ID";
NSString *BaiDuTTSAPI_KEY = @"Your_APP_KEY";
NSString *BaiDuTTSSECRET_KEY = @"Your_SECRET_KEY";
NSString *BaiDuTTSSN = @"Your_SN";

@implementation BaiDuTtsUtils
+ (instancetype)shared { /* singleton */ }
- (void)configureOfflineTTS {
    NSString *offlineSpeechData = [[NSBundle mainBundle] pathForResource:@"bd_etts_common_speech_m15_mand_eng_high_am-mgc_v3.6.0_20190117" ofType:@"dat"];
    NSString *offlineTextData = [[NSBundle mainBundle] pathForResource:@"bd_etts_common_text_txt_all_mand_eng_middle_big_v3.4.2_20210319" ofType:@"dat"];
    if (!offlineSpeechData || !offlineTextData) { NSLog(@"Offline resources missing"); return; }
    NSError *err = [[BDSSpeechSynthesizer sharedInstance] loadOfflineEngine:offlineTextData speechDataPath:offlineSpeechData licenseFilePath:nil withAppCode:BaiDuTTSAPP_ID withSn:BaiDuTTSSN];
    if (err) { NSLog(@"Offline TTS init failed"); }
}
- (void)playBaiDuTTSVoiceWithContent:(NSString *)voiceText {
    [[BDSSpeechSynthesizer sharedInstance] setSynthesizerDelegate:self];
    [self configureOfflineTTS];
    [[BDSSpeechSynthesizer sharedInstance] setPlayerVolume:10];
    [[BDSSpeechSynthesizer sharedInstance] setSynthParam:@(5) forKey:BDS_SYNTHESIZER_PARAM_SPEED];
    NSError *speakError = nil;
    [[BDSSpeechSynthesizer sharedInstance] speakSentence:voiceText withError:&speakError];
    if (speakError) { NSLog(@"Error: %ld, %@", (long)speakError.code, speakError.localizedDescription); }
}
// Delegate callbacks omitted
@end

c. Debugging

Run the main app first, then launch the Notification Service Extension target and set breakpoints in didReceiveNotificationRequest:withContentHandler: . Verify that the payload contains mutable-content = 1 . Example payload:

{
  "aps": {
    "alert": {
      "title": "标题",
      "subtitle": "副标题",
      "body": "内容"
    },
    "badge": 1,
    "sound": "default",
    "mutable-content": 1
  }
}

Common errors after iOS 12 include audio queue start failures because background playback is not allowed. The solution is to enable the Audio, AirPlay, and Picture in Picture background mode in the main app’s Signing & Capabilities, and add Required background modes with App plays audio or streams audio/video using AirPlay to the extension’s plist.

3. Conclusion

For internal distribution, enabling background audio in the Notification Service Extension allows dynamic voice playback using either system AVSpeechSynthesizer or Baidu offline TTS. For App Store submissions, only fixed audio files can be used; dynamic synthesis is not permitted.

References

iOS Voice Playback Solution (Alipay/WeChat style)

iOS JPush + Voice Playback

Baidu Offline TTS iOS SDK Documentation

Baidu AI Console

iOS 12+ Voice Playback Issues and Exploration

iOS 12.1 Baidu TTS Playback Issues

WeChat iOS Payment Voice Reminder Summary

iOS 13 WeChat Payment Voice Reminder Summary

Mobile DevelopmentiOSPush NotificationVoice PlaybackAVSpeechSynthesizerBaidu TTSNotification Service Extension
Sohu Tech Products
Written by

Sohu Tech Products

A knowledge-sharing platform for Sohu's technology products. As a leading Chinese internet brand with media, video, search, and gaming services and over 700 million users, Sohu continuously drives tech innovation and practice. We’ll share practical insights and tech news here.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.