[react-native-callingx] iOS: AudioSessionManager hardcodes .defaultToSpeaker, blocking earpiece-default for voice-only calls

**Which package/packages do you use?**

- [ ] `@stream-io/video-react-sdk`
- [x] `@stream-io/video-react-native-sdk`
- [ ] `@stream-io/video-client`

Also affects: `@stream-io/react-native-callingx@0.1.1`, `@stream-io/react-native-webrtc@137.1.3`.

## Summary

`@stream-io/react-native-callingx@0.1.1` ignores `setDefaultAudioDeviceEndpointType('earpiece')` on iOS because `CallManager.start()` short-circuits when callingx is active, and `AudioSessionManager.swift` hardcodes `.defaultToSpeaker` in `categoryOptions` regardless of intent. Removing `.defaultToSpeaker` via patch achieves earpiece-default but **also exposes a separate, deeper bug**: the default `AudioSessionManager` config destabilizes `AudioEngineDevice` (the LiveKit-style `AVAudioEngine` audio device introduced in WebRTC 137), so CallKit's lock-screen speaker button desyncs from the actual route and **requires a double-tap (or worse) to disable** speakerphone. We were able to fix that second issue too by mirroring `react-native-callkeep`'s `configureAudioSession` config in callingx, but it's a non-trivial chain of patches that we believe should be addressed upstream.

## Environment

- `@stream-io/video-react-native-sdk`: 1.32.3
- `@stream-io/react-native-callingx`: 0.1.1
- `@stream-io/react-native-webrtc`: 137.1.3
- React Native: 0.79.6 (Expo 53)
- iOS: 26.x on physical device (iPhone)
- Use case: 1:1 audio-only calls (no video) — receiver/earpiece is the expected default per business rule, like a phone call

## Configuration

```ts
import { AudioSettingsRequest } from '@stream-io/video-react-native-sdk'

export const AudioDefaultSettings: AudioSettingsRequest = {
  mic_default_on: true,
  speaker_default_on: false,
  default_device: 'earpiece',
}
```

This config is passed to `call.getOrCreate({ data: { settings_override: { audio: AudioDefaultSettings } } })` and `call.join()` is called normally.

## Expected

Call starts with audio routed to the **built-in receiver (earpiece)**, matching `default_device: 'earpiece'`.

## Actual

Call starts with audio routed to the **built-in speaker**, ignoring the config entirely.

## Root cause

Two interacting layers:

### 1. `CallManager.start()` bypasses on iOS+callingx

[`packages/react-native-sdk/src/modules/call-manager/CallManager.ts`](https://github.com/GetStream/stream-video-js/blob/main/packages/react-native-sdk/src/modules/call-manager/CallManager.ts) (lines 113–131):

```ts
start = (config?: StreamInCallManagerConfig): void => {
  if (shouldBypassForCallKit()) {
    videoLoggerSystem
      .getLogger('CallManager')
      .debug('start: skipping start as callkit is handling the audio session');
    return;  // <-- early return, config is ignored
  }
  NativeManager.setAudioRole(config?.audioRole ?? 'communicator');
  if (config?.audioRole === 'communicator') {
    const type = config.deviceEndpointType ?? 'speaker';
    NativeManager.setDefaultAudioDeviceEndpointType(type);
  }
  ...
};
```

`shouldBypassForCallKit()` returns `true` when `Platform.OS === 'ios'` and callingx is set up — i.e., the standard production setup for any consumer following the 1.32 migration guide. Result: `setDefaultAudioDeviceEndpointType` is **never called** on iOS in this configuration.

### 2. `AudioSessionManager.swift` in callingx hardcodes `.defaultToSpeaker`

[`packages/react-native-callingx/ios/AudioSessionManager.swift`](https://github.com/GetStream/stream-video-js/blob/main/packages/react-native-callingx/ios/AudioSessionManager.swift) (around lines 12–21):

```swift
let categoryOptions: AVAudioSession.CategoryOptions
#if compiler(>=6.2) // For Xcode 26.0+
    categoryOptions = [.allowBluetoothHFP, .defaultToSpeaker]
#else
    categoryOptions = [.allowBluetooth, .defaultToSpeaker]
#endif
let mode: AVAudioSession.Mode = .voiceChat
```

`.defaultToSpeaker` is unconditional. There's no setter, no config path, no way to opt out at runtime.

For comparison, the **non-CallKit code path already handles this correctly** in [`packages/react-native-sdk/ios/StreamInCallManager.swift:128`](https://github.com/GetStream/stream-video-js/blob/main/packages/react-native-sdk/ios/StreamInCallManager.swift#L128):

```swift
intendedOptions = defaultAudioDevice == .speaker
    ? [bluetoothOption, .defaultToSpeaker]
    : [bluetoothOption]
```

`StreamInCallManager` reads `defaultAudioDevice` (set via `setDefaultAudioDeviceEndpointType`) and gates `.defaultToSpeaker` on it. `AudioSessionManager` (callingx) does not — likely because the callingx path was developed under the assumption of video-first calls, where speaker default is correct.

## Step 1: remove `.defaultToSpeaker` from callingx

Patch `AudioSessionManager.swift` to remove `.defaultToSpeaker`:

```swift
#if compiler(>=6.2)
    categoryOptions = [.allowBluetoothHFP]
#else
    categoryOptions = [.allowBluetooth]
#endif
```

Plus a parallel patch on `react-native-webrtc`'s `WebRTCModule+RTCMediaStream.m:691-693` (`ensureAudioSessionWithRecording`) — same removal — because that method also reapplies `categoryOptions` defensively if the session is reset.

This **achieves earpiece-default** ✓ but exposes a separate side effect.

## Step 2: side effect — CallKit lock-screen speaker button gets stuck

After removing `.defaultToSpeaker`, the speaker button on iOS's native CallKit UI (lock screen, in-call screen) misbehaves:

1. User taps speaker on CallKit → button becomes active, audio routes to speaker ✓
2. ~2 seconds later, **the button flips to inactive while audio stays in speaker** — UI and route desync
3. Tapping again to disable speaker is interpreted by CallKit as "turn on again" (since its UI thinks state is off), so the tap is effectively a no-op — speaker stays on
4. In-app UI toggle keeps working correctly (one tap each direction), because we route through `callManager.speaker.setForceSpeakerphoneOn(false)` → `overrideOutputAudioPort(.none)` directly

Through device logs and side-by-side comparison with `react-native-callkeep` (the lib used in our previous-major production app, which works correctly with the same Stream WebRTC 137 binary), we traced this to the **`AudioEngineDevice`** (LiveKit's `AVAudioEngine`-based audio device, introduced in WebRTC 137):

1. CallKit fires `overrideOutputAudioPort(.speaker)` on the user's tap
2. `AudioEngineDevice` reacts to internal route reconciliation and rebuilds the `AVAudioEngine`
3. The rebuild fires an implicit `setCategory`, which **clears the `AVAudioSession` override flag**
4. CallKit reads that flag for its UI state → button flips to "off" while audio remains in speaker
5. Subsequent taps go to no-op because CallKit's internal state thinks speaker is already off

The root cause is that callingx's default `AudioSessionManager` config (mode `.voiceChat`, options HFP-only, no explicit `sampleRate`/`ioBufferDuration`, no reapplication on `didActivateAudioSession`) **destabilizes `AudioEngineDevice`**, triggering the spontaneous engine rebuild on route changes.

## Step 3: fix — mirror `react-native-callkeep`'s `configureAudioSession`

`react-native-callkeep` operates on the same Stream WebRTC 137 binary in production without this issue. The difference is its `configureAudioSession`, which uses a different (and stable) set of defaults. Mirroring those values in callingx fixes the side effect:

```swift
// AudioSessionManager.swift::createAudioSessionIfNeeded

- let mode: AVAudioSession.Mode = .voiceChat
+ let mode: AVAudioSession.Mode = .default

#if compiler(>=6.2)
-    categoryOptions = [.allowBluetoothHFP]
+    categoryOptions = [.allowBluetoothHFP, .allowBluetoothA2DP]
#else
-    categoryOptions = [.allowBluetooth]
+    categoryOptions = [.allowBluetooth, .allowBluetoothA2DP]
#endif

// Add explicit values to keep the audio engine from renegotiating:
+ rtcConfig.sampleRate = 44100
+ rtcConfig.ioBufferDuration = 0.005
```

```swift
// CallingxImpl.swift::provider:didActivateAudioSession

  RTCAudioSession.sharedInstance().audioSessionDidActivate(audioSession)

+ // callkeep does this on every CXProviderDelegate event; previously
+ // callingx only called it from performStartCallAction/performAnswerCallAction
+ AudioSessionManager.createAudioSessionIfNeeded()
```

With these in place on top of the Step 1 patch, the CallKit lock-screen toggle works correctly (one tap each direction) on iOS 26.3.1.

## Why this matters

Voice-only 1:1 calling (WhatsApp/Phone-style UX) is a legitimate first-class use case, and the SDK already exposes the right API surface for it via `setDefaultAudioDeviceEndpointType` with `'earpiece'`. The gap is that on iOS + callingx that config doesn't reach the audio session, which forces consumers to maintain a chain of native patches just to reach the documented behaviour:

1. Patch callingx + react-native-webrtc to drop the hardcoded `.defaultToSpeaker`
2. Patch callingx again to mirror `react-native-callkeep`'s audio session config so the CallKit UI doesn't desync from the actual route

Both layers feel like things that should be solvable inside callingx itself — either by gating `.defaultToSpeaker` on `setDefaultAudioDeviceEndpointType` (analogous to what `StreamInCallManager` already does on the non-CallKit path), or by adopting the known-stable `AudioSessionManager` defaults that match `react-native-callkeep`'s production behaviour with the same WebRTC binary.

## Thanks

Huge thanks to the Stream Video team — the 1.32 line is a real step up, and the callingx migration is clearly a lot of careful work, especially around CallKit and iOS 26. This report is meant as a friendly heads-up from a consumer who hit the audio-only edge of an otherwise great release, not a complaint. Really appreciate everything you all ship, and happy to test patches, share repro projects, or jump on anything that helps narrow this down. 🙏


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[react-native-callingx] iOS: AudioSessionManager hardcodes .defaultToSpeaker, blocking earpiece-default for voice-only calls #2219

Summary

Environment

Configuration

Expected

Actual

Root cause

1. `CallManager.start()` bypasses on iOS+callingx

2. `AudioSessionManager.swift` in callingx hardcodes `.defaultToSpeaker`

Step 1: remove `.defaultToSpeaker` from callingx

Step 2: side effect — CallKit lock-screen speaker button gets stuck

Step 3: fix — mirror `react-native-callkeep`'s `configureAudioSession`

Why this matters

Thanks

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[react-native-callingx] iOS: AudioSessionManager hardcodes .defaultToSpeaker, blocking earpiece-default for voice-only calls #2219

Description

Summary

Environment

Configuration

Expected

Actual

Root cause

1. CallManager.start() bypasses on iOS+callingx

2. AudioSessionManager.swift in callingx hardcodes .defaultToSpeaker

Step 1: remove .defaultToSpeaker from callingx

Step 2: side effect — CallKit lock-screen speaker button gets stuck

Step 3: fix — mirror react-native-callkeep's configureAudioSession

Why this matters

Thanks

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

1. `CallManager.start()` bypasses on iOS+callingx

2. `AudioSessionManager.swift` in callingx hardcodes `.defaultToSpeaker`

Step 1: remove `.defaultToSpeaker` from callingx

Step 3: fix — mirror `react-native-callkeep`'s `configureAudioSession`