How to Add Arabic + English Voice Control to a Flutter App

Voqal TeamJune 9, 2026

Short answer: To add Arabic voice control to a Flutter app today, bridge the native Voqal SDK (iOS + Android) into Flutter through a thin `MethodChannel` plugin. Voqal handles Arabic dialects (Egyptian, Gulf, Levantine, Maghrebi, Iraqi) and English, runs at sub-1-second latency, and turns speech into actions inside your app — not just transcribed text. An official Flutter package is on the roadmap; until it ships, the MethodChannel wrapper described below is the supported integration path, and the React Native package is the closest pre-built reference if you prefer a cross-platform bridge that already exists.

This tutorial walks through the whole thing: choosing your approach, writing the bridge, wiring credentials, presenting the assistant UI, and handling Arabic dialects correctly in production.

Prerequisites

Before you start, you'll need:

  • Flutter 3.16+ and Dart 3.x.
  • An iOS target of 16.0+ (the native Voqal SDK floor) and Android minSdk 24+.
  • Xcode 15+ and a recent Android Studio / Gradle toolchain for native builds.
  • A Voqal publishable key (pk_live_…). Grab one and read the contract on the Voqal docs. If you don't have access yet, join the waitlist.
  • Microphone permission declared on both platforms (covered below).
  • Basic comfort with platform channels — if you've never written one, the Flutter platform-channels guide is worth a 10-minute skim first.

The options for Flutter voice (and why they differ)

Flutter has no first-party voice stack, so you're choosing between three categories. They are not interchangeable.

OptionWhat you getArabic dialectsVoice-to-actionLatency
speech_to_text pluginRaw on-device transcriptionLimited; OS-dependent, MSA-leaningNo — you parse text yourselfVariable, OS-bound
Picovoice / wake-word SDKsKeyword spotting + intentWeak Arabic coveragePartial, rule-basedGood
Voqal (native SDK + bridge)Transcription + intent + action execution + drop-in UIEgyptian, Gulf, Levantine, Maghrebi, Iraqi + EnglishYes — speech maps to actionsSub-1s

Honest take: raw STT plugins are transcription, not voice control

The popular `speech_to_text` Flutter plugin is excellent at one thing: turning audio into a string using the device's built-in recognizer. That's it. It is explicitly built for short commands and phrases, not continuous conversation, and its Arabic support inherits whatever the underlying OS engine offers — which skews toward Modern Standard Arabic and stumbles on real dialect speech. Community packages like `speech_to_text_btn` add Arabic by layering Google Speech APIs or Vosk on top, but you still get back only text.

That leaves the hard 80% to you: detecting intent, mapping it to an action, confirming sensitive operations, rendering a response, and handling RTL Arabic UI. Voqal does that whole pipeline. Speech becomes a structured action your app executes, with a themed assistant UI rendered for you. If your goal is genuine voice control — "show my last five transactions," "create a payment link for 250 pounds" — transcription alone won't get you there. For a deeper breakdown of where voice genuinely helps versus where it's friction, see when voice actually works in mobile apps.

Bridge the native Voqal SDK via MethodChannel

The concept: install the native Voqal SDK on each platform, then expose two calls to Dart — setup (configure once) and present (open the assistant). A MethodChannel carries them across.

1. Add the native SDKs

iOS — add the Swift Package in your ios/ project (or Podfile if you wrap it):

bash
# In Xcode: File ▸ Add Package Dependencies…
# URL: https://github.com/VoqalAI/voqal-ios  @ 1.0.0  ▸ library "VoqalSDK"

Android — add the Voqal Android dependency in android/app/build.gradle:

bash
# android/app/build.gradle
dependencies {
    implementation "ai.voqal:voqal-sdk:1.0.0"
}

2. Declare the Dart channel

Create lib/voqal.dart — a thin Dart facade over the channel:

dart
import 'package:flutter/services.dart';

/// Thin Flutter facade over the native Voqal SDK.
/// Replace with the official Voqal Flutter package when it ships.
class Voqal {
  static const _channel = MethodChannel('ai.voqal/sdk');

  /// Configure the SDK once, at app launch.
  static Future<void> setup({
    required String apiKey,
    required String agentUrl,
  }) async {
    await _channel.invokeMethod('setup', {
      'apiKey': apiKey,
      'agentUrl': agentUrl,
    });
  }

  /// Open the drop-in voice assistant.
  static Future<void> present() => _channel.invokeMethod('present');

  /// Open the connection early so the first turn is fast.
  static Future<void> prewarm() => _channel.invokeMethod('prewarm');
}

3. Implement the iOS side (Swift)

In ios/Runner/AppDelegate.swift, register the channel and forward calls to VoqalSDKManager:

dart
// ios/Runner/AppDelegate.swift  (Swift — shown here for completeness)
import Flutter
import VoqalSDK

let channel = FlutterMethodChannel(
  name: "ai.voqal/sdk",
  binaryMessenger: controller.binaryMessenger)

channel.setMethodCallHandler { call, result in
  switch call.method {
  case "setup":
    let args = call.arguments as! [String: Any]
    VoqalSDKManager.shared.setup(
      apiKey: args["apiKey"] as! String,
      agentURL: args["agentUrl"] as! String)
    result(nil)
  case "present":
    VoqalSDKManager.shared.presentChat(from: controller, delegate: delegate)
    result(nil)
  case "prewarm":
    VoqalSDKManager.shared.prewarm(delegate: delegate)
    result(nil)
  default:
    result(FlutterMethodNotImplemented)
  }
}

The Android side mirrors this in MainActivity.kt against the Voqal Android manager. The delegate supplies the live auth token and host view controller — see the delegate contract on the docs.

Configure credentials

Voqal resolves your tenant from a publishable key and authenticates the end user with a live token read on every request. Never hardcode the user token — return a fresh one from the delegate each time.

Call setup once at launch, then prewarm so the first spoken turn isn't cold:

dart
// lib/main.dart
void main() {
  WidgetsFlutterBinding.ensureInitialized();
  Voqal.setup(
    apiKey: 'pk_live_xxx',          // publishable — safe in the client
    agentUrl: 'https://api.voqal.ai',
  ).then((_) => Voqal.prewarm());   // open the connection early
  runApp(const MyApp());
}

Declare microphone permission on both platforms:

bash
# ios/Runner/Info.plist
<key>NSMicrophoneUsageDescription</key>
<string>Used for voice control of the app.</string>

# android/app/src/main/AndroidManifest.xml
<uses-permission android:name="android.permission.RECORD_AUDIO"/>

Present the assistant

With the bridge in place, opening the voice assistant is a single Dart call. Drop a button anywhere in your widget tree:

dart
FloatingActionButton(
  child: const Icon(Icons.mic),
  onPressed: () => Voqal.present(),
)

The native SDK renders the full assistant interior — the listening/thinking/speaking states, the response widgets, and any confirmation cards — over your Flutter app. You don't build voice UI. Speech goes in; an action (and a themed visual response) comes out.

Arabic dialect handling

This is where Voqal separates from generic STT. Voqal recognizes and responds across Egyptian, Gulf, Levantine, Maghrebi, and Iraqi Arabic plus English, and it follows two rules that matter for MENA UX:

  • Recognition is dialect-aware. A user speaking Egyptian colloquial and another speaking Gulf are both understood — you don't force users into Modern Standard Arabic, which is how most OS recognizers fail real speakers.
  • Spoken replies default to Modern Standard Arabic (فصحى). Output stays clean and universally legible while input stays casual and natural.

You don't configure a per-dialect model — detection is automatic. On the Flutter side, the only thing you own is RTL layout for any of your own surfaces that display returned text. Wrap Arabic content in a directionality-aware widget:

dart
Directionality(
  textDirection: TextDirection.rtl,
  child: Text(responseText),
)

For the full picture of dialect coverage and how Arabic ASR quality varies across providers, see the complete Arabic voice SDK guide and the best Arabic voice & speech-to-text APIs for 2026.

Test

Validate on a real device, not just the simulator — microphone behavior and recognition latency differ:

bash
flutter run --release -d <device-id>

A quick checklist:

  • Speak an Egyptian command and a Gulf command for the same intent — both should resolve to the same action.
  • Speak in English mid-session — language should switch without re-setup.
  • Confirm a sensitive action triggers the confirm card rather than executing silently.
  • Measure first-turn latency with and without prewarm — the prewarmed path should feel near-instant.
  • Verify Arabic text in your own widgets renders right-to-left.

Production notes

  • Always prewarm at launch. The first turn after a cold start pays a connection + model-cache cost; prewarming hides it. Subsequent turns settle around 2.5–3s end-to-end including the network round trip, with recognition itself sub-second.
  • Never commit secrets. The publishable pk_live_… key is safe in the client; the end-user auth token is not — always source it live from your delegate.
  • Pin SDK versions. Pin the native iOS/Android SDK versions in your lockfiles so CI builds are reproducible. Bump deliberately.
  • Handle the no-permission path. If the user denies microphone access, fall back gracefully to your existing tap UI rather than leaving a dead button.
  • Watch the roadmap. When the official Voqal Flutter package ships it will replace this hand-rolled bridge with a pub.dev dependency and a typed Dart API — the channel names and method shapes here are intentionally close to that future surface, so migration is mostly deleting glue code. Track it on the docs.

FAQ

Is there an official Voqal Flutter package today? Not yet — it's on the roadmap. Voqal ships native iOS and Android SDKs and a React Native package now. For Flutter, the supported path today is the thin MethodChannel bridge shown above. The React Native integration is the closest pre-built wrapper if you'd rather start from existing cross-platform code.

How is this different from the speech_to_text Flutter plugin? `speech_to_text` gives you a transcribed string from the device recognizer and nothing more — you build intent detection, action mapping, confirmation, and UI yourself, and Arabic support leans on the OS (mostly MSA). Voqal returns a structured action plus a rendered assistant UI, and understands five Arabic dialects natively.

Which Arabic dialects does Voqal support? Egyptian, Gulf, Levantine, Maghrebi, and Iraqi, plus English. Recognition is dialect-aware automatically; spoken responses default to Modern Standard Arabic for clarity.

What latency should I expect? Recognition is sub-1-second. A full warm turn — speech in, action executed, response spoken — typically lands around 2.5–3s including network. Call prewarm() at launch so the first turn isn't cold.

Do I need my own backend or speech models? No. Voqal runs the speech, intent, and action pipeline for you. You supply a publishable key and a live user token via your delegate; the SDK handles the rest. See the docs for the request contract, or join the waitlist to get a key.

Related articles