There is a tax most MENA apps are paying without ever seeing it on a balance sheet: the cost of asking Arabic-speaking users to type. On mobile, typing is already the worst form of friction in any checkout flow. In Arabic, that friction is multiplied by script complexity, diacritics, right-to-left layout, code-switching, and the everyday reality that millions of users don't even type in Arabic script at all—they type in Arabizi. Add it all up and the result is an outsized, largely invisible drag on checkout completion in the region's most mobile-first market. Voice input removes that tax at the source: instead of fighting a keyboard, the user just says what they want.
The short answer
Mobile checkout abandonment is brutal everywhere—mobile shoppers abandon at roughly 85%, far above desktop, and a major driver is simply how hard it is to enter information on a phone, with [nearly 40% of mobile shoppers citing difficulty entering information](https://buildgrowscale.com/reduce-form-abandonment-mobile-checkout) as a reason they quit. In MENA, where commerce is overwhelmingly mobile-first, that input friction lands harder because typing Arabic on a touch keyboard carries problems Latin-script users never face. The fix isn't a cleverer form—it's removing the keyboard from the critical path. Voice input is roughly 3x faster than mobile typing and bypasses script, diacritics, and RTL entirely.
Why typing Arabic on mobile is uniquely hard
The friction isn't one problem. It's a stack of them, and they compound.
The script and its shaping
Arabic is cursive: letters change shape depending on whether they sit at the start, middle, or end of a word, and several letters look nearly identical apart from dot placement. On mobile, broken text shaping produces disconnected letters and distorted, reversed text when an app's rendering pipeline isn't fully RTL-aware. The user is now proofreading their own input mid-checkout—a place you never want hesitation.
Diacritics (tashkeel)
Arabic's short vowels and marks—fatha, damma, kasra, sukun, shadda, tanween—usually aren't typed in casual text, but when precision matters, adding a diacritic typically means tapping the letter first, then tapping the mark, and many mobile keyboards don't support diacritic entry at all without a special keyboard. Every extra tap is a chance to lose the user.
Right-to-left cursor behavior
RTL inverts the most basic motor habit in text entry. As input guides note, in Arabic moving the cursor visually left actually moves it logically right, and editing mixed Arabic-and-numbers fields (think addresses, card numbers, building numbers) regularly produces flipped, misordered, or copy-paste-corrupted text. Address and payment fields are exactly where this bites hardest.
Arabizi and code-switching
Here's the part most product teams miss. A large share of Arabic speakers, especially younger users, simply don't type in Arabic script. They use [Arabizi—Arabic written in Latin letters and numerals](https://en.wikipedia.org/wiki/Arabizi), where digits stand in for sounds with no Latin equivalent (3 for ع, 7 for ح). This convention [emerged because early phones and the internet only supported Latin script](https://en.wikipedia.org/wiki/Arabizi), and it stuck: a Saudi study found that [even users with full access to an Arabic keyboard still choose Arabizi](https://en.wikipedia.org/wiki/Arabizi) as a marker of identity. On top of that, MENA users constantly code-switch—Arabic, English, and Arabizi in a single sentence—forcing repeated keyboard-language toggles mid-form. A form field that expects one script and one language is fighting the way the region actually communicates.
A thin Arabic-content layer makes it worse
All of this happens against a backdrop where the digital Arabic content layer hasn't kept pace with smartphone adoption, so Arabic input experiences are routinely an afterthought bolted onto Latin-first designs.
The data table: Arabic input friction → conversion impact
| Arabic input friction | What the user experiences | Checkout / conversion impact |
|---|---|---|
| Cursive script + dot-only letter differences | Misreads, typos, re-checking own text | Hesitation and re-entry in name/address fields |
| Diacritics (tashkeel) | Multi-tap per mark; many keyboards lack support (source) | Added taps per field; abandonment risk on long forms |
| RTL cursor inversion in mixed fields | Cursor jumps; flipped numbers (source) | Errors in address & card-number fields—high-stakes drop-off |
| Arabizi vs. Arabic script | Users type Latin+numerals instead of script (source) | Input mismatches forms expecting Arabic script |
| Code-switching (AR/EN/Arabizi) | Constant keyboard-language toggling | Friction stacks per field; momentum lost |
| Long mobile forms generally | 81% abandon if a form feels too long | Compounds every Arabic-specific cost above |
Why this lands so hard in MENA specifically
MENA is not a market where mobile is a secondary channel—it's the primary one. Internet penetration across the Arab world sits around 70%, roughly 348 million users, with Gulf markets like the UAE, Qatar, and Bahrain exceeding 90% smartphone penetration. Over half the region's population is under 30—precisely the cohort most likely to use Arabizi and to code-switch. So the input friction described above isn't an edge case affecting a sliver of traffic; it's the default condition for the majority of checkouts in a fast-growing, mobile-first commerce market.
Now layer in the universal mobile-checkout reality. Cart abandonment globally is around 70% by Baymard's measure, and mobile specifically runs much higher—around 85% on phones. Trimming form fields alone delivers a 25–35% conversion lift on mobile. If shaving fields off a Latin-script form moves the needle that much, removing the entire Arabic-typing burden is a lever of the same order—which is exactly where the "30–40% conversion tax" in this article's title comes from: the compounded cost of script, diacritics, RTL, and Arabizi friction sitting on top of already-high mobile abandonment.
Voice removes the tax at the source
Every friction above is a property of typing. Remove typing from the critical path and the entire stack collapses. A Stanford/UW/Baidu study measured smartphone voice input at ~3x the speed of typing, with a 20%+ lower error rate. For Arabic the gap is arguably wider, because the keyboard penalties voice skips—shaping, diacritics, RTL cursor inversion, script/Arabizi mismatch, language toggling—are precisely the ones that make Arabic typing slow in the first place.
There's a deeper alignment, too: people speak Arabic far more fluidly than they type it, and a good voice layer doesn't care whether the user would have typed in script or Arabizi—it just understands the spoken intent. That's why voice-to-action is a structurally better fit for MENA than yet another form optimization. (For the full picture on building voice that handles the region's spoken reality, see our Arabic voice SDK complete guide and our breakdown of Arabic dialects in voice recognition.)
The broader thesis—that your users would rather not type at all—is one we've argued before in [your users don't want to type](/resources/blog/your-users-dont-want-to-type), and the checkout-specific evidence is in [voice commerce checkout conversion for retail and delivery](/resources/blog/voice-commerce-checkout-conversion-retail-delivery). The flip side of input friction is confirmation friction; we covered how badly a clumsy confirm step bleeds users in how we lost 60% of users at the confirmation screen.
What a voice layer needs to actually work in MENA
Voice only removes the tax if it genuinely handles the region's speech: Modern Standard Arabic and dialects, English, and the natural code-switching between them—not a brittle Latin-first engine with Arabic bolted on. The quality of the underlying speech-to-text matters enormously here; we compare the options in the best Arabic voice / speech-to-text APIs for 2026. The implementation goal is to make voice a first-class input across iOS, Android, React Native, and Flutter, so a user can simply say "send 200 to my brother" or "track my order" instead of fighting a keyboard. See the Voqal docs for how that drops into an existing app.
Frequently asked questions
Why is typing Arabic on a phone harder than typing English?
Arabic is a cursive, right-to-left script whose letters change shape by position and differ only by dots, often requiring multi-tap diacritic entry that many mobile keyboards don't even support. RTL also inverts cursor movement, which corrupts mixed text-and-number fields like addresses and card numbers.
What is Arabizi and why does it break checkout forms?
Arabizi is Arabic written in Latin letters and numerals (e.g., 3 = ع, 7 = ح). Many users—even those with an Arabic keyboard—type this way by habit and identity. Forms built to expect Arabic script (or clean English) mismatch what users actually enter, creating validation errors and drop-off.
How much does input friction actually cost at checkout?
Mobile checkout abandonment runs around 85%, and nearly 40% of mobile shoppers cite difficulty entering information. Simply reducing form fields can lift conversion 25–35%—so removing the heavier Arabic-typing burden is a lever of comparable magnitude.
Is voice input really faster than typing?
Yes. A Stanford-led study found smartphone voice input is about 3x faster than typing with a 20%+ lower error rate. The advantage is larger in Arabic because voice skips exactly the steps—script shaping, diacritics, RTL, script/Arabizi mismatch—that slow Arabic typing.
Does this only matter in the Gulf, or across MENA?
Across MENA. Internet penetration in the Arab world is around 70% (~348M users), Gulf smartphone penetration exceeds 90%, and over half the population is under 30—the cohort most likely to use Arabizi and code-switch. It's the default condition, not an edge case.
How do I add voice to my existing app?
Voqal provides a voice-to-actions SDK supporting Arabic and English across iOS, Android, React Native, and Flutter. Start with the documentation or join the waitlist to get early access.
The bottom line
The Arabic keyboard tax is real, it's compounding, and it's mostly invisible in standard analytics because it shows up as generic "mobile abandonment." But the underlying cause—asking users to fight script, diacritics, RTL, and Arabizi on a touch keyboard—is specific, and so is the remedy. Stop optimizing the form. Remove it from the critical path with voice. Join the waitlist or read the docs to see what that looks like in your app.