Migrating from anyascii¶
disarm's transliterate() replaces anyascii for Unicode-to-ASCII conversion.
Quick migration¶
# Before
from anyascii import anyascii
result = anyascii("café")
# After
from disarm import transliterate
result = transliterate("café")
Or use the compatibility alias:
from disarm import unidecode as anyascii
result = anyascii("café")
API comparison¶
| anyascii | disarm | Notes |
|---|---|---|
anyascii(s) |
transliterate(s) |
|
| — | transliterate(s, lang="de") |
New: language profiles |
| — | transliterate(s, errors="ignore") |
New: error modes |
| — | transliterate(s, errors="preserve") |
New: preserve unmapped |
Behavioral differences¶
Transliteration approach¶
anyascii and disarm both provide Unicode → ASCII transliteration, but they use different lookup tables. The core Latin-script mappings are very similar, but edge cases may differ. A detailed character-level comparison across all 83 supported languages shows:
- 49,089 codepoints across all Unicode blocks tested comprehensively (no sampling)
- 48,415 mapped by disarm vs 48,761 by anyascii — anyascii has broader coverage of some extended script blocks, while disarm provides language-aware romanization with 83 language profiles and 1,136 characters only disarm maps
- Most differences are systematic: CJK pinyin casing, Korean romanization, and language-specific national standards
from disarm import transliterate
# Common cases — identical
assert anyascii("café") == 'cafe'
assert transliterate("café") == 'cafe'
# CJK — may differ in romanization style
anyascii("北京") # romanization varies
assert transliterate("北京") == 'bei jing'
Language awareness¶
anyascii has no language parameter. disarm provides 83 language-specific profiles:
from disarm import transliterate
# anyascii can't do this
assert transliterate("München", lang="de") == 'Muenchen'
assert transliterate("Malmö", lang="sv") == 'Malmoe'
Error handling¶
anyascii silently drops characters with no mapping. disarm gives you control:
from disarm import transliterate
assert transliterate("♠", errors="replace", replace_with="?") == '?'
assert transliterate("♠", errors="ignore") == ''
assert transliterate("♠", errors="preserve") == '♠'
New features in disarm¶
Beyond basic transliteration, disarm also provides:
slugify()— URL slug generationsanitize_filename()— OS-safe filenamesnormalize_confusables()— homoglyph normalizationTextPipeline— composable text processingstrip_accents(),fold_case()— granular text cleaning
All in a single package with one consistent API.