Documentation Index
Fetch the complete documentation index at: https://hanabiaiinc-codex-add-japanese-phoneme-docs.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Getting Started
To use fine-grained control, you can use either our SDK, API, or Playground. SDK/API: Phoneme tags are preserved by text normalization, so you can keep the default normalization behavior for pronunciation control. Set"normalize": false only when you want to prevent normalization from rewriting the surrounding text, such as numbers, dates, or URLs.
Playground: You can use V1.6 Control Model, without setting any other options.
Disabling normalization may reduce the stability of reading numbers, dates,
and URLs. You’ll need to handle these cases manually for best results.
Phoneme Control
Phoneme control allows you to specify exact pronunciations for words, characters, or short phrases. Wrap the desired pronunciation in<|phoneme_start|> and <|phoneme_end|> tags.
The replacement scope depends on the language:
- English: replace one word with CMU Arpabet.
- Chinese: replace one character or syllable with tone-number pinyin.
- Japanese: replace a short Japanese word or phrase with OpenJTalk-style romaji and pitch accent markers.
English
CMU Arpabet examples for names, homographs, acronyms, and technical terms.
Chinese
Tone-number pinyin examples for multi-character words, tones, and polyphonic
characters.
Japanese
OpenJTalk romaji phonemes with pitch accent digits.
Quick Examples
English:Paralanguage
Paralanguage controls allow you to add natural speech elements and pauses to make the generated speech sound more human-like. There are two main types of controls:Pause Words
You can use common pause words like “um”, “uh”, “嗯”, “啊” to control the rhythm of the speech.Special Effects
The following special effects can be added using parentheses:| Effect | Description | First Available | Stage |
|---|---|---|---|
(break) | Short pause | V1.6 | Experimental |
(long-break) | Extended pause | V1.6 | Experimental |
(breath) | Breathing sound | V1.6 | Experimental |
(laugh) | Laughter sound | V1.6 | Experimental |
(cough) | Coughing sound | V1.6 | Experimental |
(lip-smacking) | Lip smacking sound | V1.6 | Experimental |
(sigh) | Sighing sound | V1.6 | Experimental |




