Files
BasharBotV2/ARABIC_TTS.md
2025-12-05 23:27:43 -05:00

2.9 KiB

Arabic Text-to-Speech Feature

Overview

The bot now includes Arabic TTS responses using Coqui TTS, a lightweight and high-quality text-to-speech engine.

Features

  • Verbal Responses: Bot speaks in Arabic when executing commands
  • Lightweight Model: Uses tts_models/ar/cv/vits - a fast VITS-based Arabic model
  • Automatic Fallback: Falls back to pyttsx3 if Arabic TTS fails

Verbal Responses

Command Arabic Response Translation
join نعم، أنا هنا "Yes, I am here"
leave مع السلامة "Goodbye"
play حسناً "Okay"
skip التالي "Next"
stop توقف "Stop"
unknown ماذا تريد؟ "What do you want?"

Configuration

Environment Variables

# Enable/disable Arabic TTS
USE_ARABIC_TTS=true

# TTS model to use (default is lightweight Arabic VITS)
ARABIC_TTS_MODEL=tts_models/ar/cv/vits

Available Arabic Models

Coqui TTS provides several Arabic models. The default is optimized for speed:

  1. tts_models/ar/cv/vits (Default - Recommended)

    • Fast inference
    • Good quality
    • Small model size (~50MB)
    • Based on Common Voice dataset
  2. tts_models/ar/cv/glow-tts

    • Alternative model
    • Slightly different voice characteristics

Installation

The Arabic TTS is automatically installed with:

pip install TTS==0.22.0

On first run, the model will be downloaded automatically (~50MB).

Usage

Once enabled, the bot will automatically speak responses when:

  • Joining a voice channel
  • Leaving a voice channel
  • Playing, skipping, or stopping music
  • Receiving unknown commands

No additional commands needed - it works automatically!

Performance

  • Model Load Time: ~2-3 seconds on first use
  • Inference Time: ~0.5-1 second per response
  • Memory Usage: ~200MB additional RAM
  • Disk Space: ~50MB for model files

Disabling Arabic TTS

To disable and use only English TTS:

USE_ARABIC_TTS=false

Or remove the environment variable entirely.

Troubleshooting

Model Download Fails

If the model fails to download:

  1. Check internet connection
  2. Manually download: tts --model_name tts_models/ar/cv/vits --text "test"
  3. Models are cached in ~/.local/share/tts/

Audio Quality Issues

  • Ensure FFmpeg is properly installed
  • Check Discord voice bitrate settings
  • Try a different model from the list above

High CPU Usage

The VITS model is already optimized for CPU. If still too heavy:

  1. Set USE_ARABIC_TTS=false
  2. Use pyttsx3 fallback instead
  3. Consider running on a more powerful machine

Customization

To add more responses, edit bot.py:

VERBAL_RESPONSES = {
    "join": "نعم، أنا هنا",
    "your_command": "your arabic text here",
}

Then add the response call:

await speak_response(state.voice_client, "your_command")