BasharBotV2/ARABIC_TTS.md

# Arabic Text-to-Speech Feature

## Overview

The bot now includes Arabic TTS responses using Coqui TTS, a lightweight and high-quality text-to-speech engine.

## Features

- **Verbal Responses**: Bot speaks in Arabic when executing commands
- **Lightweight Model**: Uses `tts_models/ar/cv/vits` - a fast VITS-based Arabic model
- **Automatic Fallback**: Falls back to pyttsx3 if Arabic TTS fails

## Verbal Responses

| Command | Arabic Response | Translation |
|---------|----------------|-------------|
| join    | نعم، أنا هنا    | "Yes, I am here" |
| leave   | مع السلامة     | "Goodbye" |
| play    | حسناً          | "Okay" |
| skip    | التالي         | "Next" |
| stop    | توقف           | "Stop" |
| unknown | ماذا تريد؟     | "What do you want?" |

## Configuration

### Environment Variables

```bash
# Enable/disable Arabic TTS
USE_ARABIC_TTS=true

# TTS model to use (default is lightweight Arabic VITS)
ARABIC_TTS_MODEL=tts_models/ar/cv/vits
```

### Available Arabic Models

Coqui TTS provides several Arabic models. The default is optimized for speed:

1. **tts_models/ar/cv/vits** (Default - Recommended)
   - Fast inference
   - Good quality
   - Small model size (~50MB)
   - Based on Common Voice dataset

2. **tts_models/ar/cv/glow-tts**
   - Alternative model
   - Slightly different voice characteristics

## Installation

The Arabic TTS is automatically installed with:

```bash
pip install TTS==0.22.0
```

On first run, the model will be downloaded automatically (~50MB).

## Usage

Once enabled, the bot will automatically speak responses when:
- Joining a voice channel
- Leaving a voice channel
- Playing, skipping, or stopping music
- Receiving unknown commands

No additional commands needed - it works automatically!

## Performance

- **Model Load Time**: ~2-3 seconds on first use
- **Inference Time**: ~0.5-1 second per response
- **Memory Usage**: ~200MB additional RAM
- **Disk Space**: ~50MB for model files

## Disabling Arabic TTS

To disable and use only English TTS:

```bash
USE_ARABIC_TTS=false
```

Or remove the environment variable entirely.

## Troubleshooting

### Model Download Fails

If the model fails to download:
1. Check internet connection
2. Manually download: `tts --model_name tts_models/ar/cv/vits --text "test"`
3. Models are cached in `~/.local/share/tts/`

### Audio Quality Issues

- Ensure FFmpeg is properly installed
- Check Discord voice bitrate settings
- Try a different model from the list above

### High CPU Usage

The VITS model is already optimized for CPU. If still too heavy:
1. Set `USE_ARABIC_TTS=false`
2. Use pyttsx3 fallback instead
3. Consider running on a more powerful machine

## Customization

To add more responses, edit `bot.py`:

```python
VERBAL_RESPONSES = {
    "join": "نعم، أنا هنا",
    "your_command": "your arabic text here",
}
```

Then add the response call:

```python
await speak_response(state.voice_client, "your_command")
```