101 lines
2.4 KiB
Markdown
101 lines
2.4 KiB
Markdown
# Voice Receiving Setup Guide
|
|
|
|
## discord.py vs py-cord
|
|
|
|
Both **discord.py 2.0+** and **py-cord** support voice receiving through `discord.sinks`. Your bot now uses discord.py.
|
|
|
|
## Key Changes Made
|
|
|
|
1. **Switched to discord.py** - More actively maintained, better voice support
|
|
2. **Added opuslib** - Required for voice receiving on Windows
|
|
3. **Simplified connection logic** - Let the library handle reconnection internally
|
|
|
|
## Installation Steps
|
|
|
|
### 1. Install Opus (Windows)
|
|
|
|
```powershell
|
|
# Using Chocolatey (recommended)
|
|
choco install opus-tools -y
|
|
|
|
# Or download manually from:
|
|
# https://opus-codec.org/downloads/
|
|
```
|
|
|
|
### 2. Reinstall Python dependencies
|
|
|
|
```bash
|
|
pip uninstall py-cord discord.py -y
|
|
pip install -r requirements.txt
|
|
```
|
|
|
|
### 3. Set Opus path (if needed)
|
|
|
|
If Opus still doesn't load, add to your `.env`:
|
|
|
|
```
|
|
OPUS_LIB=C:\path\to\opus.dll
|
|
```
|
|
|
|
Common locations:
|
|
- `C:\ProgramData\chocolatey\lib\opus-tools\tools\opus.dll`
|
|
- `C:\Windows\System32\opus.dll`
|
|
|
|
## How Voice Receiving Works
|
|
|
|
### Recording Audio
|
|
|
|
```python
|
|
# Start recording (already in your bot)
|
|
voice_client.start_recording(sink, callback)
|
|
|
|
# Stop recording
|
|
voice_client.stop_recording()
|
|
```
|
|
|
|
### The Sink Pattern
|
|
|
|
Your `HotwordStreamSink` receives PCM audio data:
|
|
- **48kHz sample rate**
|
|
- **2 channels (stereo)**
|
|
- **16-bit PCM**
|
|
|
|
The sink's `write()` method is called continuously with audio chunks from each user.
|
|
|
|
## Troubleshooting
|
|
|
|
### Error 4006 (Session Invalid)
|
|
|
|
This happens when Discord thinks you're already connected. Fixed by:
|
|
- Proper cleanup before reconnecting
|
|
- Using `reconnect=True` in `channel.connect()`
|
|
- Waiting 1 second after disconnect
|
|
|
|
### No Audio Received
|
|
|
|
1. Check Opus is loaded: Look for "Loaded opus library" in logs
|
|
2. Verify bot has "Use Voice Activity" permission
|
|
3. Ensure users aren't muted
|
|
|
|
### High CPU Usage
|
|
|
|
The continuous transcription can be heavy. Consider:
|
|
- Increasing `min_chunk_seconds` in HotwordStreamSink
|
|
- Using a lighter STT model
|
|
- Only transcribing when volume threshold is met
|
|
|
|
## Testing
|
|
|
|
1. Start the bot: `python bot.py`
|
|
2. Join a voice channel
|
|
3. Say "hey bashar join" in text chat
|
|
4. Bot should join and start listening
|
|
5. Speak in voice - bot transcribes in real-time
|
|
|
|
## Alternative: discord.py Voice Recv
|
|
|
|
If you want even more control, check out:
|
|
https://github.com/imayhaveborkedit/discord-ext-voice-recv
|
|
|
|
This is a discord.py extension specifically for voice receiving.
|