Getting Started
Welcome to Voicepad! This guide will help you set up and make your first recording with transcription.
Installation
Install Voicepad using pip:
pip install voicepad
Requirements: Python 3.13 or higher
GPU Acceleration Available
For 4-5x faster transcription, install with GPU support:
pip install voicepad[gpu]
First-Time Setup
1. Check Your Microphone
List available audio devices:
voicepad config input
This shows all audio input devices on your system. Note the index number of your microphone.
2. Create Configuration File
Create a voicepad.yaml file in your project directory:
# Audio input (use device index from previous step)
input_device_index: 2
# Output directories
recordings_path: data/recordings
markdown_path: data/markdown
# Transcription settings
transcription_model: medium
transcription_device: auto
Configuration Location
You can also place this file at ~/.config/voicepad/voicepad.yaml for global configuration.
3. Verify System Capabilities
Check if your system is ready:
voicepad config system
This displays RAM, CPU, and GPU information to help you choose the right model.
Your First Recording
Simple Recording
Start recording (press Ctrl+C to stop):
voicepad record start
This will:
- ✅ Record audio from your configured microphone
- ✅ Save audio to
data/recordings/recording_TIMESTAMP.wav - ✅ Automatically transcribe after stopping
- ✅ Save transcription to
data/markdown/recording_TIMESTAMP.md
Recording with VAD Chunking
For long recordings, use VAD (Voice Activity Detection) chunking:
voicepad record start --vad --min-chunk-duration 60
This enables:
- 🎯 Real-time transcription while recording
- 📝 Live markdown updates
- 🔄 Smart chunk splitting at speech boundaries
- ⚡ Reduced wait time at the end
VAD Chunking Recommended For
- Meetings longer than 5 minutes
- Lectures and presentations
- Interviews
- Any recording where you want to see transcription progress in real-time
Understanding the Output
Audio Files
Located in data/recordings/:
- Format: WAV (16-bit PCM, 16kHz mono)
- Naming:
{prefix}_{timestamp}.wav - Example:
recording_20260218_033000.wav
Transcription Files
Located in data/markdown/:
- Format: Markdown with metadata
- Naming: Same as audio file with
.mdextension - Content: Full transcription with timestamps and statistics
Example transcription output:
# Transcription: recording_20260218_033000.wav
**Status:** Recording complete
---
## Chunk 1 (0:00 - 1:12)
This is the transcribed text from the first chunk of audio...
## Chunk 2 (1:12 - 2:24)
Continuing with the second chunk...
---
**Recording Complete**
- Total chunks: 2
- Total duration: 2:24
Common Tasks
Record with Custom Filename
voicepad record start --prefix meeting_notes
Record for Fixed Duration
voicepad record start --duration 300 # 5 minutes
Record Without Transcription
voicepad record start --no-transcribe
Check Current Configuration
voicepad record info
Next Steps
Now that you've made your first recording, explore:
- Configuration Guide - Customize all settings
- VAD Chunking - Learn about smart audio splitting
- CLI Reference - Complete command options
- Transcription Models - Choose the right model
Troubleshooting
No Audio Recorded
- Check microphone selection with
voicepad config input - Verify microphone permissions in your OS
- Test microphone in another application
Slow Transcription
- Consider using a smaller model (e.g.,
tinyinstead ofmedium) - Install GPU support:
pip install voicepad[gpu] - See GPU Acceleration
Import Errors
- Ensure Python 3.13+ is installed:
python --version - Reinstall:
pip install --force-reinstall voicepad
Need Help?
Visit the GitHub repository to report issues or ask questions.