Skip to content

Getting Started

Welcome to Voicepad! This guide will help you set up and make your first recording with transcription.

Installation

Install Voicepad using pip:

pip install voicepad

Requirements: Python 3.13 or higher

GPU Acceleration Available

For 4-5x faster transcription, install with GPU support:

pip install voicepad[gpu]
See GPU Acceleration Guide for setup.

First-Time Setup

1. Check Your Microphone

List available audio devices:

voicepad config input

This shows all audio input devices on your system. Note the index number of your microphone.

2. Create Configuration File

Create a voicepad.yaml file in your project directory:

# Audio input (use device index from previous step)
input_device_index: 2

# Output directories
recordings_path: data/recordings
markdown_path: data/markdown

# Transcription settings
transcription_model: medium
transcription_device: auto

Configuration Location

You can also place this file at ~/.config/voicepad/voicepad.yaml for global configuration.

3. Verify System Capabilities

Check if your system is ready:

voicepad config system

This displays RAM, CPU, and GPU information to help you choose the right model.

Your First Recording

Simple Recording

Start recording (press Ctrl+C to stop):

voicepad record start

This will:

  1. ✅ Record audio from your configured microphone
  2. ✅ Save audio to data/recordings/recording_TIMESTAMP.wav
  3. ✅ Automatically transcribe after stopping
  4. ✅ Save transcription to data/markdown/recording_TIMESTAMP.md

Recording with VAD Chunking

For long recordings, use VAD (Voice Activity Detection) chunking:

voicepad record start --vad --min-chunk-duration 60

This enables:

  • 🎯 Real-time transcription while recording
  • 📝 Live markdown updates
  • 🔄 Smart chunk splitting at speech boundaries
  • ⚡ Reduced wait time at the end

VAD Chunking Recommended For

  • Meetings longer than 5 minutes
  • Lectures and presentations
  • Interviews
  • Any recording where you want to see transcription progress in real-time

Understanding the Output

Audio Files

Located in data/recordings/:

  • Format: WAV (16-bit PCM, 16kHz mono)
  • Naming: {prefix}_{timestamp}.wav
  • Example: recording_20260218_033000.wav

Transcription Files

Located in data/markdown/:

  • Format: Markdown with metadata
  • Naming: Same as audio file with .md extension
  • Content: Full transcription with timestamps and statistics

Example transcription output:

# Transcription: recording_20260218_033000.wav

**Status:** Recording complete

---

## Chunk 1 (0:00 - 1:12)

This is the transcribed text from the first chunk of audio...

## Chunk 2 (1:12 - 2:24)

Continuing with the second chunk...

---

**Recording Complete**
- Total chunks: 2
- Total duration: 2:24

Common Tasks

Record with Custom Filename

voicepad record start --prefix meeting_notes

Record for Fixed Duration

voicepad record start --duration 300  # 5 minutes

Record Without Transcription

voicepad record start --no-transcribe

Check Current Configuration

voicepad record info

Next Steps

Now that you've made your first recording, explore:

Troubleshooting

No Audio Recorded

  • Check microphone selection with voicepad config input
  • Verify microphone permissions in your OS
  • Test microphone in another application

Slow Transcription

  • Consider using a smaller model (e.g., tiny instead of medium)
  • Install GPU support: pip install voicepad[gpu]
  • See GPU Acceleration

Import Errors

  • Ensure Python 3.13+ is installed: python --version
  • Reinstall: pip install --force-reinstall voicepad

Need Help?

Visit the GitHub repository to report issues or ask questions.