Whisper API — Speech to Text with OpenAI
Advertisement
Introduction
Whisper converts speech to text with high accuracy. This guide covers API integration.
Getting Started
from openai import OpenAI
client = OpenAI(api_key="sk-...")
with open("audio.mp3", "rb") as f:
transcript = client.audio.transcriptions.create(
model="whisper-1",
file=f
)
print(transcript.text)
Features
- Multiple languages
- High accuracy
- Automatic punctuation
- Timestamps available
- Reasonable pricing
Supported Formats
MP3, MP4, MPEG, MPGA, M4A, WAV, WEBM
Pricing
$0.006 per minute, very affordable.
Tips
- Clean audio produces better results
- Specify language for improvement
- Handle errors gracefully
- Cache transcripts
- Monitor API usage
Use Cases
Transcription services, voice commands, accessibility, meeting notes, customer feedback.
Limitations
- Audio length limit (25MB)
- No real-time streaming
- Language detection sometimes inaccurate
Conclusion
Whisper excellent choice for speech-to-text needs.
FAQ
Q: Accuracy? A: Very high, comparable to human transcription.
Q: Other languages? A: Supports 99+ languages with good accuracy.
Advertisement