MCP Server for Claude, Cursor & AI Agents
BetaGive your AI agent native access to voice-to-text transcription, YouTube transcript extraction, and text-to-speech — no API calls, no glue code. Add one config entry to Claude Desktop, Cursor, or Windsurf and start using it immediately.
What is the Model Context Protocol?
MCP (Model Context Protocol) is an open standard created by Anthropic that lets AI tools call external services as native functions — no custom integration, no REST plumbing. Add VoiceToTextOnline once and every project using that config can transcribe audio, pull YouTube captions, and generate speech directly inside the AI conversation.
Tools, not endpoints
Instead of writing fetch() calls, your AI agent calls tools like youtube_transcript and text_to_speech as if they were built-in functions.
Works mid-conversation
Pull a transcript or generate audio during any task — without leaving the chat or switching tools. Claude handles it automatically.
One config, every project
Add your API key to your MCP config once. Every project sharing that config gets instant access, no per-project setup.
Available MCP Tools
youtube_transcript● liveYouTube Transcript
Extract the full transcript from any YouTube video URL or 11-character video ID. Returns plain text. Perfect for summarisation, research, content repurposing, and note-taking workflows.
url: string — YouTube URL or video IDformat?: "text" | "srt" — output format20 requests / day per API key
text_to_speech● liveText to Speech
Convert text to lifelike speech using Google Neural voices (Neural2, Chirp3 HD). Returns base64-encoded MP3 audio. Supports 55+ languages and speaking rate control.
text: string — text to synthesizevoiceName: string — e.g. "en-US-Neural2-J"languageCode: string — BCP-47 e.g. "en-US"speed?: number — 0.25–4.0, default 1.0500 chars/request (free) · 2,000 chars/request (paid)
transcribe○ coming soonAudio File Transcription
Submit an audio or video file URL for AI transcription with speaker labels and timestamps. Returns a full transcript via AssemblyAI.
fileUrl: stringlanguage?: stringspeakerLabels?: booleanCredit-based (same as dashboard)
How to Set Up the MCP Server
Works with Claude Desktop, Cursor, Windsurf, and any MCP-compatible client. The server URL and Authorization header are identical regardless of which client you use.
Go to Dashboard → API Keys and click Generate Key. Free accounts include 2 keys with no credit card required.
Paste the snippet below into your client's MCP config file. Replace v2t_live_... with your actual key.
Tell Claude "get the transcript of this YouTube video" or "convert this text to speech" — it calls VoiceToTextOnline automatically.
{
"mcpServers": {
"voicetotextonline": {
"url": "https://voicetotextonline.com/api/mcp",
"headers": {
"Authorization": "Bearer v2t_live_..."
}
}
}
}claude_desktop_config.json.cursor/mcp.json.codeiumrcmcp configMCP vs REST API — Which Should You Use?
No glue code required
The REST API requires fetch calls, response parsing, and error handling. MCP tools are invoked directly by the AI — the agent decides when to call them and handles the result.
Works mid-conversation
Your agent can pull a YouTube transcript or generate audio mid-task without any tool-switching. It's invisible to the user.
Chain tools together
Transcribe audio → summarise → convert summary to speech. All in one agent flow. MCP makes multi-step pipelines trivial.
Same auth, same quota
Your existing API key and plan covers both MCP and REST. No new account, no extra billing. Quota is shared across both.
Prefer direct HTTP? View the REST API docs →
Frequently Asked Questions
Is the MCP server free to use?▼
Does this work with Cursor and Windsurf?▼
How is this different from the REST API?▼
What happens when I hit the daily YouTube limit?▼
Can I use the MCP server for commercial projects?▼
Related Tools & Integrations
Ready to add voice tools to your AI agent?
Generate a free API key, add one config entry, and your AI agent gains access to YouTube transcript extraction and text-to-speech — immediately. No credit card required.