
SYS.GUIDES
How to Get Claude Code to Understand YouTube Videos
Set up the yt-analysis YouTube MCP server for Claude Code and Claude Desktop. Lets Claude summarize, transcribe, ask questions about, and search YouTube videos — no downloading needed.
ON THIS PAGE
Claude Code and Claude Desktop cannot natively access YouTube videos. Paste a URL and Claude either hallucinates the content or tells you it cannot watch videos. This guide walks through setting up the yt-analysis YouTube MCP server — an open-source Model Context Protocol server that fixes that using Google’s Gemini API to analyze YouTube videos directly, without downloading anything.
I built it because I kept running into the same situation: someone shares a 40-minute conference talk, and I want Claude to pull out the specific parts relevant to what I’m working on. Now it can.
What This Actually Does
The YouTube MCP server adds 8 tools to Claude for working with YouTube content. Gemini receives the YouTube URL directly — no local video processing, no transcription APIs, no downloading. The key is that Gemini has native YouTube URL support in its multimodal API, so it analyzes the actual video frames and audio rather than a separately generated transcript.
What you get out of it:
- Video summaries at three levels of detail
- Question and answer against any video
- Full YouTube transcript extraction
- Frame capture at key moments or specific timestamps
- Search for specific content within a video
- YouTube search (find videos matching a query)
All of this works inside a Claude Code session or Claude Desktop conversation — wherever you normally work.
What Is an MCP Server
MCP stands for Model Context Protocol, an open standard created by Anthropic that lets AI agents connect to external tools and data sources. Think of MCP servers as plugins for Claude — each one adds a set of tools that Claude can call during a conversation.
Claude cannot browse the web, watch videos, or access external APIs on its own. MCP servers close that gap. You install a YouTube MCP server once, and from that point on Claude has YouTube analysis capabilities in every session. The yt-analysis MCP server specifically bridges Claude to the Gemini YouTube API, giving it the ability to read, summarize, and query any public video.
Prerequisites
- Node.js 18+ installed on your system
- Claude Code or Claude Desktop
- A Gemini API key from Google AI Studio — the free tier works for most use cases
The Gemini free tier gives you 15 requests per minute on Gemini 2.5 Flash. That’s enough for casual use. If you are analyzing videos constantly, the paid tier is cheap enough that it will not noticeably affect your bill.
Installation
Clone the repo and build it:
git clone https://github.com/Legorobotdude/yt-analysis-mcp.git
cd yt-analysis-mcp
pnpm install
pnpm buildThe build step compiles TypeScript to dist/index.js. Note the full path to where you cloned it — you will need it in the next step.
Configure Claude Code
Run this command with your Gemini API key and the actual path to the cloned repo:
claude mcp add -s user -e GEMINI_API_KEY=your-key yt-analysis -- node /path/to/yt-analysis-mcp/dist/index.jsThe -s user flag makes this MCP server available across all your Claude Code sessions, not just the current project. Restart Claude Code after running this and the tools will be active immediately.
Configure Claude Desktop
Open your Claude Desktop MCP config file. On macOS it lives at ~/Library/Application Support/Claude/claude_desktop_config.json. Add the YouTube MCP server to the mcpServers object:
{
"mcpServers": {
"yt-analysis": {
"command": "node",
"args": ["/path/to/yt-analysis-mcp/dist/index.js"],
"env": {
"GEMINI_API_KEY": "your-key"
}
}
}
}Replace /path/to/yt-analysis-mcp with the actual path on your machine. Restart Claude Desktop and look for the tools icon in the chat interface to confirm the MCP server connected.
Tools Reference
Once installed, Claude has access to 8 YouTube analysis tools. You do not need to call them explicitly — just give Claude a YouTube URL and describe what you want. It picks the right tool automatically.
Summarize a Video
Tool: summarize_video
Generates a summary at one of three detail levels: brief, medium (default), or detailed (includes timestamps). Brief is good for quickly deciding if a video is worth your time. Detailed is what you want when you need to reference specific moments later.
Summarize this video in detail with timestamps:
https://www.youtube.com/watch?v=VIDEO_IDAsk Questions About a Video
Tool: ask_about_video
Ask anything about the video content — what the speaker said about a specific topic, what tools they mentioned, what the conclusion was. Gemini has analyzed the whole video, so answers are grounded in actual content rather than guessed from the title or description.
What does the speaker say about rate limiting?
https://www.youtube.com/watch?v=VIDEO_IDGet the YouTube Transcript
Tool: get_transcript
Returns a full text transcript of the video. This is one of the most searched use cases — getting a YouTube transcript without using a separate tool or API. Useful when you want to quote exact wording, feed content into another workflow, or do your own analysis on the raw text. Unlike YouTube’s auto-generated captions, this transcript comes from Gemini’s full video analysis, so it handles unclear audio and technical terminology better.
Get the full transcript of this video:
https://www.youtube.com/watch?v=VIDEO_IDExtract Screenshots
Tools: extract_screenshots and extract_frames
extract_screenshots has Gemini identify the most important moments in the video and capture frames from those points. You can also pass a focus parameter to target specific content — code examples, slide transitions, diagrams. The companion tool extract_frames works the same way but takes explicit timestamps you specify rather than letting Gemini decide. Frames are saved locally to the directory you specify.
Search Within a Video
Tool: search_in_video
Searches for specific content within a video and returns timestamps where it appears. Useful for long videos where you know what you are looking for but do not want to scrub through manually.
Find every timestamp in this video where they discuss
database indexing:
https://www.youtube.com/watch?v=VIDEO_IDSearch YouTube
Tool: search_youtube
Searches YouTube for videos matching a query and returns results with URLs. This is for finding videos in the first place, not analyzing them. Combine it with the other tools to go from a topic to a summary in one conversation without leaving Claude.
Find videos about React Server Components and summarize
the top resultPractical Examples
Three workflows I use regularly:
Conference talk triage. Someone shares a 3-hour conference recording. Ask Claude for a brief summary, then follow up with specific questions about the parts relevant to your current work. You get the value of the talk in about 5 minutes.
Give me a brief summary of this conference talk:
https://www.youtube.com/watch?v=VIDEO_ID
Then tell me: what specific architectural decisions
did they make for the data layer?Tutorial code extraction. You found a tutorial video for a library but want the code without watching the whole thing. Ask Claude to get the YouTube transcript and pull out all the code snippets.
Get the transcript of this tutorial and extract every
code example they show:
https://www.youtube.com/watch?v=VIDEO_IDResearch synthesis. You want to understand a topic from multiple angles. Search YouTube for videos on the topic, pick the top 3, summarize each, and ask Claude to synthesize the key points across all of them.
Search YouTube for "raft consensus algorithm explained",
find the top 3 results, summarize each one, and tell me
what key points all of them agree onTips and Gotchas
- Private and age-restricted videos will not work. Gemini can only analyze publicly accessible YouTube URLs.
- Very long videos are slower. Gemini processes the whole video before responding. A 3-hour video takes noticeably longer than a 10-minute one.
- The Gemini free tier rate-limits at 15 req/min. If you are chaining multiple analyses in a single session, you may hit this. Wait a minute and retry.
- You do not need to call tools by name. Just tell Claude what you want in natural language. “Summarize this video” triggers
summarize_video. “What did they say about X?” triggersask_about_video. “Get me the transcript” triggersget_transcript. Explicit tool names only matter if you want precise control. - Shorts and youtu.be URLs both work. The MCP server handles
youtube.com/watch?v=,youtu.be/, andyoutube.com/shorts/.
Full source and issue tracker are on GitHub. If you are interested in other MCP servers I’ve built, check out the Image Generation MCP Server which uses the same Gemini API for image generation across any MCP-compatible client.
FREQUENTLY ASKED QUESTIONS
What is an MCP server and why do I need one for YouTube?
MCP (Model Context Protocol) is an open standard that lets AI agents like Claude connect to external tools and data sources. Claude cannot access YouTube natively, so installing a YouTube MCP server gives it tools to summarize, transcribe, and query videos directly.
How does the YouTube MCP server analyze videos without downloading them?
The yt-analysis MCP server passes YouTube URLs directly to Google's Gemini API, which has native YouTube URL support. Gemini processes the video on Google's infrastructure — no downloading, no separate transcription API, no local processing required.
Does this YouTube MCP server work with Claude Desktop as well as Claude Code?
Yes. It works with Claude Code via the claude mcp add command and with Claude Desktop via the claude_desktop_config.json configuration file. Any MCP-compatible AI agent or IDE can use it.