Workflow
Scrape YouTube Channel Transcripts to RAG
Fetch videos from a YouTube channel via Supadata, extract transcripts with metadata, and ingest them into a Needle collection with rich labels for retrieval.
Needle Team
Last updated
March 1, 2026
Connectors used
Tags
YouTubeTranscript ExtractionRAGKnowledge BaseAI SearchMetadata LabelingChannel Intelligence
Key Takeaways
- Bulk channel ingestion — Pulls up to hundreds of YouTube videos from one channel handle in a single run
- Transcript-first RAG pipeline — Converts each transcript into Markdown and stores it in your Needle collection
- Rich labeling for retrieval — Adds structured labels like
videoId,durationMinutes,topic_*, andmentions_* - Two-stage loop architecture — Fetch IDs first, then iterate each video for transcript extraction + indexing
- Built for AI agent memory — Gives your collection searchable long-form video knowledge, not just links
What This Workflow Does
This workflow turns a YouTube channel into a searchable RAG knowledge base.
You provide a channel handle (for example @n8n) in the Manual Trigger. The workflow then:
- Gets all video + short IDs from Supadata
- Loops each video ID and fetches transcript content
- Normalizes transcript text and generates Markdown files
- Adds each transcript file to your Needle collection
- Adds metadata labels to power semantic + filter-based retrieval
Setup
- Create a Supadata account and API key
- Set
SUPADATA_API_KEYas a secret workflow variable - Select your target Needle collection in both Needle nodes
- Run with a channel handle like
@channel_name
Labels Added
- Source + identity:
source,videoId,videoUrl,channelHandle - Video shape:
isShort,videoType,language - Stats:
wordCount,characterCount,durationMinutes,lengthCategory - Time buckets:
indexedDate,indexedYearMonth,indexedYear,indexedMonth - Flags:
isTutorial,isReview,isNews,hasLiveDemo,hasCTA - Dynamic topic labels:
topic_* - Dynamic tool labels:
mentions_*
Troubleshooting
- No files added: Verify channel handle and API key.
- Transcript missing: Some videos do not expose transcripts; loop continues by design.
- Labels missing: Ensure both Needle nodes target the same collection.
- Run is slow: Lower initial fetch limit or increase wait tolerance.
Want to showcase your own workflows?
Become a Needle workflow partner and turn your expertise into recurring revenue.