Workflow

Scrape YouTube Channel Transcripts to RAG

Fetch videos from a YouTube channel via Supadata, extract transcripts with metadata, and ingest them into a Needle collection with rich labels for retrieval.

Needle Team

Last updated

March 1, 2026

Connectors used

Needle Logo

Tags

YouTubeTranscript ExtractionRAGKnowledge BaseAI SearchMetadata LabelingChannel Intelligence

Key Takeaways

  • Bulk channel ingestion — Pulls up to hundreds of YouTube videos from one channel handle in a single run
  • Transcript-first RAG pipeline — Converts each transcript into Markdown and stores it in your Needle collection
  • Rich labeling for retrieval — Adds structured labels like videoId, durationMinutes, topic_*, and mentions_*
  • Two-stage loop architecture — Fetch IDs first, then iterate each video for transcript extraction + indexing
  • Built for AI agent memory — Gives your collection searchable long-form video knowledge, not just links

What This Workflow Does

This workflow turns a YouTube channel into a searchable RAG knowledge base.

You provide a channel handle (for example @n8n) in the Manual Trigger. The workflow then:

  1. Gets all video + short IDs from Supadata
  2. Loops each video ID and fetches transcript content
  3. Normalizes transcript text and generates Markdown files
  4. Adds each transcript file to your Needle collection
  5. Adds metadata labels to power semantic + filter-based retrieval

Setup

  1. Create a Supadata account and API key
  2. Set SUPADATA_API_KEY as a secret workflow variable
  3. Select your target Needle collection in both Needle nodes
  4. Run with a channel handle like @channel_name

Labels Added

  • Source + identity: source, videoId, videoUrl, channelHandle
  • Video shape: isShort, videoType, language
  • Stats: wordCount, characterCount, durationMinutes, lengthCategory
  • Time buckets: indexedDate, indexedYearMonth, indexedYear, indexedMonth
  • Flags: isTutorial, isReview, isNews, hasLiveDemo, hasCTA
  • Dynamic topic labels: topic_*
  • Dynamic tool labels: mentions_*

Troubleshooting

  • No files added: Verify channel handle and API key.
  • Transcript missing: Some videos do not expose transcripts; loop continues by design.
  • Labels missing: Ensure both Needle nodes target the same collection.
  • Run is slow: Lower initial fetch limit or increase wait tolerance.

Want to showcase your own workflows?

Become a Needle workflow partner and turn your expertise into recurring revenue.

Try Needle today

Streamline AI productivity at your company today

Join thousands of people who have transformed their workflows.

Agentic workflowsAutomations, meet AI agents
AI SearchAll your data, searchable
Chat widgetsDrop-in widget for your website
Developer APIMake your app talk to Needle
    Needle LogoNeedle
    Like many websites, we use cookies to enhance your experience, analyze site traffic and deliver personalized content while you are here. By clicking "Accept", you are giving us your consent to use cookies in this way. Read our more on our cookie policy .