Workflow

Access Protected WordPress Pages & Add to Knowledge Base

Scrape password-protected WordPress pages using browser cookies and automatically add the content to a Needle collection. Great for internal wikis and gated content.

Jan HeimesJan Heimes

Last updated

February 4, 2026

Connectors used

Needle Logo

Tags

WordPressWeb ScrapingKnowledge BaseRAGProtected ContentInternal WikiCookie Authentication

Key Takeaways

  • Scrapes authenticated pages - Access content behind login walls using your browser session cookies
  • AI content extraction - Gemini strips navigation, headers, and footers, returning clean markdown
  • Adds to your Needle collection - Scraped content is indexed for AI-powered semantic search
  • Works with any cookie-authenticated site - WordPress, Drupal, internal wikis, and more
  • One page per run - Handles a single page; extend with a loop for multiple pages

What This Workflow Does

This workflow fetches a password-protected web page using your browser's session cookies, extracts the main content with AI, and adds it to a Needle collection for semantic search. You copy a fetch() request from DevTools, and the workflow makes the authenticated request, converts the HTML to clean markdown, and indexes it in your knowledge base.

Use cases:

  • Index internal wiki pages for AI-powered search
  • Archive gated documentation or member-only content
  • Add protected knowledge base articles to your Needle collection

How It Works

StepWhat Happens
1. Manual triggerYou paste a fetch() request copied from your browser's DevTools
2. Parse fetchCode node extracts the URL, method, headers, and cookies
3. HTTP requestFetches the protected page using your session authentication
4. AI content extractionGemini extracts the main text content and formats it as markdown
5. Add to collectionConverts to a markdown file and adds it to your Needle collection

Setup Instructions

  1. Click "Use template" on this page
  2. Log into the site with the protected content
  3. Open DevTools (F12) and go to the Network tab
  4. Navigate to the protected page
  5. Find the page request (usually the first one), right-click, and copy as fetch
  6. Paste the fetch() into the Manual Trigger node
  7. Select your target Needle collection in the last node
  8. Run the workflow

Customization

What You Can ChangeHow
Target collectionSelect a different Needle collection in the "Add Files" node
Content extractionEdit the AI node prompt to focus on specific parts of the page
Multiple pagesWrap the workflow in a loop with a list of URLs
Output formatChange the code node to produce a different file format

FAQ

Q: Does this work with any website? A: It works with any site that uses cookie-based authentication. Sites using JavaScript-only rendering may need the AI browser tool instead.

Q: How long do session cookies stay valid? A: It depends on the site. Most sessions last between 24 hours and 30 days. Re-copy the fetch() if your session expires.

Q: Can I scrape multiple pages at once? A: The template handles one page per run. You can extend it with a loop node to process a list of URLs.

Q: Is the scraped content searchable immediately? A: Yes, once added to your Needle collection, it is indexed and available for semantic search right away.

Want to showcase your own workflows?

Become a Needle workflow partner and turn your expertise into recurring revenue.

Try Needle today

Streamline AI productivity at your company today

Join thousands of people who have transformed their workflows.

Agentic workflowsAutomations, meet AI agents
AI SearchAll your data, searchable
Chat widgetsDrop-in widget for your website
Developer APIMake your app talk to Needle
    Needle LogoNeedle
    Like many websites, we use cookies to enhance your experience, analyze site traffic and deliver personalized content while you are here. By clicking "Accept", you are giving us your consent to use cookies in this way. Read our more on our cookie policy .