Workflow

Access Protected WordPress Pages & Add to Knowledge Base

Scrape password-protected WordPress pages using browser cookies and automatically add the content to a Needle collection. Great for internal wikis and gated content.

React Flow

Jan Heimes

Last updated

February 4, 2026

Connectors used

Key Takeaways

Scrapes authenticated pages - Access content behind login walls using your browser session cookies
AI content extraction - Gemini strips navigation, headers, and footers, returning clean markdown
Adds to your Needle collection - Scraped content is indexed for AI-powered semantic search
Works with any cookie-authenticated site - WordPress, Drupal, internal wikis, and more
One page per run - Handles a single page; extend with a loop for multiple pages

What This Workflow Does

This workflow fetches a password-protected web page using your browser's session cookies, extracts the main content with AI, and adds it to a Needle collection for semantic search. You copy a fetch() request from DevTools, and the workflow makes the authenticated request, converts the HTML to clean markdown, and indexes it in your knowledge base.

Use cases:

Index internal wiki pages for AI-powered search
Archive gated documentation or member-only content
Add protected knowledge base articles to your Needle collection

How It Works

Step	What Happens
1. Manual trigger	You paste a fetch() request copied from your browser's DevTools
2. Parse fetch	Code node extracts the URL, method, headers, and cookies
3. HTTP request	Fetches the protected page using your session authentication
4. AI content extraction	Gemini extracts the main text content and formats it as markdown
5. Add to collection	Converts to a markdown file and adds it to your Needle collection

Setup Instructions

Click "Use template" on this page
Log into the site with the protected content
Open DevTools (F12) and go to the Network tab
Navigate to the protected page
Find the page request (usually the first one), right-click, and copy as fetch
Paste the fetch() into the Manual Trigger node
Select your target Needle collection in the last node
Run the workflow

Customization

What You Can Change	How
Target collection	Select a different Needle collection in the "Add Files" node
Content extraction	Edit the AI node prompt to focus on specific parts of the page
Multiple pages	Wrap the workflow in a loop with a list of URLs
Output format	Change the code node to produce a different file format

FAQ

Q: Does this work with any website? A: It works with any site that uses cookie-based authentication. Sites using JavaScript-only rendering may need the AI browser tool instead.

Q: How long do session cookies stay valid? A: It depends on the site. Most sessions last between 24 hours and 30 days. Re-copy the fetch() if your session expires.

Q: Can I scrape multiple pages at once? A: The template handles one page per run. You can extend it with a loop node to process a list of URLs.

Q: Is the scraped content searchable immediately? A: Yes, once added to your Needle collection, it is indexed and available for semantic search right away.

Want to showcase your own workflows?

Become a Needle workflow partner and turn your expertise into recurring revenue.

Learn how