Extract Abbreviations from Knowledge Base
Automatically extract abbreviations from documents in your knowledge base and compile them into a structured Google Sheet or Google Doc directory with definitions.
Last updated
November 3, 2025
Connectors used
Tags
Video Tutorial
Key Takeaways
- Scans your document collection - Loops through files in a Needle collection to find abbreviations and their definitions.
- Structured AI extraction - Uses GPT-4.1 with structured output to return abbreviations as word, abbreviation, and definition triplets.
- Outputs to Google Sheets - Writes results to a Google Sheet, checking for duplicates before adding new entries.
- Handles pagination - Processes files in batches of 20 with automatic pagination through your entire collection.
- Manual trigger - You run it when you have documents ready to process.
What This Workflow Does
This workflow reads documents from a Needle collection, uses AI to extract abbreviations and their definitions, and compiles them into a Google Sheet. It loops through your files in batches, extracts abbreviations from each file, checks against existing entries in the sheet to avoid duplicates, and adds only new ones. It is designed for building glossaries or abbreviation directories from technical documentation, policies, or internal wikis.
Use cases:
- Building an onboarding glossary from internal documentation
- Maintaining an abbreviation directory for technical or compliance documents
- Creating a searchable reference sheet from research papers or knowledge base articles
How It Works
| Step | What Happens |
|---|---|
| 1. Manual Trigger | You start the workflow when documents are ready. |
| 2. Loop | Paginates through files in your Needle collection, 20 at a time. |
| 3. List Files | Fetches a batch of files from the collection using offset-based pagination. |
| 4. Transform | Flattens the file list for processing. |
| 5. Get File Contents | Extracts text content from each file. |
| 6. AI Extract Abbreviations | GPT-4.1 identifies abbreviations and returns structured triplets (word, abbreviation, definition). |
| 7. Get Values in Range | Reads existing entries from the Google Sheet. |
| 8. Merge | Combines the newly extracted abbreviations with existing sheet data. |
| 9. AI Write to Sheet | An AI agent with Google Sheets tools checks for duplicates and adds new entries. |
Workflow Nodes
| Node | Role |
|---|---|
| Manual Trigger | Starts the workflow on demand |
| Loop | Paginates through files with a configurable iteration limit (up to 20 iterations) |
| List Files | Fetches files from a Needle collection with offset-based pagination |
| Transform | Flattens nested file data into a single list |
| Get File Contents | Retrieves text content from each document |
| AI Agent (Extract) | Uses GPT-4.1 to extract abbreviations as structured output (word, abbreviation, definition) |
| Google Sheets Get Values in Range | Reads existing data from the target Google Sheet |
| Merge | Combines extracted abbreviations with existing sheet entries |
| AI Agent (Write) | Uses GPT-4.1 with Google Sheets tools to add new, non-duplicate abbreviations to the sheet |
Setup Instructions
- Add the "Extract Abbreviations from Knowledge Base" template to your Needle workspace.
- Upload your documents (PDFs, Word docs, markdown, text files) to a Needle collection.
- Open the List Files node and select your collection.
- Create a Google Sheet with columns: Word, Abbreviation, Definition.
- Connect your Google Sheets account by creating a Google Sheets connector in Needle.
- Update the Get Values in Range node with your Google Sheet URL.
- Update the AI Write node's system prompt with your Google Sheet URL.
- Select your Google Sheets connector in the relevant nodes.
- Click the manual trigger to run the workflow.
Customization
| What You Can Change | How |
|---|---|
| Document collection | Select a different Needle collection in the List Files node |
| Output destination | Replace Google Sheets nodes with Google Docs nodes for a formatted glossary instead of a spreadsheet |
| AI extraction focus | Modify the AI prompt to target specific types of abbreviations (technical, business, industry-specific) |
| Pagination batch size | Adjust the loop condition and offset calculation to process more or fewer files per iteration |
| Google Sheet columns | Update the AI Write node's system prompt to match a different column structure |
FAQ
Q: What file types does this support? A: It works with any files uploaded to a Needle collection, including PDFs, Word documents, markdown, and plain text files.
Q: How does it handle duplicates? A: The AI Write node reads the existing Google Sheet data and only adds abbreviations that are not already present.
Q: Can I output to Google Docs instead of Sheets? A: Yes. You can replace the Google Sheets nodes and tools with Google Docs equivalents for a formatted glossary document.
Q: What if a document has no abbreviations? A: The AI extraction node will return an empty list for that file, and the workflow will continue to the next one.
Want to showcase your own workflows?
Become a Needle workflow partner and turn your expertise into recurring revenue.