furuCRM
Back to Blog

PDF Knowledge Search Without RAG — Validation of Agentforce × Multimodal × Custom Objects

March 23, 2026
PDF Knowledge Search Without RAG — Validation of Agentforce × Multimodal × Custom Objects

Introduction

“Can PDF knowledge base search and Q&A be implemented without RAG (Retrieval Augmented Generation)?”

Prompted by this question from a client, we conducted a validation combining Salesforce Agentforce and Multimodal AI. In this blog, we share the results of that validation and the implementation approach.

1. Background and Challenges

Conventional Approach: Intelligent Context / Data Library

Salesforce Intelligent Context and data library features work by indexing documents and retrieving relevant content through semantic search. However, there are challenges such as the following.

Accuracy in interpreting PDFs that contain complex tables, charts, and graphs
The cost of building and operating vector search infrastructure
The complexity of designing and tuning RAG pipelines

Validation Question: “Can we achieve equivalent or better quality without RAG?”

To answer this question, we validated an approach that combines PDF analysis using Multimodal AI (Gemini 2.5 Pro) with custom objects and tag-based search.

2. Validation Approach

2.1 Why Multimodal?

PDFs that contain complex tables, charts, and graphs can be difficult to preserve accurately in terms of structure and numeric relationships when using only conventional text extraction or OCR.

By leveraging the multimodal capability of Gemini 2.5 Pro, we enabled the model to “look at” and understand PDFs, then structure them as follows.

Tables → Converted into HTML <table>
Charts and graphs → Data points extracted and stored as HTML tables
Headings and paragraphs → Structured into hierarchical semantic HTML (<h1> to <h6>, <p>, <section>)

As a result, we confirmed that it is possible to accumulate knowledge in a way that preserves PDF structure and numeric information without relying on Intelligent Context or a data library.

2.2 Overall Architecture

[PDF File]
    ↓ Processed asynchronously by batch (PDFExtractBatch / PDFExtractScheduler)
    ↓ Extracted and structured with Multimodal (Gemini 2.5 Pro)
[PDFExtract Prompt Template]
    ↓ Chunk split and automatic tag generation
[PDF_Knowledge__c] Custom Object
    - Chunk1 to 10 (HTML format)
    - Tag (for search)
    - FileName, FilePath (source information)
    ↓
[PDFTag__c] Master of all tags
    ↓
[User Question] → Agentforce Topic Instruction
    ↓ Calls the appropriate action
[Evaluate_PDF_Tags] LLM evaluates the relevance between the question and tags
    ↓ Identifies relevant tags
[PDF_Knowledge__c Search] LIKE search by tags
    ↓ Retrieves matching chunks
[Search_PDF_Knowledge] Generates response with sources
    ↓
[HTML Response]

3. Implementation Highlights

3.1 PDF Extraction (PDFExtract)

Input:
PDF files in Salesforce Files (ContentDocument)
Process:
Uses multimodal processing to structure tables, charts, and graphs into HTML, then splits them into up to 10 chunks
Output:
JSON (chunks, tags)
Stored In:
PDF_Knowledge__c, PDFTag__c

We confirmed that even for PDFs containing complex diagrams and tables, extraction can preserve both numerical values and structural relationships.

3.2 Tag-Based Search (As an Alternative to RAG)

RAG typically retrieves semantically similar documents through vector search. In this validation, we instead adopted the following.

Automatic tag generation: The LLM generates tags during PDF extraction
Relevance evaluation between questions and tags: The Evaluate_PDF_Tags Prompt Template uses an LLM to evaluate the relationship between the user’s question and all tags
Filtering by tags: Relevant tags are used to perform a LIKE search on PDF_Knowledge__c and retrieve matching chunks

By incorporating an expert perspective for each domain into the prompts, we improve the stability of tag evaluation.

3.3 Integration with Agentforce

Topic Instruction:
Instructs the system to call the appropriate action depending on the type of question
GenAiFunction:
Defines the Search PDF Knowledge action
Invocable Apex:
SearchPDFKnowledge orchestrates tag evaluation → search → answer generation

4. Validation Scenario Examples

In this validation, we tested operation using a scenario based on manufacturing product manuals and safety standards.

4.1 Example of Registered PDF

登録PDFの例

4.2 Example of Extracted Tags (All_Tags_Str)

"Product Specifications","Handling Instructions","Safety Standards","Protective Equipment","Procedure","Quality Control",
"Inspection Standards","Tolerance Values","Checklist","Record Format","Illustration","Flowchart"

4.3 Mapping of Example Questions and Related Tags

タグマッピング

4.4 Example Flow (Question Example: “Please explain the safety procedures”)

1. The user asks the Agent a question
→ “Please explain the safety procedures”

2. Topic Instruction interprets the question
→ Calls the Search PDF Knowledge action

3. Evaluate_PDF_Tags evaluates the tags
→ Input: Search_Term="Please explain the safety procedures", All_Tags_Str="Product Specifications","Safety Standards",...
→ Output: relatedTags=["Safety Standards","Procedure","Checklist"]

4. Search PDF_Knowledge__c by tags
→ TagSearch__c LIKE '%Safety Standards%' OR LIKE '%Procedure%' OR LIKE '%Checklist%'

5. Retrieve matching chunks
→ Matching section from Safety Standards Guideline.pdf

6. Search_PDF_Knowledge generates the answer
→ “The safety procedures are as follows: 1) Confirm wearing protective equipment 2) ...”
→ Source: Safety Standards Guideline.pdf

5. Prompt Template Examples

Below are examples of each prompt’s role and sample descriptions tailored to the manufacturing scenario.

5.1 PDFExtract (PDF Extraction and Structuring)

Role: Uses Multimodal AI to analyze the PDF and perform HTML structuring, chunk splitting, and tag generation.

Main Instructions (Excerpt):

- Tables → Convert to HTML <table>
- Charts and graphs → Store data points as HTML tables
- Wrap sections with <section> or <div class='section'>
- Generate 3 to 15 tags from the document content (e.g. "Product Specifications","Safety Standards","Tolerance Values")
- Output in JSON format (chunks, tags), and escape " as \" within strings

5.2 Evaluate_PDF_Tags (Tag Relevance Evaluation)

Role: Evaluates the relationship between the user’s question and all tags, and identifies relevant tags.

Input: `Search_Term`, `All_Tags_Str`

Main Instructions (Excerpt):

ROLE: As a document specialist, pick up all tags that connect to the knowledge needed to answer the question.

EVALUATION:
- Direct match: The question’s keyword is included in a tag
- Partial match: A tag is a compound term that contains the concept
- Semantic relevance: A tag belongs to the same domain
- Means/Method: If the question asks “how,” include tags related to procedures and checklists

DOMAIN HINTS (Manufacturing / Quality Control example):
- Specifications / Tolerance values → Product Specifications, Tolerance Values, Handling Instructions, Inspection Standards
- Procedure / Method → Safety Standards, Procedure, Checklist, Record Format
- Protection / Safety → Protective Equipment, Safety Standards

Output Example:

{"relatedTags": ["Safety Standards", "Procedure", "Checklist"]}

5.3 Search_PDF_Knowledge (Answer Generation)

Role: Refers to the knowledge obtained by the search and generates an answer with source attribution.

Input: `Search_Term`, `Search_Results` (JSON of knowledge + sources)

Main Instructions (Excerpt):

- Refer only to the knowledge in the search results; do not fabricate information
- Always clearly indicate the source (e.g. Source: Internal Manual / Safety Standards Guideline.pdf)
- Return the answer in concise HTML format

6. Validation Results and Benefits

6.1 What Worked Well

Interpretation of complex PDFs: Multimodal AI can structure PDFs containing tables, charts, and graphs with high accuracy
Implementation without RAG: Question answering is possible without a vector database or RAG pipeline
Clear source attribution: File names and paths can be included in responses as evidence
Flexibility with custom objects: The data model and search logic can be adjusted to fit company requirements

6.2 Expected Benefits

メリット

7. Pros, Cons, and Scope of Application

7.1 Advantages of This Approach (Without RAG)

本アプローチ(RAGなし)のメリット

7.2 Disadvantages and Constraints

デメリット・制約

7.3 Guideline for Scope of Application

適用範囲の目安

Conclusion: If the scope is limited to internal manuals, product documents, regulations, and similar materials, this approach is sufficiently practical without RAG. Once cross-domain or large-scale knowledge search becomes necessary, it is realistic to consider moving to RAG or hybrid search.

8. Gemini 2.5 Pro Full Context Capability — Context Width Beyond RAG

Gemini 2.5 Pro, which was used in this validation, differs from RAG (which injects retrieved fragments into prompts) in that it has a large context window capable of injecting file content directly into the prompt as-is.

8.1 Specifications (When Using Prompt Templates)

仕様

8.2 Approximation in Japanese — How Many Characters Can Be Injected?

The conversion between token count and character count varies depending on the language and writing system. For Japanese, it is generally estimated that 1 token ≈ 2 to 3 characters (mixed kanji and kana text).

日本語換算

* Excluding output tokens, about 980,000 tokens can be used for input. The above is a theoretical maximum-level estimate.

As a practical guideline, for Japanese PDFs or text, roughly 1.5 to 2 million characters can be passed as context in a single prompt call. In terms of ordinary business documents (about 500 to 800 characters per page), this corresponds to approximately 2,000 to 4,000 pages.

8.3 Difference from RAG — What Full Context Means

RAG との違い

The key point is that when the knowledge volume fits within roughly 1 million characters, directly injecting the relevant content into the prompt using a Full Context approach can be simpler and less likely to lose information than using RAG to “search → retrieve fragments.” In this validation, we adopted a hybrid approach in which PDFs are chunked and stored in custom objects, filtered by tags, and then the relevant chunks are passed into the prompt.

9. Future Expansion Ideas

Hierarchical tagging and synonym mapping
Pre-generation of chunk summaries and abstracts
Consideration of hybrid search (tags + keywords)
Extension to other document formats (Word, Excel)

Summary

To answer the question, “Can this be achieved without RAG?”, we validated that a combination of multimodal PDF analysis, custom objects, and tag-based search can achieve equal or better quality.

When handling PDFs that contain complex tables, charts, and graphs, structuring with Multimodal AI is effective, and even with a simple architecture that does not rely on RAG, practical knowledge search and answer generation can be achieved.

Reference Links / Tech Stack

Salesforce Agentforce
Einstein Prompt Builder (PDFExtract, Evaluate_PDF_Tags, Search_PDF_Knowledge)
Gemini 2.5 Pro (Multimodal PDF Analysis)
Custom Objects: PDF_Knowledge__c, PDFTag__c

This blog is intended to share validation results. When applying this approach in a production environment, we recommend evaluation and tuning according to your specific environment.