Agentforce and Data Cloud of 20% of total score in Salesforce Agentforce Specialist Exam. The topic covers data library, retrieval augment generation (RAG), search types, and retrievers.

NOTE

Most of the content in this work was generated with the assistance of AI and carefully reviewed, edited, and curated by the author. If you have found any issues with the content on this page, please do not hesitate to contact me at support@issacc.com.

🤖 Agentforce & Data Cloud

🎯 Learning Objectives

After studying this topic, you should be able to:

  • Explain how an Agentforce Data Library improves agent response accuracy and personalization.
  • Describe how to create and configure data libraries using the Knowledge base or uploaded files.
  • Assign an Agentforce Data Library to an agent via the Agent Builder.
  • Explain how Retrieval Augmented Generation (RAG) enhances AI-generated responses.
  • Use retrievers to ground prompt templates with relevant Data Cloud information.

📚 Agentforce Data Library Overview

💡 A Data Library acts as a structured knowledge repository for Agentforce agents.

🔍 Purpose🧩 Description
AccuracyGround AI responses in domain-specific knowledge.
PersonalizationProvide contextually relevant, organization-specific responses.
TrustBuild user confidence in generative AI output.

Data libraries can use:

  • Salesforce Knowledge base
  • Uploaded files (.txt, .html, .pdf)

⚙️ Core Concepts

🧠 Grounding

  • Injects domain-specific knowledge and customer context into LLM prompts.
  • Leads to more accurate, relevant, and trustworthy responses.

🧩 Chunking

  • Breaks data sources into smaller chunks for efficient search and retrieval.
  • Works across text, images, and audio.

🗂️ Indexing

  • Organizes chunks for structured search and quick access.
  • Enables semantic matching via similarity scoring.
  • When a query runs, chunks are compared by similarity score.
  • High-scoring chunks are injected into the LLM prompt for context.

📥 Retrievers

  • Embedded resources that search and return relevant information from data libraries.
  • Define which datasets in Data Cloud are available to AI agents.

🧰 Data Library Setup & Management

🏗️ Creating a Data Library

Can be done via:

  • Setup → Agentforce Data Library page, or
  • Agent Builder → Knowledge tab

🧾 Data Sources

Two possible sources:

  1. Knowledge Base
    • Choose Identifying Fields (for locating articles).
    • Choose Content Fields (for enriching responses).
    • Optionally restrict to public articles or filter by categories.
  2. Uploaded Files
    • Upload .txt or .html (≤ 4 MB) and .pdf (≤ 100 MB).

🧠 Knowledge Settings

  • Define which Knowledge Articles to index.
  • Filter by data category or availability.

🧩 Data Space

  • Defines which Data Cloud data source the library uses.

⚡ Automated Configuration

When created, Salesforce automatically:

  • Pushes data to Data Cloud
  • Builds search index and retriever
  • Links agents to that data

🧑‍💻 Assignment & Usage

🔧 Feature💬 Function
Agent BuilderSelect or create a data library under Knowledge tab
AI Features SupportedAgentforce Agents, Agentforce Service Agent, Einstein Service Replies
RestrictionEach feature → only one data library at a time
Agent ActionAnswer Questions with Knowledge (uses assigned library for responses)

⚠️ Requirements

  • Must have Data Cloud enabled
  • Data Cloud Admin permissions required

🧠 Retrieval Augmented Generation (RAG)

🌐 Concept Overview

RAG improves LLM output by grounding prompts in accurate, current, and contextually relevant data.

It uses retrievers to pull structured/unstructured information from vector databases in Data Cloud.


⚙️ Process Breakdown

🏗️ Offline Preparation

  1. Connect unstructured data.
  2. Create Search Index Configuration (chunk + vectorize).
  3. Store and manage the search index in Data Cloud.

⚡ Online Usage

  1. Retriever is called inside a Prompt Template.
  2. Retrieves relevant info from search index.
  3. Augments original prompt with retrieved context.
  4. Sends prompt to LLM → generates response.

🔍 Search Index Configuration

StepDescription
1️⃣ Choose Setup TypeEasy Setup, Advanced Setup, or From a Data Kit.
2️⃣ Select Search Type & DMOChoose data model object for the index.
3️⃣ Add Fields & Chunking StrategyDefine how data will be segmented.
4️⃣ Select Vectorization StrategyDetermine semantic encoding for unstructured data.
5️⃣ Set Related FieldsAdd optional filters for targeted retrieval.

🧮 Search Types

  • Converts text into numerical embeddings to measure semantic similarity.
  • Recognizes meaning beyond keywords.
  • Example:
    • “How do I reset my password?” ≈ “How can I change my login credentials?”
  • Focuses on lexical similarity.
  • Example:
    • “Model X200 Printer” ≈ “Model X210 Printer”
  • Combines both semantic and lexical matching for optimal accuracy.
  • Creates both vector index and keyword index within Data Cloud.

🧩 Data Preparation

To use retrievers:

  1. Load, chunk, vectorize, and store content in Data Cloud.
  2. Associate search index with a Data Space and Data Model Object (DMO).
  3. Make it searchable via retrievers.

🧠 Retrievers Overview

TypeDescription
Default RetrieverAuto-created when a Search Index is configured; not customizable.
Custom RetrieverBuilt in Einstein Studio; supports filters and versions.
Dynamic RetrieverPlaceholder defined at runtime (used in standard templates).

🔍 Filtering

Custom retrievers can use filters to refine results for specific use cases.

🧩 Versioning

Each edit → new version; only one active version at a time.


⚙️ Retriever in Prompt Builder

🧭 Resource Field

Displays active retrievers available for a prompt (default + custom).

🧩 Retriever Settings (Side Panel)

SettingDescription
Search TextDynamic field for semantic queries or merge fields.
Output FieldsSelect which DMO fields to return in the results.
Number of ResultsLimit how many chunks are injected (e.g., 10).

🧩 Summary Table

ComponentPurpose
Data LibraryCentralized source of knowledge (Knowledge base or files).
GroundingAdds context to LLM for accurate responses.
ChunkingSplits data for efficient retrieval.
IndexingOrganizes chunks for fast semantic search.
RetrieverFetches relevant data from Data Cloud for grounding.
Search IndexStores vectorized data for retrieval.
RAG ProcessCombines retrievers + LLM for contextual, reliable outputs.

✅ Key Takeaways

  • Agentforce Data Libraries and Retrievers together make AI smarter, contextual, and trustworthy.
  • Data Libraries provide the ground truth (Knowledge base or files).
  • RAG framework ensures AI outputs are grounded in real, organization-specific data.
  • Retrievers + Search Indexes enable fast semantic matching.
  • Agent Builder links agents directly to their data sources.
  • Data Cloud underpins it all — required for setup and operation.

📈 Flow Charts

1) Agentforce Data Library — lifecycle

flowchart LR
  A[Create Data Library] --> B{Choose Data Source}
  B -->|Knowledge| C[Select Identifying & Content Fields]
  B -->|Uploaded Files| D[Upload TXT/HTML/PDF]
  C --> E[Chunk & Index in Data Cloud]
  D --> E
  E --> F[Retriever Created Automatically]
  F --> G[Assign Library in Agent Builder]
  G --> H[Agent Action: Answer Questions with Knowledge]

2) RAG — offline vs online usage

flowchart TB
  subgraph OFFLINE Preparation
    O1[Connect Unstructured Data]
    O2[Create Search Index Configuration]
    O3[Chunk & Vectorize]
    O4[Store Index in Data Cloud]
    O1 --> O2 --> O3 --> O4
  end

  subgraph ONLINE Usage
    N1[Call Retriever in Prompt Template]
    N2[Retrieve Relevant Chunks]
    N3[Augment Prompt]
    N4[LLM Generates Response]
    N1 --> N2 --> N3 --> N4
  end

  O4 --> N1

3) Search index types & flow

flowchart TB
  A[Create Search Index] --> B{Search Type}
  B -->|Vector| C[Embeddings for Semantic Similarity]
  B -->|Keyword| D[Lexical Matching]
  B -->|Hybrid| E[Vector + Keyword]
  C --> F[Index for DMO/UDMO]
  D --> F
  E --> F
  F --> G[Supports Retriever Queries]

4) Retriever in Prompt Builder — configuration to runtime

flowchart LR
  A[Active Retriever] --> B[Prompt Builder: Resource Field]
  B --> C[Add to Prompt Template]
  C --> D{Configure Settings}
  D -->|Search Text| E[Query or Merge Fields]
  D -->|Output Fields| F[Select DMO Fields]
  D -->|Number of Results| G[Set Max Chunks]
  C --> H[Runtime: Retrieve -> Inject -> Respond]

5) Knowledge vs files — decision mini-flow

flowchart LR
  K[Choose Data Source] --> J{Use Knowledge?}
  J --> |Yes| K1[Select Articles, Categories, Public Only]
  J --> |No| K2[Upload Files TXT HTML PDF]
  K1 --> L[Index Retriever Assign]
  K2 --> L[Index Retriever Assign]


📚 Flashcards