Agentforce and Data Cloud of 20% of total score in Salesforce Agentforce Specialist Exam. The topic covers data library, retrieval augment generation (RAG), search types, and retrievers.

NOTE

Most of the content in this work was generated with the assistance of AI and carefully reviewed, edited, and curated by the author. If you have found any issues with the content on this page, please do not hesitate to contact me at support@issacc.com.

🤖 Agentforce & Data Cloud

🎯 Learning Objectives

After studying this topic, you should be able to:

Explain how an Agentforce Data Library improves agent response accuracy and personalization.
Describe how to create and configure data libraries using the Knowledge base or uploaded files.
Assign an Agentforce Data Library to an agent via the Agent Builder.
Explain how Retrieval Augmented Generation (RAG) enhances AI-generated responses.
Use retrievers to ground prompt templates with relevant Data Cloud information.

📚 Agentforce Data Library Overview

💡 A Data Library acts as a structured knowledge repository for Agentforce agents.

🔍 Purpose	🧩 Description
Accuracy	Ground AI responses in domain-specific knowledge.
Personalization	Provide contextually relevant, organization-specific responses.
Trust	Build user confidence in generative AI output.

Data libraries can use:

Salesforce Knowledge base
Uploaded files (.txt, .html, .pdf)

⚙️ Core Concepts

🧠 Grounding

Injects domain-specific knowledge and customer context into LLM prompts.
Leads to more accurate, relevant, and trustworthy responses.

🧩 Chunking

Breaks data sources into smaller chunks for efficient search and retrieval.
Works across text, images, and audio.

🗂️ Indexing

Organizes chunks for structured search and quick access.
Enables semantic matching via similarity scoring.

🔎 Search

When a query runs, chunks are compared by similarity score.
High-scoring chunks are injected into the LLM prompt for context.

📥 Retrievers

Embedded resources that search and return relevant information from data libraries.
Define which datasets in Data Cloud are available to AI agents.

🧰 Data Library Setup & Management

🏗️ Creating a Data Library

Can be done via:

Setup → Agentforce Data Library page, or
Agent Builder → Knowledge tab

🧾 Data Sources

Two possible sources:

Knowledge Base
- Choose Identifying Fields (for locating articles).
- Choose Content Fields (for enriching responses).
- Optionally restrict to public articles or filter by categories.
Uploaded Files
- Upload .txt or .html (≤ 4 MB) and .pdf (≤ 100 MB).

🧠 Knowledge Settings

Define which Knowledge Articles to index.
Filter by data category or availability.

🧩 Data Space

Defines which Data Cloud data source the library uses.

⚡ Automated Configuration

When created, Salesforce automatically:

Pushes data to Data Cloud
Builds search index and retriever
Links agents to that data

🧑‍💻 Assignment & Usage

🔧 Feature	💬 Function
Agent Builder	Select or create a data library under Knowledge tab
AI Features Supported	Agentforce Agents, Agentforce Service Agent, Einstein Service Replies
Restriction	Each feature → only one data library at a time
Agent Action	Answer Questions with Knowledge (uses assigned library for responses)

⚠️ Requirements

Must have Data Cloud enabled

Data Cloud Admin permissions required

🧠 Retrieval Augmented Generation (RAG)

🌐 Concept Overview

RAG improves LLM output by grounding prompts in accurate, current, and contextually relevant data.

It uses retrievers to pull structured/unstructured information from vector databases in Data Cloud.

⚙️ Process Breakdown

🏗️ Offline Preparation

Connect unstructured data.
Create Search Index Configuration (chunk + vectorize).
Store and manage the search index in Data Cloud.

⚡ Online Usage

Retriever is called inside a Prompt Template.
Retrieves relevant info from search index.
Augments original prompt with retrieved context.
Sends prompt to LLM → generates response.

🔍 Search Index Configuration

Step	Description
1️⃣ Choose Setup Type	Easy Setup, Advanced Setup, or From a Data Kit.
2️⃣ Select Search Type & DMO	Choose data model object for the index.
3️⃣ Add Fields & Chunking Strategy	Define how data will be segmented.
4️⃣ Select Vectorization Strategy	Determine semantic encoding for unstructured data.
5️⃣ Set Related Fields	Add optional filters for targeted retrieval.

🧮 Search Types

🧭 Vector Search

Converts text into numerical embeddings to measure semantic similarity.
Recognizes meaning beyond keywords.
Example:
- “How do I reset my password?” ≈ “How can I change my login credentials?”

🔤 Keyword Search

Focuses on lexical similarity.
Example:
- “Model X200 Printer” ≈ “Model X210 Printer”

⚗️ Hybrid Search

Combines both semantic and lexical matching for optimal accuracy.
Creates both vector index and keyword index within Data Cloud.

🧩 Data Preparation

To use retrievers:

Load, chunk, vectorize, and store content in Data Cloud.
Associate search index with a Data Space and Data Model Object (DMO).
Make it searchable via retrievers.

🧠 Retrievers Overview

Type	Description
Default Retriever	Auto-created when a Search Index is configured; not customizable.
Custom Retriever	Built in Einstein Studio; supports filters and versions.
Dynamic Retriever	Placeholder defined at runtime (used in standard templates).

🔍 Filtering

Custom retrievers can use filters to refine results for specific use cases.

🧩 Versioning

Each edit → new version; only one active version at a time.

⚙️ Retriever in Prompt Builder

🧭 Resource Field

Displays active retrievers available for a prompt (default + custom).

🧩 Retriever Settings (Side Panel)

Setting	Description
Search Text	Dynamic field for semantic queries or merge fields.
Output Fields	Select which DMO fields to return in the results.
Number of Results	Limit how many chunks are injected (e.g., 10).

🧩 Summary Table

Component	Purpose
Data Library	Centralized source of knowledge (Knowledge base or files).
Grounding	Adds context to LLM for accurate responses.
Chunking	Splits data for efficient retrieval.
Indexing	Organizes chunks for fast semantic search.
Retriever	Fetches relevant data from Data Cloud for grounding.
Search Index	Stores vectorized data for retrieval.
RAG Process	Combines retrievers + LLM for contextual, reliable outputs.

✅ Key Takeaways

Agentforce Data Libraries and Retrievers together make AI smarter, contextual, and trustworthy.
Data Libraries provide the ground truth (Knowledge base or files).
RAG framework ensures AI outputs are grounded in real, organization-specific data.
Retrievers + Search Indexes enable fast semantic matching.
Agent Builder links agents directly to their data sources.
Data Cloud underpins it all — required for setup and operation.

📈 Flow Charts

1) Agentforce Data Library — lifecycle

flowchart LR
  A[Create Data Library] --> B{Choose Data Source}
  B -->|Knowledge| C[Select Identifying & Content Fields]
  B -->|Uploaded Files| D[Upload TXT/HTML/PDF]
  C --> E[Chunk & Index in Data Cloud]
  D --> E
  E --> F[Retriever Created Automatically]
  F --> G[Assign Library in Agent Builder]
  G --> H[Agent Action: Answer Questions with Knowledge]

2) RAG — offline vs online usage

flowchart TB
  subgraph OFFLINE Preparation
    O1[Connect Unstructured Data]
    O2[Create Search Index Configuration]
    O3[Chunk & Vectorize]
    O4[Store Index in Data Cloud]
    O1 --> O2 --> O3 --> O4
  end

  subgraph ONLINE Usage
    N1[Call Retriever in Prompt Template]
    N2[Retrieve Relevant Chunks]
    N3[Augment Prompt]
    N4[LLM Generates Response]
    N1 --> N2 --> N3 --> N4
  end

  O4 --> N1

3) Search index types & flow

flowchart TB
  A[Create Search Index] --> B{Search Type}
  B -->|Vector| C[Embeddings for Semantic Similarity]
  B -->|Keyword| D[Lexical Matching]
  B -->|Hybrid| E[Vector + Keyword]
  C --> F[Index for DMO/UDMO]
  D --> F
  E --> F
  F --> G[Supports Retriever Queries]

4) Retriever in Prompt Builder — configuration to runtime

flowchart LR
  A[Active Retriever] --> B[Prompt Builder: Resource Field]
  B --> C[Add to Prompt Template]
  C --> D{Configure Settings}
  D -->|Search Text| E[Query or Merge Fields]
  D -->|Output Fields| F[Select DMO Fields]
  D -->|Number of Results| G[Set Max Chunks]
  C --> H[Runtime: Retrieve -> Inject -> Respond]

5) Knowledge vs files — decision mini-flow

flowchart LR
  K[Choose Data Source] --> J{Use Knowledge?}
  J --> |Yes| K1[Select Articles, Categories, Public Only]
  J --> |No| K2[Upload Files TXT HTML PDF]
  K1 --> L[Index Retriever Assign]
  K2 --> L[Index Retriever Assign]

📚 Flashcards

What is an Agentforce Data Library?

A structured repository of knowledge that improves accuracy, personalization, and trust in AI responses. It can source data from the Salesforce Knowledge base or uploaded files like text, HTML, and PDFs.

What are the main benefits of using a Data Library?

It enhances AI accuracy, adds personalization, builds trust, and ensures responses are grounded in verified information.

What is grounding in Agentforce?

The process of using data from a Data Library to provide domain-specific and contextual information to an LLM prompt for more accurate and relevant responses.

What is chunking?

The act of breaking data into smaller pieces called chunks to improve search efficiency and relevance.

What is indexing?

Organizing and categorizing data chunks for easier search and retrieval during AI query processing.

What is a retriever?

A component that searches for and returns relevant data from the Data Library to enrich AI responses. Retrievers determine which datasets in Data Cloud are accessible to agents.

Where can a Data Library be created?

In Setup on the Agentforce Data Library page or directly from the Knowledge tab in the Agent Builder.

What are the two possible data sources for a Data Library?

Salesforce Knowledge base or uploaded files (TXT, HTML, PDF).

What fields are configured when using Knowledge as a data source?

Identifying Fields help locate the correct articles, and Content Fields enrich AI responses with relevant details.

What are the size limits for uploaded files?

Up to 4 MB for text or HTML files and 100 MB for PDF files.

How many data libraries can each AI feature use?

Each AI feature, such as an Agentforce Agent or Einstein Service Replies, can use only one data library at a time.

What is required to use Agentforce Data Libraries?

Data Cloud must be enabled, and Data Cloud admin permissions are required for setup.

What happens automatically when a Data Library is created?

Data is pushed to Data Cloud, a search index and retriever are created, and the agent is linked to that data source.

What is Retrieval Augmented Generation (RAG)?

A framework that grounds LLM prompts with relevant information retrieved from Data Cloud, improving accuracy and relevance.

What are the two phases of RAG?

Offline preparation (connect, chunk, vectorize, store) and online usage (retrieve, augment, generate response).

What is a search index in Data Cloud?

A repository of vectorized and chunked data that allows efficient retrieval of semantically relevant information.

What are the three types of search in RAG?

Vector search (semantic), keyword search (lexical), and hybrid search (combines both).

What is the difference between default and custom retrievers?

Default retrievers are created automatically with a search index, while custom retrievers can be created in Einstein Studio and customized with filters and versions.

What are the retriever settings in Prompt Builder?

Search Text (query or merge field), Output Fields (fields returned), and Number of Results (limit of retrieved chunks).

What is a dynamic retriever?

A placeholder retriever specified at runtime, used in standard prompt templates for flexible context retrieval.

Explorer

Agentforce Specialist Exam Prep - Agentforce and Data Cloud

🤖 Agentforce & Data Cloud

🎯 Learning Objectives

📚 Agentforce Data Library Overview

⚙️ Core Concepts

🧠 Grounding

🧩 Chunking

🗂️ Indexing

🔎 Search

📥 Retrievers

🧰 Data Library Setup & Management

🏗️ Creating a Data Library

🧾 Data Sources

🧠 Knowledge Settings

🧩 Data Space

⚡ Automated Configuration

🧑‍💻 Assignment & Usage

🧠 Retrieval Augmented Generation (RAG)

🌐 Concept Overview

⚙️ Process Breakdown

🏗️ Offline Preparation

⚡ Online Usage

🔍 Search Index Configuration

🧮 Search Types

🧭 Vector Search

🔤 Keyword Search

⚗️ Hybrid Search

🧩 Data Preparation

🧠 Retrievers Overview

🔍 Filtering

🧩 Versioning

⚙️ Retriever in Prompt Builder

🧭 Resource Field

🧩 Retriever Settings (Side Panel)

🧩 Summary Table

✅ Key Takeaways

📈 Flow Charts

1) Agentforce Data Library — lifecycle

2) RAG — offline vs online usage

3) Search index types & flow

4) Retriever in Prompt Builder — configuration to runtime

5) Knowledge vs files — decision mini-flow

📚 Flashcards

Recent Notes

Graph View

Table of Contents