We’re delighted to announce a new feature in Griptape Cloud that will improve the experience for developers building applications that integrate up-to-date and private data with large language models. Griptape Cloud has supported retrieval augmented generation applications through similarity search for some time. Check out this post for more details on existing Retrieval Augmented Generation features in Griptape Cloud.
Retrievers
Today, we are adding to that capability with Griptape Cloud Retrievers. Retrievers are a fully-managed implementation of the RAG Engine within Griptape Framework, and add query modification, reranking capabilities, and the ability to apply rules to query responses. These features enable you to generate more accurate and tailored results in your RAG applications built with Griptape Cloud.
Query Modification
Query modification in RAG allows you to improve matching by transforming queries before embeddings are generated for the query and used for similarity search against the data stored in a vector store. They use techniques such as query expansion, where an LLM is used to add additional context or terms to improve each query; or Hypothetical Document Embedding (HyDE), which is the generation of hypothetical answers to a query, which are then embedded and used with the original query to enhance the query that is made against the vector store.
What is Reranking?
Reranking works by comparing the relatedness of search results returned from a vector search query to the original query and reordering the results in descending order of their relatedness. This gives a ‘reranked’ list of results that you can use in your application. Let’s explore an example of reranking with Griptape Framework to illustrate why reranking is valuable, and then we’ll move on to explain how to get started with Retrievers in Griptape Cloud. Let’s assume we asked the question “What is the capital of France?” and the results that came back from a vector search operation across our data sources were as follows:
results = [ “Hotdog", "San Francisco", "Lille", "Paris", "Rome", "Baguette", "Eiffel Tower", "French Hotdog" ]
To rerank these results, we compare the embedding for each result to the embedding for the original question “What is the capital of France?” and then order the results in descending order of relatedness. In this use-case, answering a question, we would likely use the top result, but there are other use-cases, a research agent might be one example, where we might want to take the top n results from the reranking operation to perform a secondary operation on those results.
Implementing reranking is simple with Griptape Framework. Griptape Framework supports reranking with a local rerank driver using a simple relatedness calculation, and also has the capabilities to use Cohere’s reranking model through the CohereRerankDriver
. The sample code below uses the local rerank driver to rerank these results from the example query.
from griptape.artifacts import TextArtifact
from griptape.drivers.rerank.local import LocalRerankDriver
list = ["Hotdog","San Francisco","Lille","Paris","Rome","Baguette","Eiffel Tower","French Hotdog"]
artifact_list = []
for i in list:
artifact_list.append(TextArtifact(i))
rerank_driver = LocalRerankDriver()
artifacts = rerank_driver.run("What is the capital of France?", artifact_list)
print("Reranked list:")
for artifact in artifacts:
print("\t", artifact.value)
Reranked list:
Paris
Eiffel Tower
Lille
San Francisco
Rome
Baguette
French Hotdog
Hotdog
You can see that the reranking operation correctly identifies Paris as the best answer to the question that we posed.
The example above shows the LocalRerankDriver
being used as a standalone module, but it is more commonly used within a Griptape Framework RagEngine
. The code sample below shows how we might create a tool for an Agent to use where we define a RetrievalRagStage
that includes a VectorStoreRetrievalRagModule
and a TextChunksRerankRagModule
, where we use the LocalRerankDriver
as the rerank_driver
.
rag_tool = RagTool(
description="Contains information about the judgements and applications relating to legal cases",
off_prompt=False,
rag_engine=RagEngine(
retrieval_stage=RetrievalRagStage(
retrieval_modules=[
VectorStoreRetrievalRagModule(
vector_store_driver=vector_store_driver,
query_params={"namespace": "legal documents", "top_n": 20},
)
],
rerank_module=TextChunksRerankRagModule(rerank_driver=LocalRerankDriver()),
),
response_stage=ResponseRagStage(
response_modules=[
PromptResponseRagModule(
prompt_driver=OpenAiChatPromptDriver(model="gpt-4o")
)
]
),
),
)
As we mentioned earlier, Retrievers are a fully-managed implementation of the RAG Engine within the Griptape Framework. So you don’t need to worry about this complexity if you’re using Griptape Cloud.
Query Response Types
Retrievers support two different response types, text chunk responses and prompts with rulesets. The Text Chunk response type is intended for use in conjunction with an LLM for Retrieval Augmented Generation use-case.
The Prompt with Rulesets response type is used to generate natural language responses to your queries directly from the Retriever without the need to pass text chunks to an LLM for response generation. As you might expect from the name of this response type, you can control the behaviour of the Retrieverer in generating natural language responses by attaching a Ruleset.
Using Retrievers in Griptape Cloud
Retrievers bring the benefits of query modification, reranking, and control over query responses to Griptape Cloud. Retrievers can rerank the results returned from multiple Knowledge Bases. This means that they are particularly valuable when combining the results from multiple different data sources where they can help ensure that your applications get the results that are most relevant to the search queries they are making.
Let’s walk through the process of setting up a Retriever on Griptape Cloud. In this example, we’re going to create a Retriever that uses the Knowledge Base that we configured when we set up the Assistant in the 'Brush Up on NVIDIA's Q4 Earnings’ sample on the Griptape Cloud console home page. If you want to learn more about that sample application, it’s covered in this blog post.
Retrievers can be found under the Libraries Navigation header in the left navigation menu in the Griptape Cloud Console. To create a Retriever, select the yellow highlighted option in the left navigation menu and then select Create Retriever on the Retrievers page.

You will then be prompted to provide the details for your new Retriever. We are going to create a Retriever for the RAG use-case that I will connect to an Assistant, so I completed the Retriever details as shown below. Once the details have been entered, click the Create button to create the Retriever.

I can then connect the new Retriever to my Assistant and use it to retrieve answers to my questions from the NVIDIA Q4 Earnings Knowledge Base.

In this example, I am using a Ruleset to guide the Assistants behavior. If you want to experiment with this, you can use my rules as inspiration for your own, or just copy them for yourself. The rules I used are as follows:
- Only provide answers that you can verify using the Knowledge Base or Retriever. Check all answers against either the Knowledge Base or Retriever. If you cannot verify an answer from the Knowledge Base or Retriever, say so and decline to answer. Do not make things up that cannot be verified.
- Only answer questions related to NVIDIA's Q4 2025 earnings. Decline to answer all other questions.
We hope you find the Griptape Cloud Retrievers a valuable addition to your RAG toolkit. As usual, we’re excited to hear how you put this new capability to work. Please join us in the Griptape Discord if you have any questions, or use-cases that you would like to share.