Retrieval Augmented Generation with Griptape Cloud

All Generative AI applications depend on data. A substantial amount of knowledge is encoded into large language models during their training phases, generating data in the form of the parameters that models learn during training. However, models finish training at a finite point in time and only have access to data that is available up to that precise point in time, the so-called ‘Knowledge Cut Off’. There is another limiting factor too. The private or proprietary nature of some data means that it cannot be used to train large language models.

To overcome these limitations, generative AI application developers commonly use two approaches. One approach is to give applications access to the internet through tools, such as the Griptape Cloud WebSearch and WebScraper tools. This can be a good solution for providing access to up-to-the-minute data for LLM-powered applications. However, there is a drawback with this approach, which is simply that the information available via the combination of WebSearch and WebScrapers is almost unlimited. Working with such unbounded data sets can be challenging and relatively costly. Despite this, there are still some use-cases where this approach is valid. Web search and web scrapers do not normally address the issue of access to private or proprietary data. To solve this, a different approach is needed.

Using Data with Griptape Cloud

Since its creation, Griptape has provided a solution for integrating private or proprietary data with your LLM-powered applications through its Data Source, Data Lake and Knowledge Base features. A Data Source ingests your data and stores it in your Data Lake in Griptape Cloud. Once you have created a Data Source you can configure Knowledge Bases that are linked to that Data Source. When you add Data Sources to a Knowledge Base, the data in your Data Sources will be split into chunks, embeddings will be calculated for each chunk, and those embeddings will be stored in a vector store alongside the chunks that they represent.

Knowledge Bases can then be connected to a Griptape Cloud Assistant, or to a Griptape Structure such as an Agent or Workflow using the GriptapeCloudVectorStoreDriver. This allows you to ‘augment’ the knowledge encoded within a large language model with up-to-date or private data stored in a Knowledge Base, enabling you to use the popular Retrieval Augmented Generation (RAG) pattern for your Generative AI applications. 

With Griptape Cloud, when you ask a question that the LLM determines would be best answered using data within a Knowledge Base, a vector similarity search query is performed, metadata is returned from the vector store, and these results are used to augment the response generated by the LLM to answer your question.

You can also use Rules in your applications to control the behavior of the LLM that you are using, ensuring that they only provide responses that are grounded using the results from queries to a Knowledge Base.

You can experiment with RAG on Griptape Cloud using one of our samples. We update the RAG sample application periodically because we think it’s important that we demonstrate that RAG provides a way to access up-to-date information that was created after the knowledge cut off for the LLM that you are using. 

At the time of writing, our ‘Brush Up on NVIDIA's Q4 Earnings’ sample provides a simple way to experiment with RAG on Griptape Cloud. This sample ingests the transcript from Nvidia’s 2025 Q4 earnings call and the earnings announcement from Nvidia’s website, and then creates a Knowledge Base and Griptape Cloud Assistant. This allows you to ask questions about Nvidia’s Q4 results.

Getting started

To try the sample for yourself follow these 4 simple steps.

  1. Log in to Griptape Cloud, or create a free account at https://cloud.griptape.ai/account?signup=true
  2. Select the 'Brush Up on NVIDIA's Q4 Earnings’ sample on the Griptape Cloud console home page
  1. Follow the instructions that you will see at the top of the page that will guide you through the three step process of creating the sample application
  2. Ask questions of the Griptape Cloud Assistant. In this example, I ask what the top driver of earnings growth was in Q4, and the LMM powering the Assistant retrieves the correct section from the Knowledge Base to augment its response to me.

We hope you enjoy experimenting with RAG on Griptape Cloud. If you have any questions about the sample, or about retrieval augmented generation more generally, please jump on to the Griptape Discord, and our team will be happy to help.