Tuning into The Latest Update v0.26.0

Griptape’s release of version 0.26.0 is now available on PyPi. This update brings many new features designed for the sake of improving your development experience by simplifying configurations, extending functionality, and improving system interactions. Let’s dive into how these new tools and updates can bring efficiency and power to any of your projects.

Powering Up Audio and Real-Time Functionalities

With the introduction of the AzureOpenAiStructureConfig, you will be able to integrate Azure OpenAI services much easier. This tool acts like a central hub for all your Azure OpenAI Driver settings by streamlining the setup process and reducing complexity. Imagine setting up a new smart home system where this tool serves as the central control panel, allowing you to manage all your connections and configurations from one place.

Before introducing the AzureOpenAiStructureConfig, you would have to initialize all of the drivers you needed with the same, repetitive values as seen below.

agent_old = Agent(
    embedding_driver=AzureOpenAiEmbeddingDriver(
        model=os.environ["AZURE_OPENAI_EMBEDDING_MODEL_NAME"],
        azure_deployment=os.environ["AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME"],
        azure_endpoint=os.environ["AZURE_OPENAI_DEFAULT_ENDPOINT"],
        api_key=os.environ["AZURE_OPENAI_API_KEY"]
    ),
    prompt_driver=AzureOpenAiChatPromptDriver(
        model=os.environ["AZURE_OPENAI_PROMPT_MODEL_NAME"],
        azure_deployment=os.environ["AZURE_OPENAI_PROMPT_DEPLOYMENT_NAME"],
        azure_endpoint=os.environ["AZURE_OPENAI_DEFAULT_ENDPOINT"],
        api_key=os.environ["AZURE_OPENAI_API_KEY"]
    ),
)

With the new Config, you no longer need to repeat configuration, and it will initialize all base drivers for you with the Config you specify.

agent_new = Agent(
    config=AzureOpenAiStructureConfig(
        azure_endpoint=os.environ["AZURE_OPENAI_DEFAULT_ENDPOINT"],
        api_key=os.environ["AZURE_OPENAI_API_KEY"]
    )
)

Furthermore, for developers working with image and audio content, the new AzureOpenAiVisionImageQueryDriver and AudioLoader are a different sort of game changer. The AzureOpenAiVisionImageQueryDriver is built to extend your application’s capabilities to understand and analyze images. It also extends support for working with OpenAI’s multi-modal models within the Azure ecosystem. Think of it like giving your app a pair of 'intelligent eyes'. 

On the other hand, the AudioLoader breaks down the process of importing audio files into your system, much like having an efficient digital librarian at your disposal who organizes and prepares all audio materials for quick access and processing. The AudioLoader feature allows you to easily load audio files from your disk and transform them into Griptape Artifacts. Once loaded, these artifacts can smoothly integrate with functionalities such as the newly introduced Audio Transcription Task. This integration streamlines the process of transforming spoken words into written text, enhancing the utility and versatility of your applications. Whether you're developing voice-activated systems or looking to automate transcription workflows, this feature simplifies the initial step of accessing and utilizing audio content within your projects.

Our release also focuses on improving audio processing capabilities. The AudioTranscriptionTask and AudioTranscriptionClient work in collaboration to transform audio content into text. This is perfect for applications such as voice-activated tools or automated transcription services. It could be compared to having an assistant who takes notes for you, ensuring no piece of information is missed.

The following code example takes a downloaded audio file, transcribes, translates, and outputs new audio files with the translations. It uses the ElevenLabsTextToSpeechDriver introduced in Griptape 0.25.

import os

from griptape.artifacts import TextArtifact
from griptape.drivers import OpenAiAudioTranscriptionDriver
from griptape.drivers import ElevenLabsTextToSpeechDriver
from griptape.engines import TextToSpeechEngine, AudioTranscriptionEngine
from griptape.structures import Workflow
from griptape.tasks import (
    PromptTask,
    ToolTask,
    CodeExecutionTask,
    TextToSpeechTask,
    BaseTask,
)
from griptape.tools.audio_transcription_client.tool import AudioTranscriptionClient
from dotenv import load_dotenv

load_dotenv()

transcription_task = ToolTask(
    "Transcribe the audio file {{ args[0] }}.",
    tool=AudioTranscriptionClient(
        off_prompt=False,
        engine=AudioTranscriptionEngine(
            audio_transcription_driver=OpenAiAudioTranscriptionDriver(
                model="whisper-1"
            ),
        ),
    ),
    id="transcription_task",
)


def make_translation_task(lang: str) -> list[BaseTask]:
    lang_prompt = f"Translate the following text into {lang}."
    common_prompt = """

        {{ parent_outputs['transcription_task'] }}
    """

    revision_task = PromptTask(
        f"""
        {lang_prompt}
        {common_prompt}
        """,
        id=f"translation_task_{lang}",
    )

    tts_task = TextToSpeechTask(
        lambda task: task.parents[0].output,
        output_dir="demolangs/",
        text_to_speech_engine=TextToSpeechEngine(
            text_to_speech_driver=ElevenLabsTextToSpeechDriver(
                api_key=os.environ["ELEVEN_LABS_API_KEY"],
                model="eleven_multilingual_v2",
                voice="Matilda",
            )
        ),
        id=f"tts_task_{lang}",
    )

    return [revision_task, tts_task]


end_task = CodeExecutionTask(
    run_fn=lambda _: TextArtifact("Done."),
    id="end_task",
)

langs = ["spanish", "portuguese"]
translation_tasks = [make_translation_task(lang) for lang in langs]

workflow = Workflow()
workflow.add_task(transcription_task)
workflow.add_task(end_task)
for translation_task in translation_tasks:
    workflow.insert_tasks(transcription_task, [translation_task[0]], end_task)
    workflow.insert_tasks(translation_task[0], [translation_task[1]], end_task)

workflow.run("blog_post.mp3")

For real-time application needs, the PusherEventListenerDriver enables instantaneous communication within your projects using Pusher WebSockets. This is very important for applications that depend on real-time updates, whether that’s live sports apps or instant messaging platforms. Speed and responsiveness can be crucial in both.

Here is an example of how you would set up an Agent to respond to events pushed to it. You can see the Agent’s response to each event in the Pusher console.

import os
from griptape.drivers import PusherEventListenerDriver
from griptape.events import (
    StartTaskEvent,
    FinishTaskEvent,
    StartActionsSubtaskEvent,
    FinishActionsSubtaskEvent,
    StartPromptEvent,
    FinishPromptEvent,
    EventListener,
    FinishStructureRunEvent,
)

from griptape.structures import Agent

agent = Agent(
    event_listeners=[
        EventListener(
            event_types=[
                StartTaskEvent,
                FinishTaskEvent,
                StartActionsSubtaskEvent,
                FinishActionsSubtaskEvent,
                StartPromptEvent,
                FinishPromptEvent,
                EventListener,
                FinishStructureRunEvent,
            ],
            driver=PusherEventListenerDriver(
                batched=True,
                app_id=os.environ["PUSHER_APP_ID"],
                key=os.environ["PUSHER_KEY"],
                secret=os.environ["PUSHER_SECRET"],
                cluster=os.environ["PUSHER_CLUSTER"],
                channel="my-channel",
                event_name="my-event",
            ),
        ),
    ],
)

agent.run("Analyze the pros and cons of remote work vs. office work")

Streamlining and Securing Environment Management

Finally, our update brings enhancements in managing environments with the introduction of the Parameter env to BaseStructureRunDriver. This allows for more precise control over the execution environments of your structures when using the LocalStructureRunDriver or the GriptapeCloudStructureRunDriver. Specifically, this feature allows you to define and adjust environment variables that dictate how a structure operates during runtime. By adjusting these environment variables, you can tailor the execution environment to match the specific needs of your application, ensuring it performs optimally under different conditions. This level of control is appealing for maintaining stability and efficiency, especially when deploying complex systems that must operate consistently across varied configurations.

The following code uses the env parameter in order to pass runtime configuration for a specific run of the structure.

import os
from griptape.drivers import LocalStructureRunDriver
from griptape.rules import Rule
from griptape.structures import Agent, Pipeline
from griptape.tasks import StructureRunTask
from dotenv import load_dotenv

load_dotenv()


def build_joke_teller():
    joke_teller = Agent(
        rules=[
            Rule(
                value="You are very funny.",
            )
        ],
    )

    return joke_teller


def build_joke_rewriter():
    joke_rewriter = Agent(
        rules=[
            Rule(
                value="You are the editor of a joke book. But you only speak in riddles",
            ),
            Rule(
                value=f"Your output should be in the following language: {os.environ['LANGUAGE']}"
            ),
        ],
    )

    return joke_rewriter


joke_coordinator = Pipeline(
    tasks=[
        StructureRunTask(
            driver=LocalStructureRunDriver(
                structure_factory_fn=build_joke_teller,
            ),
        ),
        StructureRunTask(
            ("Rewrite this joke: {{ parent_output }}",),
            driver=LocalStructureRunDriver(
                structure_factory_fn=build_joke_rewriter, env={"LANGUAGE": "spanish"}
            ),
        ),
    ]
)

joke_coordinator.run("Tell me a joke")

Together, these features and improvements not only improve the robustness and capability of your applications but also clear up the management and integration of complex systems. So, whether you are building highly interactive applications, processing extensive data, or you’re integrating sophisticated AI models, Griptape version 0.26.0 is built to lift your projects to new heights. We can’t wait to see how you make it to the top. 

Breaking Changes

We also want to re-introduce you to the off_prompt functionality that is configured on all Griptape tools. This feature allows developers to choose whether a tool's results are stored directly in TaskMemory or returned immediately to the Large Language Model (LLM). By setting off_prompt to True, results are saved in TaskMemory, ideal for sensitive or large data that you might want to process or analyze later. Conversely, setting it to False returns results directly to the LLM, speeding up interactions when immediate response is preferred. This flexibility offers customized control over data handling to optimize performance and security based on your application's needs.

In Griptape 0.26.0, there was an update where all Tools now default to off_prompt=False and you will be required to set off_prompt to True if you wish to opt-in to using TaskMemory. A more detailed write-up and examples can be found in the technical documentation.