Introducing Guardrails for Amazon Bedrock: Safeguarding LLMs with AWS

We are all talking about the business gains from using large language models, but there are lot of known issues with these models and finding ways to constrain the answers that a model could give is one way to apply some control to these powerful technologies. Today, at AWS re:Invent in Las Vegas, AWS CEO Adam Selipsky announced Guardrails for Amazon Bedrock.

“With Guardrails for Amazon Bedrock, you can consistently implement safeguards to deliver relevant and safe user experiences aligned with your company policies and principles,” the company wrote in a blog post this morning.

The new tool lets companies define and limit the kinds of language a model can use, so if someone asks a question that isn’t really relevant to the bot you are creating, it will not answer it instead of maybe providing a very convincing, but wrong answer, or worse, something that is offensive and could harm a brand.

At its most basic level, the company lets you define topics that are out of bounds for the model, so it simply doesn’t answer irrelevant questions. As an example, Amazon uses a financial services company, which may want to avoid letting the bot give investment advice for fear it could provide inappropriate recommendations that the customers might take seriously. A scenario like this could work as follows:

“I specify a denied topic with the name “Investment advice” and provide a natural language description, such as “Investment advice refers to inquiries, guidance, or recommendations regarding the management or allocation of funds or assets with the goal of generating returns or achieving specific financial objectives.”

In addition, you can filter out specific words and phrases to remove any kind of content that could be offensive, while applying filter strengths to different words and phrases to let the model know that this is out of bounds. Finally you can filter out PII data to keep private data out of the model answers.

The guardrails feature was announced in preview today. It will probably be available to all customers some time next year.

 

Author photo
Publication date:
Author: admin