This module provides the BedrockEncoder class for generating embeddings using Amazon’s Bedrock Platform.

The BedrockEncoder class is a subclass of DenseEncoder and utilizes the TextEmbeddingModel from the Amazon’s Bedrock Platform to generate embeddings for given documents. It requires an AWS Access Key ID and AWS Secret Access Key and supports customization of the pre-trained model, score threshold, and region.

Example usage:

from semantic_router.encoders.bedrock_encoder import BedrockEncoder

encoder = BedrockEncoder(access_key_id=“your-access-key-id”, secret_access_key=“your-secret-key”, region=“your-region”) embeddings = encoder([“document1”, “document2”])

Classes: BedrockEncoder: A class for generating embeddings using the Bedrock Platform.

BedrockEncoder Objects

class BedrockEncoder(DenseEncoder)

Dense encoder using Amazon Bedrock embedding API. Requires an AWS Access Key ID and AWS Secret Access Key.

The BedrockEncoder class is a subclass of DenseEncoder and utilizes the TextEmbeddingModel from the Amazon’s Bedrock Platform to generate embeddings for given documents. It supports customization of the pre-trained model, score threshold, and region.

Example usage:

from semantic_router.encoders.bedrock_encoder import BedrockEncoder

encoder = BedrockEncoder(
    access_key_id="your-access-key-id",
    secret_access_key="your-secret-key",
    region="your-region"
)
embeddings = encoder(["document1", "document2"])

__init__

def __init__(name: str = EncoderDefault.BEDROCK.value["embedding_model"],
             input_type: Optional[str] = "search_query",
             score_threshold: float = 0.3,
             access_key_id: Optional[str] = None,
             secret_access_key: Optional[str] = None,
             session_token: Optional[str] = None,
             region: Optional[str] = None)

Initializes the BedrockEncoder.

Arguments:

  • name (str): The name of the pre-trained model to use for embedding. If not provided, the default model specified in EncoderDefault will be used.
  • input_type (str): The type of input to use for the embedding. If not provided, the default input type specified in EncoderDefault will be used.
  • score_threshold (float): The threshold for similarity scores.
  • access_key_id (str): The AWS access key id for an IAM principle. If not provided, it will be retrieved from the access_key_id environment variable.
  • secret_access_key (str): The secret access key for an IAM principle. If not provided, it will be retrieved from the AWS_SECRET_KEY environment variable.
  • str0: The session token for an IAM principle. If not provided, it will be retrieved from the AWS_SESSION_TOKEN environment variable.
  • str1 (str): The location of the Bedrock resources. If not provided, it will be retrieved from the AWS_REGION environment variable, defaulting to “us-west-1”

Raises:

  • str3: If the Bedrock Platform client fails to initialize.

__call__

def __call__(docs: List[Union[str, Dict]],
             model_kwargs: Optional[Dict] = None) -> List[List[float]]

Generates embeddings for the given documents.

Arguments:

  • docs (list[str]): A list of strings representing the documents to embed.
  • model_kwargs (dict): A dictionary of model-specific inference parameters.

Raises:

  • ValueError: If the Bedrock Platform client is not initialized or if the API call fails.

Returns:

list[list[float]]: A list of lists, where each inner list contains the embedding values for a document.

chunk_strings

def chunk_strings(strings, MAX_WORDS=20)

Breaks up a list of strings into smaller chunks.

Arguments:

  • strings (list): A list of strings to be chunked.
  • max_chunk_size (int): The maximum size of each chunk. Default is 20.

Returns:

list[list[str]]: A list of lists, where each inner list contains a chunk of strings.

get_env_variable

@staticmethod
def get_env_variable(var_name, provided_value, default=None)

Retrieves environment variable or uses a provided value.

Arguments:

  • var_name (str): The name of the environment variable.
  • provided_value (Optional[str]): The provided value to use if not None.
  • default (Optional[str]): The default value if the environment variable is not set.

Raises:

  • ValueError: If no value is provided and the environment variable is not set.

Returns:

str: The value of the environment variable or the provided/default value.