Documentation Index
Fetch the complete documentation index at: https://docs.aurelio.ai/llms.txt
Use this file to discover all available pages before exploring further.
VitEncoder Objects
class VitEncoder(DenseEncoder)
Encoder for Vision Transformer models.
This class provides functionality to encode images using a Vision Transformer
model via Hugging Face. It supports various image processing and model initialization
options.
__init__
Initialize the VitEncoder.
Arguments:
**data (dict): Additional keyword arguments for the encoder.
__call__
def __call__(imgs: List[Any], batch_size: int = 32) -> List[List[float]]
Encode a list of images into embeddings using the Vision Transformer model.
Arguments:
imgs (List[Any]): The images to encode.
batch_size (int): The batch size for encoding.
Returns:
List[List[float]]: The embeddings for the images.