Skip to content

Embed images

visionimageembeddingClip

Embed images using pretrained DL models.

An embedding vector is a numerical representation of an image (or text etc.), such that different numerical components of the vector capture different dimensions of the image's content. Embeddings can be used, for example, to calculate the semantic similarity between pairs of images (see link_embeddings, for example, to create a network of images connected by similarity).

In its current form the step calculates image embeddings using Clip, which has been trained on 400M image/text pairs to pick out an image's correct caption from a list of candidates.

Usage


The following are the step's expected inputs and outputs and their specific types.

Step signature
embed_images(images: url, {
    "param": value
}) -> (embedding: list[number])

where the object {"param": value} is optional in most cases and if present may contain any of the parameters described in the corresponding section below.

Example

The step has no required parameters, so the simplest call is simply

Example call (in recipe editor)
embed_images(ds.image_url) -> (ds.embedding)

Inputs


images: column:url

A column of URLs to images to calculate embeddings for.

Outputs


embedding: column:list[number]

A column of embedding vectors capturing the meaning of each input image.

Parameters


normalize: boolean = True

Whether to normalize embedding vectors (to length/norm of 1.0).