embed_images
Embed images using pretrained DL models.
An embedding vector is a numerical representation of an image (or text etc.), such that different numerical components
of the vector capture different dimensions of the image’s content. Embeddings can be used, for example, to calculate
the semantic similarity between pairs of images (see link_embeddings
, for example, to create a network of images
connected by similarity).
In its current form the step calculates image embeddings using Clip, which has been trained on 400M image/text pairs to pick out an image’s correct caption from a list of candidates.
Was this page helpful?