Embedding
docarray.typing.tensor.embedding.embedding
AnyEmbedding
Bases: AnyTensor
, EmbeddingMixin
Represents an embedding tensor object that can be used with TensorFlow, PyTorch, and NumPy type.
'''python from docarray import BaseDoc from docarray.typing import AnyEmbedding
class MyEmbeddingDoc(BaseDoc): embedding: AnyEmbedding
Example usage with TensorFlow:
import tensorflow as tf
doc = MyEmbeddingDoc(embedding=tf.zeros(1000, 2)) type(doc.embedding) # TensorFlowEmbedding
Example usage with PyTorch:
import torch
doc = MyEmbeddingDoc(embedding=torch.zeros(1000, 2)) type(doc.embedding) # TorchEmbedding
Example usage with NumPy:
import numpy as np
doc = MyEmbeddingDoc(embedding=np.zeros((1000, 2))) type(doc.embedding) # NdArrayEmbedding '''
Raises: TypeError: If the type of the value is not one of [torch.Tensor, tensorflow.Tensor, numpy.ndarray]
Source code in docarray/typing/tensor/embedding/embedding.py
__docarray_validate_shape__(t, shape)
classmethod
Every tensor has to implement this method in order to enable syntax of the form AnyTensor[shape]. It is called when a tensor is assigned to a field of this type. i.e. when a tensor is passed to a Document field of type AnyTensor[shape].
The intended behaviour is as follows:
- If the shape of
t
is equal toshape
, returnt
. - If the shape of
t
is not equal toshape
, but can be reshaped toshape
, returnt
reshaped toshape
. - If the shape of
t
is not equal toshape
and cannot be reshaped toshape
, raise a ValueError.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
t |
T
|
The tensor to validate. |
required |
shape |
Tuple[Union[int, str], ...]
|
The shape to validate against. |
required |
Returns:
Type | Description |
---|---|
T
|
The validated tensor. |
Source code in docarray/typing/tensor/abstract_tensor.py
docarray.typing.tensor.embedding.embedding_mixin
docarray.typing.tensor.embedding.ndarray
docarray.typing.tensor.embedding.tensorflow
docarray.typing.tensor.embedding.torch
TorchEmbedding
Bases: TorchTensor
, EmbeddingMixin
Source code in docarray/typing/tensor/embedding/torch.py
__deepcopy__(memo)
Custom implementation of deepcopy for TorchTensor to avoid storage sharing issues.
Source code in docarray/typing/tensor/torch_tensor.py
__docarray_validate_shape__(t, shape)
classmethod
Every tensor has to implement this method in order to enable syntax of the form AnyTensor[shape]. It is called when a tensor is assigned to a field of this type. i.e. when a tensor is passed to a Document field of type AnyTensor[shape].
The intended behaviour is as follows:
- If the shape of
t
is equal toshape
, returnt
. - If the shape of
t
is not equal toshape
, but can be reshaped toshape
, returnt
reshaped toshape
. - If the shape of
t
is not equal toshape
and cannot be reshaped toshape
, raise a ValueError.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
t |
T
|
The tensor to validate. |
required |
shape |
Tuple[Union[int, str], ...]
|
The shape to validate against. |
required |
Returns:
Type | Description |
---|---|
T
|
The validated tensor. |
Source code in docarray/typing/tensor/abstract_tensor.py
__getitem__(item)
abstractmethod
__iter__()
abstractmethod
__setitem__(index, value)
abstractmethod
from_ndarray(value)
classmethod
Create a TorchTensor
from a numpy array
Parameters:
Name | Type | Description | Default |
---|---|---|---|
value |
ndarray
|
the numpy array |
required |
Returns:
Type | Description |
---|---|
T
|
a |
Source code in docarray/typing/tensor/torch_tensor.py
from_protobuf(pb_msg)
classmethod
Read ndarray from a proto msg
Parameters:
Name | Type | Description | Default |
---|---|---|---|
pb_msg |
NdArrayProto
|
|
required |
Returns:
Type | Description |
---|---|
T
|
a |
Source code in docarray/typing/tensor/torch_tensor.py
get_comp_backend()
staticmethod
Return the computational backend of the tensor
new_empty(*args, **kwargs)
This method enables the deepcopy of TorchEmbedding
by returning another instance of this subclass.
If this function is not implemented, the deepcopy will throw an RuntimeError from Torch.
Source code in docarray/typing/tensor/embedding/torch.py
to_protobuf()
Transform self into a NdArrayProto
protobuf message
Source code in docarray/typing/tensor/torch_tensor.py
unwrap()
Return the original torch.Tensor
without any memory copy.
The original view rest intact and is still a Document TorchTensor
but the return object is a pure torch.Tensor
but both object share
the same memory layout.
from docarray.typing import TorchTensor
import torch
from pydantic import parse_obj_as
t = parse_obj_as(TorchTensor, torch.zeros(3, 224, 224))
# here t is a docarray TorchTensor
t2 = t.unwrap()
# here t2 is a pure torch.Tensor but t1 is still a Docarray TorchTensor
# But both share the same underlying memory
Returns:
Type | Description |
---|---|
Tensor
|
a |