AudioTensor
docarray.typing.tensor.audio.audio_ndarray
AudioNdArray
Bases: AbstractAudioTensor
, NdArray
Subclass of NdArray
, to represent an audio tensor.
Adds audio-specific features to the tensor.
from typing import Optional
from docarray import BaseDoc
from docarray.typing import AudioBytes, AudioNdArray, AudioUrl
import numpy as np
class MyAudioDoc(BaseDoc):
title: str
audio_tensor: Optional[AudioNdArray] = None
url: Optional[AudioUrl] = None
bytes_: Optional[AudioBytes] = None
# from tensor
doc_1 = MyAudioDoc(
title='my_first_audio_doc',
audio_tensor=np.random.rand(1000, 2),
)
# doc_1.audio_tensor.save(file_path='/tmp/file_1.wav')
doc_1.bytes_ = doc_1.audio_tensor.to_bytes()
# from url
doc_2 = MyAudioDoc(
title='my_second_audio_doc',
url='https://www.kozco.com/tech/piano2.wav',
)
doc_2.audio_tensor, _ = doc_2.url.load()
# doc_2.audio_tensor.save(file_path='/tmp/file_2.wav')
doc_2.bytes_ = doc_1.audio_tensor.to_bytes()
Source code in docarray/typing/tensor/audio/audio_ndarray.py
docarray.typing.tensor.audio.abstract_audio_tensor
AbstractAudioTensor
Bases: AbstractTensor
, ABC
Source code in docarray/typing/tensor/audio/abstract_audio_tensor.py
__docarray_validate_getitem__(item)
classmethod
This method validates the input to AbstractTensor.__class_getitem__
.
It is called at "class creation time", i.e. when a class is created with syntax of the form AnyTensor[shape].
The default implementation tries to cast any item
to a tuple of ints.
A subclass can override this method to implement custom validation logic.
The output of this is eventually passed to
AbstractTensor.__docarray_validate_shape__
as its shape
argument.
Raises ValueError
if the input item
does not pass validation.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
item |
Any
|
The item to validate, passed to |
required |
Returns:
Type | Description |
---|---|
Tuple[int]
|
The validated item == the target shape of this tensor. |
Source code in docarray/typing/tensor/abstract_tensor.py
__docarray_validate_shape__(t, shape)
classmethod
Every tensor has to implement this method in order to enable syntax of the form AnyTensor[shape]. It is called when a tensor is assigned to a field of this type. i.e. when a tensor is passed to a Document field of type AnyTensor[shape].
The intended behaviour is as follows:
- If the shape of
t
is equal toshape
, returnt
. - If the shape of
t
is not equal toshape
, but can be reshaped toshape
, returnt
reshaped toshape
. - If the shape of
t
is not equal toshape
and cannot be reshaped toshape
, raise a ValueError.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
t |
T
|
The tensor to validate. |
required |
shape |
Tuple[Union[int, str], ...]
|
The shape to validate against. |
required |
Returns:
Type | Description |
---|---|
T
|
The validated tensor. |
Source code in docarray/typing/tensor/abstract_tensor.py
__getitem__(item)
abstractmethod
__iter__()
abstractmethod
__setitem__(index, value)
abstractmethod
display(rate=44100)
Play audio data from tensor in notebook.
Source code in docarray/typing/tensor/audio/abstract_audio_tensor.py
get_comp_backend()
abstractmethod
staticmethod
save(file_path, format='wav', frame_rate=44100, sample_width=2, pydub_args={})
Save audio tensor to an audio file. Mono/stereo is preserved.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
file_path |
Union[str, BinaryIO]
|
path to an audio file. If file is a string, open the file by that name, otherwise treat it as a file-like object. |
required |
format |
str
|
format for the audio file ('mp3', 'wav', 'raw', 'ogg' or other ffmpeg/avconv supported files) |
'wav'
|
frame_rate |
int
|
sampling frequency |
44100
|
sample_width |
int
|
sample width in bytes |
2
|
pydub_args |
Dict[str, Any]
|
dictionary of additional arguments for pydub.AudioSegment.export function |
{}
|
Source code in docarray/typing/tensor/audio/abstract_audio_tensor.py
to_bytes()
Convert audio tensor to AudioBytes
.
Source code in docarray/typing/tensor/audio/abstract_audio_tensor.py
to_protobuf()
abstractmethod
docarray.typing.tensor.audio.audio_tensorflow_tensor
AudioTensorFlowTensor
Bases: AbstractAudioTensor
, TensorFlowTensor
Subclass of TensorFlowTensor
,
to represent an audio tensor. Adds audio-specific features to the tensor.
from typing import Optional
import tensorflow as tf
from docarray import BaseDoc
from docarray.typing import AudioBytes, AudioTensorFlowTensor, AudioUrl
class MyAudioDoc(BaseDoc):
title: str
audio_tensor: Optional[AudioTensorFlowTensor]
url: Optional[AudioUrl]
bytes_: Optional[AudioBytes]
doc_1 = MyAudioDoc(
title='my_first_audio_doc',
audio_tensor=tf.random.normal((1000, 2)),
)
# doc_1.audio_tensor.save(file_path='file_1.wav')
doc_1.bytes_ = doc_1.audio_tensor.to_bytes()
doc_2 = MyAudioDoc(
title='my_second_audio_doc',
url='https://www.kozco.com/tech/piano2.wav',
)
doc_2.audio_tensor, _ = doc_2.url.load()
doc_2.bytes_ = doc_1.audio_tensor.to_bytes()
Source code in docarray/typing/tensor/audio/audio_tensorflow_tensor.py
docarray.typing.tensor.audio.audio_torch_tensor
AudioTorchTensor
Bases: AbstractAudioTensor
, TorchTensor
Subclass of TorchTensor
, to represent an audio tensor.
Adds audio-specific features to the tensor.
from typing import Optional
import torch
from docarray import BaseDoc
from docarray.typing import AudioBytes, AudioTorchTensor, AudioUrl
class MyAudioDoc(BaseDoc):
title: str
audio_tensor: Optional[AudioTorchTensor] = None
url: Optional[AudioUrl] = None
bytes_: Optional[AudioBytes] = None
doc_1 = MyAudioDoc(
title='my_first_audio_doc',
audio_tensor=torch.zeros(1000, 2),
)
# doc_1.audio_tensor.save(file_path='/tmp/file_1.wav')
doc_1.bytes_ = doc_1.audio_tensor.to_bytes()
doc_2 = MyAudioDoc(
title='my_second_audio_doc',
url='https://www.kozco.com/tech/piano2.wav',
)
doc_2.audio_tensor, _ = doc_2.url.load()
# doc_2.audio_tensor.save(file_path='/tmp/file_2.wav')
doc_2.bytes_ = doc_1.audio_tensor.to_bytes()