DocArray supports many modalities including
This section will show you how to load and handle video data using DocArray.
This requires an
av dependency. You can install all necessary dependencies via:
Load video data
In DocArray video data is represented by a video tensor, an audio tensor, and the key frame indices.
Check out our predefined
VideoDoc to get started and play around with our video features.
First, let's define a
MyVideo class with all of those attributes and instantiate an object with a local or remote URL:
from docarray import BaseDoc from docarray.typing import AudioNdArray, NdArray, VideoNdArray, VideoUrl class MyVideo(BaseDoc): url: VideoUrl video: VideoNdArray = None audio: AudioNdArray = None key_frame_indices: NdArray = None doc = MyVideo( url='https://github.com/docarray/docarray/blob/main/tests/toydata/mov_bbb.mp4?raw=true' )
- The video tensor is a 4-dimensional array of shape
(n_frames, height, width, channels).
- The first dimension represents the frame id.
- The last three dimensions represent the image data of the corresponding frame.
- If the video contains audio, it will be stored as an
- Additionally, the key frame indices will be stored. A key frame is defined as the starting point of any smooth transition.
For the given example you can infer from
doc.video's shape that the video contains 250 frames of size 176x320 in RGB mode.
Based on the overall length of the video (10 seconds), you can infer the framerate is approximately 250/10 = 25 frames per second (fps).
DocArray offers several
VideoTensors to store your data to:
If you specify the type of your tensor as one of the above, it will be cast to that automatically:
from docarray import BaseDoc from docarray.typing import VideoTensorFlowTensor, VideoTorchTensor, VideoUrl class MyVideo(BaseDoc): url: VideoUrl tf_tensor: VideoTensorFlowTensor = None torch_tensor: VideoTorchTensor = None doc = MyVideo( url='https://github.com/docarray/docarray/blob/main/tests/toydata/mov_bbb.mp4?raw=true' ) doc.tf_tensor = doc.url.load().video doc.torch_tensor = doc.url.load().video assert isinstance(doc.tf_tensor, VideoTensorFlowTensor) assert isinstance(doc.torch_tensor, VideoTorchTensor)
from docarray import BaseDoc from docarray.typing import VideoTensor, VideoUrl, VideoBytes class MyVideo(BaseDoc): url: VideoUrl bytes_: VideoBytes = None video: VideoTensor = None doc = MyVideo( url='https://github.com/docarray/docarray/blob/main/tests/toydata/mov_bbb.mp4?raw=true' ) doc.bytes_ = doc.url.load_bytes() doc.video = doc.url.load().video
Key frame extraction
A key frame is defined as the starting point of any smooth transition. Given the key frame indices, you can access selected scenes:
Or you can access the first frame of all new scenes and display them in a notebook:
Save video to file
You can save your video tensor to a file. In the example below you save the video with a framerate of 60 fps, which results in a 4-second video, instead of the original 10-second video with a frame rate of 25 fps.
Display video in a notebook
You can play a video in a notebook from its URL as well as its tensor, by calling
.display() on either one. For the latter, you can optionally give the corresponding
AudioTensor as a parameter.
Getting started - Predefined
To get started and play around with your video data, DocArray provides a predefined
VideoDoc, which includes all of the previously mentioned functionalities:
You can use this class directly or extend it to your preference:
from typing import Optional from docarray.documents import VideoDoc # extend it class MyVideo(VideoDoc): name: Optional[str] = None video = MyVideo( url='https://github.com/docarray/docarray/blob/main/tests/toydata/mov_bbb.mp4?raw=true' ) video.name = 'My first video doc!' video.tensor = video.url.load().video