Introduction
This user guide shows you how to use DocArray
with most of its features.
There are three main sections:
- Representing data: This section will show you how to represent your data. This is a great starting point if you want to better organize the data in your ML models, or if you are looking for a "Pydantic for ML".
- Sending data: This section will show you how to send your data. This is a great starting point if you want to serve your ML model, for example through FastAPI.
- Storing data: This section will show you how to store your data. This is a great starting point if you are looking for an "ORM for vector databases".
You should start by reading the Representing data section, and then the Sending data and Storing data sections can be read in any order.
You will first need to install DocArray
in your Python environment.
Install DocArray
To install DocArray
, you can use the following command:
This will install the main dependencies of DocArray
and will work with all the supported data modalities.
Note
To install a very light version of DocArray
with only the core dependencies, you can use the following command:
If you want to use protobuf
and DocArray
, you can run:
Depending on your usage you might want to use DocArray
with only a couple of specific modalities and their dependencies.
For instance, if you only want to work with images, you can install DocArray
using the following command:
...or with images and audio: