ElasticV7DocIndex
docarray.index.backends.elasticv7.ElasticV7DocIndex
Bases: ElasticDocIndex
Source code in docarray/index/backends/elasticv7.py
19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 |
|
DBConfig
dataclass
Bases: DBConfig
Dataclass that contains all "static" configurations of ElasticDocIndex.
Source code in docarray/index/backends/elasticv7.py
QueryBuilder
Bases: QueryBuilder
Source code in docarray/index/backends/elasticv7.py
build(*args, **kwargs)
Build the elastic search v7 query object.
Source code in docarray/index/backends/elasticv7.py
filter(query, limit=10)
Find documents in the index based on a filter query
Parameters:
Name | Type | Description | Default |
---|---|---|---|
query |
Dict[str, Any]
|
the query to execute |
required |
limit |
int
|
maximum number of documents to return |
10
|
Returns:
Type | Description |
---|---|
self |
Source code in docarray/index/backends/elastic.py
find(query, search_field='embedding', limit=10, num_candidates=None)
Find k-nearest neighbors of the query.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
query |
Union[AnyTensor, BaseDoc]
|
query vector for KNN/ANN search. Has single axis. |
required |
search_field |
str
|
name of the field to search on |
'embedding'
|
limit |
int
|
maximum number of documents to return per query |
10
|
Returns:
Type | Description |
---|---|
self |
Source code in docarray/index/backends/elasticv7.py
text_search(query, search_field='text', limit=10)
Find documents in the index based on a text search query
Parameters:
Name | Type | Description | Default |
---|---|---|---|
query |
str
|
The text to search for |
required |
search_field |
str
|
name of the field to search on |
'text'
|
limit |
int
|
maximum number of documents to find |
10
|
Returns:
Type | Description |
---|---|
self |
Source code in docarray/index/backends/elastic.py
RuntimeConfig
dataclass
Bases: RuntimeConfig
Dataclass that contains all "dynamic" configurations of ElasticDocIndex.
__contains__(item)
Checks if a given document exists in the index.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
item |
BaseDoc
|
The document to check. It must be an instance of BaseDoc or its subclass. |
required |
Returns:
Type | Description |
---|---|
bool
|
True if the document exists in the index, False otherwise. |
Source code in docarray/index/abstract.py
__delitem__(key)
Delete one or multiple Documents from the index, by id
.
If no document is found, a KeyError is raised.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
key |
Union[str, Sequence[str]]
|
id or ids to delete from the Document index |
required |
Source code in docarray/index/abstract.py
__getitem__(key)
Get one or multiple Documents into the index, by id
.
If no document is found, a KeyError is raised.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
key |
Union[str, Sequence[str]]
|
id or ids to get from the Document index |
required |
Source code in docarray/index/abstract.py
__init__(db_config=None, **kwargs)
Initialize ElasticV7DocIndex
Source code in docarray/index/backends/elasticv7.py
build_query(**kwargs)
Build a query for ElasticDocIndex.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
kwargs |
parameters to forward to QueryBuilder initialization |
{}
|
Returns:
Type | Description |
---|---|
QueryBuilder
|
QueryBuilder object |
Source code in docarray/index/backends/elastic.py
configure(runtime_config=None, **kwargs)
Configure the DocumentIndex.
You can either pass a config object to config
or pass individual config
parameters as keyword arguments.
If a configuration object is passed, it will replace the current configuration.
If keyword arguments are passed, they will update the current configuration.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
runtime_config |
the configuration to apply |
None
|
|
kwargs |
individual configuration parameters |
{}
|
Source code in docarray/index/abstract.py
execute_query(query, *args, **kwargs)
Execute a query on the ElasticDocIndex.
Can take two kinds of inputs:
- A native query of the underlying database. This is meant as a passthrough so that you can enjoy any functionality that is not available through the Document index API.
- The output of this Document index'
QueryBuilder.build()
method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
query |
Dict[str, Any]
|
the query to execute |
required |
Returns:
Type | Description |
---|---|
Any
|
the result of the query |
Source code in docarray/index/backends/elasticv7.py
filter(filter_query, limit=10, **kwargs)
Find documents in the index based on a filter query
Parameters:
Name | Type | Description | Default |
---|---|---|---|
filter_query |
Any
|
the DB specific filter query to execute |
required |
limit |
int
|
maximum number of documents to return |
10
|
Returns:
Type | Description |
---|---|
DocList
|
a DocList containing the documents that match the filter query |
Source code in docarray/index/abstract.py
filter_batched(filter_queries, limit=10, **kwargs)
Find documents in the index based on multiple filter queries.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
filter_queries |
Any
|
the DB specific filter query to execute |
required |
limit |
int
|
maximum number of documents to return |
10
|
Returns:
Type | Description |
---|---|
List[DocList]
|
a DocList containing the documents that match the filter query |
Source code in docarray/index/abstract.py
filter_subindex(filter_query, subindex, limit=10, **kwargs)
Find documents in subindex level based on a filter query
Parameters:
Name | Type | Description | Default |
---|---|---|---|
filter_query |
Any
|
the DB specific filter query to execute |
required |
subindex |
str
|
name of the subindex to search on |
required |
limit |
int
|
maximum number of documents to return |
10
|
Returns:
Type | Description |
---|---|
DocList
|
a DocList containing the subindex level documents that match the filter query |
Source code in docarray/index/abstract.py
find(query, search_field='', limit=10, **kwargs)
Find documents in the index using nearest neighbor search.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
query |
Union[AnyTensor, BaseDoc]
|
query vector for KNN/ANN search. Can be either a tensor-like (np.array, torch.Tensor, etc.) with a single axis, or a Document |
required |
search_field |
str
|
name of the field to search on. Documents in the index are retrieved based on this similarity of this field to the query. |
''
|
limit |
int
|
maximum number of documents to return |
10
|
Returns:
Type | Description |
---|---|
FindResult
|
a named tuple containing |
Source code in docarray/index/abstract.py
find_batched(queries, search_field='', limit=10, **kwargs)
Find documents in the index using nearest neighbor search.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
queries |
Union[AnyTensor, DocList]
|
query vector for KNN/ANN search. Can be either a tensor-like (np.array, torch.Tensor, etc.) with a, or a DocList. If a tensor-like is passed, it should have shape (batch_size, vector_dim) |
required |
search_field |
str
|
name of the field to search on. Documents in the index are retrieved based on this similarity of this field to the query. |
''
|
limit |
int
|
maximum number of documents to return per query |
10
|
Returns:
Type | Description |
---|---|
FindResultBatched
|
a named tuple containing |
Source code in docarray/index/abstract.py
find_subindex(query, subindex='', search_field='', limit=10, **kwargs)
Find documents in subindex level.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
query |
Union[AnyTensor, BaseDoc]
|
query vector for KNN/ANN search. Can be either a tensor-like (np.array, torch.Tensor, etc.) with a single axis, or a Document |
required |
subindex |
str
|
name of the subindex to search on |
''
|
search_field |
str
|
name of the field to search on |
''
|
limit |
int
|
maximum number of documents to return |
10
|
Returns:
Type | Description |
---|---|
SubindexFindResult
|
a named tuple containing root docs, subindex docs and scores |
Source code in docarray/index/abstract.py
index(docs, **kwargs)
index Documents into the index.
Note
Passing a sequence of Documents that is not a DocList (such as a List of Docs) comes at a performance penalty. This is because the Index needs to check compatibility between itself and the data. With a DocList as input this is a single check; for other inputs compatibility needs to be checked for every Document individually.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
docs |
Union[BaseDoc, Sequence[BaseDoc]]
|
Documents to index. |
required |
Source code in docarray/index/abstract.py
num_docs()
python_type_to_db_type(python_type)
Map python type to database type. Takes any python type and returns the corresponding database column type.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
python_type |
Type
|
a python type. |
required |
Returns:
Type | Description |
---|---|
Any
|
the corresponding database column type, or None if |
Source code in docarray/index/backends/elastic.py
subindex_contains(item)
Checks if a given BaseDoc item is contained in the index or any of its subindices.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
item |
BaseDoc
|
the given BaseDoc |
required |
Returns:
Type | Description |
---|---|
bool
|
if the given BaseDoc item is contained in the index/subindices |
Source code in docarray/index/abstract.py
text_search(query, search_field='', limit=10, **kwargs)
Find documents in the index based on a text search query.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
query |
Union[str, BaseDoc]
|
The text to search for |
required |
search_field |
str
|
name of the field to search on |
''
|
limit |
int
|
maximum number of documents to return |
10
|
Returns:
Type | Description |
---|---|
FindResult
|
a named tuple containing |
Source code in docarray/index/abstract.py
text_search_batched(queries, search_field='', limit=10, **kwargs)
Find documents in the index based on a text search query.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
queries |
Union[Sequence[str], Sequence[BaseDoc]]
|
The texts to search for |
required |
search_field |
str
|
name of the field to search on |
''
|
limit |
int
|
maximum number of documents to return |
10
|
Returns:
Type | Description |
---|---|
FindResultBatched
|
a named tuple containing |