reduce
docarray.utils.reduce
reduce(left, right, left_id_map=None)
Reduces left and right DocList into one DocList in-place. Changes are applied to the left DocList. Reducing 2 DocLists consists in adding Documents in the second DocList to the first DocList if they do not exist. If a Document exists in both DocLists (identified by ID), the data properties are merged with priority to the left Document.
Nested DocLists are also reduced in the same way.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
left |
DocList
|
First DocList to be reduced. Changes will be applied to it in-place |
required |
right |
DocList
|
Second DocList to be reduced |
required |
left_id_map |
Optional[Dict]
|
Optional parameter to be passed in repeated calls for optimizations, keeping a map of the Document ID to its offset in the DocList |
None
|
Returns:
Type | Description |
---|---|
DocList
|
Reduced DocList |
Source code in docarray/utils/reduce.py
reduce_all(docs)
Reduces a list of DocLists into one DocList. Changes are applied to the first DocList in-place.
The resulting DocList contains Documents of all DocLists. If a Document exists (identified by their ID) in many DocLists, data properties are merged with priority to the left-most DocLists (that is, if a data attribute is set in a Document belonging to many DocLists, the attribute value of the left-most DocList is kept). Nested DocLists belonging to many DocLists are also reduced in the same way.
Note
- Nested DocLists order does not follow any specific rule. You might want to re-sort them in a later step.
- The final result depends on the order of DocLists when applying reduction.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
docs |
List[DocList]
|
List of DocLists to be reduced |
required |
Returns:
Type | Description |
---|---|
DocList
|
the resulting DocList |