Parsing Objects & Resolving References
Objects are parsed twice:
First, closest to disk, immediately after reading-in the byte blob, all non-reference props are parsed and their respective Golang types (e.g.
*models.PhoneNumber) are returned.
A second time at the root level of the
db.DBtype, the whole request is parsed again (recursively) and cross-refs are resolved as requested by the user (through
Motivation behind split-parsing
Generally, shards (and also indices) are self-contained units. It is thus
natural that they return objects which work in isolation and can be interpreted
by the rest of the application (usually in the form of a
search.Results, both defined as
However, cross-references aren't predictable. They could point to an item in
another shard or even to an item of another index (because they are a different
Class). When running in multi-node mode (horizontal replication)
the shards could be distributed on any node in the cluster.
Furthermore it is more efficient (see cached resolver) to resolve references for a list of objects as opposed to a single object. At shard-level we do not know if a specific object is part of a list and if this list spans across shards or indices.
Thus the second parsing - to enrich the desired cross-references - happens at
the outermost layer of the persistence package in the
assembling the index/shards parts.
Cached Resolver Logic
The cached resolver is a helper struct with a two-step process:
Cacher: The input object list is (in form of a
search.Results) is analyzed for references. This is a recursive process, as each resolved references might be pointing to another object which the user (as specified through the
traverser.SelectProperties) wants to resolve. However Step 1 ("the cacher") stores all results in a flat list (technically a map). This saves on complexity as only the "finding references" part is recursive, but the storage part is simple.
Resolver: In a second step, the schema is parsed recursively again where each reference pointer (in the form of a
Beaconstring) is replaced with the resolved reference content (in the form of a
search.LocalRef). If the result again contains such reference pointers to other objects, these are resolved in the same fashion - recursively until everything that the user requested is resolved.
- The reference Cacher and its unit tests
- The reference Resolver and its unit tests
- Integration tests for nested refs and refs of different types
If you can't find the answer to your question here, please look at the:
- Knowledge base of old issues. Or,
- For questions: Stackoverflow. Or,
- For issues: GitHub. Or,
- For more involved discussion: Weaviate Community Forum. Or,
- We also have a Slack channel.