Details
Description
Currently the virtual nodes use the regular java UUID generation method. The uuid generation is synchronized and a bit on the slow side. For normal (real) nodes it's fast enough and there is no
problem. However for virtual node generation it can be blocking as sometimes large number of nodes are generated in a short time. I've seen this behavior also during stress testing with several different sites.
So I would like to propose to generate different UUIDs for virtual
nodes in a non-synchronized way, exploiting the cluster node locality
and use namespace prefixing.
For example:
- all virtual node uuids start with "cafeface-"
- virtual nodes on a cluster node are generated incrementally from
"cafeface-0000-0000-0000-000000000000" to
"cafeface-fffff-ffff-ffff-ffffffffffff"
The advantages are:
- faster (parallel) creation of virtual layer
- easy and safe isVirtual() check: just do
node.getIdentifier().startsWith("cafeface-") (also works on the
nodeState level in the itemstatemanagers) - easy to check and scan the db for wrongly persisted virtual nodes in
the consistencyCheck - easy for the human eye to recognize, for example in log files
- prevent uuid collissions with real nodes
- very easy to implement (just add one method to the HippoNodeId)
Comment Berry:
Your idea to boost performance has no drawbacks, however NO-ONE ever
should make ANY assumption about the form of the IDs other than the
deep internals of the repository. So no isVirtual check, that will be
copied over and over.