Commit Graph

1 Commits

Author SHA1 Message Date
boatbomber
85655ca84f Heuristic hydration matching instead of hard dependency on GetChildren ordering (#1266)
### Summary

When two or more sibling instances share the same `Name` and
`ClassName`, Rojo's reconciler previously paired them with their
server-side counterparts purely by child order (first-unvisited match in
`GetChildren()` order). If the workspace child order ever diverged from
the server's, the wrong instance got paired so each duplicate could
inherit a sibling's properties.

This is the root of the #1257 bug: the welded parts would oscillate
between positions on each connect/disconnect because hydration kept
mis-pairing them. (#1265 stopped the sync fallback from scrambling child
order in the first place but this PR makes hydration robust even when
order *does* diverge.)

This PR makes `hydrate` break ties by comparing properties: when several
existing children match on `Name`+`ClassName`, it scores each candidate
by how many of the virtual instance's properties match the candidate's
live values, and picks the best. Order remains the tiebreak when scores
are equal, so behavior is unchanged for uniquely-named instances and for
indistinguishable siblings.

### Changes

- `trueEquals.lua`: extracted verbatim from `diff.lua` (the fuzzy
value-equality helper) so it can be shared. No behavior change;
`diff.lua` now requires it.
- `countMatchingProperties.lua`: added
`countMatchingProperties(instance, virtualInstance, instanceMap) ->
number`. Skips `Ref` properties (the instanceMap isn't fully built
mid-hydrate, so refs can't be decoded reliably, and they're a poor
disambiguator anyway) and any property that can't be read or decoded.
- `hydrate.lua`: See details below.

### Hydrate Changes

This touches `hydrate`, which runs over the whole tree on every
connect/resync, so I want state clearly that **the common path is faster
than before, not slower** even for parents with thousands of children!

The old algorithm was a nested scan: for each of `V` virtual children,
scan existing children until the first unvisited `Name`+`ClassName`
match. Two costs stand out:
- A `pcall` (to guard DataModel permission errors) ran on every
comparison (up to `V*E` `pcall`s per parent).
- Even for in-order trees the re-scanning of the visited prefix made it
`O(V^2)`.

The new algorithm does a single bucketing pass, then `O(1)` lookups:
1. One `O(E)` pass groups existing children into nested
`buckets[name][className]` tables. This runs exactly `E` `pcall`s total
(one per child), down from the `V*E` worst case.
2. Each virtual child does an `O(1)` bucket lookup to find its
candidates.
3. A per-bucket cursor skips already-paired children, so order-based
matching is amortized `O(1)` per child instead of rescanning.

| Scenario | Old | New |
| ------------------------------------------------ |
--------------------------------------- |
--------------------------------------------- |
| Unique-named children (typical, incl. thousands) | `O(V^2)`, plus up
to `V*E` pcalls | `O(V + E)`, plus exactly `E` pcalls |
| `C <= 32` candidates | `O(C^2)` | `O(P * C^2)` scoring |
| `C > 32` candidates | `O(C^2)` | `O(C)` |

Property scoring (`getProperty`/`decodeValue`/`trueEquals`) is the only
new expense, and it's gated two ways:
- It runs only when a `Name`+`ClassName` group has >=2 candidates (i.e.
never for uniquely-named instances).
- A cap, `MAX_CANDIDATES_TO_SCORE = 32`, means scoring only kicks in
once a group has <=32 unvisited candidates. A folder of thousands of
identically-named parts therefore does not trigger scoring; it falls
back to the original order-based pairing. The worst-case added scoring
work is bounded to roughly `32^2` property comparisons per group,
independent of group size.

So overall this is faster when you have unique names or many children.
It is slower but more robust when you have small groups of duplicate
names. Memory usage is increased as it creates the candidate buckets.
2026-06-01 17:29:22 -07:00