ari-party/rojo

Fork 0

forked from rojo-rbx/rojo

Commit Graph

Author	SHA1	Message	Date
boatbomber	85655ca84f	Heuristic hydration matching instead of hard dependency on GetChildren ordering (#1266 ) ### Summary When two or more sibling instances share the same `Name` and `ClassName`, Rojo's reconciler previously paired them with their server-side counterparts purely by child order (first-unvisited match in `GetChildren()` order). If the workspace child order ever diverged from the server's, the wrong instance got paired so each duplicate could inherit a sibling's properties. This is the root of the #1257 bug: the welded parts would oscillate between positions on each connect/disconnect because hydration kept mis-pairing them. (#1265 stopped the sync fallback from scrambling child order in the first place but this PR makes hydration robust even when order does diverge.) This PR makes `hydrate` break ties by comparing properties: when several existing children match on `Name`+`ClassName`, it scores each candidate by how many of the virtual instance's properties match the candidate's live values, and picks the best. Order remains the tiebreak when scores are equal, so behavior is unchanged for uniquely-named instances and for indistinguishable siblings. ### Changes - `trueEquals.lua`: extracted verbatim from `diff.lua` (the fuzzy value-equality helper) so it can be shared. No behavior change; `diff.lua` now requires it. - `countMatchingProperties.lua`: added `countMatchingProperties(instance, virtualInstance, instanceMap) -> number`. Skips `Ref` properties (the instanceMap isn't fully built mid-hydrate, so refs can't be decoded reliably, and they're a poor disambiguator anyway) and any property that can't be read or decoded. - `hydrate.lua`: See details below. ### Hydrate Changes This touches `hydrate`, which runs over the whole tree on every connect/resync, so I want state clearly that the common path is faster than before, not slower even for parents with thousands of children! The old algorithm was a nested scan: for each of `V` virtual children, scan existing children until the first unvisited `Name`+`ClassName` match. Two costs stand out: - A `pcall` (to guard DataModel permission errors) ran on every comparison (up to `VE` `pcall`s per parent). - Even for in-order trees the re-scanning of the visited prefix made it `O(V^2)`. The new algorithm does a single bucketing pass, then `O(1)` lookups: 1. One `O(E)` pass groups existing children into nested `buckets[name][className]` tables. This runs exactly `E` `pcall`s total (one per child), down from the `VE` worst case. 2. Each virtual child does an `O(1)` bucket lookup to find its candidates. 3. A per-bucket cursor skips already-paired children, so order-based matching is amortized `O(1)` per child instead of rescanning. \| Scenario \| Old \| New \| \| ------------------------------------------------ \| --------------------------------------- \| --------------------------------------------- \| \| Unique-named children (typical, incl. thousands) \| `O(V^2)`, plus up to `VE` pcalls \| `O(V + E)`, plus exactly `E` pcalls \| \| `C <= 32` candidates \| `O(C^2)` \| `O(P C^2)` scoring \| \| `C > 32` candidates \| `O(C^2)` \| `O(C)` \| Property scoring (`getProperty`/`decodeValue`/`trueEquals`) is the only new expense, and it's gated two ways: - It runs only when a `Name`+`ClassName` group has >=2 candidates (i.e. never for uniquely-named instances). - A cap, `MAX_CANDIDATES_TO_SCORE = 32`, means scoring only kicks in once a group has <=32 unvisited candidates. A folder of thousands of identically-named parts therefore does not trigger scoring; it falls back to the original order-based pairing. The worst-case added scoring work is bounded to roughly `32^2` property comparisons per group, independent of group size. So overall this is faster when you have unique names or many children. It is slower but more robust when you have small groups of duplicate names. Memory usage is increased as it creates the candidate buckets.	2026-06-01 17:29:22 -07:00

Author

SHA1

Message

Date

boatbomber

85655ca84f

Heuristic hydration matching instead of hard dependency on GetChildren ordering (#1266 )

### Summary

When two or more sibling instances share the same `Name` and
`ClassName`, Rojo's reconciler previously paired them with their
server-side counterparts purely by child order (first-unvisited match in
`GetChildren()` order). If the workspace child order ever diverged from
the server's, the wrong instance got paired so each duplicate could
inherit a sibling's properties.

This is the root of the #1257 bug: the welded parts would oscillate
between positions on each connect/disconnect because hydration kept
mis-pairing them. (#1265 stopped the sync fallback from scrambling child
order in the first place but this PR makes hydration robust even when
order *does* diverge.)

This PR makes `hydrate` break ties by comparing properties: when several
existing children match on `Name`+`ClassName`, it scores each candidate
by how many of the virtual instance's properties match the candidate's
live values, and picks the best. Order remains the tiebreak when scores
are equal, so behavior is unchanged for uniquely-named instances and for
indistinguishable siblings.

### Changes

- `trueEquals.lua`: extracted verbatim from `diff.lua` (the fuzzy
value-equality helper) so it can be shared. No behavior change;
`diff.lua` now requires it.
- `countMatchingProperties.lua`: added
`countMatchingProperties(instance, virtualInstance, instanceMap) ->
number`. Skips `Ref` properties (the instanceMap isn't fully built
mid-hydrate, so refs can't be decoded reliably, and they're a poor
disambiguator anyway) and any property that can't be read or decoded.
- `hydrate.lua`: See details below.

### Hydrate Changes

This touches `hydrate`, which runs over the whole tree on every
connect/resync, so I want state clearly that **the common path is faster
than before, not slower** even for parents with thousands of children!

The old algorithm was a nested scan: for each of `V` virtual children,
scan existing children until the first unvisited `Name`+`ClassName`
match. Two costs stand out:
- A `pcall` (to guard DataModel permission errors) ran on every
comparison (up to `V*E` `pcall`s per parent).
- Even for in-order trees the re-scanning of the visited prefix made it
`O(V^2)`.

The new algorithm does a single bucketing pass, then `O(1)` lookups:
1. One `O(E)` pass groups existing children into nested
`buckets[name][className]` tables. This runs exactly `E` `pcall`s total
(one per child), down from the `V*E` worst case.
2. Each virtual child does an `O(1)` bucket lookup to find its
candidates.
3. A per-bucket cursor skips already-paired children, so order-based
matching is amortized `O(1)` per child instead of rescanning.

| Scenario | Old | New |
| ------------------------------------------------ |
--------------------------------------- |
--------------------------------------------- |
| Unique-named children (typical, incl. thousands) | `O(V^2)`, plus up
to `V*E` pcalls | `O(V + E)`, plus exactly `E` pcalls |
| `C <= 32` candidates | `O(C^2)` | `O(P * C^2)` scoring |
| `C > 32` candidates | `O(C^2)` | `O(C)` |

Property scoring (`getProperty`/`decodeValue`/`trueEquals`) is the only
new expense, and it's gated two ways:
- It runs only when a `Name`+`ClassName` group has >=2 candidates (i.e.
never for uniquely-named instances).
- A cap, `MAX_CANDIDATES_TO_SCORE = 32`, means scoring only kicks in
once a group has <=32 unvisited candidates. A folder of thousands of
identically-named parts therefore does not trigger scoring; it falls
back to the original order-based pairing. The worst-case added scoring
work is bounded to roughly `32^2` property comparisons per group,
independent of group size.

So overall this is faster when you have unique names or many children.
It is slower but more robust when you have small groups of duplicate
names. Memory usage is increased as it creates the candidate buckets.

2026-06-01 17:29:22 -07:00

1 Commits