The most important thing in this case is that curve timelines (the most popular ones) use a single buffer for frame data unless the curve is a Bezier one (and indeed most often the curve is not Bezier). Spine-c on the other hand uses two buffers: one for curve data, and one for frame data. See here ( https://github.com/Chobolabs/spine-cpp/blob/master/include/spinecpp/Timelines.h ) how curve timelines have frames inherited from CurveFrame. In our implementation Bezier curves still would cause a L1 cache miss for each frame, but most curves aren't such.
The other thing is that the bones of a skeleton are a in a single dense buffer (instead of individually allocated like in spine-c). Thus iterating through bones has fewer L1 cache misses, since big chunks of bones would be present in L1 cache at the same time.
There are other minor examples, but our goal was to localize most data structures. We try to always have a buffer of values instead of buffer of pointers to values.
In the near future we plan on cache-localizing timelines as a whole, by adding a special timeline allocator. This hasn't been tested yet, but we expect to gain a few more percent of performance that way.