Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(NODE-6356): Improve serialization performance #709

Merged
merged 3 commits into from
Sep 16, 2024

Conversation

SeanReece
Copy link
Contributor

@SeanReece SeanReece commented Aug 22, 2024

Description

This MR improves serialization (and calculateSize) performance by up to 20-40% in my tests using MFlix documents.

Test Case Ops/sec (previous) Ops/sec (new) Improvement (%)
serialize 3,102,374 3,752,681 20.96%
calculateSize 23,983,120 33,635,923 40.25%
serialize large doc 719,650 926,643 28.76%
  • Object.prototype.toString.call(x) seems to be quite expensive
  • Re-order serialization type checks to move these calls out of hot paths
  • Add WeakMap cache for repeated calls (ie. calling isDate(x), isRegExp(x), isUint8Array(x) will only result in 1 Object.prototype.toString.call(x))
  • Add global symbol constant so it doesn't have to search the symbol registry for every property

What is changing?

It may look like a lot of changes, but most of the changes here are just re-ordering the type checks during serialization. There shouldn't be any functionality changes here and all tests are passing.

Is there new documentation needed for these changes?

None

What is the motivation for this change?

Release Highlight

Serialization performance improved.

Optimizations have been implemented with respect to BSON serialization across the board, resulting in up to 20% gains in serialization with a sample of MFlix documents. Many thanks to @SeanReece for the contribution.

Double check the following

  • Ran npm run check:lint script
  • Self-review completed using the steps outlined here
  • PR title follows the correct format: type(NODE-xxxx)[!]: description
    • Example: feat(NODE-1234)!: rewriting everything in coffeescript
  • Changes are covered by tests
  • New TODOs have a related JIRA ticket

@SeanReece SeanReece changed the title perf: Improve serialization performance perf(NODE-6356): Improve serialization performance Sep 3, 2024
@durran durran self-assigned this Sep 5, 2024
@durran
Copy link
Member

durran commented Sep 5, 2024

Hi @SeanReece just I note I rebased this PR on your branch to get our latest perf benchmark updates. Now we can directly check the results in CI without needing to run the tests manually.

src/parser/utils.ts Outdated Show resolved Hide resolved
@durran
Copy link
Member

durran commented Sep 5, 2024

At the PR's current state the standouts on our benchmarks:

Notable winners:

Test Case MB/sec (prev) MB/sec (new) Improvement (%)
timestamp_array_1000._serialize_bson 93.16 201.89 116.7%
objectid_array_1000._serialize_bson 121.31 224.35 84.9%
decimal128_array_1000._serialize_bson 144.92 263.05 81.5%
binary_array_1000._serialize_bson 250.30 439.65 75.6%

Notable losers:

Test Case MB/sec (prev) MB/sec (new) Decrease (%)
decimal128_array_1000._deserialize_bson 46.88 22.24 -110.8%
mixed_large._deserialize_bson 80.60 73.77 -9.2%

So across the board this impacts large array serialization dramatically - huge gains everywhere. But the deserialization decreases are perplexing since no deserialization code was touched. Maybe just a red herring.

EDIT

I just re-ran them and still see decimal128_array_1000._deserialize_bson at 25.43 so doesn't seem to be a red herring, but need to figure out how that was affected. It's a bit perplexing.

@SeanReece
Copy link
Contributor Author

SeanReece commented Sep 5, 2024

@durran That's great news! Very odd about the deserialize regression though. I've been struggling getting bson-bench running on my M1 Mac, but I was able to get it to run in docker. I'll do some tests and see if I can reproduce.

@SeanReece
Copy link
Contributor Author

SeanReece commented Sep 16, 2024

@durran I was able to reproduce the issue with decimal128 deserialize. Since the decimal128 constructor calls isUint8Array() there was some extra overhead for that single call, adding a instanceof check improves the decimal128 deserialize performance over main by up to 116% on my machine (M1 Mac).

Winners 🎉

Name MB/sec (prev) MB/sec (new) Improvement (%)
maxkey_array_1000._serialize_bson 108.2MB 390.3MB 260.7%
minkey_array_1000._serialize_bson 112.0MB 392.7MB 250.6%
minkey_array_100._serialize_bson 89.3MB 275.8MB 209.0%
maxkey_array_100._serialize_bson 88.4MB 260.9MB 195.1%
timestamp_array_1000._serialize_bson 269.0MB 793.2MB 194.9%
timestamp_array_100._serialize_bson 225.6MB 651.2MB 188.6%
objectid_array_1000._serialize_bson 329.1MB 875.9MB 166.2%
objectid_array_100._serialize_bson 307.9MB 739.1MB 140.0%
decimal128_array_1000._deserialize_bson 130.6MB 283.0MB 116.8%
decimal128_array_100._deserialize_bson 124.3MB 258.4MB 107.9%
decimal128_array_1000._serialize_bson 313.9MB 607.2MB 93.4%
binary_array_1000._serialize_bson 607.9MB 1148.4MB 88.9%
binary_array_100._serialize_bson 578.6MB 1088.4MB 88.1%
decimal128_array_100._serialize_bson 302.2MB 552.3MB 82.8%
decimal128_array_10._deserialize_bson 109.4MB 183.1MB 67.3%
minkey_array_10._serialize_bson 45.7MB 74.2MB 62.5%
maxkey_array_10._serialize_bson 48.0MB 76.2MB 58.9%
binary_array_10._serialize_bson 304.1MB 454.3MB 49.4%
null_array_1000._serialize_bson 337.8MB 498.9MB 47.7%
timestamp_array_10._serialize_bson 108.2MB 159.6MB 47.4%
null_array_100._serialize_bson 235.1MB 312.3MB 32.8%
mixed_small._deserialize_bson 155.0MB 204.2MB 31.8%
code-without-scope_array_100._serialize_bson 838.8MB 1073.2MB 27.9%
objectid_array_10._serialize_bson 138.3MB 176.4MB 27.6%
code-without-scope_array_10._serialize_bson 638.8MB 812.5MB 27.2%
null_singleElementArray._serialize_bson 15.5MB 19.5MB 26.2%
decimal128_array_10._serialize_bson 145.2MB 181.5MB 25.0%
regex_array_10._serialize_bson 222.1MB 277.0MB 24.7%
code-with-scope_singleFieldDocument._serialize_bson 135.4MB 167.3MB 23.6%
null_array_10._serialize_bson 33.8MB 41.5MB 22.9%
regex_array_1000._serialize_bson 304.8MB 372.2MB 22.1%
decimal128_singleFieldDocument._deserialize_bson 23.2MB 28.2MB 21.3%
code-without-scope_array_1000._serialize_bson 922.7MB 1115.4MB 20.9%
regex_array_100._serialize_bson 297.2MB 348.6MB 17.3%
code-with-scope_singleElementArray._serialize_bson 122.6MB 143.8MB 17.2%
date_array_10._serialize_bson 100.2MB 115.9MB 15.7%
objectid_singleFieldDocument._serialize_bson 25.1MB 29.0MB 15.4%
binary_small._serialize_bson 1069.7MB 1221.1MB 14.1%
timestamp_singleFieldDocument._serialize_bson 19.6MB 22.4MB 14.1%
decimal128_singleFieldDocument._serialize_bson 30.1MB 34.0MB 13.2%
binary_singleFieldDocument._serialize_bson 48.6MB 55.0MB 13.1%
code-without-scope_singleFieldDocument._serialize_bson 127.8MB 143.5MB 12.3%
code-with-scope_array_100._serialize_bson 535.7MB 599.0MB 11.8%
null_singleFieldDocument._serialize_bson 12.5MB 13.9MB 11.0%
code-with-scope_array_1000._serialize_bson 518.6MB 575.5MB 11.0%
long_array_1000._deserialize_bson 327.1MB 362.1MB 10.7%
maxkey_singleFieldDocument._serialize_bson 8.5MB 9.4MB 10.6%

Losers: None

Copy link
Contributor

@nbbeeken nbbeeken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! nice work :)

@durran durran merged commit 61537f5 into mongodb:main Sep 16, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants