Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RCORE-2162: Add compression of strings in Mixed, Lst<String> and Dictionary #7804

Merged
merged 4 commits into from
Jun 19, 2024

Conversation

jedelbo
Copy link
Contributor

@jedelbo jedelbo commented Jun 12, 2024

What, How & Why?

☑️ ToDos

  • 📝 Changelog update
  • 🚦 Tests (or not relevant)
  • C-API, if public C++ API changed
  • bindgen/spec.yml, if public C++ API changed

src/realm/array_string.cpp Show resolved Hide resolved
src/realm/cluster.cpp Show resolved Hide resolved
src/realm/impl/array_writer.hpp Show resolved Hide resolved
src/realm/set.hpp Show resolved Hide resolved
src/realm/table.cpp Show resolved Hide resolved
@jedelbo jedelbo force-pushed the je/string-interning branch 2 times, most recently from eef0397 to af8c988 Compare June 14, 2024 12:09
@jedelbo
Copy link
Contributor Author

jedelbo commented Jun 17, 2024

@ironage please have a look at the modified client reset test. I think there is a problem as the 'after' callback is not synchonized with the main thread so we can't be sure that the callback has happened before the check is done. This has happened many times in ASAN runs. Do you think there is a better solution than the current fix?

@jedelbo jedelbo marked this pull request as ready for review June 17, 2024 12:30
Copy link

coveralls-official bot commented Jun 17, 2024

Pull Request Test Coverage Report for Build jorgen.edelbo_308

Details

  • 206 of 210 (98.1%) changed or added relevant lines in 22 files are covered.
  • 34 unchanged lines in 11 files lost coverage.
  • Overall coverage increased (+0.4%) to 91.236%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/realm/cluster.cpp 6 7 85.71%
src/realm/obj.cpp 36 37 97.3%
src/realm/array_backlink.cpp 0 2 0.0%
Files with Coverage Reduction New Missed Lines %
src/realm/dictionary.cpp 1 89.06%
test/test_dictionary.cpp 1 99.83%
test/test_index_string.cpp 1 93.48%
src/realm/list.hpp 2 96.94%
src/realm/sync/network/http.hpp 2 82.27%
src/realm/sync/noinst/client_impl_base.cpp 2 82.52%
src/realm/table.cpp 2 94.88%
src/realm/sync/noinst/server/server_history.cpp 3 63.57%
src/realm/unicode.cpp 3 83.83%
src/realm/sync/noinst/server/server.cpp 6 74.11%
Totals Coverage Status
Change from base Build jorgen.edelbo_306: 0.4%
Covered Lines: 221433
Relevant Lines: 242704

💛 - Coveralls

Copy link

coveralls-official bot commented Jun 17, 2024

Pull Request Test Coverage Report for Build jorgen.edelbo_311

Details

  • 176 of 181 (97.24%) changed or added relevant lines in 21 files are covered.
  • 111 unchanged lines in 20 files lost coverage.
  • Overall coverage decreased (-0.02%) to 90.821%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/realm/cluster.cpp 6 7 85.71%
src/realm/array_backlink.cpp 0 2 0.0%
src/realm/obj.cpp 24 26 92.31%
Files with Coverage Reduction New Missed Lines %
src/realm/array_mixed.cpp 1 92.38%
src/realm/array_string_short.cpp 1 86.27%
src/realm/util/file.cpp 1 84.84%
src/realm/util/serializer.cpp 1 90.43%
src/realm/uuid.cpp 1 98.48%
src/realm/query_expression.cpp 2 86.62%
src/realm/query_expression.hpp 2 93.89%
test/object-store/util/test_file.cpp 2 86.29%
src/realm/sync/noinst/client_impl_base.cpp 3 82.72%
src/realm/sync/noinst/protocol_codec.hpp 3 74.03%
Totals Coverage Status
Change from base Build jorgen.edelbo_310: -0.02%
Covered Lines: 217974
Relevant Lines: 240005

💛 - Coveralls

Copy link
Member

@nicola-cab nicola-cab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some small comments, overall LGTM.
Probably the client reset tests need some small changes, since we don't wait for a particular event, but we just sleep for a while before to checking the after reset callback.

@@ -135,6 +144,7 @@ class ArrayMixed : public ArrayPayload, private Array {
mutable ArrayString m_strings;
// Used to store nested collection refs
mutable ArrayRef m_refs;
mutable StringInterner* m_string_interner = nullptr;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need to cache the StringInterner? Shouldn't m_strings have the interner set?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. That should work.

@@ -1168,6 +1173,35 @@ REALM_FORCEINLINE void Obj::sync(Node& arr)
}
}

// helper functions for filtering out calls to set_string_interner()
template <class T>
inline void Obj::set_string_interner(T&, ColKey)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think values.set_string_interner(m_table->get_string_interner(col_key)); can just go in the main template method. The interface is basically the same, isn't it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is an optimization to only call that method for strings and mixed.

if (data_ref)
Array::destroy_deep(data_ref, m_alloc);
m_interner_data.set(col_ndx, 0);
// m_string_interners[col_ndx]->update_from_parent(true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this if not needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@@ -1048,6 +1048,13 @@ TEST_CASE("sync: client reset", "[sync][pbs][client reset][baas]") {
},
std::chrono::seconds(20), std::chrono::milliseconds(500));
}
// We can't be sure that the 'after' callback has been called yet
timed_sleeping_wait_for(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't the wait at line 1049 enough? Maybe we should extend std::chrono::milliseconds(500) to whatever it is needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok - I thought I removed the 500 ms addition in line 1049. The 500 ms is just the time we sleep between each time we check the condition. If the condition is true, we will return immediately.

Copy link

coveralls-official bot commented Jun 18, 2024

Pull Request Test Coverage Report for Build jorgen.edelbo_316

Details

  • 176 of 181 (97.24%) changed or added relevant lines in 21 files are covered.
  • 124 unchanged lines in 17 files lost coverage.
  • Overall coverage decreased (-0.02%) to 90.826%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/realm/cluster.cpp 6 7 85.71%
src/realm/array_backlink.cpp 0 2 0.0%
src/realm/obj.cpp 24 26 92.31%
Files with Coverage Reduction New Missed Lines %
src/realm/array_mixed.cpp 1 92.36%
src/realm/array_string_short.cpp 1 86.27%
src/realm/util/file.cpp 1 84.84%
src/realm/util/serializer.cpp 1 90.43%
src/realm/uuid.cpp 1 98.48%
src/realm/query_expression.hpp 3 93.86%
src/realm/sync/noinst/protocol_codec.hpp 3 74.03%
src/realm/table.cpp 3 90.1%
src/realm/unicode.cpp 3 83.83%
src/realm/sync/transform.cpp 4 60.86%
Totals Coverage Status
Change from base Build jorgen.edelbo_310: -0.02%
Covered Lines: 217984
Relevant Lines: 240003

💛 - Coveralls

Copy link

coveralls-official bot commented Jun 19, 2024

Pull Request Test Coverage Report for Build jorgen.edelbo_317

Details

  • 176 of 181 (97.24%) changed or added relevant lines in 21 files are covered.
  • 108 unchanged lines in 16 files lost coverage.
  • Overall coverage decreased (-0.02%) to 90.822%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/realm/cluster.cpp 6 7 85.71%
src/realm/array_backlink.cpp 0 2 0.0%
src/realm/obj.cpp 24 26 92.31%
Files with Coverage Reduction New Missed Lines %
src/realm/array_mixed.cpp 1 92.36%
src/realm/array_string_short.cpp 1 86.27%
src/realm/util/file.cpp 1 84.84%
src/realm/util/serializer.cpp 1 90.43%
src/realm/uuid.cpp 1 98.48%
src/realm/sync/client.cpp 3 91.78%
src/realm/query_expression.hpp 4 93.82%
src/realm/util/assert.hpp 4 87.1%
test/fuzz_tester.hpp 4 57.32%
src/realm/sync/noinst/protocol_codec.hpp 6 73.5%
Totals Coverage Status
Change from base Build jorgen.edelbo_310: -0.02%
Covered Lines: 217977
Relevant Lines: 240004

💛 - Coveralls

@jedelbo jedelbo merged commit fc31117 into feature/string-compression Jun 19, 2024
39 of 40 checks passed
@jedelbo jedelbo deleted the je/string-interning branch June 19, 2024 09:34
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jul 19, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants