Remove `const FLAGS`. by nnethercote · Pull Request #152791 · rust-lang/rust

nnethercote · 2026-02-18T09:38:22Z

The performance wins provided by these types are meagre, and I don't think they justify the code complexity they introduce.

r? @Zalathar

nnethercote · 2026-02-18T09:38:39Z

@bors try @rust-timer queue

Remove `const FLAGS`.

Zalathar · 2026-02-18T09:55:10Z

See also [EXPERIMENT] Remove static booleans from rustc_query_impl::DynamicConfig #151633 where I experimented with putting the flags in the vtable.

I wonder if this PR's approach will preserve some of the benefits of static flags via inlining. It's certainly nicer than having FLAGS everywhere.

rust-bors · 2026-02-18T11:50:45Z

☀️ Try build successful (CI)
Build commit: 744269b (744269beec5ae0238f5da62161b8168c2e7f8956, parent: 8387095803f21a256a9a772ac1f9b41ed4d5aa0a)

rust-timer · 2026-02-18T12:30:26Z

Finished benchmarking commit (744269b): comparison URL.

Overall result: ❌ regressions - please read the text below

Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @rustbot label: +perf-regression-triaged. If not, please fix the regressions and do another perf run. If its results are neutral or positive, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

	mean	range	count
Regressions ❌ (primary)	0.3%	[0.1%, 0.9%]	42
Regressions ❌ (secondary)	0.4%	[0.1%, 1.0%]	78
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	0.3%	[0.1%, 0.9%]	42

Max RSS (memory usage)

Results (primary 2.4%, secondary -0.2%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	2.4%	[1.8%, 3.6%]	3
Regressions ❌ (secondary)	3.3%	[1.8%, 6.0%]	3
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-3.6%	[-4.0%, -3.2%]	3
All ❌✅ (primary)	2.4%	[1.8%, 3.6%]	3

Cycles

Results (primary 3.0%, secondary 3.3%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	3.0%	[2.4%, 4.1%]	6
Regressions ❌ (secondary)	3.3%	[2.1%, 4.5%]	29
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	3.0%	[2.4%, 4.1%]	6

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 480.396s -> 481.316s (0.19%)
Artifact size: 397.88 MiB -> 397.75 MiB (-0.03%)

nnethercote · 2026-02-19T00:43:52Z

Just enough of a regression that it's hard to justify, alas. I see now that SemiDynamicQueryDispatcher and QueryVTable could be merged if FLAGS went away.

Relatedly, I found the following functions take a FLAGS generic param but don't actually use it:

mk_cycle
handle_cycle_error
cycle_error
wait_for_query
try_load_from_disk_and_cache_in_memory
query_key_hash_verify
try_load_from_on_disk_cache_inner

They can be changed to take a QueryVTable instead of a SemiDynamicQueryDispatcher, though then some methods on the latter need to be moved to the former in such a way that's it's not a uniform facade type.

nnethercote · 2026-02-19T05:25:15Z

@bors try @rust-timer queue

Remove `const FLAGS`.

rust-bors · 2026-02-19T07:40:51Z

☀️ Try build successful (CI)
Build commit: 06d3be4 (06d3be4e20551ccb3f23a19b0d08dabb3542e3fa, parent: e0cb264b814526acb82def4b5810e394a2ed294f)

rust-timer · 2026-02-19T08:21:06Z

Finished benchmarking commit (06d3be4): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @rustbot label: +perf-regression-triaged. If not, please fix the regressions and do another perf run. If its results are neutral or positive, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

	mean	range	count
Regressions ❌ (primary)	0.3%	[0.2%, 0.5%]	14
Regressions ❌ (secondary)	0.3%	[0.1%, 0.6%]	33
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-0.4%	[-0.5%, -0.1%]	8
All ❌✅ (primary)	0.3%	[0.2%, 0.5%]	14

Max RSS (memory usage)

Results (secondary -5.5%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-5.5%	[-5.5%, -5.5%]	1
All ❌✅ (primary)	-	-	0

Cycles

Results (primary -2.8%, secondary -3.2%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-2.8%	[-2.8%, -2.8%]	1
Improvements ✅ (secondary)	-3.2%	[-3.2%, -3.2%]	1
All ❌✅ (primary)	-2.8%	[-2.8%, -2.8%]	1

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 487.064s -> 481.364s (-1.17%)
Artifact size: 397.85 MiB -> 395.69 MiB (-0.54%)

`SemiDynamicQueryDispatcher` is just a `QueryVTable` wrapper with an additional `const FLAGS: QueryFlags` generic parameter that contains three booleans. This arrangement exists as a performance optimization. But the performance effects are very small and it adds quite a bit of complexity to an already overly-complex part of the codebase. If it didn't exist and somebody proposed adding it and asked me to review, I almost certainly wouldn't approve it. This commit removes it. The three booleans in `QueryFlags` are moved into `QueryVTable` The non-trivial methods of `SemiDynamicQueryDispatcher` become methods of `QueryVTable`.

It's now `query_vtable` because its return type changed. And thanks to the previous commit it can be manually inlined in several places. (The only remaining calls to it are in `make_dep_kind_vtable_for_query`, which are more challenging to remove.)

rustbot · 2026-02-20T02:58:10Z

Zalathar is not on the review rotation at the moment.
They may take a while to respond.

nnethercote · 2026-02-20T03:03:14Z

Looking at icounts for primary benchmarks, there are only 14 runs where the regression is considered significant, and the worst case is 0.5%. 8 of the 14 are doc builds, not sure why those are more affected, but they're arguably less important that non-doc builds.

The story for secondary benchmark is pretty similar.

Overall, the perf effects of this optimization is just really underwhelming. I don't think it's worth it, especially in part of the code that is already very complex and there is lots of ongoing work to reduce complexity.

rust-bors · 2026-02-20T09:48:18Z

☔ The latest upstream changes (presumably #152747) made this pull request unmergeable. Please resolve the merge conflicts.

Zalathar · 2026-02-20T11:35:53Z

Overall, the perf effects of this optimization is just really underwhelming. I don't think it's worth it, especially in part of the code that is already very complex and there is lots of ongoing work to reduce complexity.

When I first tried to get rid of FLAGS, I felt discouraged by the small-but-measurable perf regression.

But every time I run into how annoying it is to deal with having FLAGS everywhere, that regression starts to look more and more like something we should accept for the big simplicity win.

Zalathar · 2026-02-20T11:39:37Z

If we end up doing something like this, it might make sense to incorporate some or all of the changes in Streamline QueryVTableUnerased into GetQueryVTable #152841 at the same time.