utils/loading_cache.hh is an expensive template header that costs
~2,494 seconds of aggregate CPU time across 133 files that include it.
88 of those files include it only transitively via query_processor.hh
through the chain: query_processor.hh -> prepared_statements_cache.hh
-> loading_cache.hh, costing ~1,690s of template instantiation.
Break the chain by:
- Replacing #include of prepared_statements_cache.hh and
authorized_prepared_statements_cache.hh in query_processor.hh with
forward declarations and the lightweight prepared_cache_key_type.hh
- Replacing #include of result_message.hh with result_message_base.hh
(which doesn't pull in prepared_statements_cache.hh)
- Changing prepared_statements_cache and authorized_prepared_statements_cache
members to std::unique_ptr (PImpl) since forward-declared types
cannot be held by value
- Moving get_prepared(), execute_prepared(), execute_direct(), and
execute_batch() method bodies from the header to query_processor.cc
- Updating transport/server.cc to use the concrete type instead of the
no-longer-visible authorized_prepared_statements_cache::value_type
Per-file measurement: files including query_processor.hh now show zero
loading_cache template instantiation events (previously 20-32s each).
Wall-clock measurement (clean build, -j16, 16 cores, Seastar cached):
Baseline (origin/master): avg 24m01s (24m03s, 23m59s)
With loading_cache chain break: avg 23m29s (23m32s, 23m29s, 23m27s)
Improvement: ~32s, ~2.2%