Brussels / 31 January & 1 February 2026

schedule

Hot-patching ClickHouse in production with LLVM XRay


Ever been debugging a production issue and wished you'd added just one more log statement? Now you have to rebuild, wait for CI, deploy... all that time wasted. We've all been there, cursing our past selves.

We've integrated LLVM's XRay into ClickHouse to solve this. It lets us hot-patch running production systems to inject logging, profiling, and even deliberate delays into any function. No rebuild required.

XRay reserves space at function entry/exit that can be atomically patched with custom handlers at runtime. We built three handler types: LOG to add the trace points you forgot, SLEEP to reproduce (or prevent) timing-sensitive bugs, and PROFILE for deterministic profiling to complement our existing sampling profiler. The performance overhead when inactive is negligible.

Control is simple. Send a SQL query as SYSTEM INSTRUMENT ADD LOG `QueryMetricLog::startQuery` 'This message will be logged at the start of the function"' to patch the function instantly. Results show up in system.trace_log. Remove it just as easily when you're done.

I'll cover the integration challenges (ELF parsing, thread-safety, atomic patching), performance numbers (4-7% binary size, near-zero runtime cost), and real production war stories.

Issue with the description of the task PR that added XRay integration

Speakers

Photo of Pablo Marcos Pablo Marcos

Links