When an SAP HANA system slows down, the pressure to "just fix it" usually leads teams to throw memory at the problem or restart services. Neither addresses the root cause, and both can make things worse. Effective HANA performance troubleshooting is a diagnostic discipline: you isolate the slow statement, understand why the engine is doing what it is doing, and change one thing at a time. This guide walks through the diagnostic path we use on production HANA systems, from the first symptom to the specific anti-patterns that cause the most common slowdowns.
Start With the Right Diagnostic, Not a Guess
The single biggest time-waster in HANA troubleshooting is guessing. HANA exposes precise, queryable monitoring views that tell you exactly where time and memory are going. Before you change anything, capture the evidence.
The three views that answer most questions:
- `M_EXPENSIVE_STATEMENTS` records statements that exceeded a configured runtime or memory threshold. This is where you find the actual slow queries rather than the ones users happen to complain about.
- `M_ACTIVE_STATEMENTS` shows what is running right now, including blocked statements and their lock dependencies.
- `M_SQL_PLAN_CACHE` aggregates execution statistics per statement hash, so you can see which statements consume the most cumulative time across the day, not just the slowest single run.
A statement that runs in 800 milliseconds but executes 50,000 times a day is a bigger problem than a 12-second report that runs twice. Always rank by total execution time, not peak runtime, when deciding what to fix first.
The Statistics Server: Real Overhead or a Red Herring?
One of the most common questions we get is whether the HANA statistics server itself causes a performance impact. It is a fair concern. The statistics server runs periodic collectors that query system views, and on a busy or undersized system those collectors compete for the same resources as your workload.
The honest answer: the overhead is usually small, but it is not zero, and it becomes visible in two situations.
First, on systems with tight memory, the statistics server's historical data tables (the _SYS_STATISTICS schema) can grow large enough to matter, especially if retention is left at defaults for years. Second, individual collectors can run expensive queries on systems with very large object counts, for example tens of thousands of tables and partitions.
What to check before blaming the statistics server:
- Look at
M_EXPENSIVE_STATEMENTSfiltered to theSAP<SID>or statistics user. If statistics collectors are not appearing there, they are not your problem. - Check the size of the
_SYS_STATISTICStables. If history retention is excessive, reduce it through the configuration rather than disabling collection. - Confirm the statistics server is running in its modern embedded form, not as a separate service left over from an old migration.
Disabling the statistics server to "improve performance" is almost always the wrong move. You lose the alerting and historical data you need to diagnose the very problems you are chasing. Tune retention and collector intervals instead.
Reading an Execution Plan Without Getting Lost
Once you have the slow statement, the next step is the explain plan. In HANA Studio or the Database Explorer, run EXPLAIN PLAN or capture the PlanViz trace for the statement. You are looking for a small number of red flags rather than trying to understand every operator.
The patterns that matter most:
- A large intermediate result feeding a join. If an operator produces millions of rows that are then filtered down to a few thousand, the filter is being applied too late. The fix is usually a missing or misordered predicate.
- Row engine operators on a columnstore query. HANA can fall back to the row engine for certain operations. When a primarily analytical query is running through row-engine operators, you often have a data type mismatch or an unsupported expression forcing the fallback.
- Repeated table scans of the same table. This is the signature of a correlated subquery, covered below.
You do not need to be a query optimizer expert. You need to find the one operator that processes far more rows than the final result requires, and ask why.
The Correlated Subquery Trap
Correlated subqueries are the most reliable way to make a HANA query slow, and they are extremely common in code migrated from older databases. A correlated subquery references the outer query inside the inner query, which forces the engine to re-evaluate the subquery once per outer row.
Consider a pattern like selecting orders where the line item count exceeds the average for that customer. Written as a correlated subquery, HANA may execute the inner aggregation thousands of times. The same logic expressed as a window function or a derived table that the engine can materialize once will often run an order of magnitude faster.
When you find a correlated subquery in a slow statement:
- Rewrite it as a
JOINagainst a grouped derived table where possible. - Use window functions (
OVER (PARTITION BY ...)) for per-group comparisons instead of per-row subqueries. - Be especially careful with correlated subqueries combined with
LIMIT, which people sometimes add to "speed things up." ALIMITinside a correlated subquery does not reduce the number of times the subquery executes; it only caps each execution. The cost stays high.
This is the kind of rewrite where SAP Joule can accelerate the work. It is reliable at suggesting set-based equivalents for row-by-row SQL, though you should always validate the rewritten logic against the original results before trusting it in production.
Function Pushdown and the INITCAP Problem
HANA is fast because it pushes computation down to the columnar engine and operates on compressed data. Anything that prevents pushdown forces row-by-row processing and defeats the architecture.
A classic example is wrapping an indexed or filter column in a function. If you filter on INITCAP(name) = 'Acme Corp', HANA cannot use any optimization on the name column directly, because it must compute INITCAP for every row before comparing. The same applies to UPPER, LOWER, date-formatting functions, and implicit type conversions on the left side of a predicate.
The fixes, in order of preference:
- Move the function to the literal side. Compare the raw column to a pre-computed value rather than transforming the column.
- Normalize at write time. If you frequently search case-insensitively, store a normalized copy of the column and search that.
- Reconsider the function entirely. Many
INITCAPandUPPERcomparisons exist only because the original application was case-sensitive on a different database. HANA's collation may make them unnecessary.
The broader principle: keep filter columns "bare" so the engine can push the predicate down. Every function you wrap around a column is a small tax, and on large tables those taxes compound.
Memory Pressure and Delta Merges
If query slowness is intermittent and correlates with load, the cause is often memory, not SQL. HANA keeps column tables in memory and uses a write-optimized delta store that is periodically merged into the main store. When delta merges fall behind, reads slow down because the engine must scan both stores.
Quick checks:
- `M_CS_TABLES` shows delta store sizes per table. A table with a persistently large delta is not merging often enough for its write volume.
- `M_MERGE_STATISTICS` shows merge history and whether merges are being triggered or skipped.
- Unloads. If
M_CS_UNLOADSshows tables being evicted from memory under pressure, your column store is competing for space and reload latency is hitting query times.
The fix depends on the pattern: adjust merge thresholds for high-write tables, add memory if unloads are frequent and unavoidable, or partition very large tables so that merges and scans operate on smaller units.
A Repeatable Troubleshooting Sequence
When a HANA performance issue lands on your desk, work it in this order rather than jumping to a hypothesis:
- Capture the slow statement from
M_EXPENSIVE_STATEMENTSand rank candidates by total daily execution time fromM_SQL_PLAN_CACHE. - Pull the execution plan and find the single operator processing far more rows than the result needs.
- Check for the three usual culprits: a correlated subquery, a function wrapped around a filter column, or a late-applied predicate.
- If the SQL is clean, look at memory: delta store sizes, merge history, and unloads.
- Change exactly one thing, re-measure against the same statement, and document the before-and-after numbers.
This discipline matters because HANA systems are interconnected. A change that speeds up one query can shift memory pressure onto another workload. Measuring one change at a time is the only way to know what actually helped.
When to Bring in Help
Most slow-query problems are solvable with the views and patterns above. The cases that justify outside help are the structural ones: a data model that forces repeated full scans, a custom code base full of correlated subqueries that needs systematic remediation, or chronic memory pressure that points to an undersized or poorly partitioned system.
If you are facing recurring HANA performance issues rather than a single slow report, our SAP HANA optimization services cover query tuning, memory and partitioning strategy, and the managed monitoring that catches regressions before users do. For teams whose performance problems trace back to an aging landscape, a move to current HANA on the right infrastructure as part of an S/4HANA migration often resolves the root cause rather than the symptom.
The goal is not a one-time fix. It is a system you can reason about, with the monitoring and the habits that turn "HANA is slow" from a fire drill into a routine diagnosis.