I needed to monitor the synchronization status of SQL Server Always On Availability Group databases — but with a constraint: the monitoring agent could only connect to a secondary replica, not the primary. This sounds like it should be simple. After all, the replica knows whether it’s in sync, right? Mostly, yes — but there’s a gotcha that cost me some time.
The goal Link to heading
The objective was straightforward: build a health check that answers two questions from the replica’s perspective:
- Is each database healthy? Is it synchronized, synchronizing, or broken?
- How far behind is the replica? If replication is lagging, by how much?
All the information lives in one DMV: sys.dm_hadr_database_replica_states. The challenge is knowing which columns return useful data when you’re querying from the secondary.
Step 1: Synchronization state & health Link to heading
The first question is easy to answer. These columns are populated on both the primary and secondary:
SELECT
DB_NAME(database_id) as [name],
[synchronization_state_desc],
[synchronization_health_desc],
[database_state_desc]
FROM
sys.dm_hadr_database_replica_states;
| name | synchronization_state_desc | synchronization_health_desc | database_state_desc |
|---|---|---|---|
| DB1 | SYNCHRONIZED | HEALTHY | ONLINE |
| DB2 | SYNCHRONIZING | PARTIALLY_HEALTHY | RECOVERY |
| DB3 | NOT SYNCHRONIZING | NOT_HEALTHY | EMERGENCY |
- synchronization_state: Data-movement state — tells you if the replica is actively receiving and applying log records.
SYNCHRONIZEDmeans fully caught up;SYNCHRONIZINGmeans actively receiving changes;NOT SYNCHRONIZINGindicates data movement has stopped. - synchronization_health: A roll-up state that combines the synchronization state with the availability mode of the replica — useful for alerting.
HEALTHYis the target;PARTIALLY_HEALTHYorNOT_HEALTHYwarrant investigation. - database_state: The database state itself. Expect
ONLINEunder normal operation.
This tells you if something is wrong, but not how far behind the replica is. For that, you need timing data.
Step 2: Timing data Link to heading
The same DMV exposes several timestamps that track when log blocks move through the replication pipeline:
SELECT
DB_NAME(database_id) as [name],
[last_sent_time],
[last_received_time],
[last_commit_time]
FROM
sys.dm_hadr_database_replica_states;
| name | last_sent_time | last_received_time | last_commit_time |
|---|---|---|---|
| DB1 | 2022-06-15 11:47:34.317 | 2022-06-15 11:47:34.317 | 2022-06-15 11:05:26.687 |
| DB2 | 2022-06-15 11:47:34.180 | 2022-06-15 11:47:34.180 | 2022-06-15 11:05:28.533 |
| DB3 | 2022-06-15 11:47:35.020 | 2022-06-15 11:47:35.020 | 2022-06-15 11:47:34.827 |
- last_sent_time: When the last log block was sent
- last_received_time: When the log block ID in the last message received was read on the secondary replica
- last_commit_time: The time corresponding to the last commit record. On a secondary replica, this trails the primary’s commit time by the amount of replication lag.
The key insight: last_received_time is populated on the secondary and tells you exactly when the replica last heard from the primary. By comparing it to GETDATE(), you can calculate how stale the replica is — without ever talking to the primary.
See the full column reference in the sys.dm_hadr_database_replica_states documentation.
The gotcha: secondary_lag_seconds Link to heading
This was my first instinct. The documentation describes it as:
The number of seconds that the secondary replica is behind the primary replica during synchronization.
That sounds like the perfect metric. One column, already computed, measured in seconds. Except — and the documentation does mention this — the value is NULL on a secondary replica. It’s only populated when you query the DMV from the primary. If you can only connect to the replica, secondary_lag_seconds is useless.
Putting it all together Link to heading
With that dead end behind me, I combined the health check from Step 1 with the timing trick from Step 2 into a single query:
SELECT
DB_NAME(database_id) as [name],
synchronization_health_desc,
DATEDIFF(MILLISECOND,last_received_time,GETDATE()) as [last_received_ms],
[last_commit_time]
FROM
sys.dm_hadr_database_replica_states;
This gives the monitoring agent everything it needs from a single connection to the replica:
- name: Which database
- synchronization_health_desc: Is the AG healthy — any value other than
HEALTHYshould trigger an alert - last_received_ms: How many milliseconds have passed since the replica last received data from the primary — a practical measure of replication lag. In practice, values under a few hundred milliseconds are normal for synchronous-commit mode; a steadily growing value indicates the replica is falling behind.
- last_commit_time: When the last transaction was committed on the primary — useful for correlating with application activity
The last_received_ms calculation is the workhorse. Pair it with synchronization_health_desc and you have both a qualitative state and a quantitative lag metric — all from the secondary. This query is lightweight enough to poll on a short interval (e.g., every 30 seconds) and feed into your monitoring system of choice.