Blog
kuduntp

Why Time Synchronization (NTP) Matters in Apache Kudu

Why a Kudu cluster won't even start without NTP, actual error messages, and how to configure ntpd and chrony — all in one place.

Data DynamicsApril 12, 202611 min read

Have you ever installed Apache Kudu for the first time, started the Tablet Server, and seen "Clock unsynchronized" in the logs as the process dies? In a distributed database, "the clocks are out of sync" isn't just a warning — it means data consistency cannot be guaranteed, so Kudu refuses to start entirely. Based on the official Kudu Troubleshooting documentation, this post explains why time synchronization is mandatory, what errors you'll encounter, and how to fix them.

1. Why Kudu Is Sensitive About Clocks

Kudu internally uses a Hybrid Logical Clock (HLC). HLC is a timestamp that combines a physical clock (wall clock) with a logical counter — the core mechanism for determining event ordering across distributed nodes.

ComponentRole
Physical clockThe node's system time. Synchronized via NTP
Logical counterDistinguishes event order within the same physical timestamp

For HLC to work correctly, it needs to know the error bound of the physical clock. Kudu reads this error bound reported by the NTP daemon to the kernel to determine "how much the current node's time could differ from the actual time at most." If this value is too large — meaning the clock cannot be trusted — the following problems arise:

  • Broken read consistency: Snapshot read timestamps can't be pinpointed, causing "data appears and disappears" phenomena
  • Write conflict misjudgment: When two nodes modify the same row simultaneously, it's impossible to determine which came first
  • Raft consensus delays: Timeout calculations for leader election and log replication are thrown off

To prevent these risks, Kudu shuts down the service entirely when the clock error exceeds the allowed threshold.

2. Actual Error Messages You'll Encounter

You may see the following errors in Kudu Master or Tablet Server logs.

2.1 Clock not synchronized at all

Clock unsynchronized. Status: Service unavailable: Error reading clock.

The NTP daemon is not installed, or hasn't finished synchronizing yet. Kudu checks the kernel's clock status via the ntp_gettime() system call, and this error occurs when the kernel reports "not synchronized."

2.2 Synchronized but error is too high

Clock synchronized, but error: 11130000, is past the maximum allowable error: 10000000

NTP is running but the estimated error exceeds Kudu's allowed limit. The unit is microseconds (us). The example above means "current error 11.13 seconds, allowed limit 10 seconds."

2.3 HybridClock initialization failure at startup

Cannot initialize HybridClock. Clock synchronized but error was too high

When the situation from 2.2 occurs at process startup, this message appears and startup fails.

3. Key Configuration Flags

Kudu's main time synchronization flags are:

FlagDefaultDescription
--max_clock_sync_error_usec10000000 (10 seconds)Maximum allowed clock error. Service stops if exceeded
--time_sourcesystemTime source. system uses the OS clock (NTP required), mock is for testing only

Warning: It's tempting to increase --max_clock_sync_error_usec to avoid errors, but this is not a real fix. Increasing the tolerance weakens read/write consistency guarantees. Properly configuring NTP is the correct solution.

4. Choosing an NTP Daemon: ntpd vs chrony vs systemd-timesyncd

Since Kudu checks the kernel's clock discipline status, the NTP daemon must properly report time information to the kernel. Not all NTP implementations support this.

NTP ImplementationKudu CompatibleNotes
ntpd (ntp package)CompatibleTraditional choice. Works on all OSes
chrony (chronyd)CompatibleModern alternative. Converges faster in VM/container environments. rtcsync option required
systemd-timesyncdNot compatibleDoesn't use kernel discipline API, so Kudu reports "unsynchronized"

Why systemd-timesyncd won't work

Debian/Ubuntu systems have systemd-timesyncd enabled by default. While this service does synchronize time as an SNTP client, it does not report status through kernel discipline APIs like ntp_adjtime() / ntp_gettime(). Since Kudu checks synchronization status through exactly these APIs, systemd-timesyncd alone will not make Kudu recognize "the clock is synchronized."

# Disable systemd-timesyncd and replace with chrony or ntpd
sudo systemctl stop systemd-timesyncd
sudo systemctl disable systemd-timesyncd

chrony synchronizes faster than ntpd and converges reliably even when the clock is significantly off in VM environments. chrony is recommended for Kudu environments.

5.1 Installation

# Debian / Ubuntu
sudo apt-get install chrony
 
# RHEL / CentOS / Rocky
sudo yum install chrony

5.2 Configuration file (/etc/chrony.conf)

# NTP servers — at least 4 recommended
server time1.google.com iburst
server time2.google.com iburst
server time3.google.com iburst
server time4.google.com iburst
 
# For AWS environments, add the following (or use instead of the above)
# server 169.254.169.123 prefer iburst
 
# For GCE environments
# server metadata.google.internal prefer iburst
 
# Must be enabled for Kudu compatibility
rtcsync
 
# Step correction if error exceeds 1 second at startup
makestep 1.0 3
 
# Maximum polling interval (2^7 = 128 seconds). Shorter than default improves sync precision
maxpoll 7

Key points:

  • rtcsync: This option is required for chrony to activate the kernel's clock discipline, allowing Kudu to read "synchronized" status via ntp_gettime(). Omitting this single line will cause Kudu to not recognize the clock.
  • iburst: Rapidly sends 8 packets immediately after daemon startup to shorten initial synchronization time.
  • makestep 1.0 3: For the first 3 updates, if the error exceeds 1 second, it corrects immediately via step instead of slew. Especially useful right after VM snapshot restoration or reboot.

5.3 Start the service

sudo systemctl enable chronyd
sudo systemctl start chronyd

5.4 Verify synchronization

chronyc tracking

Items to check in the output:

Reference ID    : D8EF2300 (time1.google.com)
Stratum         : 2
Ref time (UTC)  : Sat Apr 12 03:22:15 2026
System time     : 0.000023420 seconds fast of NTP time
Last offset     : +0.000012345 seconds
RMS offset      : 0.000025678 seconds
Root delay      : 0.012345678 seconds
Root dispersion : 0.001234567 seconds
Leap status     : Normal
  • Leap status: Normal — Kernel recognizes the clock as "synchronized." Kudu can start normally
  • System time — Current error. Sub-millisecond is good
# Check NTP source servers
chronyc sources -v
 
# Per-source statistics
chronyc sourcestats

6. ntpd Configuration (Alternative)

If chrony isn't an option, ntpd works perfectly fine.

6.1 Installation

# Debian / Ubuntu
sudo apt-get install ntp
 
# RHEL / CentOS
sudo yum install ntp

6.2 Initial time correction

When the clock error is large at startup, ntpd can take minutes to tens of minutes to converge. First correct the time with ntpdate or ntpd -q, then start the daemon.

# First ensure ntpd is stopped
sudo systemctl stop ntp
 
# Immediate time correction
sudo ntpdate -b time.google.com
 
# Start daemon
sudo systemctl enable ntp
sudo systemctl start ntp

6.3 Configuration file (/etc/ntp.conf)

server time1.google.com iburst
server time2.google.com iburst
server time3.google.com iburst
server time4.google.com iburst
 
# Shorten maximum polling interval
maxpoll 7

6.4 Verify synchronization

# Check kernel clock status
ntptime

If the status field in the output contains NANO or OK, it's normal. If you see UNSYNC, synchronization hasn't completed yet.

# NTP peer status summary
ntpstat
 
# Detailed peer information
ntpq -p

6.5 -x flag caution

Some distributions include the -x flag in ntpd startup options. This flag disables step correction and allows only slew, which means synchronization can take an extremely long time when the clock is significantly off. For Kudu nodes, removing -x is recommended.

# Check in /etc/sysconfig/ntpd or /etc/default/ntp
# Remove -x from OPTIONS="-x -u ntp:ntp"
OPTIONS="-u ntp:ntp"

7. Cloud-Specific NTP Servers

In cloud VMs, using the NTP server provided by the cloud platform minimizes network latency and error.

CloudNTP ServerConfiguration Example
AWS169.254.169.123server 169.254.169.123 prefer iburst
GCEmetadata.google.internalserver metadata.google.internal prefer iburst
Azuretime.windows.comserver time.windows.com prefer iburst
On-premisesInternal Stratum 1/2 server or public NTP poolserver time.google.com iburst

The prefer keyword means this server will be used as the primary reference. Since cloud-internal NTP servers are physically close, adding prefer provides stability.

8. Troubleshooting Checklist

When Kudu dies with a clock-related error, check in this order.

8.1 Is the NTP daemon running?

# For chrony
sudo systemctl status chronyd
 
# For ntpd
sudo systemctl status ntp

8.2 Is systemd-timesyncd running instead?

timedatectl status

If the output shows NTP service: systemd-timesyncd, a replacement is needed.

8.3 Is the kernel reporting "synchronized"?

# chrony environment
chronyc tracking | grep "Leap status"
# "Normal" means OK
 
# ntpd environment
ntptime | grep status
# Contains "NANO" or "OK" = normal, "UNSYNC" = problem

8.4 Is the estimated error within 10 seconds?

# chrony
chronyc tracking | grep "Root dispersion"
 
# ntpd
ntptime | grep "maximum error"

If the error exceeds 10 seconds (10,000,000 us), Kudu will reject it. Check NTP server connectivity and network.

8.5 Can the NTP servers be reached?

# chrony
chronyc sources
 
# ntpd
ntpq -p

If all sources show ? or unreachable, check if UDP port 123 is open in the firewall.

9. Caution with chrony's Local Reference Mode

In air-gapped environments without access to external NTP servers, chrony may be operated in local reference mode. In this case, do not deploy Kudu on the chrony local reference node itself. In Kudu 3.4 and earlier, there's a known issue where ntptime on that node reports "unsynchronized." Even if chronyc tracking shows "Normal," the kernel discipline status may report differently.

Solution:

  • Separate the chrony local reference server as a dedicated node
  • Configure Kudu Master/Tablet Servers to reference that server as a client

10. Pre-Startup Verification Script for Kudu

A simple verification script to run on all Kudu nodes before startup.

#!/usr/bin/env bash
set -euo pipefail
 
echo "=== NTP Synchronization Check ==="
 
# 1. Is systemd-timesyncd disabled?
if systemctl is-active --quiet systemd-timesyncd 2>/dev/null; then
  echo "[FAIL] systemd-timesyncd is active. Disable it and install chrony/ntpd."
  exit 1
fi
echo "[OK] systemd-timesyncd disabled"
 
# 2. Is chrony or ntpd running?
if systemctl is-active --quiet chronyd 2>/dev/null; then
  echo "[OK] chronyd running"
  NTP_TYPE="chrony"
elif systemctl is-active --quiet ntp 2>/dev/null || systemctl is-active --quiet ntpd 2>/dev/null; then
  echo "[OK] ntpd running"
  NTP_TYPE="ntpd"
else
  echo "[FAIL] Neither chrony nor ntpd is running."
  exit 1
fi
 
# 3. Kernel synchronization status
if command -v ntptime &>/dev/null; then
  if ntptime 2>&1 | grep -q "UNSYNC"; then
    echo "[FAIL] Kernel clock is not yet synchronized. Wait for NTP daemon to converge."
    exit 1
  fi
  echo "[OK] Kernel clock synchronized"
fi
 
# 4. Check estimated error (chrony)
if [ "$NTP_TYPE" = "chrony" ]; then
  OFFSET=$(chronyc tracking 2>/dev/null | grep "System time" | awk '{print $4}')
  echo "[INFO] Current system time offset: ${OFFSET} seconds"
fi
 
echo ""
echo "=== Check complete. Kudu is ready to start. ==="

11. Summary

QuestionAnswer
Is NTP required for Kudu?Required. Won't even start without it
Is systemd-timesyncd sufficient?No. chrony or ntpd is needed
Recommended NTP daemon?chrony (fast convergence, VM-friendly)
Must-have chrony option?rtcsync
Default error threshold?10 seconds (--max_clock_sync_error_usec=10000000)
Can the error limit be raised?Not recommended. Weakens consistency guarantees
Cloud NTP servers?AWS 169.254.169.123, GCE metadata.google.internal

Time synchronization is the most fundamental prerequisite for Kudu operations. If you set up NTP first when initially building the cluster, you'll almost never encounter clock-related failures in subsequent operations. Conversely, ignoring this leads to the most difficult-to-understand forms of data inconsistency.


This post was written based on the Apache Kudu official Troubleshooting documentation. If you need help with NTP configuration or Kudu operations in your cluster environment, feel free to reach out.

— Data Dynamics Engineering Team