Further analysis clarifies some of the actual bugs in dhclient/dhcp code. The ‘infinite lease’ problem only happens on 64-bit hosts due to the size of time_t. Plus the problem isn’t just an ‘infinite’ lease issue, it’s a ‘infinite lease minus seconds since the epoch’ problem due to how the dhclient code internally handles timeout calculation. Remember, an ‘infinite’ lease is 0xFFFFFFFF, or UINT_MAX.
On IA-32, where time_t is 32-bit:
- after the lease is ACK-ed, some code adds the lease seconds to gettimeofday() seconds to determine the lease expiry time, which of course is now wrapped
- this same code then checks if the lease expiry is less than gettimeofday() seconds (which of course it is, since it just wrapped around to gettimeofday() – 1) and helpfully resets the lease expiry to INT_MAX
- checks in isc_time_nowplusinterval() which guard against 32-bit overflow (by checking expiry seconds plus gettimeofday() against UINT_MAX) succeed and everything works
On x64, where time_t is 64-bit:
- adding gettimeofday() seconds to the lease seconds does not wrap, thus the lease expiry is larger than UINT_MAX
- since the value did not wrap, the checks for negative lease expiry do not trigger, and the expiry is still UINT_MAX + gettimeofday()
- the overflow checks in isc_time_nowplusinterval() trigger because our values are already larger than UINT_MAX, and the lease fails
The core problem is that most of the dhcp/dhclient code simply doesn’t deal with large time values that may occur on 64-bit platforms. A few suggestions for dhclient:
- Change the TIME define to ‘u_int64_t’ instead of ‘time_t’; that makes the relevant client_lease structure fields large enough to survive a wrap
- Change the fields of struct isc_time_t to ‘u_int64_t’ instead of ‘unsigned int’; maybe even change ‘struct isc_interval_t’ to use 64-bit values just to ensure wrapping doesn’t happen there too
- Stop using ‘struct timeval’; instead use the fixed ‘struct isc_time_t’
- Wrap gettimeofday() in a utility function that returns a ‘struct isc_time_t’ instead of ‘struct timeval’
- Then you don’t need the 32-bit checks in isc_time_nowplusinterval() or at least you could change them to be 64-bit safe instead.
For Fedora, fixed test packages are available here. Not sure about other distros.
May people invite you to all the conferences I will attend in the future.
Also, may someone develop a DHCP test suite.