Why No HTM Forward-Progress Guarantees?

Why are there no forward-progress guarantees for hardware transactional memory (HTM)?

Only the HTM vendors know for sure, but here are a few possible reasons:

  1. HTM transactions cannot tolerate non-idempotent operations such as I/O and system calls, so any forward-progress guarantees will apply only to transactions that operate only on normal memory.
  2. HTM transactions whose data falls into the same cache/TLB set and whose size exceeds either the cache's or the TLB's associativity (plus the size of any corresponding victim caches) cannot possibly succeed at all, which eliminates any possibility of a forward-progress guarantee for these unfortunate transactions. Any forward-progress guarantees must therefore be conditioned on the transaction's cache footprint relative to the minimum of the the worst case cache/TLB capacity.
  3. It is possible that some HTM implementations will be subject to other hardware resource limits, for example, store-buffer size.
  4. The fact that interrupts and exceptions on a given CPU will abort transactions running on that CPU means that a transaction cannot commit if its execution time exceeds the duration of the interval between interrupts. Any forward-progress guarantees must therfore be conditioned on the transaction's duration. (Sorry, no successful transactions containing infinite loops!)
  5. A production-quality system must handle a very large number of arbitrary and mutually conflicting concurrent transactions of varying shapes, sizes, and numbers of prior attempts. The difficulty inherent in efficiently resolving the resulting conflicts becomes apparent when you consider the large number of schemes that have been proposed. Perhaps current HTM implementations refrain from providing forward-progress guarantees because it is extremely hard to do so in the general case in a production-quality setting.

Until HTM implementations provide some sort of forward-progress guarantee, HTM will be limited by its fallback code sequences. For example, if the fallback code uses locking, then HTM will be just as vulnerable to deadlock as is the fallback code. Some speculate that HTM transactions will frequently succeed, and that the fallback code may therefore use simpler and less-scalable locking designs that are less prone to deadlock. However, this speculation requires demonstration, especially under overload conditions.

At the same time, it is important to note that even highly constrained forward-progress guarantees could be quite useful. For example, only a few cache lines are required to be able to atomically push and pop to/from a stack or queue. Furthermore, if an HTM update is nested within an RCU or hazard-pointer read-side critical section, small changes to large data structures can be handled by HTM implementations with small size guarantees. Such guarantees could greatly ease validation of HTM software, because it would not be necessary to separately validate fallback code. Of course, the larger the guaranteed size the better, but even small guarantees can be highly useful.

As always, infeasible perfection should be spurned in favor of achievable goodness.