13.7 Stall Mechanics

<< Click to Display Table of Contents >>

Navigation:  ASA-EMulatR Reference Guide > Introduction > Architecture Overview > Chapter 13 – AlphaPipeline Implementation >

13.7 Stall Mechanics

13.7.1 Stall Conditions

 

A PipelineSlot may stall (slot.stalled = true) under the following conditions:

 

Memory barrier pending — needsMemoryBarrier && !memoryBarrierCompleted. Detected in stage_MEM(). The slot holds in MEM until CBox signals completion via MemoryBarrierCoordinator. Applies to MB (global visibility) and EXCB (pending exception resolution).

 

Write buffer drain pending — needsWriteBufferDrain && !writeBufferDrained. Detected in stage_MEM(). The slot holds in MEM until the write buffer confirms all prior stores are committed. Applies to WMB (local write ordering).

 

Multi-cycle FP execution — the grain blocks the slot in stage_EX() until the floating-point operation completes. The slot remains in EX for the duration of the operation; stage_EX() re-evaluates completion on each tick.

 

TRAPB/EXCB serialization — the instruction must wait for all prior instructions that may generate traps or exceptions to complete before proceeding. Enforced through the barrier stall mechanism in stage_MEM().

 

Device backpressure — an MMIO access to a device that cannot accept the operation immediately. The slot stalls in EX until the device signals readiness.

 


 

13.7.2 Stall Detection and Enforcement

 

Stalls are detected at two stages: stage_EX() detects multi-cycle operation stalls and device backpressure. stage_MEM() detects barrier and write-buffer-drain stalls. In both cases, the detecting stage sets slot.stalled = true and returns without advancing the slot.

 

Stall clearing: The stall condition is re-evaluated on every subsequent tick. For barrier stalls, CBox sets memoryBarrierCompleted = true via the MemoryBarrierCoordinator when the barrier is satisfied. For write buffer drains, CBox sets writeBufferDrained = true when all prior stores are flushed. For multi-cycle FP, the FBox signals completion and the grain updates the slot. The clearing agent is always the subsystem that owns the completion condition — never the pipeline itself.

 

Critical detail: When a barrier stall clears in stage_MEM(), the grain does not re-execute. Execution already completed in stage_EX() on a prior tick — the result sits in slot.payLoad and slot.m_pending. The stall only delayed the register writeback (commitPending) and progression to WB. On clearing, stage_MEM() performs commitPending() normally and the slot advances to WB on the next tick.

 


 

13.7.3 Stall Propagation

 

Stalls propagate forward only (toward younger instructions). If stage N stalls, stage N−1 cannot advance, and all earlier stages remain frozen. Later stages (toward WB) continue draining. This guarantees forward progress — older instructions always retire before younger ones can proceed.

 

Bubbles vs stalls: A bubble is an empty slot (slot.valid = false) flowing through the pipeline — it occupies a position but performs no work. A stall is a live slot frozen in place — it contains a valid instruction that cannot advance. Bubbles pass through harmlessly. Stalls block all upstream stages until the stall condition clears.

 

The stall is checked in execute() after all stages run: if isPipelineStalled() returns true, the ring buffer does not advance (advanceRing() is skipped), and BoxResult::stallPipeline() is returned to AlphaCPU. The stalled stage will be re-evaluated on the next tick.

 

See Also: Section 6.5 – Pipeline Behavior (barrier stalls); Chapter 15 – Memory System Implementation Details (MemoryBarrierCoordinator).