Appendix J.4 - SRM Firmware Initialization and PAL Exception Dispatch

<< Click to Display Table of Contents >>

Navigation:  ASA-EMulatR Reference Guide > Introduction > Appendix > Appendix J - SRM Firmware Topic Hive >

Appendix J.4 - SRM Firmware Initialization and PAL Exception Dispatch

This section describes the SRM firmware initialization sequence, the PAL exception dispatch architecture, and the vector computation contracts that govern all PAL mode transitions.

 


 

J.4 SRM Boot Overview

 

The SRM firmware image (ES40_V6_2.EXE) is a compressed binary that self-decompresses into RAM before transferring control to the console. Boot proceeds in two phases.

 

Decompression phase. The ROM is loaded into guest memory at its physical base. CPU0 begins execution at PC=0x5C0 in PAL mode. The decompressor runs for approximately 5,767,331 cycles, expands the firmware image into RAM, then transfers control to the console at the boot handoff PC with PAL_BASE set to 0x600000.

 

Console phase. The SRM console executes from the decompressed image, constructs the HWRPB at PA=0x2000, initializes per-CPU state, and provides the boot environment for the OS loader.

 

Verified boot handoff values for ES40_V6_2.EXE:

 

finalPC = 0x00000000000005C0

finalPalBase = 0x0000000000600000

cyclesExecuted = 5,767,331

romHash = 0x9689831940DA2165 (FNV-1a over payload)

snapshotChecksum = 0xEF23EC3A5607179F (FNV-1a over snapshot file)

 


 

J.4.1 Snapshot System

 

The full decompression phase takes approximately 14 minutes in a debug build. The snapshot system captures complete machine state at boot handoff and reloads it in approximately 66 milliseconds, eliminating the decompression penalty on every subsequent boot.

 

The snapshot file format (.axpsnap) uses the following layout. Field order is the serialization order in SrmRomLoader::saveSnapshot() and must be read in exactly the same order by SrmRomLoader::loadSnapshot():

 

+------------------------------------------------------------------+

| HEADER                                                           |

+------------------------------------------------------------------+

| magic            quint64   0x415850534E415001  "AXPSNAPv1"       |

| version          quint32   1                                     |

| romHash          quint64   FNV-1a hash of ROM payload            |

| finalPC          quint64   Boot handoff PC (PAL bit included)    |

| finalPalBase     quint64   PAL_BASE at handoff                   |

| cyclesExecuted   quint64   Total decompressor cycles             |

| elapsedMs        double    Wall-clock decompression time         |

+------------------------------------------------------------------+

| MEMORY REGIONS                                                   |

+------------------------------------------------------------------+

| regionCount      quint32   Number of regions                     |

| regions[N]                                                       |

|   basePA         quint64   Physical base address                 |

|   size           quint64   Region size in bytes                  |

|   data           byte[]    Raw memory bytes                      |

+------------------------------------------------------------------+

| INTEGER REGISTERS                                                |

+------------------------------------------------------------------+

| intRegs[32]      quint64   R0-R31 at boot handoff                |

+------------------------------------------------------------------+

| FLOATING-POINT REGISTERS                                         |

+------------------------------------------------------------------+

| fpRegs[32]       quint64   F0-F31 at boot handoff                |

+------------------------------------------------------------------+

| INTERNAL PROCESSOR REGISTERS (IPRs)                              |

+------------------------------------------------------------------+

| iprCount         quint32   Number of IPR entries                 |

| iprEntries[N]                                                    |

|   id             quint32   IPR identifier                        |

|   value          quint64   IPR value                             |

+------------------------------------------------------------------+

| FOOTER                                                           |

+------------------------------------------------------------------+

| checksum         quint64   FNV-1a over all preceding bytes       |

+------------------------------------------------------------------+

 

 

Format note. The header contains exactly seven fields: magic, version, romHash, finalPC, finalPalBase, cyclesExecuted, elapsedMs. There is no emulatorBuild field in the current implementation. saveSnapshot() and loadSnapshot() must remain in strict field-count parity. Any addition of an emulatorBuild field requires a version bump (kSnapshotVersion) and rejection of all v1 snapshots at load time.

 

On every reload the ROM payload hash is recomputed and compared against the stored romHash. A mismatch rejects the snapshot unconditionally, deletes the stale file, and reruns decompression. The emulator is fully self-healing -- no manual snapshot management is ever required.

 


 

J.4.2 PAL Mode Architecture

 

PAL mode is signaled by PC bit 0 being set. All PAL mode entry, exit, and vector computation is owned exclusively by PalBox (PalBoxBase.h). PalService (Pal_Service.h) contains only CALL_PAL instruction semantic bodies.

 

PalBox owns: PAL mode entry, exit, vector dispatch, enterPal(),

 enterPalCore(), executeREI(), shadow register management

PalService owns: CALL_PAL instruction bodies (CALLSYS, BPT, SWPPAL, ...)

CPUStateView: all PAL mode mutation accessors (single source of truth)

 

⚠ ARCHITECTURAL INVARIANT -- PAL MODE TRUTH

PAL mode is defined as h->pc & 1.

No software cache of PAL mode state is maintained anywhere in the system. All callers read PAL mode state exclusively via m_iprGlobalMaster->isInPalMode(), which delegates directly to h->inPalMode(), which returns h->pc & 1.

Any code that tests, sets, or clears PAL mode by any mechanism other than reading or writing PC bit 0 through the CPUStateView accessors is architecturally incorrect and must be removed. This invariant is the foundation of snapshot integrity, exception dispatch correctness, and pipeline flush semantics.

 


 

J.4.3 CPUStateView PAL Mode Accessors

 

All PAL mode mutations go through CPUStateView (global_RegisterMaster_hot.h). No caller reads or writes PC bit 0 directly.

 

Function

Effect

Call Sites

isInPalMode()

Read h->pc & 1

All PAL mode guards

enterPalMode(entryPC)

Set new PC and assert PAL bit atomically

enterPalCore() only

exitPalMode(returnPC)

Set new PC and clear PAL bit

executeREI(), commitPalResult()

resetPalMode()

PC = 0x8001 (cold reset only)

CPU cold reset path only

setPalMode(bool)

Set or clear PAL bit on current PC

Legacy -- to be retired

 

Critical distinction. enterPalMode(entryPC) sets the destination PC and the PAL bit atomically. advancePC() preserves the current PAL bit and must never be used for PAL entry -- it would fail to assert PAL mode when entering from a non-PAL context. This was the root cause of one of the three exception dispatch bugs corrected prior to J.4.14.

 


 

J.4.4 Vector Computation

 

All PAL entry addresses are computed by CPUStateView. No caller computes PAL entry addresses independently. Two functions cover all cases.

 

computeExceptionVector(PalVectorId_EV6) -- Fault and exception entry:

 

entryPC = (PAL_BASE & ~0x7FFF) | (vecId & 0x7FFE) | 0x1

 

computeCallPalEntry(quint32 func) -- CALL_PAL dispatch. Illegal function codes (0x40--0x7F, >0xBF) redirect to the OPCDEC vector. Encoding for legal functions:

 

entryPC = (PAL_BASE & ~0x7FFF) | (1 << 13) | (func[7] << 12)

 | (func[5:0] << 6) | 0x1

 

Bit 13 distinguishes CALL_PAL entries from exception vectors. PC bit 0 is always set on entry.

 


 

J.4.5 PAL Entry Contract

 

All PAL entry flows converge at PalBox::enterPalCore(). Two public overloads exist.

 

enterPal(TrapCode_Class, faultPC) -- Overload A. Maps TrapCode_Class to PalVectorId_EV6 via mapTrapToPalVector(), then calls computeExceptionVector(). faultPC is the address of the faulting instruction (HW_REI retries it).

 

enterPal(PalEntryReason, vectorOrSelector, faultPC) -- Overload B. Routes through computeCallPalEntry() for CALL_PAL_INSTRUCTION reason, or computeExceptionVector() for explicit vector dispatch. faultPC is pc+4 (return address after the CALL_PAL instruction).

 

enterPalCore() executes the following steps in order -- no step may be reordered:

 

1. Guard: entryPC == 0 or sentinel value -> escalate to MCHK

2. saveContext() -- snapshot before any state change

3. exc_addr = faultPC -- save return / retry address

4. enterPalMode(entryPC) -- set PC and assert PAL bit atomically

5. setIPL_Unsynced(7) -- mask all interrupts

6. setCM(CM_KERNEL) -- force kernel mode

7. shadowRegsActive = true -- activate PAL shadow registers

8. return BoxResult().flushPipeline()

 

saveContext() must be called before any architectural state is modified. This is the invariant that makes snapshot and recovery reliable.

 


 

J.4.6 PAL Exit Contract

 

Three exit mechanisms exist, each with distinct return address semantics.

 

Instruction

Return Address Source

PAL Bit After

Notes

HW_REI (HW_RET)

Rb register (PC <- Rb)

Cleared

EV6 opcode 0x1E. NOT exc_addr.

RETSYS / RTI / RFE

exc_addr

Cleared

CALL_PAL return path

HALT

Current PC (stays in place)

Preserved

kHalt flag in PalResult.sideEffects

 

HW_REI uses Rb, not exc_addr. The EV6 HW_RET instruction reads the return address from the Rb register field of the instruction encoding. exc_addr holds the fault PC for fault retry. SRM uses HW_REI as a general-purpose PC manipulation instruction during initialization, so exc_addr=0 at early HW_REI instructions is expected and correct. See J.4.12 sequence 63.

 


 

J.4.7 commitPalResult() PC Dispatch

 

After a CALL_PAL handler returns a PalResult, commitPalResult() applies the PC change via the following three-way dispatch on PipelineEffect flags:

 

if pr.pcModified:

 if pr.has(kHalt): advancePC(pr.newPC) -- HALT: preserve PAL mode

 elif pr.has(kResetEntry): enterPalMode(pr.newPC) -- re-entry: assert PAL bit

 else: exitPalMode(pr.newPC) -- normal return: clear PAL bit

 

The kResetEntry flag is set by executeRESTART(). The restart vector is computed via computeExceptionVector(PalVectorId_EV6::RESET). The raw pal_base value must not be used as a PC directly -- its low 15 bits are non-zero and the PAL mode bit is absent.

 

Prior bug note. An earlier version of this function called exitPalMode() unconditionally for all pcModified cases. This caused incorrect PAL bit behavior on RESTART and HALT paths. The three-way dispatch above is the corrected form.

 


 

J.4.8 CALL_PAL Grain Dispatch Chain

 

IBox fetch: decode CALL_PAL instruction

 grain->execute(slot)

 slot.m_palBox->executeXXX(slot) -- PalBox grain entry

 enterPal(PalEntryReason, func, faultPC) -- PAL mode entry

 computeCallPalEntry(func) -- vector address

 enterPalCore(reason, entryPC, faultPC)

 saveContext()

 exc_addr = faultPC

 enterPalMode(entryPC)

 setIPL(7), setCM(KERNEL), shadowRegsActive = true

 m_palService->execute(fn, slot, result) -- OSF/1 semantics

 commitPalResult(slot, result) -- apply side effects

 

PalService never computes vectors, sets PAL mode, or writes exc_addr. By the time PalService::execute() is called, PAL mode is already active and architectural state is saved.

 


 

J.4.9 Exception Dispatch Chain

 

grain->execute(slot) detects fault

 BoxResult returned: trapCode, faultPending=true, faultPC

 runOneInstruction() checks BoxResult

 m_pBox->enterPal(trapCode, faultPC) -- Overload A

 mapTrapToPalVector(trapCode) -- TrapCode -> PalVectorId_EV6

 computeExceptionVector(vecId) -- PalVectorId_EV6 -> entry PC

 enterPalCore(EXCEPTION, entryPC, faultPC)

 saveContext()

 exc_addr = faultPC -- retry address for HW_REI

 enterPalMode(entryPC) -- PC set, PAL bit asserted

 setIPL(7), setCM(KERNEL), shadowRegsActive = true

 return BoxResult().flushPipeline()

 

IBox fault routing. When IBox fetches an instruction with no matching grain (grain == nullptr), it populates the pipeline slot fault fields directly: trapCode=ILLEGAL_INSTRUCTION, faultPending=true, faultVA=pc. The FetchStats.illegalInstructions counter is incremented. This fault then flows through the normal BoxResult dispatch path above.

 

Prior bug note. An earlier version routed through enterPALVector() which used the static PalVectorTable instead of computeExceptionVector(). This produced incorrect entry PCs for any exception whose vector offset differed between the static table and the architectural formula. The dispatch chain above is the corrected form.

 


 

J.4.10 mapTrapToPalVector

 

mapTrapToPalVector() in Pal_core_inl.h is the mapping table between TrapCode_Class values and PalVectorId_EV6 offsets. An incorrect entry here causes a wrong vector address regardless of the dispatch mechanism's correctness. The table must agree with the EV6 PAL vector table. Key entries for SRM testing:

 

TrapCode_Class

PalVectorId_EV6

Vector Offset

ILLEGAL_INSTRUCTION

OPCDEC

0x0480

ARITHMETIC

ARITH

0x0500

MEMORY_FAULT

MM_FAULT

0x0300 (ITB) / 0x0400 (DTB)

BREAKPOINT

BPT

0x0080 (CALL_PAL 0x80)

UNALIGNED

UNA

0x0600

PRIVILEGED

OPCDEC

0x0480

MACHINE_CHECK

MCHK

0x0660

INTERRUPT

INTERRUPT

0x0680

 

With PAL_BASE=0x600000 and OPCDEC offset 0x0480, the ILLEGAL_INSTRUCTION entry PC is (0x600000 & ~0x7FFF) | 0x0480 | 0x1 = 0x600481.

 


 

J.4.11 SWPPAL Architectural Contract

 

SWPPAL (CALL_PAL 0x83) transfers control to the current PALcode, which then performs the switch to a new PALcode image. The CPU performs only three operations. PALcode performs all remaining work.

 

CPU responsibilities (PalService::executeSWPPAL):

1. Validate privilege (kernel mode required, ASA Section 6.5)

2. enterPal(CALL_PAL_INSTRUCTION, SWPPAL, faultPC)

3. Dispatch to SWPPAL vector (R16-R21 preserved for PALcode)

 

PALcode responsibilities (not emulated at CPU level):

1. Validate R16 (PAL variant 0-255 or physical base address)

2. Locate PALcode image

3. Flush icache, invalidate TBs

4. Transfer control to new PALcode

5. R0 = 0 success / 1 unknown variant / 2 not loaded

 

The emulator does not compute variant offsets or perform the PALcode switch. Reference: ASA Section 6.5.6, pages 6-21 to 6-22.

 


 

J.4.12 SRM Execution Trace Reference

 

Regression anchors from the DEC ASM reference trace:

(tracescpu_trace.lst https://github.com/timothyPeer/EmulatRAppUni/blob/main/Trace%20Output/tracescpu_trace_ASM.zip ) for SRM console initialization. All addresses are in the decompressed firmware image at PA 0x0.

 

Seq

PC

Instruction

Expected Behavior

63

0x6DC

HW_REI

PC <- R26 = 0x6E0 (BSR return continues). exc_addr=0 is expected and correct here.

70

0x6EC

HW_REI

PC <- Rb = 0x5C8 (HALT sentinel)

147

0x6DC

HW_REI

PC <- R26 = 0x6E0 (second pass)

154

0x6EC

HW_REI

PC <- Rb = 0x6F0 (HALT sentinel)

210

0x808

HW_REI

PC <- Rb = 0x78C

218

0x79C

HW_REI

PC <- Rb = 0x000 (BR spin loop, intentional -- not a runaway)

 

exc_addr=0 at sequence 63 is expected and correct. SRM uses HW_REI as a general-purpose control transfer instruction during initialization. The exc_addr field is not architecturally relevant until the first real exception entry occurs. See J.4.6 for HW_REI return address semantics.

 


 

J.4.13 CPU Thread Gate (Phase 14c / Phase 15)

 

CPU0 starts halted. ExecutionCoordinator::startCPU() releases it after the snapshot is loaded and all memory regions are mapped, preventing speculative execution before architectural state is fully initialized.

 

Phase 14c: ROM loaded, memory regions registered, FNV-1a hash verified

Phase 15: ExecutionCoordinator::startCPU() called

 CPU0 halt latch released

 executeLoop() flushes stale pipeline state

 CPU0 begins executing from finalPC = 0x5C0

 

The halt wakeup in executeLoop() flushes the pipeline before the first fetch, preventing stale slot state from the snapshot reload from being interpreted as live instructions.

 


 

J.4.14 Known Issues and Pending Verification

 

mapTrapToPalVector audit. The TrapCode_Class-to-PalVectorId_EV6 mapping in Pal_core_inl.h has not been audited against the EV6 vector table. This must be verified before TRAPB testing proceeds.

 

TRAPB / ILLEGAL_INSTRUCTION vector verification. Three bugs in the exception dispatch path have been corrected: (1) enterPALVector() used PalVectorTable instead of computeExceptionVector(); (2) enterPalCore() called advancePC() instead of enterPalMode(); (3) commitPalResult() called exitPalMode() unconditionally. The corrected path must be verified by running to TRAPB and confirming entry PC = (PAL_BASE & ~0x7FFF) | 0x0480 | 0x1.

 

HWRPB construction. The HWRPB signature (0x42707248 "HrpB") is constructed by SRM console code executing from PC=0x5C0, not by the decompressor. The sys_type field, PALcode size fields, and MDP entries at PA=0x2000 have not yet been verified against expected values.

 

Secondary CPU startup. SRM issues b -fl0,0 to start secondary CPUs via IPI handler. setHalted(false) via IPI has not been tested in the SMP configuration.

 

Firmware deployment glob (CMakeLists.txt). The file(GLOB ...) call for deploying *_64.exe firmware files has argument order reversed (CONFIGURE_DEPENDS and the variable name are swapped). As written, FIRMWARE_64_FILES is never populated and firmware files are silently not deployed to the build output directory. Users who do not pre-stage firmware files in the expected path will encounter a missing-file error at runtime rather than a build-time diagnostic. The correct form is:

file(GLOB FIRMWARE_64_FILES CONFIGURE_DEPENDS

 "${PROJECT_SOURCE_DIR}/firmware/*_64.exe"

)

 

Snapshot format versioning. The current kSnapshotVersion = 1 header contains seven fields. Any future addition (e.g., a planned emulatorBuild field) requires incrementing kSnapshotVersion and adding explicit rejection of v1 snapshots in loadSnapshot(). The save and load field counts must remain in strict parity at all times.

 


 

See Also:J.3 - SRM-D Snapshot Mechanics (configuration, file format, initialization flow);J.1 - ROM Loader: Descriptor Derivation and Snapshot Validation;Chapter 14 - Execution Domains ("Boxes");PalBoxBase.h; Pal_Service.h; PAL_core.h; PAL_core_inl.h; global_RegisterMaster_hot.h;SrmRomLoader.cpp; SrmRomLoader.h;Alpha Architecture Handbook Section 4.11 (PAL Mode);Alpha Architecture Handbook Section 6.5 (CALL_PAL);Alpha 21264/EV6 Hardware Reference Manual.