|
<< Click to Display Table of Contents >> Navigation: ASA-EMulatR Reference Guide > Introduction > Appendix > Appendix J - SRM Firmware Topic Hive > Appendix J.4 - SRM Firmware Initialization and PAL Exception Dispatch |
This section describes the SRM firmware initialization sequence, the PAL exception dispatch architecture, and the vector computation contracts that govern all PAL mode transitions.
The SRM firmware image (ES40_V6_2.EXE) is a compressed binary that self-decompresses into RAM before transferring control to the console. Boot proceeds in two phases.
Decompression phase. The ROM is loaded into guest memory at its physical base. CPU0 begins execution at PC=0x5C0 in PAL mode. The decompressor runs for approximately 5,767,331 cycles, expands the firmware image into RAM, then transfers control to the console at the boot handoff PC with PAL_BASE set to 0x600000.
Console phase. The SRM console executes from the decompressed image, constructs the HWRPB at PA=0x2000, initializes per-CPU state, and provides the boot environment for the OS loader.
Verified boot handoff values for ES40_V6_2.EXE:
finalPC = 0x00000000000005C0
finalPalBase = 0x0000000000600000
cyclesExecuted = 5,767,331
romHash = 0x9689831940DA2165 (FNV-1a over payload)
snapshotChecksum = 0xEF23EC3A5607179F (FNV-1a over snapshot file)
The full decompression phase takes approximately 14 minutes in a debug build. The snapshot system captures complete machine state at boot handoff and reloads it in approximately 66 milliseconds, eliminating the decompression penalty on every subsequent boot.
The snapshot file format (.axpsnap) uses the following layout. Field order is the serialization order in SrmRomLoader::saveSnapshot() and must be read in exactly the same order by SrmRomLoader::loadSnapshot():
+------------------------------------------------------------------+
| HEADER |
+------------------------------------------------------------------+
| magic quint64 0x415850534E415001 "AXPSNAPv1" |
| version quint32 1 |
| romHash quint64 FNV-1a hash of ROM payload |
| finalPC quint64 Boot handoff PC (PAL bit included) |
| finalPalBase quint64 PAL_BASE at handoff |
| cyclesExecuted quint64 Total decompressor cycles |
| elapsedMs double Wall-clock decompression time |
+------------------------------------------------------------------+
| MEMORY REGIONS |
+------------------------------------------------------------------+
| regionCount quint32 Number of regions |
| regions[N] |
| basePA quint64 Physical base address |
| size quint64 Region size in bytes |
| data byte[] Raw memory bytes |
+------------------------------------------------------------------+
| INTEGER REGISTERS |
+------------------------------------------------------------------+
| intRegs[32] quint64 R0-R31 at boot handoff |
+------------------------------------------------------------------+
| FLOATING-POINT REGISTERS |
+------------------------------------------------------------------+
| fpRegs[32] quint64 F0-F31 at boot handoff |
+------------------------------------------------------------------+
| INTERNAL PROCESSOR REGISTERS (IPRs) |
+------------------------------------------------------------------+
| iprCount quint32 Number of IPR entries |
| iprEntries[N] |
| id quint32 IPR identifier |
| value quint64 IPR value |
+------------------------------------------------------------------+
| FOOTER |
+------------------------------------------------------------------+
| checksum quint64 FNV-1a over all preceding bytes |
+------------------------------------------------------------------+
Format note. The header contains exactly seven fields: magic, version, romHash, finalPC, finalPalBase, cyclesExecuted, elapsedMs. There is no emulatorBuild field in the current implementation. saveSnapshot() and loadSnapshot() must remain in strict field-count parity. Any addition of an emulatorBuild field requires a version bump (kSnapshotVersion) and rejection of all v1 snapshots at load time.
On every reload the ROM payload hash is recomputed and compared against the stored romHash. A mismatch rejects the snapshot unconditionally, deletes the stale file, and reruns decompression. The emulator is fully self-healing -- no manual snapshot management is ever required.
PAL mode is signaled by PC bit 0 being set. All PAL mode entry, exit, and vector computation is owned exclusively by PalBox (PalBoxBase.h). PalService (Pal_Service.h) contains only CALL_PAL instruction semantic bodies.
PalBox owns: PAL mode entry, exit, vector dispatch, enterPal(),
enterPalCore(), executeREI(), shadow register management
PalService owns: CALL_PAL instruction bodies (CALLSYS, BPT, SWPPAL, ...)
CPUStateView: all PAL mode mutation accessors (single source of truth)
⚠ ARCHITECTURAL INVARIANT -- PAL MODE TRUTH |
|---|
PAL mode is defined as h->pc & 1. No software cache of PAL mode state is maintained anywhere in the system. All callers read PAL mode state exclusively via m_iprGlobalMaster->isInPalMode(), which delegates directly to h->inPalMode(), which returns h->pc & 1. Any code that tests, sets, or clears PAL mode by any mechanism other than reading or writing PC bit 0 through the CPUStateView accessors is architecturally incorrect and must be removed. This invariant is the foundation of snapshot integrity, exception dispatch correctness, and pipeline flush semantics. |
All PAL mode mutations go through CPUStateView (global_RegisterMaster_hot.h). No caller reads or writes PC bit 0 directly.
Function |
Effect |
Call Sites |
|---|---|---|
isInPalMode() |
Read h->pc & 1 |
All PAL mode guards |
enterPalMode(entryPC) |
Set new PC and assert PAL bit atomically |
enterPalCore() only |
exitPalMode(returnPC) |
Set new PC and clear PAL bit |
executeREI(), commitPalResult() |
resetPalMode() |
PC = 0x8001 (cold reset only) |
CPU cold reset path only |
setPalMode(bool) |
Set or clear PAL bit on current PC |
Legacy -- to be retired |
Critical distinction. enterPalMode(entryPC) sets the destination PC and the PAL bit atomically. advancePC() preserves the current PAL bit and must never be used for PAL entry -- it would fail to assert PAL mode when entering from a non-PAL context. This was the root cause of one of the three exception dispatch bugs corrected prior to J.4.14.
All PAL entry addresses are computed by CPUStateView. No caller computes PAL entry addresses independently. Two functions cover all cases.
computeExceptionVector(PalVectorId_EV6) -- Fault and exception entry:
entryPC = (PAL_BASE & ~0x7FFF) | (vecId & 0x7FFE) | 0x1
computeCallPalEntry(quint32 func) -- CALL_PAL dispatch. Illegal function codes (0x40--0x7F, >0xBF) redirect to the OPCDEC vector. Encoding for legal functions:
entryPC = (PAL_BASE & ~0x7FFF) | (1 << 13) | (func[7] << 12)
| (func[5:0] << 6) | 0x1
Bit 13 distinguishes CALL_PAL entries from exception vectors. PC bit 0 is always set on entry.
All PAL entry flows converge at PalBox::enterPalCore(). Two public overloads exist.
enterPal(TrapCode_Class, faultPC) -- Overload A. Maps TrapCode_Class to PalVectorId_EV6 via mapTrapToPalVector(), then calls computeExceptionVector(). faultPC is the address of the faulting instruction (HW_REI retries it).
enterPal(PalEntryReason, vectorOrSelector, faultPC) -- Overload B. Routes through computeCallPalEntry() for CALL_PAL_INSTRUCTION reason, or computeExceptionVector() for explicit vector dispatch. faultPC is pc+4 (return address after the CALL_PAL instruction).
enterPalCore() executes the following steps in order -- no step may be reordered:
1. Guard: entryPC == 0 or sentinel value -> escalate to MCHK
2. saveContext() -- snapshot before any state change
3. exc_addr = faultPC -- save return / retry address
4. enterPalMode(entryPC) -- set PC and assert PAL bit atomically
5. setIPL_Unsynced(7) -- mask all interrupts
6. setCM(CM_KERNEL) -- force kernel mode
7. shadowRegsActive = true -- activate PAL shadow registers
8. return BoxResult().flushPipeline()
saveContext() must be called before any architectural state is modified. This is the invariant that makes snapshot and recovery reliable.
Three exit mechanisms exist, each with distinct return address semantics.
Instruction |
Return Address Source |
PAL Bit After |
Notes |
|---|---|---|---|
HW_REI (HW_RET) |
Rb register (PC <- Rb) |
Cleared |
EV6 opcode 0x1E. NOT exc_addr. |
RETSYS / RTI / RFE |
exc_addr |
Cleared |
CALL_PAL return path |
HALT |
Current PC (stays in place) |
Preserved |
kHalt flag in PalResult.sideEffects |
HW_REI uses Rb, not exc_addr. The EV6 HW_RET instruction reads the return address from the Rb register field of the instruction encoding. exc_addr holds the fault PC for fault retry. SRM uses HW_REI as a general-purpose PC manipulation instruction during initialization, so exc_addr=0 at early HW_REI instructions is expected and correct. See J.4.12 sequence 63.
After a CALL_PAL handler returns a PalResult, commitPalResult() applies the PC change via the following three-way dispatch on PipelineEffect flags:
if pr.pcModified:
if pr.has(kHalt): advancePC(pr.newPC) -- HALT: preserve PAL mode
elif pr.has(kResetEntry): enterPalMode(pr.newPC) -- re-entry: assert PAL bit
else: exitPalMode(pr.newPC) -- normal return: clear PAL bit
The kResetEntry flag is set by executeRESTART(). The restart vector is computed via computeExceptionVector(PalVectorId_EV6::RESET). The raw pal_base value must not be used as a PC directly -- its low 15 bits are non-zero and the PAL mode bit is absent.
Prior bug note. An earlier version of this function called exitPalMode() unconditionally for all pcModified cases. This caused incorrect PAL bit behavior on RESTART and HALT paths. The three-way dispatch above is the corrected form.
IBox fetch: decode CALL_PAL instruction
grain->execute(slot)
slot.m_palBox->executeXXX(slot) -- PalBox grain entry
enterPal(PalEntryReason, func, faultPC) -- PAL mode entry
computeCallPalEntry(func) -- vector address
enterPalCore(reason, entryPC, faultPC)
saveContext()
exc_addr = faultPC
enterPalMode(entryPC)
setIPL(7), setCM(KERNEL), shadowRegsActive = true
m_palService->execute(fn, slot, result) -- OSF/1 semantics
commitPalResult(slot, result) -- apply side effects
PalService never computes vectors, sets PAL mode, or writes exc_addr. By the time PalService::execute() is called, PAL mode is already active and architectural state is saved.
grain->execute(slot) detects fault
BoxResult returned: trapCode, faultPending=true, faultPC
runOneInstruction() checks BoxResult
m_pBox->enterPal(trapCode, faultPC) -- Overload A
mapTrapToPalVector(trapCode) -- TrapCode -> PalVectorId_EV6
computeExceptionVector(vecId) -- PalVectorId_EV6 -> entry PC
enterPalCore(EXCEPTION, entryPC, faultPC)
saveContext()
exc_addr = faultPC -- retry address for HW_REI
enterPalMode(entryPC) -- PC set, PAL bit asserted
setIPL(7), setCM(KERNEL), shadowRegsActive = true
return BoxResult().flushPipeline()
IBox fault routing. When IBox fetches an instruction with no matching grain (grain == nullptr), it populates the pipeline slot fault fields directly: trapCode=ILLEGAL_INSTRUCTION, faultPending=true, faultVA=pc. The FetchStats.illegalInstructions counter is incremented. This fault then flows through the normal BoxResult dispatch path above.
Prior bug note. An earlier version routed through enterPALVector() which used the static PalVectorTable instead of computeExceptionVector(). This produced incorrect entry PCs for any exception whose vector offset differed between the static table and the architectural formula. The dispatch chain above is the corrected form.
mapTrapToPalVector() in Pal_core_inl.h is the mapping table between TrapCode_Class values and PalVectorId_EV6 offsets. An incorrect entry here causes a wrong vector address regardless of the dispatch mechanism's correctness. The table must agree with the EV6 PAL vector table. Key entries for SRM testing:
TrapCode_Class |
PalVectorId_EV6 |
Vector Offset |
|---|---|---|
ILLEGAL_INSTRUCTION |
OPCDEC |
0x0480 |
ARITHMETIC |
ARITH |
0x0500 |
MEMORY_FAULT |
MM_FAULT |
0x0300 (ITB) / 0x0400 (DTB) |
BREAKPOINT |
BPT |
0x0080 (CALL_PAL 0x80) |
UNALIGNED |
UNA |
0x0600 |
PRIVILEGED |
OPCDEC |
0x0480 |
MACHINE_CHECK |
MCHK |
0x0660 |
INTERRUPT |
INTERRUPT |
0x0680 |
With PAL_BASE=0x600000 and OPCDEC offset 0x0480, the ILLEGAL_INSTRUCTION entry PC is (0x600000 & ~0x7FFF) | 0x0480 | 0x1 = 0x600481.
SWPPAL (CALL_PAL 0x83) transfers control to the current PALcode, which then performs the switch to a new PALcode image. The CPU performs only three operations. PALcode performs all remaining work.
CPU responsibilities (PalService::executeSWPPAL):
1. Validate privilege (kernel mode required, ASA Section 6.5)
2. enterPal(CALL_PAL_INSTRUCTION, SWPPAL, faultPC)
3. Dispatch to SWPPAL vector (R16-R21 preserved for PALcode)
PALcode responsibilities (not emulated at CPU level):
1. Validate R16 (PAL variant 0-255 or physical base address)
2. Locate PALcode image
3. Flush icache, invalidate TBs
4. Transfer control to new PALcode
5. R0 = 0 success / 1 unknown variant / 2 not loaded
The emulator does not compute variant offsets or perform the PALcode switch. Reference: ASA Section 6.5.6, pages 6-21 to 6-22.
Regression anchors from the DEC ASM reference trace:
(tracescpu_trace.lst https://github.com/timothyPeer/EmulatRAppUni/blob/main/Trace%20Output/tracescpu_trace_ASM.zip ) for SRM console initialization. All addresses are in the decompressed firmware image at PA 0x0.
Seq |
PC |
Instruction |
Expected Behavior |
|---|---|---|---|
63 |
0x6DC |
HW_REI |
PC <- R26 = 0x6E0 (BSR return continues). exc_addr=0 is expected and correct here. |
70 |
0x6EC |
HW_REI |
PC <- Rb = 0x5C8 (HALT sentinel) |
147 |
0x6DC |
HW_REI |
PC <- R26 = 0x6E0 (second pass) |
154 |
0x6EC |
HW_REI |
PC <- Rb = 0x6F0 (HALT sentinel) |
210 |
0x808 |
HW_REI |
PC <- Rb = 0x78C |
218 |
0x79C |
HW_REI |
PC <- Rb = 0x000 (BR spin loop, intentional -- not a runaway) |
exc_addr=0 at sequence 63 is expected and correct. SRM uses HW_REI as a general-purpose control transfer instruction during initialization. The exc_addr field is not architecturally relevant until the first real exception entry occurs. See J.4.6 for HW_REI return address semantics.
CPU0 starts halted. ExecutionCoordinator::startCPU() releases it after the snapshot is loaded and all memory regions are mapped, preventing speculative execution before architectural state is fully initialized.
Phase 14c: ROM loaded, memory regions registered, FNV-1a hash verified
Phase 15: ExecutionCoordinator::startCPU() called
CPU0 halt latch released
executeLoop() flushes stale pipeline state
CPU0 begins executing from finalPC = 0x5C0
The halt wakeup in executeLoop() flushes the pipeline before the first fetch, preventing stale slot state from the snapshot reload from being interpreted as live instructions.
mapTrapToPalVector audit. The TrapCode_Class-to-PalVectorId_EV6 mapping in Pal_core_inl.h has not been audited against the EV6 vector table. This must be verified before TRAPB testing proceeds.
TRAPB / ILLEGAL_INSTRUCTION vector verification. Three bugs in the exception dispatch path have been corrected: (1) enterPALVector() used PalVectorTable instead of computeExceptionVector(); (2) enterPalCore() called advancePC() instead of enterPalMode(); (3) commitPalResult() called exitPalMode() unconditionally. The corrected path must be verified by running to TRAPB and confirming entry PC = (PAL_BASE & ~0x7FFF) | 0x0480 | 0x1.
HWRPB construction. The HWRPB signature (0x42707248 "HrpB") is constructed by SRM console code executing from PC=0x5C0, not by the decompressor. The sys_type field, PALcode size fields, and MDP entries at PA=0x2000 have not yet been verified against expected values.
Secondary CPU startup. SRM issues b -fl0,0 to start secondary CPUs via IPI handler. setHalted(false) via IPI has not been tested in the SMP configuration.
Firmware deployment glob (CMakeLists.txt). The file(GLOB ...) call for deploying *_64.exe firmware files has argument order reversed (CONFIGURE_DEPENDS and the variable name are swapped). As written, FIRMWARE_64_FILES is never populated and firmware files are silently not deployed to the build output directory. Users who do not pre-stage firmware files in the expected path will encounter a missing-file error at runtime rather than a build-time diagnostic. The correct form is:
file(GLOB FIRMWARE_64_FILES CONFIGURE_DEPENDS
"${PROJECT_SOURCE_DIR}/firmware/*_64.exe"
)
Snapshot format versioning. The current kSnapshotVersion = 1 header contains seven fields. Any future addition (e.g., a planned emulatorBuild field) requires incrementing kSnapshotVersion and adding explicit rejection of v1 snapshots in loadSnapshot(). The save and load field counts must remain in strict parity at all times.
See Also:J.3 - SRM-D Snapshot Mechanics (configuration, file format, initialization flow);J.1 - ROM Loader: Descriptor Derivation and Snapshot Validation;Chapter 14 - Execution Domains ("Boxes");PalBoxBase.h; Pal_Service.h; PAL_core.h; PAL_core_inl.h; global_RegisterMaster_hot.h;SrmRomLoader.cpp; SrmRomLoader.h;Alpha Architecture Handbook Section 4.11 (PAL Mode);Alpha Architecture Handbook Section 6.5 (CALL_PAL);Alpha 21264/EV6 Hardware Reference Manual.