Wednesday, April 6, 2011

MIPS Exceptions Initialization and Handling on Linux

Exceptions and interrupts are un-expected events that disrupt the normal flow of execution. Interrupts are unexpected events from outside the CPU core. Exceptions are from within the CPU, including memory translation exceptions, cache misses, unusual program conditions such as unaligned loads, system calls and traps, and so on.

MIPS architecture adopts precise exception mechanisms, and make it easy for the software to deal with the exceptions. After any exception, CP0 EPC register points to the correct place to restart execution after the exception is dealt with.


The followed is a brief list of MIPS exceptions in order of their relative priority.

  1. Reset
  2. NMI -- non-maskable interrupts
  3. Interrupt -- hardware and software interrupts
  4. AdEL -- fetch address alignment error; fetch reference to protected address
  5. TLBL -- fetch TLB miss
  6. ICache Error
  7. Sys -- execution of SYSCALL instruction
  8. Ov -- execution of an arithmetic instruction that overflowed
  9. Tr -- execution of a trap
  10. AdEL -- load address alignement error; load reference to protected address
  11. AdES -- store address alignement error; store reference to protected address
  12. TLBL -- load TLB miss
  13. TLBS -- store TLB miss
  14. DCache Error

MIPSr2 architecture supports 3 interrupt modes. It provides interrupt compatibility mode, which behaves identical to MIPSr1. It also supports vectored interrupt (VI) mode, and permits the use of an external interrupt controller (EIC).

VI mode adds the ability to prioritize and vector interrupts to a handler dedicated to that interrupt. Each interrupt starts at the address according to the interrupt signal. It also allows us to configure the space between different entry points (VS of IntCtl register). In EIC mode, the six independent signals become a 6-bit binary number: Zero means no interrupt, others are 63 distinct interrupt codes.

An exception vector is where the exception handling starts. MIPS exceptions are handled either through the general exception vector (offset 0x180), or the special interrupt vector (0x200), based on the value of IV of Cause register. Software was required to prioritize exceptions in the handler prologue.

The followed is the table of exception entry points (vector addresses).
Where Base is 0x80000000 by default, and can be configured by VA of EBase register. And RBase is 0xBFC00000 by default, and can be changed by configuring SI_UseExceptionBase to 1. Cache error handling instructions have to be fetched through uncached KSeg1 window.

Here is what a MIPS CPU does when it decides to take an exception:
  1. It sets up the EPC to point to the restart location.
  2. It force CPU into kernel mode and disables interrupts. (by setting EXL of SR register)
  3. Cause register is set up so that software can see the reason for the exception. On address exceptions, BadVAddr is also set. Memory management system exceptions set up some of the MMU registers too.
  4. CPU then starts fetching instructions from the exception entry point, and everything else is up to software.

So, let's see how MIPS Linux initializes handling routines and handles exceptions.
  1 void __init trap_init(void)
  2 {
  3         extern char except_vec3_generic, except_vec3_r4000;
  4         extern char except_vec4;
  5         unsigned long i;
  6         int rollback;
  7 
  8         check_wait();
  9         rollback = (cpu_wait == r4k_wait);
 10 
 11 #if defined(CONFIG_KGDB)
 12         if (kgdb_early_setup)
 13                 return;       /* Already done */
 14 #endif
 15 
 16         if (cpu_has_veic || cpu_has_vint) {
 17                 unsigned long size = 0x200 + VECTORSPACING*64;
 18                 ebase = (unsigned long)
 19                         __alloc_bootmem(size, 1 << fls(size), 0);
 20         } else {
 21                 ebase = CKSEG0;
 22                 if (cpu_has_mips_r2)
 23                         ebase += (read_c0_ebase() & 0x3ffff000);
 24         }
 25 
 26         per_cpu_trap_init();
 27 
 28         /*
 29          * Copy the generic exception handlers to their final destination.
 30          * This will be overriden later as suitable for a particular
 31          * configuration.
 32          */
 33         set_handler(0x180, &except_vec3_generic, 0x80);
 34 
line 20~24: It initializes base address of exception vectors (ebase) to 0x80000000 (CKSEG0), the power-on default value. MIPSr2 provides an option to adjust ebase according to VA (bit 29:12) of EBase register.

line 16~19: If VI mode or EIC mode is used, bootmem from offset 0 (goal 0) will be allocated for the handler routeines. __alloc_bootmem returns virtual address which will also be 0x80000000 (an article FYR). There are 64 handlers in total, and each is of size 0x100 by default. The first 0x200 is also reserved for TLB Refill and Cache Error handlers.

line 26: Cache Error handler and TLB Refill handler are configured in per_cpu_trap_init().

line 33: Default exception handler would be located 0x80000180 and of size 128 bytes. set_handler() copies 128 bytes from the location of except_vec3_generic to ebase + 0x180.

Take a look at per_cpu_trap_init() and except_vec3_generic.
  1 void __cpuinit per_cpu_trap_init(void)
  2 {
  3         unsigned int cpu = smp_processor_id();
  x         ......
 50         if (cpu_has_veic || cpu_has_vint) {
 51                 unsigned long sr = set_c0_status(ST0_BEV);
 52                 write_c0_ebase(ebase);
 53                 write_c0_status(sr);
 54                 /* Setting vector spacing enables EI/VI mode  */
 55                 change_c0_intctl(0x3e0, VECTORSPACING);
 56         }
 57         if (cpu_has_divec) {
 58                 if (cpu_has_mipsmt) {
 59                         unsigned int vpflags = dvpe();
 60                         set_c0_cause(CAUSEF_IV);
 61                         evpe(vpflags);
 62                 } else
 63                         set_c0_cause(CAUSEF_IV);
 64         }
  x         ......
 72         if (cpu_has_mips_r2) {
 73                 cp0_compare_irq_shift = CAUSEB_TI - CAUSEB_IP;
 74                 cp0_compare_irq = (read_c0_intctl() >> INTCTLB_IPTI) & 7;
 75                 cp0_perfcount_irq = (read_c0_intctl() >> INTCTLB_IPPCI) & 7;
 76                 if (cp0_perfcount_irq == cp0_compare_irq)
 77                         cp0_perfcount_irq = -1;
 78         } else {
 79                 cp0_compare_irq = CP0_LEGACY_COMPARE_IRQ;
 80                 cp0_compare_irq_shift = cp0_compare_irq;
 81                 cp0_perfcount_irq = -1;
 82         }
  x         ......
 99                 cpu_cache_init();
100                 tlb_init();
  x         ......
111 }
line 50~56: It sets BEV for Status register, configure EBase register with ebase, and configure VS in IntCtrl register. VS will be VECTORSPACING(0x100) << 4 = 8 to get a 256 bytes spacing.

line 57~64: It sets IV in Cause register and special interrupt vector will be used instead.

line 72~82: MIPSr2 provides the flexibility to change the default interrupt numbers for timer interrupt and performance counter interrupt.

line 99: It calls xxx_cache_init() accordingly, r4k_cache_init() for example, and calls set_uncached_handler() to build the Cache Error exception handler. It copies the handler except_vec2_generic to ebase + 0x100. set_uncached_handler() is similar to set_handler(), the only difference is that it uses KSeg1 address of ebase. except_vec2_generic turns off KSeg0 caching (sets K0 to 2 in Config register), and jump to cache_parity_error to handle the exception.

line 100: It calls tlb_init(), then build_tlb_refill_handler(), then build handlers handling exceptions including TLB refill exception, TLB load exception (handle_tlbl()), TLB store exception (handle_tlbs()), and TLB modify read-only area exception (handle_tlbm()). The build_r4000_tlb_refill_handler() builds tlb_handler with correct instructions, copy to final_handler, fixes relocations, and copy final_handler to ebase + 0.

And except_vec3_generic,
 1 NESTED(except_vec3_generic, 0, sp)
 2         .set    push
 3         .set    noat
 4 #if R5432_CP0_INTERRUPT_WAR
 5         mfc0    k0, CP0_INDEX
 6 #endif
 7         mfc0    k1, CP0_CAUSE
 8         andi    k1, k1, 0x7c
 9 #ifdef CONFIG_64BIT
10         dsll    k1, k1, 1
11 #endif
12 PTR_L   k0, exception_handlers(k1)
13         jr      k0
14         .set    pop
15         END(except_vec3_generic)
line 2: save the current status of flags
line 3: prevent the assembler from using AT register
line 7~8: read ExeCode (bit 6:2) from Cause register

line 10: In 64-bit environment, pointers are 8-byte aligned, which is double of size compared to a 4-byte pointer.

line 12: load the handler address from exception_handlers + k1. exception_handlers is an array of pointers storing address of exception handlers.

line 13: jump to the correct exception handler (according to the ExeCode)
line 14: restore the previous status of flags

Let us leave per_cpu_trap_init() and except_vec3_generic() there and go back to trap_init().
  1 void __init trap_init(void)
  2 {
  x         ......
 35         /*
 36          * Setup default vectors
 37          */
 38         for (i = 0; i <= 31; i++)
 39                 set_except_vector(i, handle_reserved);
 40 
 41         /*
 42          * Copy the EJTAG debug exception vector handler code to it's final
 43          * destination.
 44          */
 45         if (cpu_has_ejtag && board_ejtag_handler_setup)
 46                 board_ejtag_handler_setup();
 47 
 48         /*
 49          * Only some CPUs have the watch exceptions.
 50          */
 51         if (cpu_has_watch)
 52                 set_except_vector(23, handle_watch);
 53  
line 38~39: set the 32 exception handlers to handle_reserved() by default. set_except_vector() will substitute the i-th entry of exception_handlers[] with the given address.

line 44~45: configure the EJTAG debug exception handler if needed
line 51~52: configure watch exception handler to handle_watch()

handle_reserved() and handle_watch() are both built in arch/mips/kernel/genex.S as the followed:
 1         BUILD_HANDLER adel ade ade silent               /* #4  */
 2         BUILD_HANDLER ades ade ade silent               /* #5  */
 3         BUILD_HANDLER ibe be cli silent                 /* #6  */
 4         BUILD_HANDLER dbe be cli silent                 /* #7  */
 5         BUILD_HANDLER bp bp sti silent                  /* #9  */
 6         BUILD_HANDLER ri ri sti silent                  /* #10 */
 7         BUILD_HANDLER cpu cpu sti silent                /* #11 */
 8         BUILD_HANDLER ov ov sti silent                  /* #12 */
 9         BUILD_HANDLER tr tr sti silent                  /* #13 */
10         BUILD_HANDLER fpe fpe fpe silent                /* #15 */
11         BUILD_HANDLER mdmx mdmx sti silent              /* #22 */
12 #ifdef  CONFIG_HARDWARE_WATCHPOINTS
13         /*
14          * For watch, interrupts will be enabled after the watch
15          * registers are read.
16          */
17         BUILD_HANDLER watch watch cli silent            /* #23 */
18 #else
19         BUILD_HANDLER watch watch sti verbose           /* #23 */
20 #endif
21         BUILD_HANDLER mcheck mcheck cli verbose         /* #24 */
22         BUILD_HANDLER mt mt sti silent                  /* #25 */
23         BUILD_HANDLER dsp dsp sti silent                /* #26 */
24         BUILD_HANDLER reserved reserved sti verbose     /* others */ 
Each call to BUILD_HANDLER will build two functions, handle_\exception and handle_\exception_int. Thus, you may look the symbol table for handle_adel(), handle_adel_int(), handle_ades(), handle_ades_int(), and so on.

How do these handlers get prepared and handle exceptions?
 1         .macro  __BUILD_HANDLER exception handler clear verbose ext
 2         .align  5
 3         NESTED(handle_\exception, PT_SIZE, sp)
 4         .set    noat
 5         SAVE_ALL
 6         FEXPORT(handle_\exception\ext)
 7         __BUILD_clear_\clear
 8         .set    at
 9         __BUILD_\verbose \exception
10         move    a0, sp
11         PTR_LA  ra, ret_from_exception
12         j       do_\handler
13         END(handle_\exception)
14         .endm
15 
16         .macro  BUILD_HANDLER exception handler clear verbose
17         __BUILD_HANDLER \exception \handler \clear \verbose _int
18         .endm 
line 16~18: BUILD_HANDLER invokes __BUILD_HANDLER to build handlers

line 3: Frame size of handle_\exception() is PT_SIZE, which is automatically generated by Kbuild according to sizeof(struct pt_regs). pt_regs defines how registers are stored in stack during the exception.

line 4: tell the assembler no to use AT register because we will

line 5: save some CP0 registers and all the general purpose registers by calling SAVE_SOME, SAVE_AT, SAVE_TEMP, and SAVE_STATIC. The registers will be stored onto stack in order as defined in pt_regs.

line 6: export the handle_\exception_int() symbol. handle_\exception() and handle_\exception_int() share contents below this line.

line 7: could be CLI, STI, fpe, or ade
line 8: The assembler is free to use AT because we've done our saving job.
line 9: print the information of EPC if verbose is set

line 10~11: prepare to jump do_\handler(). The first argument is sp, the pointer to pt_regs. The return address is the address of ret_from_exception(), the function called when exception handling is done. ret_from_exception() will restore registers back, and return to the original address (discussed later).

line 12: call the exception handling function do_\handler(), such as do_ade(), do_be(), do_ri, and so on.

Okey, let's go back to trap_init() again.
  1 void __init trap_init(void)
  2 {
  x         ......
 54         /*
 55          * Initialise interrupt handlers
 56          */
 57         if (cpu_has_veic || cpu_has_vint) {
 58                 int nvec = cpu_has_veic ? 64 : 8;
 59                 for (i = 0; i < nvec; i++)
 60                         set_vi_handler(i, NULL);
 61         }
 62         else if (cpu_has_divec)
 63                 set_handler(0x200, &except_vec4, 0x8);
 64 
 65         /*
 66          * Some CPUs can enable/disable for cache parity detection, but does
 67          * it different ways.
 68          */
 69         parity_protection_init();
 70 
 71         /*
 72          * The Data Bus Errors / Instruction Bus Errors are signaled
 73          * by external hardware.  Therefore these two exceptions
 74          * may have board specific handlers.
 75          */
 76         if (board_be_init)
 77                 board_be_init();
 78 
line 57~61: in VI or EIC mode, set interrupt handler to NULL by default. BSP providers need to configure every interrupt handler accordingly. It calls set_vi_srs_handler() and copies the given handler to ebase + 0x200 + n*VECTORSPACING.

line 62~63: configure except_vec4 as the default special interrupt handler
line 69: enable cache parity dectection if it is implemented. (ErrCtrl register)
line 76~77: configure the Data/Instruction Bus Error exception handler if supported

  1 void __init trap_init(void)
  2 {
  x         ......
 79         set_except_vector(0, rollback ? rollback_handle_int : handle_int);
 80         set_except_vector(1, handle_tlbm);
 81         set_except_vector(2, handle_tlbl);
 82         set_except_vector(3, handle_tlbs);
 83 
 84         set_except_vector(4, handle_adel);
 85         set_except_vector(5, handle_ades);
 86 
 87         set_except_vector(6, handle_ibe);
 88         set_except_vector(7, handle_dbe);
 89 
 90         set_except_vector(8, handle_sys);
 91         set_except_vector(9, handle_bp);
 92         set_except_vector(10, rdhwr_noopt ? handle_ri :
 93                           (cpu_has_vtag_icache ?
 94                            handle_ri_rdhwr_vivt : handle_ri_rdhwr));
 95         set_except_vector(11, handle_cpu);
 96         set_except_vector(12, handle_ov);
 97         set_except_vector(13, handle_tr);
  x         ......
114         if (board_nmi_handler_setup)
115                 board_nmi_handler_setup();
116 
117         if (cpu_has_fpu && !cpu_has_nofpuex)
118                 set_except_vector(15, handle_fpe);
119 
120         set_except_vector(22, handle_mdmx);
121 
122         if (cpu_has_mcheck)
123                 set_except_vector(24, handle_mcheck);
124 
125         if (cpu_has_mipsmt)
126                 set_except_vector(25, handle_mt);
127 
128         set_except_vector(26, handle_dsp);
129 
line 79~128: configure exception handler seperately. 0~26 are exception codes (ExeCode of Cause register), the meaning of their value could be referenced to MIPS user manual. For example, 0 means that it is an interrupt, 1 means that TLB modify to read-only page exception.

  1 void __init trap_init(void)
  2 {
  x         ......
130         if (cpu_has_vce)
131                 /* Special exception: R4[04]00 uses also the divec space. */
132                 memcpy((void *)(ebase + 0x180), &except_vec3_r4000, 0x100);
133         else if (cpu_has_4kex)
134                 memcpy((void *)(ebase + 0x180), &except_vec3_generic, 0x80);
135         else
136                 memcpy((void *)(ebase + 0x080), &except_vec3_generic, 0x80);
137 
138         local_flush_icache_range(ebase, ebase + 0x400);
139         flush_tlb_handlers();
140 
141         sort_extable(__start___dbe_table, __stop___dbe_table);
142 
143         cu2_notifier(default_cu2_call, 0x80000000);    /* Run last  */
144 }
line 130~136: Here we don't support virtual coherence conflict exception, and copy except_vec3_generic to the address of default exception vector again.

line 138~139: flush ICache for address in the range of exception handlers
line 141: sort the exception table for data bus error

Interrupt is a special type of the exceptions; it comes from outside the CPU core. When a exception with ExeCode equal to 0, it is an interrupt. Let's see how interrupts are dealt with:

 1 NESTED(handle_int, PT_SIZE, sp)
 2 +-- 31 lines: #ifdef CONFIG_TRACE_IRQFLAGS------------------
 3         SAVE_ALL
 4         CLI
 5         TRACE_IRQS_OFF
 6 
 7         LONG_L  s0, TI_REGS($28)
 8         LONG_S  sp, TI_REGS($28)
 9         PTR_LA  ra, ret_from_irq
10         j       plat_irq_dispatch
11         END(handle_int) 
line 2, 5: trace IRQ stuff, we are not going to discuss it here.
line 3: save some CP0 registers and all the general purpose registers to stack
line 4: CLI to clear the interrupts

line 7: save the information of pt_regs pointer to current thread_info to s0
line 8: load the pt_regs pointer with the stack pointer (where we saved all registers)

line 9: set return address to ret_from_irq instead, so that handler will return to ret_from_irq after the interrupt

line 10: jump to the platform dependent interrupt handling routine

As you may see, handle_int is similar to handle_\exception with the differences:
  1. handle_int needs to CLI, while handle_\exception may use CLI, STI, fpe, or ade.
  2. the return address of handle_int is ret_from_irq; the return address of handle_\exception is ret_from_exception
  3. handle_int will jump to plat_irq_dispatch, handle_\exception will jump to do_\handler
  4. do_\handler has an argument, a pointer to pt_regs. handle_\exception copies stack pointer to a0 for the usage.
  5. plat_irq_dispatch has no argument, and the pointer to pt_regs will be in current thread_info. So handle_int has to save the current pt_regs to s0 before updating it with stack pointer

Interrupts and handling are often different between platforms. Developers have to implement plat_irq_dispatch() accordingly. A common handler is provded in arch/mips/mipssim/sim_int.c. It finds the first pending interrupt number and calls do_IRQ(no.) to handle it.
 1 asmlinkage void plat_irq_dispatch(void)
 2 {
 3         unsigned int pending = read_c0_cause() & read_c0_status() & ST0_IM;
 4         int irq;
 5 
 6         irq = irq_ffs(pending);
 7 
 8         if (irq > 0)
 9                 do_IRQ(MIPS_CPU_IRQ_BASE + irq);
10         else
11                 spurious_interrupt();
12 } 
line 3: read pending interrupts from Cause register and masked by IM7-0 of Status register

line 6: find the first bit set in pendings. This is an example of handling compatible interrupts, first bit is the highest priority interrupt.

line 8~9: do_IRQ() to handle the interrupt
line 10~11: It is a spurious interrupt (should not happen).

And when exception or interrupt has been handle, it invokes ret_from_exception or ret_from_irq to return to the original executions.
 1 #ifndef CONFIG_PREEMPT
 2 #define resume_kernel   restore_all
 3 #else
 4 #define __ret_from_irq  ret_from_exception
 5 #endif
 6 
 7         .text
 8         .align  5
 9 #ifndef CONFIG_PREEMPT
10 FEXPORT(ret_from_exception)
11         local_irq_disable                       # preempt stop
12         b       __ret_from_irq
13 #endif
14 FEXPORT(ret_from_irq)
15         LONG_S  s0, TI_REGS($28)
16 FEXPORT(__ret_from_irq)
17         LONG_L  t0, PT_STATUS(sp)               # returning to kernel mode?
18         andi    t0, t0, KU_USER
19         beqz    t0, resume_kernel
20 
21 resume_userspace:
22         local_irq_disable               # make sure we dont miss an
23                                         # interrupt setting need_resched
24                                         # between sampling and return
25         LONG_L  a2, TI_FLAGS($28)       # current->work
26         andi    t0, a2, _TIF_WORK_MASK  # (ignoring syscall_trace)
27         bnez    t0, work_pending
28         j       restore_all 
line 10,14: ret_from_exception and ret_from_irq share the context because they are similar.
line 15: ret_from_irq has to restore back the pointer of pt_regs of thread_info stored in s0 previously.
line 17~18: read UM of Status register and see if it is in kernel mode
line 19, 1~3: if in kernel mode, jump to resume_kernel (it actually calls restore_all)
line 21~28: if in user mode, also call restore_all after finishing pending workings

restore_all will call RESTORE_TEMP, RESTORE_AT, RESTORE_STATIC, RESTORE_SOME, and RESTORE_SP_AND_RET. RESTORE_SP_AND_RET will execute eret instruction finally, it clears EXL of Status register and returns to the address stored in EPC, then original execution continues.

References:

2 comments:

Anonymous said...

Nice set of posts about Mips on Linux! Very interesting and useful :)

Unknown said...

Awesome looking for this.