The R1 is a simplified load/store type microprocessor. The CPU is designed for usage in embedded applications and feature 32 registers each 32-bit wide, a large set of register-to-register instructions for higher performance and a simplified memory architecture based on bytes 16 bit long.
The following legends are used throughout the text:
This is the opcode-format. It describes the bits in the instruction. Like here:
111111.11010.rrrrr
It means that the 6 most significant bits are all 1's. the 5 next is 11010 and the last 5 bits makes a number referenced to as rrrrr
This is pseudo-code.
You read this line-for-line, from the top.
Actually, I wanted indents for blocks of code, but HTML does not allow this, so I have used a combination of {} and ,:
start by doing this
if (a=0): { this is done if a is 0, likewise is this }
if (a=1): { this is only done if a is 1, and this too }
but this is always done
Sometimes it is nescesary to reference individual bits in a register. Bits are counted from 0 and up, where 0 is least significant bit and 31 is MSB (or sign if 2's complement). The following notation is used to address individual bits:
rrrrr31
This is the MSB of the register with number given in rrrrr.
Please note that some of the instructions are not supported by the current version of the emulator and some is very difficult to implement. The ones not implemented is ROR, ROL (very difficult in pure C++) and INT. The ones probably not present in the final version is MUL, IMUL, DIV, IDIV because they demand large amounts of either time or hardware (hard to reach one-cycke execution). Unimplemented instructions cause OPCODE interrupt in the emulator for most correct emulation.
A word is considered to be 32-bit long, whereas a byte is 16-bit.
The CPU is equipped with 32 registers, each 32-bits long. The registers are known as R0 to R31.
Register name | Purpose |
---|---|
R00 (IP) | Instruction Pointer - points to next instruction to execute |
R01 (F) | Flags, see below |
R02 (SP) | Stack pointer. The stack grows downwards. |
R03 (BP) | Base pointer. Addresses are calculated as offset from this pointer |
R04 (IT) * | Absolute address of interrupt table (this is NOT a offset from BP) |
R05 * | reserved |
R06 (PTB) * | Page Table Base (N/A, for virtual memory) |
R07 (PTS) * | Page Table Size (N/A, for virtual memory) |
R08-R31 | Free-to-use general purpose |
* Can only be changed by an application running in supervisor mode.
Flags: ssp00000 00000000 00000vbi 000boscz | | ||| ||||+- Zero-flag | | ||| |||+-- Carry/Borrow out of last bit | | ||| ||+--- Sign flag | | ||| |+---- Overflow, last arithmethic action | | ||| | (often equal to carry, but not on div) | | ||| +----- Byte Arithmetics (N/A in CPU1) | | ||+---------- Interrupts enabled | | |+----------- Single-Step interrupt | | +------------ Overflow interrupt | +--------------------------------- Paging enabled (not available in CPU1 +----------------------------------- Supervisor level (not available in CPU1)
The address-space of the R1 is 32-bit, and each cell is 16-bit long. This implementation has been chosen to minimize the number of instructions needed for interfacing to memory. If the address-space should have been based on 8-bit cells, the complexity of the instruction set would have been increased by adding 6 new instructions, and besides - even strings use 16-bit pr. character. If an implementation needs 8-bit cells, they can be implemented by either reserving a memory-space, where all addresses will return 0 in the 8 MSB's, or by using a combination of bit-manipulation instructions.
The CPU does not feature an additional I/O-bus, but uses memory mapped I/O. This is done to keep the number of instructions low and does not add additional complexity to the system, since the CPU cannot gain advantages such as parrallel I/O and memory-access on advanced high-performance systems.
When the RESET-pin is pulled low, the CPU is reset. When the CPU resets, almost all registers are set to 0. IP is initialized to what is stored at the location 0fffffffch, and SP is set to 1000h. Thus, the first thing, the CPU does is to read 4 bytes from memory at 0fffffffch.
All the instructions are designed to fill exactly one 16-bit byte, thus simplifying the internal construction. An instruction byte comes in three different flavors - the 2-reg, 1-reg and no-regs version.
All the instructions use a syntax where sssss and ddddd stands for the 5-bit number of the affected register, thus for example:
000001.10010.01101 - CMP R18, R13
R0 is never allowed as destination register - R0 can only be modified through JMP, CALL, J??. Trying to use R0 as a destination register results in a OPCODE interrupt.
6 bits | 5 bits | 5 bits |
---|---|---|
OpCode (oooooo) | Source Register (sssss) | Destination Register (ddddd) |
111111 | OpCode (ooooo) | Destination Register (ddddd) |
111111 | 11111 | OpCode (ooooo) |
Instruction is no-operation, a space-filler. Issuing a NOP only takes up processor cycles and results in the increment of the instruction pointer.
Format: 000000.-----.-----
IP ← IP + 1
Copy an operand between two registers or from/to memory. All the instructions come in both a 16-bit version and a 32-bit version, known as MOVE.W and MOVE.L (or just MOVE). There are three basic instructions - fill a register with an immediate value, load/store a memory location with immediate offset from base-register and move between registers or between memory (with offset BP)/registers. When moving words, it is important to remember, that the upper 16 bits are unaffected by moves - they will not be copied, nor modified.
Examples: MOVE.B R18, R20 byte(R20) → byte(R18) MOVE.W R20:[200], R12 word( mem[R20 + 200] ) → word(R12) MOVE.B R19, [200] byte(R19) → byte( mem[BP + 200] )
format:110wmn.sssss.ddddd
m and n selects whether to use the registers as offsets (from BP) or as registers. w selects word (1) or byte (0) moves.
IP ← IP + 1
if (m=0, n=0): ddddd ← sssss
if (m=0, n=1): [BP + ddddd] ← sssss
if (m=1, n=0): ddddd ← [BP + sssss]
if (m=1, n=1): [BP + ddddd] ← [BP + sssss]
zf ← value moved = 0
sf ← value moved < 0
format: 1110ws.bbbbb.rrrrr >offset<
Load/Store word/dword in memory with Base-Offset. Unlike MOVE, MOVE_BO lets you select your offset register. The assembler will automatically select BP (or a matching base if an assume is made) if no offset is selected (eg. MOVE [10], R20 is coded as MOVE BP:[10], R20).
IP ← IP + 1
disp ← word( [IP] )
IP ← IP + 4
if (w=0): rrrrr ← byte( [bbbbb+disp] )
if (w=1): rrrrr ← word( [bbbbb+disp] )
zf ← value moved = 0
sf ← value moved < 0
format: 111111.1110w.rrrrr >imm16/imm32<
w selects dword (0) or word (1) load
IP ← IP + 1
if (w=0): { rrrrr ← byte( [IP] ), IP ← IP + 1 }
if (w=1): { rrrrr ← word( [IP] ), IP ← IP + 2 }
zf ← value moved = 0
sf ← value moved < 0
Sets register to either 0 or FFFF FFFF.
format: 111111.0000b.rrrrr
IP ← IP + 1
if (b=0): { [rrrrr] ← 0000 0000, zf ← 1, sf ← 0 }
if (b=1): { [rrrrr] ← FFFF FFFF, zf ← 0, sf ← 1 }
The CPU supports a stack. The stack is controlled by the SP-pointer, that points to the next location in the stack to use. SP is not calculated as an offset from BP. This is done to protect the stack even though a procedure modifies BP.
format: 11111.10000.rrrrr
IP ← IP + 1
SP ← SP - 2
[SP] ← rrrrr
The POP instruction is the counterpart to the PUSH instruction. POP reads a dword from the top of the stack and increments the stack-pointer accordingly.
format: 111111.10001.rrrrr
IP ← IP + 1
rrrrr ← [SP]
SP ← SP + 2
zf ← rrrrr = 0
sf ← rrrrr31
Add source to destination and place result in destination. Flags updated to reflect result of addition. This instruction performs the normal binary add-operation, thus it can be used to add two-complement numbers.
format: 00100c.sssss.ddddd
IP ← IP + 1
if (c=0): ddddd ← ddddd + sssss
if (c=1): ddddd ← ddddd + sssss + cf
cf ← carry-out-of-addition
of ← overflow-flag updated
zf ← ddddd = 0
sf ← ddddd31
Subtract source from destination, place result in destination. Update flags to reflect result of subtraction. The SBB is used when handling large integer operations. The subtract-instruction uses two's complement notation to perform the calculation and thus can have overflows.
format: 00110b.sssss.ddddd
IP ← IP + 1
if (b=0): ddddd ← ddddd - sssss
if (b=1): ddddd ← ddddd - sssss - cf
cf ← borrow-out-of-subtraction
of ← overflow-flag updated
zf ← ddddd = 0
sf ← ddddd31
Multiply the two integers in sssss and ddddd. Place the 64-bit result in sssss:ddddd (treat the pair as one 64-bit reg). The order of the register are chosen so that if one is sure, that the result is 32-bits, only ddddd contains result.
format: 01110i.sssss.ddddd
IP ← IP + 1
if (i=0): sssss:ddddd ← ddddd*sssss
if (i=1): sssss:ddddd ← ddddd*sssss (sign corrected)
sf ← cleared if unsigned multiply, set to MSB of sssss if signed mult.
zf ← zero-flag updated, set if sssss:ddddd=0, cleared otherwise
Divide the integer in ddddd by the integer in sssss. The remainder of the division is placed in sssss and the dividend in ddddd. If an overflow occurs, the overflow flag will be set. If a division by zero is attempted a DIVBYZERO exception will be thrown (interrupt 1).
format: 01111i.sssss.ddddd
IP ← IP + 1
if (sssss=0): raise EDivByZero
if (i=0): { sssss ← ddddd mod sssss, ddddd ← ddddd div sssss }
if (i=1): { sssss ← ddddd mod sssss, ddddd ← ddddd div sssss }
sf ← cleared if unsigned multiply, set to MSB of ddddd if signed mult.
zf ← zero-flag updated, set if sssss:ddddd=0, cleared otherwise
The AND instruction will perform the logical bitwise AND between the two registers sssss and ddddd. The result will be placed in ddddd.
format: 010000.sssss.ddddd
IP ← IP + 1
ddddd ← ddddd & sssss (eg: 1101 & 1010 = 1000)
zf ← ddddd = 0
sf ← ddddd31
The OR instruction will perform bitwise OR between the two registers sssss and ddddd.
format: 010010.sssss.ddddd
IP ← IP + 1
ddddd ← ddddd | sssss (eg: 0101 | 1001 = 1101)
zf ← ddddd = 0
sf ← ddddd31
The XOR instruction will perform bitwise XOR between the two registers sssss and ddddd.
format: 010011.sssss.ddddd
IP ← IP + 1
ddddd ← ddddd ^ sssss (eg: 0101 | 1001 = 1100)
zf ← ddddd = 0
sf ← ddddd31
Shifts the ddddd reg by the indicated amount. 0's will be placed in the MSB and the bit shifted out of the register will be placed in the carry-flag. The i selects whether or not to treat sssss as an immediate value instead of a register.
format: 10000i.sssss.ddddd
IP ← IP + 1
if (i=0): treat sssss as register number
if (i=1): treat sssss as immediate
repeat sssss times: { cf ← d0, d ← 0, d31,d30,d29,d28...d1 }
zf ← ddddd = 0
sf ← ddddd31
Unlike SHR, SAR will extend the MSB into the new bit-positions.
format: 10001i.sssss.ddddd
IP ← IP + 1
if (i=0): treat sssss as register number
if (i=1): treat sssss as immediate
repeat sssss times: { cf ← d0, d ← d31,d31,d30,d29,d28...d1 }
zf ← ddddd = 0
sf ← ddddd31
The bits in the register is shiftet to the left. 0's fill the lower bits. The bit shiftet out of the register is placed in the carry-flag.
format: 10010i.sssss.ddddd
IP ← IP + 1
if (i=0): treat sssss as register number
if (i=1): treat sssss as immediate
repeat sssss times: { cf ← d31; ddddd ← d30,d29,d28...d0,0 }
zf ← ddddd = 0
sf ← ddddd31
The bits in the register is rotated to the left. If c is set, the carry is used to rotate through, meaning LSB is set to carry and MSB is shifted into the carry. If c is cleared, the MSB is rotated into the LSB.
format: 1010ci.sssss.ddddd
IP ← IP + 1
if (i=0): treat sssss as register number
if (i=1): treat sssss as immediate
repeat sssss times: if (c=1) { cf ← d31; ddddd ← d30,d29,d28...d0, old cf } else { cf ← d31; ddddd ← d30,d29,d28...d0, d31 }
zf ← ddddd = 0
sf ← ddddd31
The bits in the register is rotated to the right. If c is set, the carry is used to rotate through, meaning MSB is set to carry and LSB is shifted into the carry. If c is cleared, the LSB is rotated into the MSB.
format: 1011ci.sssss.ddddd
IP ← IP + 1
if (i=0): treat sssss as register number
if (i=1): treat sssss as immediate
repeat sssss times: if (c=1) { cf ← d0; ddddd ← old cf, d31,d30...d2,d1 } else { cf ← d0; ddddd ← d0,d31,d30...d2, d1 }
zf ← ddddd = 0
sf ← ddddd31
Clears the specified bit in the register.
format: 01100i.sssss.ddddd
IP ← IP + 1
if (i=0): treat sssss as register number
if (i=1): treat sssss as immediate
the bit at position sssss in ddddd is cleared (bitpos counted from 0)
zf ← ddddd = 0
sf ← ddddd31
Sets the specified bit in the register supplied.
format: 01101i.sssss.ddddd
IP ← IP + 1
if (i=0): treat sssss as register number
if (i=1): treat sssss as immediate
the bit at position sssss in ddddd is set (bitpos counted from 0)
zf ← ddddd = 0
sf ← ddddd31
Copy the specified bit to the carry-flag. The instruction comes in two different versions - one where an immediate is specified in the sssss and another where the sssss is a register, whose value is used as index.
format: 01010i.sssss.ddddd
IP ← IP + 1
if (i=0): treat sssss as register number
if (i=1): treat sssss as immediate
cf ← dddddsssss
Copy the carry-flag to the specified bit. The instruction comes in two different versions - one where an immediate is specified in the sssss and another where the sssss is a register, whose value is used as index.
format: 01010i.sssss.ddddd
IP ← IP + 1
if (i=0): treat sssss as register number
if (i=1): treat sssss as immediate
dddddsssss ← cf
NOT inverts all the bits in rrrrr.
format: 111111.10100.rrrrr
IP ← IP + 1
rrrrr ← !rrrrr (eg. !1001001 = 0110110)
zf ← rrrrr = 0
sf ← rrrrr31
NEG inverts all bits in rrrrr and adds 1.
format: 111111.10101.rrrrr
IP ← IP + 1
rrrrr ← !rrrrr+1 (eg. !1001001+1 = 0110111)
zf ← rrrrr = 0
sf ← rrrrr31
INC adds 1 to the register supplied in rrrrr
format: 111111.10010.rrrrr
IP ← IP + 1
rrrrr ← rrrrr+1
cf ← carry-out-of-addition
of ← cf
zf ← rrrrr = 0
sf ← rrrrr31
DEC subtracts one from the register supplied in rrrrr.
format: 111111.10011.rrrrr
IP ← IP + 1
rrrrr ← rrrrr-1
cf ← borrow-out-of-subtraction
of ← cf
zf ← rrrrr = 0
sf ← rrrrr31
All jump/call instructions come in two flavors - immediate and register. The two versions differ in the way they calculate the destination address - immediate uses relative addressing and register uses absolute addressing. This is done because to allow a program to be relocateable, one must not use absolute addressing, but when working with things as object-oriented programming and other systems using a high degree of late-linking/dynamic-linking it is nescescary to jump to an absolute address in memory regardless of from where the method is called.
Compare the two registers. Comparing two registers update the flags as if a subtraction was performed. That is if ddddd-sssss=0, the zero-flag will be set, otherwise cleared. If the ddddd-sssss<0 (two's complement) the sign flag will be set, otherwise cleared.
format: 001110.sssss.ddddd
zf ← (rrrrr-sssss) = 0
sf ← (rrrrr-sssss)31
Branch to the absolute address specified in r as a subroutine (push return address on stack)
format: 111111.00011.rrrrr
IP ← IP + 1
[SP] ← IP
SP ← SP - 2
IP ← rrrrr
Branch to immediate 32-bit relative address following this instruction as subroutine.
format: 111111.11111.00011
IP ← IP + 1
[SP] ← IP
SP ← SP - 2
IP ← imm32
The CPU pops the return-address off the stack and puts it in IP, then continues execution at the address.
format: 111111.11111.00000
SP ← SP + 2
IP ← [SP]
Causes the CPU to push the flags on the stack, look up the interrupt address and transfer control to that location. The interrupt-table consists of 2-byte entries with the address of the handler. The rrrrr specifies which type of interrupt is created. It is not recommended to use this opcode to transfer control to other methods.
format: 111111.?????.rrrrr
push F
push IP
ip ← [ IT + 2 * rrrrr ]
Used in conjunction with INT to return from an interrupt handler. First, the return-address is popped, then the flags.
format: 111111.11111.00001
pop IP
pop F
Branches to the absolute address specified in rrrrr.
format: 111111.00011.rrrrr
IP ← rrrrr
Branches to immediate 32-bit relative address following this instruction
format: 111111.11111.00010
IP ( imm32
Conditional jump are a group of instructions that allows conditional jumps to occur. The instructions, opcodes and their flag-equivalents are listet below. All jumps come in two versions - relative immediate addressing and absolute register addressing. The following two formats are used - substitute ooooo with the combination read in the table below:
format: 111111.ooooo.rrrrr
format: 111111.11111.ooooo
Mnemonic | Jump if... | Condition | Opcode |
---|---|---|---|
JE / JZ | equal / zero | ZF=1 | 10000 |
JNE / JNZ | not equal / not zero | ZF=0 | 10001 |
JC | carry | CF=1 | 10010 |
JNC | no carry | CF=0 | 10011 |
JS | signed | SF=1 | 10100 |
JNS | not signed | SF=0 | 10101 |
JA | above | (ZF or CF)=0 | 10110 |
JAE | above or equal | CF=0 | 10111 |
JB | below | CF=1 | 11000 |
JBE | below or equal | (CF or ZF)=1 | 11001 |
JG | greater than (two's cmp) | ((SF xor OF) or ZF)=0 | 11010 |
JGE | greater than or equal to (two's cmp) | (SF xor OF)=0 | 11011 |
JL | less than (two's cmp) | (SF xor OF)=1 | 11100 |
JLE | less than or equal to (two's cmp) | ((SF xor OF) or ZF)=1 | 11101 |
* This table is based entirely on the jump-instructions of the Intel 80x86 processor series as described in Microprocessors and interfacing1 figure 4-10, p.76. Only parity and overflow has been removed - totalling 14 jump instructions.
Causes the CPU to halt, waiting until an interrupt is issued.
format: 111111.11111.11111
All instructions in this group has the format:
oooooo.sssss.ddddd
The table should be read so that the 4 most significant bits are read from the rows, and the columns supply the two least significant bits of the instruction.
↓ 4 msb | 2 lsb → | 00 | 01 | 10 | 11 |
---|---|---|---|---|
0000 | NOP | |||
0001 | ||||
0010 | ADD | ADC | ||
0011 | SUB | SBB | CMP | |
0100 | AND | TEST | OR | XOR |
0101 | BTT | BTT imm | BTR | BTR imm |
0110 | BTC | BTC imm | BTS | BTS imm |
0111 | MUL | IMUL | DIV | IDIV |
1000 | SHR | SHR imm | SAR | SAR imm |
1001 | SHL | SHL imm | ||
1010 | ROR | ROR imm | RCR | RCR imm |
1011 | ROL | ROL imm | RCL | RCL imm |
1100 | MOVE.L | MOVE.L | MOVE.L | MOVE.L |
1101 | MOVE.W | MOVE.W | MOVE.W | MOVE.W |
1110 | MOVE_BO.L | MOVE_BO.L | MOVE_BO.W | MOVE_BO.W |
1111 | reserved |
All instructions in this group has the format:
111111 ooooo rrrrr
↓ 10 msb | 1 lsb → | 0 | 1 |
---|---|---|
111111 0000 | CLR | SET |
111111 0001 | NOT | NEG |
111111 0010 | INC | DEC |
111111 0011 | PUSH | POP |
111111 0100 | MOVE.I | MOVE.IW |
111111 0101 | ||
111111 0110 | ||
111111 0111 | JMP | CALL |
111111 1000 | JE | JNE |
111111 1001 | JC | JNC |
111111 1010 | JS | JNS |
111111 1011 | JA | JAE |
111111 1100 | JB | JBE |
111111 1101 | JG | JGE |
111111 1110 | JL | JLE |
111111 1111 | Reserved |
The 15 most significant bits are shown downwards, the least significant bit of the instruction is left.
↓ 15 msb | 1 lsb → | 0 | 1 |
---|---|---|
111111 11111 0000 | RET | IRET |
111111 11111 0001 | ||
111111 11111 0010 | ||
111111 11111 0011 | ||
111111 11111 0100 | ||
111111 11111 0101 | ||
111111 11111 0110 | ||
111111 11111 0111 | JMP imm | CALL imm |
111111 11111 1000 | JE imm | JNE imm |
111111 11111 1001 | JC imm | JNC imm |
111111 11111 1010 | JS imm | JNS imm |
111111 11111 1011 | JA imm | JAE imm |
111111 11111 1100 | JB imm | JBE imm |
111111 11111 1101 | JG imm | JGE imm |
111111 11111 1110 | JL imm | JLE imm |
111111 11111 1111 | INT4 (BRKPNT) | HALT |
Interrupts are treated with priority according to their number - 0 is highest priority and 31 is lowest. A higher-priority interrupt can only interrupt the execution of a lower-priority interrupt if the interrupt flag is set. This does not apply to non-maskable interrupts (the first 8 interrupts), which can interrupt any lower-priority executing interrupt. This is because it may be nescesary to force certain kind of exceptions through even though an interrupt is already being handled - eg. currently we are handling a timer-interrupt exclusively. The timer interruptcode contains illegal instructions which cannot be executed, so we must generate an OPCODE interrupt. Unfortunally, while we're handling the opcode, a NMI occurs (we are at a nuclear plant and the reactor is just about to blow) so it's probably good to handle the NMI, even though we are already handling the OPCODE. But then a RAMERR occurs - this interrupt is currently ignored, since we really cannot interrupt a NMI, so it will have to wait until we start working on the OPCODE again. The OPCODE, on the other hand, cannot correctly execute with RAMERR, so RAMERR is handled, and then OPCODE is completed, and finally the timer-interrupt may be completed or abandoned.
The following 16 are currently assigned in the emulator - the first 8 interrupts are unmaskable and will be executed even though the interrupt flag is cleared. The last 8 interrupts are features included in the emulator/CPU, the 3 external pins for interrupts, IEXT0 through IEXT2 and
Interrupts are serviced when the currently executing instruction is completed (usually a couple of clock-ticks later).
The interrupt-table was modified on 2000-08-30 to accomodate the 4 timers instead of only 3 and the names where changed to timer-0 through timer-3. Except from that, the numbers was reassigned for better high-speed
No. | Name | Meaning | Reason |
---|---|---|---|
00 | NMI | Non-Maskable interrupt | This interrupt is generated when the external NMI pin is a logical 1. |
01 | RAMERR | RAM-Error | If the CPU is equipped with ECC or parity-enabled RAM, the RAMERR pin can be used to signal an error to the CPU. If the RAMERR pin is a logical 1, this interrupt is generated. On systems not equipped with error-detecting/correcting RAM, the pin should always be zero. |
02 | DBZ | Division by Zero | When a division with zero is attempted. |
03 | OPCODE | Invalid Opcode detected | When an application tries to execute an invalid operation, this interrupt is generated. Among invalid operations are attempts to modify R0 with an arithmetic instruction. |
04 | BRKPNT | Break-Point | When a break-point instruction is encountered in the instruction stream, this interrupt is generated. |
05 | PFAULT | Page-fault | Page not present. This interrupt is generated when the page, on which the requested storage cell is placed, is (according to the page-table) not present in memory. |
06 | reserved | ||
07 | reserved | ||
08 | OVRFLOW | Overflow | If the Interrupt-On-Overflow flag is set, this interrupt is generated when a 2's-complement arithmetic instruction causes an overflow. |
09 | IEXT0 | Pin /IEXT0 | When a logical zero (low) is applied to pin IEXT0, this interrupt is raised. |
0A | TIMER0 | Timer-0 | This interrupt is used by the internal timer/counter circuit in the CPU. This interrupt is coupled with timer-0. |
0B | IEXT1 | Pin /IEXT1 | When a logical zero (low) is applied to pin IEXT1, this interrupt is raised. |
0C | TIMER1 | Timer-1 | This interrupt is used by the internal timer/counter circuit in the CPU. This interrupt is coupled with timer-1. |
0D | IEXT2 | Pin /IEXT2 | When a logical zero (low) is applied to pin IEXT2, this interrupt is raised. |
0E | TIMER2 | Timer-2 | This interrupt is used by the internal timer/counter circuit in the CPU. This interrupt is coupled with timer-2. |
0F | TIMER3 | Timer-3 | This interrupt is used by the internal timer/counter circuit in the CPU. This interrupt is coupled with timer-3. |
10-1F | IEXT | External interrupts | When a logical zero (low level) is applied to an external interrupt, the according interrupt is raised. |
Execution of an interrupt followes this:
SP ← SP - 2
[SP] ← F
SP ← SP - 2
[SP] ← IP
IP ← [IT+2*IntNo]
After an interrupt is completed, the following steps are taken
IP ← [SP]
SP ← SP + 2
F ← [SP]
SP ← SP + 2
The R1-assembler is a simple assembler supporting basic functionality required for building an assembler. Macros and modules are not supported.
Most syntax is supplied on Extended-Backus-Naur form. This means that on the left-hand-side of ::= a class of grammatic element is defined, on the right-hand-side the elements, that make up the new element is listet. Elements enclosed in "[" and "]" is optional, elements enclosed in "{" and "}" are repeated 0 or more times.
The scanner performs the first processing of the input. This consists of tokenizing the input into more handable tokens consisting of serveral characters. To properly handle case-insensitivity, all letters are converted to lowercase during the scanners processing. When the scanner detects an identifier, it will use a look-up-table to convert the identifier to one of the reserved words in the assembler; if a match is found the identifier is converted to the appropriate token-type - this behavior will not always be apparent in the following chapters, but for the consistency of this document the approach is chosen.
The output from the scanner is a set of tokens obeying these rules:
letter ::= 'a'..'z' digit ::= '0'..'9' hexdigit ::= '0'..'9', 'a'..'f' alphanummeric ::= <letters>,<digits> ident ::= <letter> { <alphanummeric> } const ::= <digit> [ <hex-digits> ] register ::= "ip" | "bp" | "sp" | "it" | "fl" | "flags" | "r00" | "r01" | "r02" | ... | "r31" string ::= '"' { chars } '"'
Strings are sequences of characters enclosed in double-quotes. The following escape-sequences are supported:
\b = backspace (8) \t = horizontal tabulator(9) \n = line-feed (10) \r = carriage return (13) \\ = \ \" = "
Comments, too, are handled in the scanner. When the scanner encounters the "#" character, it issues an end-of-line token.
comment ::= "#" <rest of line>
line ::= <directive> | <label> [ <instruction> ] | <instruction>
The assembler supports expressions (resolving to constants during assembly) with operator priority. This results in the following set of grammar:
E0 ::= <E1> { ("+" | "-") <E1> } E1 ::= <E2> { "*" <E2> } E2 ::= <value> | "(" <E0> ")" value ::= <const> | <ident(type==const)> | "@" <ident(type!=const)>
Note here, that if an identifier is used in a value, it must be a constant or the address of the identifier - this is to ensure that the type of the expression is a value.
constant ::= <ident> "=" <E0>
A constant is an identifier bound to a constant value. Constants should be used for "magic numbers" in the code, so that only one number needs to be changed. The E0 supplied must be resolveable in the first pass; this is to allow the use of constants in labels.
directive ::= "." <dirspec> dirspec ::= "end" | "assume" <assume_list> | "seg" <ident> assume_list::= { <reg> ":" <ident> }
Directives are used to control the behavior of the assembler during assembly.
.end are used to indicate the end of the file. All text after .end is ignored.
When this directive is encountered, the assembler creates a new segment of data. You may freely place whatever data you desire in any segment. The .seg must be followed by a unique identifier so that no two segments collide.
.assume is used by the assembler to choose a proper base-register when addressing. The statement is followed by a list of registers and segments that the registers point to - thus all idents in the directive must be segments.
.seg data be: db 200 .seg code .assume bp:data cell: dw 10 move be, r10 # due to assume, this is assembled as "move.b bp:[@be], r10" move cell, r11 # since cell is in current segment, a relative indexing # is used "move.b ip:[@cell-$], r11" .end
Labels are used both for jump destinations and for defining storage for the program, therefore the syntax is quite flexible to allow these two purposes
label ::= <ident> "=" <E0 resolvable> | <ident> ":" [ <stor_spec> [ "*" <E2 resolvable> | <num_list> ] | <inst> ] storagespec ::= "db" | "dw" num_list ::= ( <E0> | <string> ) { "," ( <E0> <string> ) }
If a label is followed by a storagespec, the identifier will be marked as a memory-location rather than a label and may then be used in move-statements. If the storagespec is followed by a "*", the allocation is repeated the number of times supplied afterwards. If the line contains more, a list of values are expected and parsed. For each value it parses, the counter is increased by the size of the first element.
If the label is not followed by a storagespec, the identifier is marked as a label and the rest of the line is parsed as an instruction.
The offset of a variable (the address assigned) is obtained by prefixing the variable with @, so if you want to move the address of eg. myVar (who's offset is 0x02998) into a register, you may write "move @myVar, R10", which will be assembled the same way as "move 0x02998, R10".
Parsing instructions is probably the most difficult task for the assembler - it requires a lot of work to handle these syntactic rules. All mne_? is processed in the scanner.
movemod ::= "B" | "W" mne_move ::= "MOVE" [ "." <movemod> ] mne_bitman ::= "BTT" | "BTR" | "BTS" | "BTC" | "SHL" | "SHR" | "SAR" | "ROL" | "RCL" | "ROR" | "RCR" mne_jmp ::= "CALL" | "JMP" | "JE" | "JNE" | "JC" | "JNC" | "JS" | "JNS" | "JA" | "JAE" | "JB" | "JBE" | "JG" | "JGE" | "JL" | "JLE" mne_2reg ::= "ADD" | "SUB" | "AND" | "OR" | "XOR" | "TEST" mne_1reg ::= "CLR" | "SET" | "PUSH" | "POP" | "INC" | "DEC" | "NOT" | "NEG" mne_0reg ::= "HALT" | "RET" | "IRET" op0reg ::= <mne_0reg> op1reg ::= <mne_1reg> <reg> op2reg ::= <mne_2reg> <reg> "," <reg> opbitman ::= <mne_bitman> <reg> "," <reg> | <mne_bitman> <E0> "," <reg> opjmp ::= <mne_jmp> <reg> | <mne_jmp> <ident(type==label)> moveoperand::= <const> | "@" <ident> | "(" E0 ")" | "[" <reg> "]" | "[" <E0> "]" | <ident> [ "[" <E0> "]" ] | <reg> | <reg> ":" <moveoperand> opmove ::= <mne_move> <moveoperand> "," <moveoperand> inst ::= <opmove> | <opjmp> | <opbitman> | <op2reg> | <op1reg> | <op0reg>
While all other instructions are quite simple to parse, the move-instruction raises the most problems due to its higher grade of flexibility. Therefore we will take a look at the move-instruction, where we handle each of the different cases:
moveoperand::= <const> | "@" <ident> | "(" E0 ")" | "[" <reg> "]" | "[" <E0> "]" | <ident> [ "[" <E0> "]" ] | <reg> | <reg> ":" <moveoperand> opmove ::= <mne_move> <moveoperand> "," <moveoperand>
Simple handled - a const may never occur on the rhs in the move.
Take the address of the identifier - only possible if the identifier is a label or memory reference.
Handled as a constant - may thus never occur on rhs of move.
Whenever something is enclosed in "[" "]" it is considered an offset (a reference to memory). If a base-register is not supplied through the <reg> ":" <moveoperand>
syntax, the current assumes will be used to attempt to resolve the base-register to use; this is of course not relevant if a register is enclosed between the brackets, in which case the base-register is always bp.
Base override - used when you wish to define your own base-register for the addressing. When this syntax is used, the base-register of the moveoperand following need not be resolvable.
Array indexing - the E0 value will be resolved, multiplied by the element size of ident and the displacement added to the address of ident to form the complete offset. The base-register needed to address ident must be resolvable unless the above sumtax is used.
Of course - the simplicity of the CPU sets a great deal of restrictions on the use of the move-instruction. But in general:
Besides, try it out, the assembler will make a syntax-error if it encounters an instruction it does not know how to assemble. You may be able to assemble the instruction manually, in which case you can use a label to resolve your problem.
-Examples of assembler code # Tell the assembler that the data-area starts here. # The assembler then assumes, that BP points to the start of the data-area, # thus it always generate move [BP+@varname],or move ,[BP+@varname] # instructions. .seg data # three constants are defined here pot10_1 = 10 pot10_2 = 100 keyb_io = 0x0A # a definition of a series of bytes pot10_3: db 1000 # here two words known as "pot10_6" and "pot10_9" pot10_6: dw 1000000 pot10_9: dw 1000000000 # and now a complete array of words, first memory location is "pot10" pot10: dw 10,100,1000,10000,100000,1000000,10000000,100000000,1000000000 # when no initial value is supplied, a zero is assumed read: db .seg code .assume bp:data start: move.b [keyb_io], R10 # this is allowed since keyb_io is bound to constant value move.b ascii_a, R11 # the value ascii_a is loaded in R10 add R11, R10 # the two regs are added move.b R10, read # this syntax is allowed since "read" is bound to a memory-loc. .end # the code ends at ".end". The assembler will not try to process anything after ".end".
The R1-emulator features the same set of periphial devices as found in the CPU, that is the integrated interrupt-controller and the 4 timers/counters timer-0 to timer-3.
The timers are count-up timers. The timers/counters are usually used to make an interrupt after a given number of events has occured, therefore they are count-up timers. Each timer has an event-value, that is the value, on which an interrupt is desired. When the counter reaches this value, the interrupt-pin is set. If the counter is set for auto-reset, 0 will be put into the counter and thus the counter is ready for another run.
The 4 timer/counter is accessible at the absolute address:
Address | Contains |
---|---|
0xFFFF0000 | Timer0 event value (write), counter value (read) |
0xFFFF0002 | Timer0 control word |
0xFFFF0004 | Timer1 event value (write), counter value (read) |
0xFFFF0006 | Timer1 control word |
0xFFFF0008 | Timer2 event value (write), counter value (read) |
0xFFFF000A | Timer2 control word |
0xFFFF000C | Timer3 event value (write), counter value (read) |
0xFFFF000E | Timer3 control word |
Should you try to access eg. 0xFFFF0001, an undefined result will be returned - only word-reads on exact addresses are allowed.
The control-words has the following format:
00000000.00000000.00000000.000ssize ss - select source 00 - internal clock 01 - external clock (T?CLK) 10 - select last counters event flag 11 - reserved i - interrupt on eventval 0 - do not generate interrupt when D == eventcount 1 - generate interrupt when D == eventcount z - reset on eventval 0 - do not reset counter when eventvalue is reached 1 - reset counter when eventvalue is reached e - counter enabled 0 - counter not enabled 1 - counter enabled
No warrenties are associated with the information provided here. All information and concepts are public domain (unless I overlooked someone).
Last updated 2000-10-11 by FlushedSector