Abstract

The R1 is a simplified load/store type microprocessor. The CPU is designed for usage in embedded applications and feature 32 registers each 32-bit wide, a large set of register-to-register instructions for higher performance and a simplified memory architecture based on bytes 16 bit long.

Index

Abstract
Index
Legend
Architecture

Registers
Flags
Memory

Boot-Up
Instructions

NOP
Memory Access Instructions

MOVE - copy between memory and registers
SET / CLR - Set or clear register
PUSH - Push a register on the stack
POP - Pop a register off the stack

Arithmethic Instructions

Control Transfer Instructions

CMP - Compare two registers
CALL - Call subroutine
RET - Return from sub-routine
INT - software interrupt
IRET - Interrupt return
JMP - Unconditional Jump
J?? - Conditional Jump
HALT - Halt

Instruction map

Group 1 - 2-reg operations
Group 2 - 1-reg operations
Group 3 - no registers in operation

Interrupts
Assembler

Scanner
Comments
Asm lines
Expressions
Constants
Assembler directives
Labels
Instructions
"move" instruction
Sample code

Legend

The following legends are used throughout the text:

This is the opcode-format. It describes the bits in the instruction. Like here:
111111.11010.rrrrr
It means that the 6 most significant bits are all 1's. the 5 next is 11010 and the last 5 bits makes a number referenced to as rrrrr

This is pseudo-code.
You read this line-for-line, from the top.
Actually, I wanted indents for blocks of code, but HTML does not allow this, so I have used a combination of {} and ,:

start by doing this
if (a=0): { this is done if a is 0, likewise is this }
if (a=1): { this is only done if a is 1, and this too }

but this is always done

Sometimes it is nescesary to reference individual bits in a register. Bits are counted from 0 and up, where 0 is least significant bit and 31 is MSB (or sign if 2's complement). The following notation is used to address individual bits:

rrrrr₃₁

This is the MSB of the register with number given in rrrrr.

Please note that some of the instructions are not supported by the current version of the emulator and some is very difficult to implement. The ones not implemented is ROR, ROL (very difficult in pure C++) and INT. The ones probably not present in the final version is MUL, IMUL, DIV, IDIV because they demand large amounts of either time or hardware (hard to reach one-cycke execution). Unimplemented instructions cause OPCODE interrupt in the emulator for most correct emulation.

Architecture

A word is considered to be 32-bit long, whereas a byte is 16-bit.

Registers

The CPU is equipped with 32 registers, each 32-bits long. The registers are known as R0 to R31.

Register name	Purpose
R00 (IP)	Instruction Pointer - points to next instruction to execute
R01 (F)	Flags, see below
R02 (SP)	Stack pointer. The stack grows downwards.
R03 (BP)	Base pointer. Addresses are calculated as offset from this pointer
R04 (IT) *	Absolute address of interrupt table (this is NOT a offset from BP)
R05 *	reserved
R06 (PTB) *	Page Table Base (N/A, for virtual memory)
R07 (PTS) *	Page Table Size (N/A, for virtual memory)
R08-R31	Free-to-use general purpose

* Can only be changed by an application running in supervisor mode.

Flags

Flags:
 ssp00000 00000000 00000vbi 000boscz
 | |                    |||    ||||+- Zero-flag
 | |                    |||    |||+-- Carry/Borrow out of last bit 
 | |                    |||    ||+--- Sign flag
 | |                    |||    |+---- Overflow, last arithmethic action 
 | |                    |||    |      (often equal to carry, but not on div)
 | |                    |||    +----- Byte Arithmetics (N/A in CPU1)
 | |                    ||+---------- Interrupts enabled
 | |                    |+----------- Single-Step interrupt
 | |                    +------------ Overflow interrupt
 | +--------------------------------- Paging enabled (not available in CPU1
 +----------------------------------- Supervisor level (not available in CPU1)

Zero-flag: The Zero-flag will be set if the result of the last operation was zero. This happens if a subtract is zero, a load operation is zero etc.
Carry/Borrow: The Carry-flag is set if the last arithmetic operation caused a carry out of most significant bit.
Sign: The sign flag will be set if the last operation was below zero in two's complement.
Overflow: If the last arithmetic operation caused an overflow.
Byte Arithmetics: Set the CPU to perform byte-wise arithmetic operations (add/sub)
Interrupts enabled: If this flag is zero, interrupts will be masked, otherwise interrupts is enabled
Paging enabled: Indicates that the CPU uses page-tables for addressing physical memory.
Supervisor Level: The protection level of this task. If a task wants access to the paging-flag, etc. it must have level 0.
Single-Step Interrupt: When this flag is set, the CPU generates a BreakPoint exception after the execution of each instruction. This is for debugging purposes.
Overflow Interrupt: When set, the CPU automatically generates a Overflow interrupt when an arithmetic operation causes overflow.

Memory

The address-space of the R1 is 32-bit, and each cell is 16-bit long. This implementation has been chosen to minimize the number of instructions needed for interfacing to memory. If the address-space should have been based on 8-bit cells, the complexity of the instruction set would have been increased by adding 6 new instructions, and besides - even strings use 16-bit pr. character. If an implementation needs 8-bit cells, they can be implemented by either reserving a memory-space, where all addresses will return 0 in the 8 MSB's, or by using a combination of bit-manipulation instructions.

The CPU does not feature an additional I/O-bus, but uses memory mapped I/O. This is done to keep the number of instructions low and does not add additional complexity to the system, since the CPU cannot gain advantages such as parrallel I/O and memory-access on advanced high-performance systems.

Boot-up

When the RESET-pin is pulled low, the CPU is reset. When the CPU resets, almost all registers are set to 0. IP is initialized to what is stored at the location 0fffffffch, and SP is set to 1000h. Thus, the first thing, the CPU does is to read 4 bytes from memory at 0fffffffch.

Instructions

All the instructions are designed to fill exactly one 16-bit byte, thus simplifying the internal construction. An instruction byte comes in three different flavors - the 2-reg, 1-reg and no-regs version.

All the instructions use a syntax where sssss and ddddd stands for the 5-bit number of the affected register, thus for example:

000001.10010.01101 - CMP R18, R13

R0 is never allowed as destination register - R0 can only be modified through JMP, CALL, J??. Trying to use R0 as a destination register results in a OPCODE interrupt.

6 bits	5 bits	5 bits
OpCode (oooooo)	Source Register (sssss)	Destination Register (ddddd)
111111	OpCode (ooooo)	Destination Register (ddddd)
111111	11111	OpCode (ooooo)

NOP

Instruction is no-operation, a space-filler. Issuing a NOP only takes up processor cycles and results in the increment of the instruction pointer.

Format: 000000.-----.-----

IP ← IP + 1

Memory access instructions

MOVE - copy between memory and registers

Copy an operand between two registers or from/to memory. All the instructions come in both a 16-bit version and a 32-bit version, known as MOVE.W and MOVE.L (or just MOVE). There are three basic instructions - fill a register with an immediate value, load/store a memory location with immediate offset from base-register and move between registers or between memory (with offset BP)/registers. When moving words, it is important to remember, that the upper 16 bits are unaffected by moves - they will not be copied, nor modified.

Examples:
	MOVE.B R18, R20			byte(R20) → byte(R18)
	MOVE.W R20:[200], R12		word( mem[R20 + 200] ) → word(R12)
	MOVE.B R19, [200]		byte(R19) → byte( mem[BP + 200] )

Register/memory to register/memory

format:110wmn.sssss.ddddd

m and n selects whether to use the registers as offsets (from BP) or as registers. w selects word (1) or byte (0) moves.

IP ← IP + 1
if (m=0, n=0): ddddd ← sssss
if (m=0, n=1): [BP + ddddd] ← sssss
if (m=1, n=0): ddddd ← [BP + sssss]
if (m=1, n=1): [BP + ddddd] ← [BP + sssss]

zf ← value moved = 0
sf ← value moved < 0

Base-offset addressing (MOVE_BO.B, MOVE_BO.W)

format: 1110ws.bbbbb.rrrrr >offset<

Load/Store word/dword in memory with Base-Offset. Unlike MOVE, MOVE_BO lets you select your offset register. The assembler will automatically select BP (or a matching base if an assume is made) if no offset is selected (eg. MOVE [10], R20 is coded as MOVE BP:[10], R20).

IP ← IP + 1
disp ← word( [IP] )
IP ← IP + 4

if (w=0): rrrrr ← byte( [bbbbb+disp] )
if (w=1): rrrrr ← word( [bbbbb+disp] )

zf ← value moved = 0
sf ← value moved < 0

immediate to register (MOVE.IB, MOVE.IW)

format: 111111.1110w.rrrrr >imm16/imm32<

w selects dword (0) or word (1) load

IP ← IP + 1

if (w=0): { rrrrr ← byte( [IP] ), IP ← IP + 1 }
if (w=1): { rrrrr ← word( [IP] ), IP ← IP + 2 }

zf ← value moved = 0
sf ← value moved < 0

SET / CLR - Set or clear register

Sets register to either 0 or FFFF FFFF.

format: 111111.0000b.rrrrr

IP ← IP + 1

if (b=0): { [rrrrr] ← 0000 0000, zf ← 1, sf ← 0 }
if (b=1): { [rrrrr] ← FFFF FFFF, zf ← 0, sf ← 1 }

PUSH - Push a register on the stack

The CPU supports a stack. The stack is controlled by the SP-pointer, that points to the next location in the stack to use. SP is not calculated as an offset from BP. This is done to protect the stack even though a procedure modifies BP.

format: 11111.10000.rrrrr

IP ← IP + 1
SP ← SP - 2
[SP] ← rrrrr

POP - Pop a register off the stack

The POP instruction is the counterpart to the PUSH instruction. POP reads a dword from the top of the stack and increments the stack-pointer accordingly.

format: 111111.10001.rrrrr

IP ← IP + 1
rrrrr ← [SP]
SP ← SP + 2

zf ← rrrrr = 0
sf ← rrrrr₃₁

Arithmethic instructions

ADD / ADC - add (with carry)

Add source to destination and place result in destination. Flags updated to reflect result of addition. This instruction performs the normal binary add-operation, thus it can be used to add two-complement numbers.

format: 00100c.sssss.ddddd

IP ← IP + 1
if (c=0): ddddd ← ddddd + sssss
if (c=1): ddddd ← ddddd + sssss + cf

cf ← carry-out-of-addition
of ← overflow-flag updated
zf ← ddddd = 0
sf ← ddddd₃₁

SUB / SBB - subtract (with borrow)

Subtract source from destination, place result in destination. Update flags to reflect result of subtraction. The SBB is used when handling large integer operations. The subtract-instruction uses two's complement notation to perform the calculation and thus can have overflows.

format: 00110b.sssss.ddddd

IP ← IP + 1
if (b=0): ddddd ← ddddd - sssss
if (b=1): ddddd ← ddddd - sssss - cf

cf ← borrow-out-of-subtraction
of ← overflow-flag updated
zf ← ddddd = 0
sf ← ddddd31

MUL / IMUL - multiply (integer multiply)

Multiply the two integers in sssss and ddddd. Place the 64-bit result in sssss:ddddd (treat the pair as one 64-bit reg). The order of the register are chosen so that if one is sure, that the result is 32-bits, only ddddd contains result.

format: 01110i.sssss.ddddd

IP ← IP + 1
if (i=0): sssss:ddddd ← ddddd*sssss
if (i=1): sssss:ddddd ← ddddd*sssss (sign corrected)

sf ← cleared if unsigned multiply, set to MSB of sssss if signed mult.
zf ← zero-flag updated, set if sssss:ddddd=0, cleared otherwise

DIV / IDIV - division (integer division)

Divide the integer in ddddd by the integer in sssss. The remainder of the division is placed in sssss and the dividend in ddddd. If an overflow occurs, the overflow flag will be set. If a division by zero is attempted a DIVBYZERO exception will be thrown (interrupt 1).

format: 01111i.sssss.ddddd

IP ← IP + 1

if (sssss=0): raise EDivByZero

if (i=0): { sssss ← ddddd mod sssss, ddddd ← ddddd div sssss }
if (i=1): { sssss ← ddddd mod sssss, ddddd ← ddddd div sssss }

sf ← cleared if unsigned multiply, set to MSB of ddddd if signed mult.
zf ← zero-flag updated, set if sssss:ddddd=0, cleared otherwise

AND - Logical AND

The AND instruction will perform the logical bitwise AND between the two registers sssss and ddddd. The result will be placed in ddddd.

format: 010000.sssss.ddddd

IP ← IP + 1

ddddd ← ddddd & sssss (eg: 1101 & 1010 = 1000)

zf ← ddddd = 0
sf ← ddddd₃₁

OR - Logical OR

The OR instruction will perform bitwise OR between the two registers sssss and ddddd.

format: 010010.sssss.ddddd

IP ← IP + 1

ddddd ← ddddd | sssss (eg: 0101 | 1001 = 1101)

zf ← ddddd = 0
sf ← ddddd₃₁

XOR - Logical Exclusive-OR

The XOR instruction will perform bitwise XOR between the two registers sssss and ddddd.

format: 010011.sssss.ddddd

IP ← IP + 1

ddddd ← ddddd ^ sssss (eg: 0101 | 1001 = 1100)

zf ← ddddd = 0
sf ← ddddd₃₁

SHR - Shift Right

Shifts the ddddd reg by the indicated amount. 0's will be placed in the MSB and the bit shifted out of the register will be placed in the carry-flag. The i selects whether or not to treat sssss as an immediate value instead of a register.

format: 10000i.sssss.ddddd

IP ← IP + 1

if (i=0): treat sssss as register number
if (i=1): treat sssss as immediate

repeat sssss times: { cf ← d₀, d ← 0, d₃₁,d₃₀,d₂₉,d₂₈...d₁ }

zf ← ddddd = 0
sf ← ddddd₃₁

SAR - Shift Arithmethic Right

Unlike SHR, SAR will extend the MSB into the new bit-positions.

format: 10001i.sssss.ddddd

IP ← IP + 1

if (i=0): treat sssss as register number
if (i=1): treat sssss as immediate

repeat sssss times: { cf ← d₀, d ← d₃₁,d₃₁,d₃₀,d₂₉,d₂₈...d₁ }

zf ← ddddd = 0
sf ← ddddd₃₁

SHL - Shift Left

The bits in the register is shiftet to the left. 0's fill the lower bits. The bit shiftet out of the register is placed in the carry-flag.

format: 10010i.sssss.ddddd

IP ← IP + 1

if (i=0): treat sssss as register number
if (i=1): treat sssss as immediate

repeat sssss times: { cf ← d₃₁; ddddd ← d₃₀,d₂₉,d₂₈...d₀,0 }

zf ← ddddd = 0
sf ← ddddd₃₁

ROL / RCL - Rotate (through carry) Left

The bits in the register is rotated to the left. If c is set, the carry is used to rotate through, meaning LSB is set to carry and MSB is shifted into the carry. If c is cleared, the MSB is rotated into the LSB.

format: 1010ci.sssss.ddddd

IP ← IP + 1

if (i=0): treat sssss as register number
if (i=1): treat sssss as immediate

repeat sssss times: if (c=1) { cf ← d₃₁; ddddd ← d₃₀,d₂₉,d₂₈...d₀, old cf } else { cf ← d₃₁; ddddd ← d₃₀,d₂₉,d₂₈...d₀, d₃₁ }

zf ← ddddd = 0
sf ← ddddd₃₁

ROR / RCR - Rotate (through carry) Right

The bits in the register is rotated to the right. If c is set, the carry is used to rotate through, meaning MSB is set to carry and LSB is shifted into the carry. If c is cleared, the LSB is rotated into the MSB.

format: 1011ci.sssss.ddddd

IP ← IP + 1

if (i=0): treat sssss as register number
if (i=1): treat sssss as immediate

repeat sssss times: if (c=1) { cf ← d₀; ddddd ← old cf, d₃₁,d₃₀...d₂,d₁ } else { cf ← d₀; ddddd ← d₀,d₃₁,d₃₀...d₂, d₁ }

zf ← ddddd = 0
sf ← ddddd₃₁

BTC - Bit Clear

Clears the specified bit in the register.

format: 01100i.sssss.ddddd

IP ← IP + 1

if (i=0): treat sssss as register number
if (i=1): treat sssss as immediate

the bit at position sssss in ddddd is cleared (bitpos counted from 0)

zf ← ddddd = 0
sf ← ddddd₃₁

BTS - Bit Set

Sets the specified bit in the register supplied.

format: 01101i.sssss.ddddd

IP ← IP + 1

if (i=0): treat sssss as register number
if (i=1): treat sssss as immediate

the bit at position sssss in ddddd is set (bitpos counted from 0)

zf ← ddddd = 0
sf ← ddddd₃₁

BTT - Bit Test

Copy the specified bit to the carry-flag. The instruction comes in two different versions - one where an immediate is specified in the sssss and another where the sssss is a register, whose value is used as index.

format: 01010i.sssss.ddddd

IP ← IP + 1

if (i=0): treat sssss as register number
if (i=1): treat sssss as immediate

cf ← ddddd_sssss

BTR - Bit Transfer

Copy the carry-flag to the specified bit. The instruction comes in two different versions - one where an immediate is specified in the sssss and another where the sssss is a register, whose value is used as index.

format: 01010i.sssss.ddddd

IP ← IP + 1

if (i=0): treat sssss as register number
if (i=1): treat sssss as immediate

ddddd_sssss ← cf

NOT - Invert all bits

NOT inverts all the bits in rrrrr.

format: 111111.10100.rrrrr

IP ← IP + 1

rrrrr ← !rrrrr (eg. !1001001 = 0110110)

zf ← rrrrr = 0
sf ← rrrrr₃₁

NEG - Negate to two's complement

NEG inverts all bits in rrrrr and adds 1.

format: 111111.10101.rrrrr

IP ← IP + 1

rrrrr ← !rrrrr+1 (eg. !1001001+1 = 0110111)

zf ← rrrrr = 0
sf ← rrrrr₃₁

INC - Increase register

INC adds 1 to the register supplied in rrrrr

format: 111111.10010.rrrrr

IP ← IP + 1

rrrrr ← rrrrr+1

cf ← carry-out-of-addition
of ← cf
zf ← rrrrr = 0
sf ← rrrrr₃₁

DEC - Decrease register

DEC subtracts one from the register supplied in rrrrr.

format: 111111.10011.rrrrr

IP ← IP + 1

rrrrr ← rrrrr-1

cf ← borrow-out-of-subtraction
of ← cf
zf ← rrrrr = 0
sf ← rrrrr₃₁

Control transfer instructions

All jump/call instructions come in two flavors - immediate and register. The two versions differ in the way they calculate the destination address - immediate uses relative addressing and register uses absolute addressing. This is done because to allow a program to be relocateable, one must not use absolute addressing, but when working with things as object-oriented programming and other systems using a high degree of late-linking/dynamic-linking it is nescescary to jump to an absolute address in memory regardless of from where the method is called.

CMP - Compare two registers

Compare the two registers. Comparing two registers update the flags as if a subtraction was performed. That is if ddddd-sssss=0, the zero-flag will be set, otherwise cleared. If the ddddd-sssss<0 (two's complement) the sign flag will be set, otherwise cleared.

format: 001110.sssss.ddddd

zf ← (rrrrr-sssss) = 0
sf ← (rrrrr-sssss)₃₁

CALL - Call subroutine

Branch to the absolute address specified in r as a subroutine (push return address on stack)

format: 111111.00011.rrrrr

IP ← IP + 1
[SP] ← IP
SP ← SP - 2

IP ← rrrrr

Branch to immediate 32-bit relative address following this instruction as subroutine.

format: 111111.11111.00011

IP ← IP + 1
[SP] ← IP
SP ← SP - 2

IP ← imm32

RET - Return from sub-routine

The CPU pops the return-address off the stack and puts it in IP, then continues execution at the address.

format: 111111.11111.00000

SP ← SP + 2
IP ← [SP]

*INT - software interrupt

Causes the CPU to push the flags on the stack, look up the interrupt address and transfer control to that location. The interrupt-table consists of 2-byte entries with the address of the handler. The rrrrr specifies which type of interrupt is created. It is not recommended to use this opcode to transfer control to other methods.

format: 111111.?????.rrrrr

push F
push IP
ip ← [ IT + 2 * rrrrr ]

IRET - Interrupt return

Used in conjunction with INT to return from an interrupt handler. First, the return-address is popped, then the flags.

format: 111111.11111.00001

pop IP
pop F

JMP - Unconditional Jump

Branches to the absolute address specified in rrrrr.

format: 111111.00011.rrrrr

IP ← rrrrr

Branches to immediate 32-bit relative address following this instruction

format: 111111.11111.00010

IP ( imm32

J?? - Conditional Jump

Conditional jump are a group of instructions that allows conditional jumps to occur. The instructions, opcodes and their flag-equivalents are listet below. All jumps come in two versions - relative immediate addressing and absolute register addressing. The following two formats are used - substitute ooooo with the combination read in the table below:

format: 111111.ooooo.rrrrr

format: 111111.11111.ooooo

Mnemonic	Jump if...	Condition	Opcode
JE / JZ	equal / zero	ZF=1	10000
JNE / JNZ	not equal / not zero	ZF=0	10001
JC	carry	CF=1	10010
JNC	no carry	CF=0	10011
JS	signed	SF=1	10100
JNS	not signed	SF=0	10101
JA	above	(ZF or CF)=0	10110
JAE	above or equal	CF=0	10111
JB	below	CF=1	11000
JBE	below or equal	(CF or ZF)=1	11001
JG	greater than (two's cmp)	((SF xor OF) or ZF)=0	11010
JGE	greater than or equal to (two's cmp)	(SF xor OF)=0	11011
JL	less than (two's cmp)	(SF xor OF)=1	11100
JLE	less than or equal to (two's cmp)	((SF xor OF) or ZF)=1	11101

* This table is based entirely on the jump-instructions of the Intel 80x86 processor series as described in Microprocessors and interfacing1 figure 4-10, p.76. Only parity and overflow has been removed - totalling 14 jump instructions.

HALT - Halt

Causes the CPU to halt, waiting until an interrupt is issued.

format: 111111.11111.11111

Instruction Map

Group 1 - 2-reg operations

All instructions in this group has the format:

oooooo.sssss.ddddd

The table should be read so that the 4 most significant bits are read from the rows, and the columns supply the two least significant bits of the instruction.

↓ 4 msb \| 2 lsb →	00	01	10	11
0000	NOP
0001
0010	ADD	ADC
0011	SUB	SBB	CMP
0100	AND	TEST	OR	XOR
0101	BTT	BTT imm	BTR	BTR imm
0110	BTC	BTC imm	BTS	BTS imm
0111	MUL	IMUL	DIV	IDIV
1000	SHR	SHR imm	SAR	SAR imm
1001	SHL	SHL imm
1010	ROR	ROR imm	RCR	RCR imm
1011	ROL	ROL imm	RCL	RCL imm
1100	MOVE.L	MOVE.L	MOVE.L	MOVE.L
1101	MOVE.W	MOVE.W	MOVE.W	MOVE.W
1110	MOVE_BO.L	MOVE_BO.L	MOVE_BO.W	MOVE_BO.W
1111				reserved

Group 2 - 1-reg operations:

All instructions in this group has the format:

111111 ooooo rrrrr

↓ 10 msb \| 1 lsb →	0	1
111111 0000	CLR	SET
111111 0001	NOT	NEG
111111 0010	INC	DEC
111111 0011	PUSH	POP
111111 0100	MOVE.I	MOVE.IW
111111 0101
111111 0110
111111 0111	JMP	CALL
111111 1000	JE	JNE
111111 1001	JC	JNC
111111 1010	JS	JNS
111111 1011	JA	JAE
111111 1100	JB	JBE
111111 1101	JG	JGE
111111 1110	JL	JLE
111111 1111		Reserved

Group 3 - no registers

The 15 most significant bits are shown downwards, the least significant bit of the instruction is left.

↓ 15 msb \| 1 lsb →	0	1
111111 11111 0000	RET	IRET
111111 11111 0001
111111 11111 0010
111111 11111 0011
111111 11111 0100
111111 11111 0101
111111 11111 0110
111111 11111 0111	JMP imm	CALL imm
111111 11111 1000	JE imm	JNE imm
111111 11111 1001	JC imm	JNC imm
111111 11111 1010	JS imm	JNS imm
111111 11111 1011	JA imm	JAE imm
111111 11111 1100	JB imm	JBE imm
111111 11111 1101	JG imm	JGE imm
111111 11111 1110	JL imm	JLE imm
111111 11111 1111	INT4 (BRKPNT)	HALT

Interrupts

Interrupts are treated with priority according to their number - 0 is highest priority and 31 is lowest. A higher-priority interrupt can only interrupt the execution of a lower-priority interrupt if the interrupt flag is set. This does not apply to non-maskable interrupts (the first 8 interrupts), which can interrupt any lower-priority executing interrupt. This is because it may be nescesary to force certain kind of exceptions through even though an interrupt is already being handled - eg. currently we are handling a timer-interrupt exclusively. The timer interruptcode contains illegal instructions which cannot be executed, so we must generate an OPCODE interrupt. Unfortunally, while we're handling the opcode, a NMI occurs (we are at a nuclear plant and the reactor is just about to blow) so it's probably good to handle the NMI, even though we are already handling the OPCODE. But then a RAMERR occurs - this interrupt is currently ignored, since we really cannot interrupt a NMI, so it will have to wait until we start working on the OPCODE again. The OPCODE, on the other hand, cannot correctly execute with RAMERR, so RAMERR is handled, and then OPCODE is completed, and finally the timer-interrupt may be completed or abandoned.

The following 16 are currently assigned in the emulator - the first 8 interrupts are unmaskable and will be executed even though the interrupt flag is cleared. The last 8 interrupts are features included in the emulator/CPU, the 3 external pins for interrupts, IEXT0 through IEXT2 and

Interrupts are serviced when the currently executing instruction is completed (usually a couple of clock-ticks later).

The interrupt-table was modified on 2000-08-30 to accomodate the 4 timers instead of only 3 and the names where changed to timer-0 through timer-3. Except from that, the numbers was reassigned for better high-speed

No.	Name	Meaning	Reason
00	NMI	Non-Maskable interrupt	This interrupt is generated when the external NMI pin is a logical 1.
01	RAMERR	RAM-Error	If the CPU is equipped with ECC or parity-enabled RAM, the RAMERR pin can be used to signal an error to the CPU. If the RAMERR pin is a logical 1, this interrupt is generated. On systems not equipped with error-detecting/correcting RAM, the pin should always be zero.
02	DBZ	Division by Zero	When a division with zero is attempted.
03	OPCODE	Invalid Opcode detected	When an application tries to execute an invalid operation, this interrupt is generated. Among invalid operations are attempts to modify R0 with an arithmetic instruction.
04	BRKPNT	Break-Point	When a break-point instruction is encountered in the instruction stream, this interrupt is generated.
05	PFAULT	Page-fault	Page not present. This interrupt is generated when the page, on which the requested storage cell is placed, is (according to the page-table) not present in memory.
06	reserved
07	reserved
08	OVRFLOW	Overflow	If the Interrupt-On-Overflow flag is set, this interrupt is generated when a 2's-complement arithmetic instruction causes an overflow.
09	IEXT0	Pin /IEXT0	When a logical zero (low) is applied to pin IEXT0, this interrupt is raised.
0A	TIMER0	Timer-0	This interrupt is used by the internal timer/counter circuit in the CPU. This interrupt is coupled with timer-0.
0B	IEXT1	Pin /IEXT1	When a logical zero (low) is applied to pin IEXT1, this interrupt is raised.
0C	TIMER1	Timer-1	This interrupt is used by the internal timer/counter circuit in the CPU. This interrupt is coupled with timer-1.
0D	IEXT2	Pin /IEXT2	When a logical zero (low) is applied to pin IEXT2, this interrupt is raised.
0E	TIMER2	Timer-2	This interrupt is used by the internal timer/counter circuit in the CPU. This interrupt is coupled with timer-2.
0F	TIMER3	Timer-3	This interrupt is used by the internal timer/counter circuit in the CPU. This interrupt is coupled with timer-3.
10-1F	IEXT	External interrupts	When a logical zero (low level) is applied to an external interrupt, the according interrupt is raised.

Execution of an interrupt followes this:

SP ← SP - 2
[SP] ← F
SP ← SP - 2
[SP] ← IP

IP ← [IT+2*IntNo]

After an interrupt is completed, the following steps are taken

IP ← [SP]
SP ← SP + 2
F ← [SP]
SP ← SP + 2

Assembler

The R1-assembler is a simple assembler supporting basic functionality required for building an assembler. Macros and modules are not supported.
Most syntax is supplied on Extended-Backus-Naur form. This means that on the left-hand-side of ::= a class of grammatic element is defined, on the right-hand-side the elements, that make up the new element is listet. Elements enclosed in "[" and "]" is optional, elements enclosed in "{" and "}" are repeated 0 or more times.

Scanner

The scanner performs the first processing of the input. This consists of tokenizing the input into more handable tokens consisting of serveral characters. To properly handle case-insensitivity, all letters are converted to lowercase during the scanners processing. When the scanner detects an identifier, it will use a look-up-table to convert the identifier to one of the reserved words in the assembler; if a match is found the identifier is converted to the appropriate token-type - this behavior will not always be apparent in the following chapters, but for the consistency of this document the approach is chosen.
The output from the scanner is a set of tokens obeying these rules:

letter ::= 'a'..'z'
digit ::= '0'..'9'
hexdigit ::= '0'..'9', 'a'..'f'
alphanummeric ::= <letters>,<digits>
ident ::= <letter> { <alphanummeric> }
const ::= <digit> [ <hex-digits> ]
register ::= "ip" | "bp" | "sp" | "it" | "fl" | "flags" | "r00" | "r01" | "r02" | ... | "r31"
string ::= '"' { chars } '"'

Strings are sequences of characters enclosed in double-quotes. The following escape-sequences are supported:

\b = backspace (8)
\t = horizontal tabulator(9)
\n = line-feed (10)
\r = carriage return (13)
\\ = \
\" = "

Comments

Comments, too, are handled in the scanner. When the scanner encounters the "#" character, it issues an end-of-line token.

comment ::= "#" <rest of line>

Asm lines

line ::= <directive> | <label> [ <instruction> ] | <instruction>

Expressions

The assembler supports expressions (resolving to constants during assembly) with operator priority. This results in the following set of grammar:

E0    ::= <E1> { ("+" | "-") <E1> }
E1    ::= <E2> { "*" <E2> }
E2    ::= <value> | "(" <E0> ")"
value ::= <const> | <ident(type==const)> | "@" <ident(type!=const)>

Note here, that if an identifier is used in a value, it must be a constant or the address of the identifier - this is to ensure that the type of the expression is a value.

Constants

constant ::= <ident> "=" <E0>

A constant is an identifier bound to a constant value. Constants should be used for "magic numbers" in the code, so that only one number needs to be changed. The E0 supplied must be resolveable in the first pass; this is to allow the use of constants in labels.

Assembler directives

directive  ::= "." <dirspec>
dirspec    ::= "end" | "assume" <assume_list> | "seg" <ident>
assume_list::= { <reg> ":" <ident> }

Directives are used to control the behavior of the assembler during assembly.

.end

.end are used to indicate the end of the file. All text after .end is ignored.

.seg

When this directive is encountered, the assembler creates a new segment of data. You may freely place whatever data you desire in any segment. The .seg must be followed by a unique identifier so that no two segments collide.

.assume { <reg> ":" <ident> }

.assume is used by the assembler to choose a proper base-register when addressing. The statement is followed by a list of registers and segments that the registers point to - thus all idents in the directive must be segments.

	.seg data
	be: db 200
	.seg code
	.assume bp:data
	cell: dw 10
	move be, r10   # due to assume, this is assembled as "move.b bp:[@be], r10"
	move cell, r11 # since cell is in current segment, a relative indexing 
	               # is used "move.b ip:[@cell-$], r11"
	.end

Labels

Labels are used both for jump destinations and for defining storage for the program, therefore the syntax is quite flexible to allow these two purposes

label ::= <ident> "=" <E0 resolvable> | 
          <ident> ":" [ <stor_spec> [ "*" <E2 resolvable> | <num_list> ] | <inst> ]
storagespec ::= "db" | "dw"
num_list	 ::= ( <E0> | <string> ) { "," ( <E0> <string> ) }

If a label is followed by a storagespec, the identifier will be marked as a memory-location rather than a label and may then be used in move-statements. If the storagespec is followed by a "*", the allocation is repeated the number of times supplied afterwards. If the line contains more, a list of values are expected and parsed. For each value it parses, the counter is increased by the size of the first element.
If the label is not followed by a storagespec, the identifier is marked as a label and the rest of the line is parsed as an instruction.
The offset of a variable (the address assigned) is obtained by prefixing the variable with @, so if you want to move the address of eg. myVar (who's offset is 0x02998) into a register, you may write "move @myVar, R10", which will be assembled the same way as "move 0x02998, R10".

Instructions

Parsing instructions is probably the most difficult task for the assembler - it requires a lot of work to handle these syntactic rules. All mne_? is processed in the scanner.

movemod    ::= "B" | "W"
mne_move   ::= "MOVE" [ "." <movemod> ]
mne_bitman ::= "BTT" | "BTR" | "BTS" | "BTC" | "SHL" |
               "SHR" | "SAR" | "ROL" | "RCL" | "ROR" | "RCR"
mne_jmp    ::= "CALL" | "JMP" | "JE" | "JNE" | "JC"	| "JNC" |
               "JS"   | "JNS" | "JA" | "JAE" | "JB"	| "JBE" |
               "JG"   | "JGE" | "JL"	| "JLE"
mne_2reg   ::= "ADD" | "SUB" | "AND" | "OR"	| "XOR" | "TEST"
mne_1reg	  ::= "CLR" | "SET" | "PUSH" | "POP" | "INC" | "DEC" | "NOT" | "NEG"
mne_0reg	  ::= "HALT" | "RET" | "IRET"

op0reg     ::= <mne_0reg>
op1reg     ::= <mne_1reg> <reg>
op2reg     ::= <mne_2reg> <reg> "," <reg>
opbitman   ::= <mne_bitman> <reg> "," <reg> | 
               <mne_bitman> <E0> "," <reg>
opjmp      ::= <mne_jmp> <reg> | <mne_jmp> <ident(type==label)>
moveoperand::= <const> | "@" <ident> | "(" E0 ")"  | "[" <reg> "]" | 
               "[" <E0> "]" | <ident> [ "[" <E0> "]" ] | <reg> | 
               <reg> ":" <moveoperand>
opmove     ::= <mne_move> <moveoperand> "," <moveoperand>
inst       ::= <opmove> | <opjmp> | <opbitman> | 
               <op2reg> | <op1reg> | <op0reg>

"move" instruction

While all other instructions are quite simple to parse, the move-instruction raises the most problems due to its higher grade of flexibility. Therefore we will take a look at the move-instruction, where we handle each of the different cases:

moveoperand::= <const> | "@" <ident> | "(" E0 ")"  | "[" <reg> "]" | 
               "[" <E0> "]" | <ident> [ "[" <E0> "]" ] | <reg> | 
               <reg> ":" <moveoperand>
opmove     ::= <mne_move> <moveoperand> "," <moveoperand>

<const>

Simple handled - a const may never occur on the rhs in the move.

"@" <ident>

Take the address of the identifier - only possible if the identifier is a label or memory reference.

"(" E0 ")"

Handled as a constant - may thus never occur on rhs of move.

"[" <reg> "]" | "[" <E0> "]"

Whenever something is enclosed in "[" "]" it is considered an offset (a reference to memory). If a base-register is not supplied through the <reg> ":" <moveoperand> syntax, the current assumes will be used to attempt to resolve the base-register to use; this is of course not relevant if a register is enclosed between the brackets, in which case the base-register is always bp.

<reg> ":" <moveoperand>

Base override - used when you wish to define your own base-register for the addressing. When this syntax is used, the base-register of the moveoperand following need not be resolvable.

<ident> [ "[" <E0> "]" ]

Array indexing - the E0 value will be resolved, multiplied by the element size of ident and the displacement added to the address of ident to form the complete offset. The base-register needed to address ident must be resolvable unless the above sumtax is used.

Of course - the simplicity of the CPU sets a great deal of restrictions on the use of the move-instruction. But in general:

no more than two registers may ever be used
you cannot move from memory to memory unless both operands are register-offsets ([<reg>])
you can never store in a constant - rhs must be either register or offset

Besides, try it out, the assembler will make a syntax-error if it encounters an instruction it does not know how to assemble. You may be able to assemble the instruction manually, in which case you can use a label to resolve your problem.

Sample code

Examples of assembler code
# Tell the assembler that the data-area starts here.
# The assembler then assumes, that BP points to the start of the data-area,
# thus it always generate move [BP+@varname],  or move ,[BP+@varname]
# instructions. 
.seg data

# three constants are defined here
pot10_1 = 10
pot10_2 = 100
keyb_io = 0x0A

# a definition of a series of bytes
pot10_3: db 1000

# here two words known as "pot10_6" and "pot10_9"
pot10_6: dw 1000000
pot10_9: dw 1000000000

# and now a complete array of words, first memory location is "pot10"
pot10: dw 10,100,1000,10000,100000,1000000,10000000,100000000,1000000000

# when no initial value is supplied, a zero is assumed
read: db

.seg code
.assume bp:data

start:	move.b [keyb_io], R10		# this is allowed since keyb_io is bound to constant value
	move.b ascii_a, R11		# the value ascii_a is loaded in R10
	add R11, R10			# the two regs are added
	move.b R10, read		# this syntax is allowed since "read" is bound to a memory-loc.

.end
# the code ends at ".end". The assembler will not try to process anything after ".end".

The R1-emulator

The R1-emulator features the same set of periphial devices as found in the CPU, that is the integrated interrupt-controller and the 4 timers/counters timer-0 to timer-3.

Timer/Counter

The timers are count-up timers. The timers/counters are usually used to make an interrupt after a given number of events has occured, therefore they are count-up timers. Each timer has an event-value, that is the value, on which an interrupt is desired. When the counter reaches this value, the interrupt-pin is set. If the counter is set for auto-reset, 0 will be put into the counter and thus the counter is ready for another run.

The 4 timer/counter is accessible at the absolute address:

Address	Contains
0xFFFF0000	Timer0 event value (write), counter value (read)
0xFFFF0002	Timer0 control word
0xFFFF0004	Timer1 event value (write), counter value (read)
0xFFFF0006	Timer1 control word
0xFFFF0008	Timer2 event value (write), counter value (read)
0xFFFF000A	Timer2 control word
0xFFFF000C	Timer3 event value (write), counter value (read)
0xFFFF000E	Timer3 control word

Should you try to access eg. 0xFFFF0001, an undefined result will be returned - only word-reads on exact addresses are allowed.

The control-words has the following format:

	00000000.00000000.00000000.000ssize
	ss - select source
		00 - internal clock
		01 - external clock (T?CLK)
		10 - select last counters event flag
		11 - reserved
	i - interrupt on eventval
		0 - do not generate interrupt when D == eventcount
		1 - generate interrupt when D == eventcount
	z - reset on eventval
		0 - do not reset counter when eventvalue is reached
		1 - reset counter when eventvalue is reached
	e - counter enabled
		0 - counter not enabled
		1 - counter enabled

¹Douglas V. Hall - Microprocessors and Interfacing, programming and hardware, second edition - McGraw-Hill 1992 - ISBN: 0-07-112636-8.

No warrenties are associated with the information provided here. All information and concepts are public domain (unless I overlooked someone).

Last updated 2000-10-11 by FlushedSector

Abstract

Index

Legend

Architecture

Registers

Flags

Memory

Boot-up

Instructions

NOP

Memory access instructions

MOVE - copy between memory and registers

Register/memory to register/memory

Base-offset addressing (MOVE_BO.B, MOVE_BO.W)

immediate to register (MOVE.IB, MOVE.IW)

SET / CLR - Set or clear register

PUSH - Push a register on the stack

POP - Pop a register off the stack

Arithmethic instructions

ADD / ADC - add (with carry)

SUB / SBB - subtract (with borrow)

*MUL / *IMUL - multiply (integer multiply)

*DIV / *IDIV - division (integer division)

AND - Logical AND

OR - Logical OR

XOR - Logical Exclusive-OR

SHR - Shift Right

SAR - Shift Arithmethic Right

SHL - Shift Left

*ROL / *RCL - Rotate (through carry) Left

*ROR / *RCR - Rotate (through carry) Right

BTC - Bit Clear

BTS - Bit Set

BTT - Bit Test

BTR - Bit Transfer

NOT - Invert all bits

NEG - Negate to two's complement

INC - Increase register

DEC - Decrease register

Control transfer instructions

CMP - Compare two registers

CALL - Call subroutine

RET - Return from sub-routine

*INT - software interrupt

IRET - Interrupt return

JMP - Unconditional Jump

J?? - Conditional Jump

HALT - Halt

Instruction Map

Group 1 - 2-reg operations

Group 2 - 1-reg operations:

Group 3 - no registers

Interrupts

Assembler

Scanner

Comments

Asm lines

Expressions

Constants

Assembler directives

.end

.seg

.assume { <reg> ":" <ident> }

Labels

Instructions

"move" instruction

<const>

"@" <ident>

"(" E0 ")"

"[" <reg> "]" | "[" <E0> "]"

<reg> ":" <moveoperand>

<ident> [ "[" <E0> "]" ]

Sample code

The R1-emulator

Timer/Counter

MUL / IMUL - multiply (integer multiply)

DIV / IDIV - division (integer division)

ROL / RCL - Rotate (through carry) Left

ROR / RCR - Rotate (through carry) Right