Contents
- 1. Introduction
-
2. CPU file
-
2.1. components
- 2.1.1. Add
- 2.1.2. ALU
- 2.1.3. ALUControl
- 2.1.4. And
- 2.1.5. Concatenator
- 2.1.6. Constant
- 2.1.7. ControlUnit
- 2.1.8. DataMemory
- 2.1.9. Distributor
- 2.1.10. ExtendedALU
- 2.1.11. Fork
- 2.1.12. ForwardingUnit
- 2.1.13. HazardDetectionUnit
- 2.1.14. InstructionMemory
- 2.1.15. Multiplexer
- 2.1.16. Not
- 2.1.17. Or
- 2.1.18. PC
- 2.1.19. PipelineRegister
- 2.1.20. RegBank
- 2.1.21. ShiftLeft
- 2.1.22. SignExtend
- 2.1.23. Xor
- 2.1.24. ZeroExtend
- 2.2. wires
- 2.3. reg_names
- 2.4. instructions
-
2.1. components
-
3. Instruction set file
- 3.1. types
- 3.2. instructions
- 3.3. pseudo
- 3.4. control
- 3.5. alu
- 4. Custom Components
1. Introduction
DrMIPS provides several unicycle and pipeline MIPS datapaths.
These datapaths are defined in JSON
(http://json.org/) files, and
have the .cpu
extension.
These CPU files can be modified, and new ones can be created.
The instruction sets used by the datapaths are also defined in JSON files,
having the .set
extension.
These can also be created and modified.
This manual explains, with some detail, the syntax of both of these files. Section 2 explains the syntax of the CPU files while section 3 explains the syntax of the instruction set files.
2. CPU file
The different versions of the MIPS CPU are defined in CPU files. These can be edited/configured and additional ones can be created. This section explains the syntax of these files.
The CPU files are formatted in JSON. A partial example of a CPU file is shown below:
{
"components": {
"MuxDst": {
"type": "Multiplexer",
"x": 205,
"y": 260,
"size": 5,
"sel": "RegDst",
"out": "Out",
"in": ["0", "1"],
"desc": {
"default": "Custom description in English.",
"pt": "Descrição personalizada em Português."
}
},
...
},
"wires": {
{
"from": "DistInst",
"out": "15-11",
"to": "MuxDst",
"in": "1",
"start": {"x": 185, "y": 270},
"points": [
{"x": 195, "y": 270},
{"x": 195, "y": 282}
],
"end": {"x": 205, "y": 280}
},
...
},
"reg_names": ["zero", "at", "v0", "v1", "a0", "a1", "a2", "a3",
"t0", "t1", "t2", "t3", "t4", "t5", "t6", "t7",
"s0", "s1", "s2", "s3", "s4", "s5", "s6", "s7",
"t8", "t9", "k0", "k1", "gp", "sp", "fp", "ra"],
"instructions": "default.set"
}
The various sections that compose the CPU files are detailed in the following sections.
2.1. components
This section defines all the components of the CPU and their properties. An example of the definition of a component is shown below:
"MuxDst": {
"type": "Multiplexer",
"x": 205,
"y": 260,
"size": 5,
"sel": "RegDst",
"out": "Out",
"in": ["0", "1"],
"desc": {
"default": "Description in English.",
"pt": "Descrição em Português."
}
}
Each component is identified by a unique ID (MuxDst
in this example).
The properties of the component are defined between curly braces as a JSON
object. Many properties are specific to each type of component, but some exist
for all components. These are:
- type: the type of the component. The component of the example is a multiplexer. The different types of components are explained next.
-
latency:
(optional) an integer with the latency of the component
in ps. The default latency is
0
ps. - x: the x-coordinate of the top-left corner of the component in the graphical datapath. The minimum value is 0 and corresponds to the left border of the datapath. There is no maximum value.
- y: the y-coordinate of the top-left corner of the component in the graphical datapath. The minimum value is 0 and corresponds to the top border of the datapath. There is no maximum value.
-
desc:
(optional) component specific description, in each language,
shown in the tooltip of the component.
The value is a JSON object where the description for each language is defined
in the form
"language_code": "Description."
. Thelanguage_code
identifier is the code of the language, likept
orpt_BR
. The special language codedefault
should define the default description in English that is used when the language-specific description is not available.
The following subsections explain the different types of components available
and their specific properties.
The titles of the subsections are the values that should be written in the
type
property of the components. These correspond to the names
of the classes in the source code. This property is
case-sensitive.
2.1.1. Add
An adder that sums the values of the inputs. The specific properties are:
- in1: identifier of the first input.
- in2: identifier of the second input.
- out: identifier of the output.
2.1.2. ALU
The basic ALU. Only one ALU or Extended ALU can be present. The specific properties are:
- in1: identifier of the first input.
- in2: identifier of the second input.
- control: identifier of the control input (that selects the operation to perform).
- out: identifier of the result output.
-
zero: identifier of the 1 bit
zero
output.
2.1.3. ALUControl
The component that controls the ALU. Only one can be present. The specific properties are:
-
aluop: identifier of the
ALUOp
input. -
func: identifier of the
func
input (from the function field of the instruction).
2.1.4. And
A logical AND
port. The specific properties are the same as the
Add component.
2.1.5. Concatenator
A "concatenator" that concatenates the values of the two inputs into a single output. The value of the output is the concatenation of the value of the first input (as higher order bits) with the value of the second input (as lower order bits). The size of the output is equal to the sum of the sizes of the inputs. The specific properties are:
- in1: properties of the first input.
- in2: properties of the second input.
- out: identifier of the output.
The properties of both inputs are defined as JSON objects. The properties of these objects are:
- id: identifier of the input.
- size: size of the input (in bits).
2.1.6. Constant
A component that outputs a constant value. The specific properties are:
- out: identifier of the output.
- val: the constant value.
- size: size of the output (in bits).
2.1.7. ControlUnit
The control unit. The datapath must have one control unit.
This component has only one specific property: in
, which is
the identifier of the input.
2.1.8. DataMemory
The data memory. Only one can be present. The specific properties are:
- size: size of the memory (number of 32 bits memory positions).
-
address: identifier of the
Address
input. -
write_data: identifier of the
WriteData
input. - out: identifier of the output.
-
mem_read: identifier of the
MemRead
control input. -
mem_write: identifier of the
MemWrite
control input.
2.1.9. Distributor
This component distributes the bits of the input through several outputs. The specific properties are:
-
in: properties of the input as a JSON object.
The properties of the object are:
- id: identifier of the input.
- size: size of the input (in bits).
-
out: Properties of the outputs as a JSON array.
Each element of the array defines the properties of an output as a
JSON object. The properties of the objects of the array are:
- msb: the index of the most significant bit from the input.
- lsb: the index of the less significant bit from the input.
-
id: (optional) identifier of the output.
If omitted, the identifier corresponds to
"
<msb>-<lsb>
".
2.1.10. ExtendedALU
An extended ALU. This ALU stores the hi
and lo
registers and is capable of calculating multiplications and divisions.
Only one ALU or Extended ALU can be present. The specific properties are
the same as the basic ALU.
2.1.11. Fork
This component forks a wire into several other wires with the same size. The specific properties are:
- in: identifier of the input.
- size: size of the input and outputs (in bits).
- out: array with the identifiers of the outputs.
2.1.12. ForwardingUnit
The pipeline forwarding unit. Only one can be present. The specific properties are:
-
ex_mem_reg_write: identifier of the
EX/MEM.RegWrite
input. -
mem_wb_reg_write: identifier of the
MEM/WB.RegWrite
input. -
ex_mem_rd: identifier of the
EX/MEM.Rd
input. -
mem_wb_rd: identifier of the
MEM/WB.Rd
input. -
id_ex_rs: identifier of the
ID/EX.Rs
input. -
id_ex_rt: identifier of the
ID/EX.Rt
input. -
fwd_a: identifier of the
ForwardA
output. -
fwd_b: identifier of the
ForwardB
output.
2.1.13. HazardDetectionUnit
The pipeline hazard detection unit. Only one can be present. The specific properties are:
-
id_ex_mem_read: identifier of the
ID/EX.MemRead
input. -
id_ex_rt: identifier of the
ID/EX.Rt
input. -
if_id_rs: identifier of the
IF/ID.Rs
input. -
if_id_rt: identifier of the
IF/ID.Rt
input. - stall: identifier of the output.
2.1.14. InstructionMemory
The instruction memory. The datapath must have one instruction memory. The specific properties are:
- in: identifier of the input.
- out: identifier of the output.
2.1.15. Multiplexer
A multiplexer. The specific properties are:
- size: the size of the inputs and output (in bits).
- sel: identifier of the selector input.
- out: identifier of the output.
- in: array with the identifiers of the inputs.
2.1.16. Not
A logical NOT
port. The specific properties are:
- in: identifier of the input.
- out: identifier of the output.
2.1.17. Or
A logical OR
port. The specific properties are the same as the
Add component.
2.1.18. PC
The program counter. The datapath must have one program counter. The specific properties are:
- in: identifier of the input.
- out: identifier of the output.
-
write: (optional) identifier of the
Write
control input.
2.1.19. PipelineRegister
A pipeline register that separates two stages of the pipeline.
A pipelined datapath must have exactly 4 of these registers (corresponding to a
5-stage pipeline). Additionally, the identifiers of these components must be:
IF/ID
, ID/EX
, EX/MEM
and
MEM/WB
. The specific properties are:
- regs: definition of the registers recorded. The value is a JSON object where each property defines a register: the identifier is the identifier of the register and corresponding input and output, and the value is the size of the register (in bits).
-
flush: (optional) identifier of the
Flush
control input. -
write: (optional) identifier of the
Write
control input.
2.1.20. RegBank
The register bank. The datapath must have one register bank. The specific properties are:
- num_regs: the number of registers. Must be greater than 1 and a power of 2.
-
read_reg1: identifier of the
ReadReg1
input. -
read_reg2: identifier of the
ReadReg2
input. -
read_data1: identifier of the
ReadData1
output. -
read_data2: identifier of the
ReadData2
output. -
write_reg: identifier of the
WriteReg
input. -
write_data: identifier of the
WriteData
input. -
reg_write: identifier of the
RegWrite
control input. -
forwarding: (optional) if
true
, the register bank will use internal forwarding (for pipelined datapaths). -
const_regs: (optional) JSON array that defines the
constant registers. Each element can be either the index of the
register or a JSON object with the following properties:
- reg: index of the register.
- val: the constant value of the register.
2.1.21. ShiftLeft
A shift-left logical. The specific properties are:
- in: properties of the input.
- out: properties of the output.
- amount: number of bits to shift left.
2.1.22. SignExtend
A sign extender. The specific properties are:
- in: properties of the input.
- out: properties of the output.
The properties of both the input and output are defined as JSON objects. The properties of these objects are:
- id: identifier of the input/output.
- size: size of the input/output (in bits).
2.1.23. Xor
A logical XOR
port. The specific properties are the same as the
Add component.
2.1.24. ZeroExtend
A zero extender. The specific properties are the same as the SignExtend component.
2.2. wires
This section defines all the wires that connect the components of the CPU. An example of the definition of a wire is shown below:
{
"from": "DistInst",
"out": "15-11",
"to": "MuxDst",
"in": "1",
"start": {"x": 185, "y": 270},
"points": [
{"x": 195, "y": 270},
{"x": 195, "y": 282}
],
"end": {"x": 205, "y": 280}
}
Each wire connects an output of a component to an input of another component. A wire is defined as a JSON object with several properties. These are:
- from: the ID of the component that the wire connects from (origin).
- out: the ID of output of the origin component that the wire connects from.
- to: the ID of the component that the wire connects to (destination).
- in: the ID of input of the destination component that the wire connects to.
- start: (optional) a JSON object that defines the start position of the wire in the graphical datapath, if the default one is unsuitable.
- points: (optional) an array of JSON objects that define the positions of the intermediate points of the wire, if desired.
- end: (optional) a JSON object that defines the end position of the wire in the graphical datapath, if the default one is unsuitable.
The positions used in the start
, points
and
end
properties above are JSON objects with two integer properties:
x
and y
.
Each input and output of each component is, by default, "attached" to one of
the four sides of the component.
The positions of the inputs and outputs in the datapath and, thus, the start
and end positions of the connected wires, are calculated automatically but can
be overwritten by the start
and end
properties.
The inputs and outputs on each side of a component are, by default, ordered
alphabetically by their IDs.
2.3. reg_names
This optional section defines the "friendly" names of the
registers (i.e. $zero
, $t0
, etc.).
The value is an array of strings that defines the names of the registers, from
register $0
to the last one, without the leading dollar
sign.
The registers can always be referred by their indexes ($0
,
$1
, etc.) in the simulator.
2.4. instructions
This section declares the instruction set that the CPU uses. The value is the relative path to the desired instruction set file. These files are explained in the next section.
3. Instruction set file
The instruction sets used by the different versions of the MIPS CPU are defined in instruction set files. These can be edited/configured and additional ones can be created. This chapter explains the syntax of these files.
Like the CPU files, the instruction set files are formatted in JSON. A partial example of an instruction set file is shown below:
{
"types": {
"R": [
{"id": "op", "size": 6},
{"id": "rs", "size": 5},
{"id": "rt", "size": 5},
{"id": "rd", "size": 5},
{"id": "shamt", "size": 5},
{"id": "func", "size": 6}
],
...
},
"instructions": {
"add": {
"type": "R",
"args": ["reg", "reg", "reg"],
"fields": {
"op": 0,
"rs": "#2",
"rt": "#3",
"rd": "#1",
"shamt": 0,
"func": 32
},
"desc": "$t1 = $t2 + $t3"
},
...
},
"pseudo": {
"move": {
"args": ["reg", "reg"],
"to": ["add #1, #2, $0"],
"desc": "$t1 = $t2"
},
...
},
"control": {
"0": {"RegDst": 1, "RegWrite": 1, "ALUOp": 2, "ALUSrc": 0, "MemToReg": 0},
...
},
"alu": {
"aluop_size": 2,
"func_size": 6,
"control_size": 3,
"control": [
{"aluop": 0, "out": {"control": 2}},
{"aluop": 2, "func": 32, "out": {"control": 2}},
...
],
"operations": {
"2": "add",
...
}
}
}
The various sections that compose the instruction set files are detailed in the following sections.
3.1. types
This section defines the existing types of instructions and their fields. All instruction are 32 bits in size. An example of the definition of an instruction type is shown below:
"R": [
{"id": "op", "size": 6},
{"id": "rs", "size": 5},
{"id": "rt", "size": 5},
{"id": "rd", "size": 5},
{"id": "shamt", "size": 5},
{"id": "func", "size": 6}
]
Each instruction type is identified by its identifier (R
in
this example).
The fields of an instruction are defined in a JSON array. Each individual
field is a JSON object and defines the field's identifier and size in bits.
The first field in this example is the op
field with a size of
6 bits.
The first field of the instructions is considered the opcode and
must have the same size in all types.
3.2. instructions
This section defines the available instructions and how they are encoded. An example of the definition of an instruction is shown below:
"add": {
"type": "R",
"args": ["reg", "reg", "reg"],
"fields": {
"op": 0,
"rs": "#2",
"rt": "#3",
"rd": "#1",
"shamt": 0,
"func": 32
},
"desc": "$t1 = $t2 + $t3"},
}
Each instruction is identified by its unique mnemonic (add
in
this example).
An instruction is defined as a JSON object with several properties. These are:
- type: the type of the instruction.
-
args:
an array with the types of each argument
(may be omitted if the instruction has no arguments).
The different types of arguments are:
- reg: a register.
-
int: an integer value. Can also be a label
(allowed because of the
la
pseudo-instruction). - target: a label in the code or direct instruction index for a jump.
- offset: a label in the code or direct instruction offset for a branch.
- label: a label in the code or data segment or direct address/index.
-
data: a label in the data segment or direct
address plus an offset register for a load or store instruction
(an argument like
label($t0)
). The user may omit the offset register.
-
fields:
A JSON object that defines the values of the fields
(the fields defined in the instruction's type). The value of each field
can either be a constant integer or come from an argument (specified as
"#1"
,"#2"
, etc.). For values that come from adata
argument it is necessary to specify if they come from the base address or from the offset register. This is done by appending.base
or.offset
to the reference to the argument (#1
,#2
, etc.).
In this example:op
has the constant value0
,rs
has the value from the 2nd argument,rt
has the value from the 3rd argument, etc. - desc: (optional) a string that should contain a short symbolic description of what the instruction does for the user to see.
3.3. pseudo
This section defines the available pseudo-instructions. An example of the definition of a pseudo-instruction is shown below:
"move": {
"args": ["reg", "reg"],
"to": ["add #1, #2, $0"],
"desc": "$t1 = $t2"
}
Each pseudo-instruction is identified by its unique mnemonic (move
in this example). Each mnemonic can only identify either an instruction or a
pseudo-instruction, not both.
A pseudo-instruction is defined as a JSON object with several properties.
These are:
- args: an array with the types of each argument (may be omitted if the pseudo-instruction has no arguments). These types are the same ones that are available for instructions, explained in the previous section.
-
to:
An array that lists the real instructions that the pseudo-instruction
is converted to when assembled. The values of the arguments specified
by the user can be referenced by using
#1
for the 1st argument,#2
for the 2nd, and so on. - desc: (optional) a string that should contain a short symbolic description of what the pseudo-instruction does for the user to see.
3.4. control
This section defines how the control unit works. More specifically, it defines the values of the outputs (control signals) for each possible value at the input (the opcode). An example of the definition of the values of the control signals for one opcode is shown below:
"0": {
"RegDst": 1,
"RegWrite": 1,
"ALUOp": 2,
"ALUSrc": 0,
"MemToReg": 0
}
The values of the control signals for a specified opcode are defined
as a JSON object. The values are in decimal format and the sizes (in bits) of
the control signals are determined automatically.
Control signals that have the value 0
can be omitted.
3.5. alu
This section defines how the ALU and ALU control work. A partial example of this section is shown below:
"alu": {
"aluop_size": 2,
"func_size": 6,
"control_size": 3,
"control": [
{"aluop": 0, "out": {"control": 2}},
{"aluop": 2, "func": 32, "out": {"control": 2}},
...
],
"operations": {
"2": "add",
...
}
}
The aluop_size
, func_size
and control_size
properties define the sizes (in bits) of the ALUOp
and
func
inputs of the ALU control and of the control input of the ALU,
respectively.
The control
subsection defines how the ALU control works (in
a JSON array).
Each element in the array defines (as a JSON object) the values of the outputs
for the specified values of the inputs.
The aluop
and func
properties correspond to the
values of the inputs for the correspondence. If the value of the
func
input is not relevant for the correspondence
(don't care, can be any value), the property should be omitted.
The out
property defines the values of the outputs for the
correspondence.
The values are specified as a JSON object and the sizes (in bits) of the outputs
are determined automatically.
All values are in decimal format.
The operations
subsection defines (as a JSON object) the
correspondence between the values of the control input of the ALU and the
arithmetic operation it performs.
The available operations are:
add
,
sub
,
and
,
or
,
slt
,
xor
,
sll
,
srl
,
sra
,
nor
,
mult
,
div
,
mfhi
,
mflo
.
Note that the last four operations require an "extended" ALU in the CPU,
instead of a "normal" ALU.
4. Custom Components
Custom components can be created and provided along with the CPU files.
These should be compiled Java classes and they should extend from the class
brunonova.drmips.simulator.Component
.
To compile a custom component into a .class file, open a terminal in the folder where the file is and run:
javac -source 1.7 -target 1.7 -cp /path/to/DrMIPSSimulator.jar CustomComponentName.java
The compiled .class files should be placed in the same folder as the
CPU file. If they are in a package, they must follow Java's directory hierarchy.
That is, if the class is in a package named package.name
, the file
should be put in /path/to/cpu/package/name/
.
It will then be possible to use this custom component in the CPU files in the same way as built-in components. The type of the component in the CPU file corresponds to the name of the Java class, prefixed with the package name if it is in one.
Check the documentation of the Component
class in the source
code of the simulator for more information.