Skip to content

Memories

A significant effort of hardware design revolves around memories. Unlike Von Neumann models, memories must be explicitly managed. Some list of concerns when designing memories in ASIC/FPGAs:

  • Reads and Writes may have different number of cycles to take effect
  • Reset does not initialize memory contents
  • There may not be data forwarding if a read and a write happen in the same cycle
  • ASIC memories come from memory compilers that require custom setup pins and connections
  • FPGA memories tend to have their own set of constraints too
  • Logic around memories like BIST has to be added before fabrication

This constrains the language, it is difficult to have a typical vector/memory provided by the language that handles all these cases. Instead, the complex memories are managed by the Pyrope standard library.

The flow directly supports arrays/memories in two ways:

  • Async memories or arrays
  • RTL instantiation

Async memories or arrays

Asynchronous memories, async memories for short, have the same Pyrope tuple interface. The difference between tuples/arrays and async memories is that the async memories preserve the array contents across cycles. In contrast, the array contents are cleared at the end of each cycle.

In Pyrope, an async memory has one cycle to write a value and 0 cycles to read. The memory has forwarding by default, which behaves like a 0 cycle read/write. From a non-hardware programmer, the default memory looks like an array with persistence across cycles.

Pyrope async memories behave like what a "traditional software programmer" will expect in an array. This means that values are initialized and there is forwarding enabled. This is not what a "traditional hardware programmer" will expect. In languages like CHISEL there is no forwarding or initialization. In Pyrope is possible to have different options of async memories, but those should use the RTL interface.

The async memories behave like tuples/arrays but there is a small difference, the persistence of state between clock cycles. To be persistent across clock cycles, this is achieved with a reg declaration. When a variable is declared with var the contents are lost at the end of the cycle, when declared with reg the contents are preserved across cycles.

In most cases, the arrays and async memories can be inferred automatically. The maximum/minimum value on the index effectively sets the size and the default initialization is zero.

reg mem:[] = 0
mem[3]   = something // async memory
var array:[] = _
array[3] = something // array no cross cycles persistence
var index:u7 = _
var index2:u6 = _

array[index] = something
some_result  = array[index2+3]

In the previous example, the compiler infers that the bundle at most has 127 entries.

There are several constructs to declare arrays or async memories:

reg mem1:[16]i8 = 3   // mem 16bit memory initialized to 3 with type i8
reg mem2:[16]i8 = _   // mem 16bit memory initialized to 0 with type i8
var mem3:[] = 0sb?        // array infer size and type, 0sb? initialized
var mem4:[13] = 0         // array 13 entries size, initialized to zero

Pyrope allows slicing of bundles and hence arrays.

x1 = array[first..<last]  // from first to last, last not included
x2 = array[first..=last]  // from first to last, last included
x3 = array[first..+size]  // from first to first+size, first+size. not included

Since bundles are multi-dimensional, arrays or async memories are multi-dimensional too.

a[3][4] = 1

var b:[4][8]u8 = 13

assert b[2][7] == 13
assert b[2][10]      // compile error, '10' is out of bound access for 'b[2]'

It is possible to initialize the async memory with an array. The initialization of async memories happens whenever reset is set on the system. A key difference between arrays (no clock) and memories is that arrays initialization value must be comptime while memories and reg can have a sequence of statements to generate a reset value.

var mem1:[4][8]u5 = 0
var reset_value:[3][8]u5:[comptime] = _ // only used during reset
for i in 0..<3 {
  for j in 0..<8 {
    reset_value[i][j] = j
  }
}
reg mem2 = reset_value   // infer async mem u5[3][8]
var mem = ( 
  ,(u5(0), u5(0), u5(0), u5(0), u5(0), u5(0), u5(0), u5(0))
  ,(u5(0), u5(0), u5(0), u5(0), u5(0), u5(0), u5(0), u5(0))
  ,(u5(0), u5(0), u5(0), u5(0), u5(0), u5(0), u5(0), u5(0))
  ,(u5(0), u5(0), u5(0), u5(0), u5(0), u5(0), u5(0), u5(0))
)
reg mem2 = ( 
  ,(u5(0), u5(1), u5(2), u5(3), u5(4), u5(5), u5(6), u5(7))
  ,(u5(0), u5(1), u5(2), u5(3), u5(4), u5(5), u5(6), u5(7))
  ,(u5(0), u5(1), u5(2), u5(3), u5(4), u5(5), u5(6), u5(7))
)

Sync memories

Pyrope asynchronous memories provide the result of the read address and update their contents on the same cycle. This means that traditional SRAM arrays can not be directly used. Most SRAM arrays either flop the inputs or flop the outputs (sense amplifiers). This document calls synchronous memories the memories that either has a flop input or an output.

There are two ways in Pyrope to instantiate more traditional synchronous memories. Either use async memories with flopped inputs/outputs or do a direct RTL instantiation.

Flop the inputs or outputs

When either the inputs or the output of the asynchronous memory access is directly connected to a flop, the flow can recognize the memory as asynchronous memory. A further constrain is that only single dimension memories. Multi-dimensional memories or memories with partial updates need to use the RTL instantiation.

To illustrate the point of simple single dimensional synchronous memories, this is a typical decode stage from an in-order CPU:

reg rf:[32]i64 = 0sb?   // random initialized

reg a:(addr1:u5, addr2:u5) = (0,0)

data_rs1 = rf[a.addr1]
data_rs2 = rf[a.addr2]

a = (insn[8..=11], insn[0..=4])
var rf:[32]i64 = 0sb?

reg a:(data1:i64, data2:i64) = _

data_rs1 = a.data1
data_rs2 = a.data2

a = (rf[insn[8..=11]], rf[insn[0..=4]])

RTL instantiation

There are several constraints and additional options to synchronous memories that the async memory interface can not provide: multi-dimension, partial updates, negative edge clock...

Pyrope allows for a direct call to LiveHD cells with the RTL instantiation, as such that memories can be created directly.

// A 2rd+1wr memory (RF type)

mem.addr    = (raddr0, raddr1, wraddr)
mem.bits    = 4
mem.size    = 16
mem.clock   = my_clock
mem.din     = (0, 0, din0)
mem.enable  = (1, 1, we0)

mem.fwd     = false
mem.latency = (1, 1, 1)
mem.wensize = 1 // we bit (no write mask)
mem.rdport  = (-1,1,0) // 0 WR, !=0 -> RD

res =#[..] __memory(mem)

q0 = res.0
q1 = res.1

The previous code directly instantiates a memory and passes the configuration.

Multi cycle memories are pipelined elements, and using them requires the =#[..] assignment and the same rules as pipeline flops apply (See pipelining).

Multidimensional arrays

Pyrope supports multi-dimensional arrays, it is possible to slice the array by dimension. The entries are in a row-major order.

var d2:[2][2] = ((1,2),(3,4))
assert d2[0][0] == 1 and d2[0][1] == 2 and d2[1][0] == 3 and d2[1][1] == 4

assert d2[0] == (1,2) and d2[1] == (2,3)

The for iterator goes over each entry of the bundle/array. If a matrix, it does in row-major order. This allows building a simple function to flatten multi-dimensional arrays.

let flatten = fun(...arr) {
  var res = 0
  for i in arr {
    res ++= i
  }
  return res
}

assert flatten(d2) == (1,2,3,4)
assert flatten((((1),2),3),4) == (1,2,3,4)

Array index

Array index by default are unsigned integers, but the index can be constrained with tuples or by requiring an enumerate.

var x1:[2]u3 = (0,1)
assert x1[0] == 0 and x1[1] == 1

var X=enum(
  ,t1 = 0 // sequential enum, not one hot enum (explicit assign)
  ,t2
  ,t3
)

var x2:[X]u3 = _
x2[X.t1] = 0
x2[X.t2] = 1
x2[0]              // compile error, only enum index

var x3:[-8..<7]u3 = _  // accept signed values

var x4:[100..<132]u3 = _

assert x4[100] == 0
assert x4[3]       // compile error, out of bounds index

Reset and initialization

Like the let and var statements, reg statements require an initialization value. While let/var initialize every cycle, the reg initialization is the value to set during reset.

Like in let/var cases, the reset/initialization value can use the traditional Verilog uninitialized (0sb?) contents. The Pyrope semantics for any bit with ? value is to respect arithmetic Verilog semantics at compile time, but to randomly generate a zero/ones for each simulation. As a result assertions can fail with unknowns.

reg r_ver = 0sb?

reg r = _
var v = _

assert v == 0 and r == 0

assert !(r_ver != 0)    // it will randomly fail
assert !(r_ver == 0)    // it will randomly fail
assert !(r_ver != 0sb?) // it will randomly fail
assert !(r_ver == 0sb?) // it will randomly fail

The reset for arrays may take several cycles to take effect, this can lead to unexpected results during the reset period. Memories and registers are randomly initialized before reset during simulation. There is no guarantee of zero initialization before reset.

var arr:[] = (0,1,2,3,4,5,6,7)

always assert arr[0] == 0 and arr[7] == 7  // may FAIL during reset

reg mem:[] = (0,1,2,3,4,5,6,7)

always assert mem[7] == 7                  // may FAIL during reset
always assert mem[7] == 7 unless mem.reset // OK
assert mem[7] == 7                         // OK, not checked during reset