Verilog OR of array elements - arrays

I want to OR a parameterized number of 32-bit buses as follows:
out_bus = bus1 | bus2 | bus 3 | ... | bus N;
I also want to declare the buses as an array (N is a fixed parameter, defined at compile time):
reg [31:0] bus[N-1:0];
The best I can figure how to do this is something like this:
parameter N;
reg [N-1:0] temp;
reg [31:0] out_bus;
reg [31:0] bus[N-1:0];
always #(*) begin
for (j=0; j<32; j=j+1) begin : bits
for (k=0; k < N; k=k+1) begin : bus
temp = bus[k][j];
end
out_bus[j] = |temp;
end
end
This need to be synthesizable. There's got to be a cleaner/better way, no?

If you were using SystemVerilog, you could replace the entire always block with
assign out_bus = bus.or();

This uses one fewer for loop and one fewer temporary signal:
reg [31:0] out_bus;
reg [31:0] bus[N-1:0];
integer k;
always #(*) begin
out_bus = {32{1'b0}};
for (k=0; k < N; k=k+1) begin
out_bus = out_bus | bus[k];
end
end

The following code in quartus gave the expected results, as verified in schematic view.
module example #(
parameter WIDTH = 32,
parameter DEPTH = 4
)(
input [DEPTH-1:0][WIDTH-1:0] DataIn,
output reg [WIDTH-1:0] DataOut
);
reg [WIDTH-1:0] ORDatain;
always#(*)
begin
ORDatain = 32'h0000_0000;
for(int index=0; index <DEPTH; index++)
ORDatain = ORDatain | DataIn[index];
end
assign DataOut = ORDatain;
endmodule

Related

Making my output be an 8-bit output from a pre-defined array

I'm trying to make a program that works like an up-down counter as it reads values from an array equivalent to what the counter is at, where this counter value can be adjusted whether the up or down function is active. This is my code below, where I have excluded my 1_Hz counter that already has proven to work. The errors I receive so far during synthesis is that mem has not been declared and I am unsure how to fix that. Advice appreciated, thank you.
module reader (
output reg [7:0] out , // Output of the counter
input wire up_down , // up_down control for counter
input wire clk_1Hz , // clock input
input wire reset // reset input
mem
);
//-------------Code Starts Here-------
reg [16:0] i;
reg [7:0][0:16] mem;
initial begin
assign {mem[0],mem[1],mem[2],mem[3],mem[4],mem[5],mem[6],mem[7],mem[8],mem[9],mem[10],mem[11],mem[12],mem[13],mem[14],mem[15],mem[16]}={8'b0000000,8'b00000001,8'b00000100,8'b00001001,8'b00010000,8'b00011001,8'b00100100,8'b00110001,8'b01000000,8'b01010001,8'b01101000,8'b01111001,8'b10010000,8'b10101001,8'b11000100,8'b11100001,8'b11111111};
end
always #(posedge clk_1Hz)
if (reset) begin // active high reset
out <= 8'b0 ;
end else if (up_down) begin
i <= i+1;
out <= mem[i];
end else begin
i <= i-1;
out <=mem[i];
end
endmodule
initial blocks are not synthesizable, try below code.
module reader (
output reg [7:0] out , // Output of the counter
input wire up_down , // up_down control for counter
input wire clk_1Hz , // clock input
input wire reset // reset input
// mem //Removed this from port list
);
//-------------Code Starts Here-------
reg [4:0] i; //Address range is 0-16 hence only 5 bits needed
reg [7:0]mem[16:0]; //2-D wire declaration
always#(*) begin {mem[0],mem[1],mem[2],mem[3],mem[4],mem[5],mem[6],mem[7],mem[8],mem[9],mem[10],mem[11],mem[12],mem[13],mem[14],mem[15],mem[16]}={8'b0000000,8'b00000001,8'b00000100,8'b00001001,8'b00010000,8'b00011001,8'b00100100,8'b00110001,8'b01000000,8'b01010001,8'b01101000,8'b01111001,8'b10010000,8'b10101001,8'b11000100,8'b11100001,8'b11111111};
end
always #(posedge clk_1Hz)
begin
if (reset) begin // active high reset
out <= 8'b0 ;
end
else if (up_down)
begin
i <= i+1;
out <= mem[i];
end
else
begin
i <= (i == 0) ? i : i-1; //Added cond to avoid underflow
out <= mem[i];
end
end
endmodule

how to preset the register arrays in Verilog?

I am trying to define a register file, 32-bit wide 32-bit deep, in Verilog. How to preset all the values to zero or to any value I want with/without a for loop?
Here's my code, I tried but failed:
module register_file(rna, rnb, qa, qb);
input [4:0]rna;
input [4:0]rnb;
output [31:0]qa;
output [31:0]qb;
genvar i;
reg [31:0]registers[0:31];
assign registers[0]=32'b0;
registers[1]=32'b0;
registers[2]=32'b0;
registers[3]=32'b0;
endmodule
A usual way to preset register values is done using clocks and a reset signal. For example:
reg [31:0]registers[0:31];
integer i;
always #(posedge clk) begin
if (reset) begin
for (i = 0; i < 31; i = i + 1)
registers[i] <= 0;
end
else begin
// do some real work with registers here
end
end
in some cases you might want to do some initial setting in your testbench initial block
initial begin
for (i = 0; ...) registers[i]= 0;
end
The above is not usually synthesizable.
There are few other ways available in System Verilog.

reading multiple block ram indexes in one write clock cycle

I have an application where I'm continuously writing to a block ram at a slow clock speed (clk_a) and within this slow clock cycle need to read three indexes from the block ram at a fast clock speed (clk_b) to use these values as operands in a math module, the result being written back to the block ram on the next slow clock. These three indexes are the current address written to at posedge of the slow clock, plus the two immediate neighbouring addresses (addr_a -1 and addr_a +1).
What is an efficient way to synthesize this? My best attempt to date uses a small counter (triplet) running at fast clock rate that increments the addresses but I end up running out of logic as it looks like Yosys does not infer the ram properly. What is a good strategy for this?
here is what I have:
module myRam2 (
input clk_a,
input clk_b,
input we_a,
input re_a,
input [10:0] addr_a,
input [10:0] addr_b,
input [11:0] din_a,
output [11:0] leftNeighbor,
output [11:0] currentX,
output [11:0] rightNeighbor
);
parameter MEM_INIT_FILE2 = "";
initial
if (MEM_INIT_FILE2 != "")
$readmemh(MEM_INIT_FILE2, ram2);
reg [11:0] ram2 [0:2047];
reg [1:0] triplet = 3;
reg [10:0] old_addr_a;
reg [11:0] temp;
always #(posedge clk_a) begin
ram2[addr_a] <= din_a;
end
always#(posedge clk_b)
if (old_addr_a != addr_a) begin
triplet <= 0;
old_addr_a <= addr_a;
end
else
if(triplet < 3) begin
triplet <= triplet +1;
end
always #(posedge clk_b) begin
temp <= ram2[addr_a + (triplet - 1)];
end
always #(posedge clk_b) begin
case(triplet)
0: leftN <= temp;
1: X <= temp;
2: rightN <= temp;
endcase
end
reg signed [11:0] leftN;
reg signed [11:0] X;
reg signed [11:0] rightN;
assign leftNeighbor = leftN;
assign currentX = X;
assign rightNeighbor = rightN;
endmodule
Regarding the efficiency the following approach should work and removes the need for a faster clock:
module myRam2 (
input wire clk,
input wire we,
input wire re,
input wire [10:0] addr_a,
input wire [10:0] addr_b,
input wire [11:0] din_a,
output reg [11:0] leftNeighbor,
output reg [11:0] currentX,
output reg [11:0] rightNeighbor
);
reg [11:0] ram2 [2047:0];/* synthesis syn_ramstyle = "no_rw_check" */;
always #(posedge clk) begin
if(we) ram2[addr_a] <= din_a;
if(re) {leftNeighbor,currentX,rightNeighbor} <= {ram2[addr_b-1],ram2[addr_b],ram2[addr_b+1]};
end
endmodule
The synthesis keyword helped me in the past to increase the likelyhood of correctly inferred ram.
EDIT: removed second example suggesting a 1D mapping. It turned out that at least Lattice LSE cannot deal with that approach. However the first code snipped should work according to Active-HDL and Lattice LSE.

Verilog vector inner product

I am trying to implement a synthesizable verilog module, which produces a vector product of 2 vector/arrays, each containing eight 16-bit unsigned integers. Design Compiler reported error that symbol i must be a constant or parameter. I don't know how to fix it. Here's my code.
module VecMul16bit (a, b, c, clk, rst);
// Two vector inner product, each has 8 elements
// Each element is 16 bits
// So the Output should be at least 2^32*2^3 = 2^35 in order to
// prevent overflow
// Output is 35 bits
input clk;
input rst;
input [127:0] a,b;
output [35:0] c;
reg [15:0] a_cp [0:7];
reg [15:0] b_cp [0:7];
reg [35:0] c_reg;
reg k,c_done;
integer i;
always # (a)
begin
for (i=0; i<=7; i=i+1) begin
a_cp[i] = a[i*15:i*15+15];
end
end
always # (b)
begin
for (i=0; i<=7; i=i+1) begin
b_cp[i] = b[i*15:i*15+15];
end
end
assign c = c_reg;
always #(posedge clk or posedge rst)
begin
if (rst) begin
c_reg <= 0;
k <= 0;
c_done <= 0;
end else begin
c_reg <= c_done ? c_reg : (c_reg + a_cp[k]*b_cp[k]);
k <= c_done ? k : k + 1;
c_done <= c_done ? 1 : (k == 7);
end
end
endmodule
As you can see, I'm trying to copy a to a_cp through a loop, is this the right way to do it?
If yes, how should I defined it i and can a constant be used as a stepper in for loop?
A part select in verilog must have constant bounds. So this is not allowed:
a_cp[i] = a[i*15:i*15+15];
Verilog-2001 introduced a new indexed part select syntax where you specify the starting position and the width of the selected group of bits. So, you need to replace the above line by:
a_cp[i] = a[i*15+:16];
This takes a 16-bit width slice of a starting at bit i*15 counting rightwards. You can use -: instead of +:, in which case you count leftwards.
Be careful: it is very easy to type :+ instead of +: and :+ is valid syntax and so might not be spotted by your compiler (but could still be a bug). In fact I did exactly that when typing this EDA Playground example, though my typo was caught by the compiler in this case.
Actually, what you need for your code to be synthesizable is using genvar as the type of i. Kind of like this (using macros, put it above your module):
`define PACK_ARRAY_2D2(PK_WIDTH,PK_LEN,PK_DIMS,PK_SRC,PK_DEST,PK_OFFS) ({\
genvar pk_idx; genvar pk_dims; \
generate \
for (pk_idx=0; pk_idx<(PK_LEN); pk_idx=pk_idx+1) \
begin \
for (pk_dims=0; pk_dims<(PK_DIMS); pk_dims=pk_dims+1) \
begin \
assign PK_DEST[(((PK_WIDTH)*(pk_idx+pk_dims+1))-1+((PK_WIDTH)*PK_OFFS*pk_idx)):(((PK_WIDTH)*(pk_idx+pk_dims))+((PK_WIDTH)*PK_OFFS*pk_idx))] = PK_SRC[pk_idx][pk_dims][((PK_WIDTH)-1):0];\
end\
end\
endgenerate\
})
`define UNPACK_ARRAY_2D2(PK_WIDTH,PK_LEN,PK_DIMS,PK_DEST,PK_SRC,PK_OFFS) ({\
genvar unpk_idx; genvar unpk_dims; \
generate \
for (unpk_idx=0; unpk_idx<(PK_LEN); unpk_idx=unpk_idx+1) \
begin \
for (unpk_dims=0; unpk_dims<(PK_DIMS); unpk_dims=unpk_dims+1)\
begin \
assign PK_DEST[unpk_idx][unpk_dims][((PK_WIDTH)-1):0] = PK_SRC[(((PK_WIDTH)*(unpk_idx+unpk_dims+1))-1+((PK_WIDTH)*PK_OFFS*unpk_idx)):(((PK_WIDTH)*(unpk_idx+unpk_dims))+((PK_WIDTH)*PK_OFFS*unpk_idx))];\
end end endgenerate\
})
and here is how to use it (just put in inside pack_unpack.v) as an example of function to transpose matrix :
// Macros for Matrix
`include "pack_unpack.v"
module matrix_weight_transpose(
input signed [9*5*32-1:0] weight, // 5 columns, 9 rows, 32 bit data length
output signed [9*5*32-1:0] weight_transposed // 9 columns, 5 rows, 32 bit data length
);
wire [31:0] weight_in [8:0][4:0];
`UNPACK_ARRAY_2D2(32,9,5,weight_in,weight,4)
wire [31:0] weight_out [4:0][8:0];
`PACK_ARRAY_2D2(32,5,9,weight_out,weight_transposed,8)
generate // Computing the transpose
genvar i;
for (i = 0; i < 9; i = i + 1)
begin : columns
genvar j;
for (j = 0; j < 5; j = j + 1)
begin : rows
assign weight_out[j][i] = weight_in[i][j];
end
end
endgenerate
endmodule

Converting a fixed point Matlab code to Verilog

I have a fixed point Matlab code and it needs to be converted to Verilog. Below is the Matlab code. yfftshift is 5000x0 and y2shape 100x50.
rows=100;
colms=50;
r=1;
for m=0:colms-1
for n=0:rows-1
y2shape(n+1,m+1)=yfftshift(r,1);
r=r+1;
end
end
How can I create memories in Verilog and call them inside the for loop?
The easiest way to handle fixed precision in Verilog is to introduce a scale factor and allocate sufficiently large registers to hold the maximum value. For example, if you know that the maximum value of your numbers will be 40, and three digits of precision to the right of the decimal place are OK, a scaling factor of 1000 can be used with 16-bit registers. Verilog treats unsigned numbers, so if values can be negative, it's necessary to add "signed" to the declarations. The Verilog could be:
`define NUMBER_ROWS 100
`define NUMBER_COLS 50
`define MAX_ROW (`NUMBER_ROWS-1)
`define MAX_COL (`NUMBER_COLS-1)
module moveMemory();
reg clk;
reg [15:0] y2shape [`MAX_ROW:0][`MAX_COL:0];
reg [15:0] yfftshift [`NUMBER_ROWS * `NUMBER_COLS:0];
integer rowNumber, colNumber;
always #(posedge clk)
begin
for (rowNumber = 0; rowNumber < `NUMBER_ROWS; rowNumber = rowNumber + 1)
for (colNumber = 0; colNumber < `NUMBER_COLS; colNumber = colNumber + 1)
y2shape[rowNumber][colNumber] <= yfftshift[rowNumber * `NUMBER_COLS + colNumber];
end
endmodule
This is OK for an FPGA or simulation project, but for full custom work, an SRAM macro would be used to avoid the die area associated with 16,000 registers. For an FPGA implementation, you've probably already paid for the 16K registers, or you may be able to do some extra work get the synthesizer to map the registers to an on-chip SRAM.
The test bench:
// Testing code
integer loadCount, rowShowNumber, colShowNumber;
initial
begin
// Initialize array with some data
for (loadCount=0; loadCount < (NUMBER_ROWS *NUMBER_COLS); loadCount = loadCount + 1)
yfftshift[loadCount] <= loadCount;
clk <= 0;
// Clock the block
#1
clk <= 1;
// Display the results
#1
$display("Y2SHAPE has these values at time ", $time);
for (rowShowNumber = 0; rowShowNumber < `NUMBER_ROWS; rowShowNumber = rowShowNumber + 1)
for (colShowNumber = 0; colShowNumber < `NUMBER_COLS; colShowNumber = colShowNumber + 1)
$display("y2shape[%0d][%0d] is %d ", rowShowNumber, colShowNumber, y2shape[rowShowNumber][colShowNumber]);
end
The simulation results for NUMBER_ROWS=10, NUMBER_COLS=5
Y2SHAPE has these values at time 2
y2shape[0][0] is 0
y2shape[0][1] is 1
y2shape[0][2] is 2
y2shape[0][3] is 3
y2shape[0][4] is 4
.
.
.
y2shape[9][2] is 47
y2shape[9][3] is 48
y2shape[9][4] is 49

Resources