Can't store output of SystemVerilog module in 2D array - arrays

I have a SystemVerilog module mFunc described as follows:
module mFunc #(parameter N = 1, parameter W=1)
(input logic signed [W:0] x[N],
output logic signed [W:0] Cx[N]);
always_comb
for(int k=0; k<N; k++)
Cx[k] = N-x[k];
endmodule: mFunc
and a module mFunc2 which calls mFunc:
module mFunc2 #(parameter N = 1, parameter W=1)
(input logic signed [W:0] x[N][N],
output logic signed [W:0] Cx[N][N]);
logic signed [W:0] x_rows[N];
logic signed [W:0] C_rows[N];
mFunc #(.N(N), .W(W)) mFunc_rows(.x(x_rows), .Cx(C_rows));
always_comb begin
for(int k=0; k<N; k++) begin
for(int j=0; j<N; j++)begin
x_rows[j] = x[k][j];
Cx[k][j] = C_rows[j];
end
end
end
endmodule: mFunc2
When I run simulation the behavior of the module is according to figure 1 and the output (C_rows) of mFunc is not stored properly in Cx, saving just the last value of C_rows as can be seen in Figure 1.
Please could anyone help me with this problem?
Figure 1: Statement of the problem
Here is the link of the simulation in EDA Playground
Thank you.

The reason you are seeing only the last row filling the entire matrix is due to how you've implemented the mFunc2 module. In it, you have combinational logic containing a for loop assigning Cx[k] to C_rows, which translates into x_rows being assigned to the x[N-1]. Then, the logic is in the mFunc module can convert that x_rows into C_rows for the x[N-1] row. Finally, your combinational logic in mFunc2 runs again, assigning all rows of Cx to that C_rows.
Thinking about it from a synthesis prospective, you are only ever instantiating a single mFunc module, which means the logic for translating a row from x to Cx only exists for a single row. As you have no clock or other synchronous mechanisms, you cannot feed a single row into the module and get its result, and then feed in the next row as you are trying to do with the for loop. To have the design work combinationally, you must create enough resources in the design to complete the task. In this case, that means having N (as there are N rows) mFunc modules, one for each row. Otherwise, you can introduce a clock and feed in a single row into mFunc every cycle for N cycles.
The purely combinational path is relatively easy, as all you need to do to your mFunc2 module is instantiate more mFunc in it. You can do this using a generate loop, or you can make use of array modules which work perfectly for your case:
module mFunc2 #(parameter N = 1, parameter W=1)
(input logic signed [W:0] x[N][N],
output logic signed [W:0] Cx[N][N]);
// Here, we make use of arrays of modules to make N instantiations of mFunc
// Note the [N] next to the mFunc_rows, meaning we are declaring an array
// of mFunc modules, ie N of them. These can be hooked up to the unpacked
// arrays x and Cx directly and will produce the desired output in Cx
mFunc #(.N(N), .W(W)) mFunc_rows[N](.x(x), .Cx(Cx));
endmodule: mFunc2
Quick side note, if you intend for W to be the bit width of the values in x and Cx, you need to declare them as logic signed [W-1:0] not logic signed [W:0] as the latter will produce variables of length W+1. Example, a byte would be [7:0], not [8:0] as it has 8 bits.

Related

How to assign the contents of two arrays in yasm and calculate the result based on the following equation

I'm a beginner to assembly language. Can you guys help me to guide the steps to complete this assignment?
The equation is : Sigma notation, for i = 0 to N-1 ((-3 +a(i)) +(b(i) -14))
here is the picture
The task in the main section is to explicitly follow the equation and iteratively add -3 to the indexed value in array a, subtract the indexed value in array b by 14 and finally add
these two parts and store the resulting value in memory location result.
Use a maximum of three general purpose registers in this lab. NOT allowed to change any values in the memory locations of aand b. Some of the opcodes of use in this lab are:
•mov - moving data from register-register, register-variable etc
•lea - loading effective address of a variable to a register.
•add - adding two values in registers or in variables.
•sub - subtract two values in registers or in variables
Upon completion of the task, zero out all the used registers and return.
I got the segment.data
segment .data
a dw -4, 22, 144 ; array of 3 values
b db -3, -16, 12 ; array of 3 values
result dq 0 ; memory to result
segment .text
global main
main:
Can you guys help me to guide the steps to complete this assignment?
First, I would write out a proper C version of that, so you can see the detail that has to be written out in programming languages, but hidden by the Sigma notation.  It will only be about 3 lines of code in total, and that will add clarity.
Next, fundamentally, the approach is to decompose the problem (i.e. that C code), into smaller piece parts, then solve each piece part, and recompose that into a total solution.
It is good to start with is understanding the data first; here variables i, N, result, a, b and possibly a temporary or two if you need to compute an intermediate value.  That's more than 3 items so, they cannot all go in registers.  However global variables can be accessed without requiring registers (a, b, and result are global variables).
Next write out the code and solve each statement, and solve each expression in each statement.  Once you know where your variables are you can write instructions that work with them.
The idea is to first map the variables of the algorithm into physical storage of the processor: either CPU registers or memory.  Once you have that mapping, you can start to write assembly instructions, but the other way around is awkward (though you may have to iterate if you run out of registers mid way, for example).
For example, the basic for-loop can be translated to a while loop, then to an if-goto-label loop.
for ( int i = 0; i < N; i++ )
<loop-body>
int i = 0;
while ( i < N ) {
<loop-body>
i++;
}
int i = 0;
loop1:
if ( i >= N ) goto endLoop1;
<loop-body>
i++;
goto loop1;
endLoop1:
The above if-goto-label form is pretty close to assembly, and has the control structure of the original for-loop.
Next, break down the loop-body into its individual piece parts, which will include array referencing, addition, subtraction, and summation.  So, figure out each of those and then place them together in context of the formula you're working.
Compose all of that into a solution and you'll have your program.  If you find you run out of registers, given a working limit of 3, you can take a step back and figure out something to put in memory instead.

Concatenation of two arrays with specific range in one array in SystemVerilog

I was trying to store two specific spans of an array inside another array, but I get an error.
What I want to do:
I have [8-1:0]A as module input, and I wanna store :
logic [8-1:0]temp = {A[4:7],A[0:3]};
but, when I simulate my module in test bench, I get an error in modelSim:
error: Range of part-select into 'A' is reversed.
Ways I tried:
Convert logic to wire,
Use assign
I think the idea is problematic.
example :
A = 8'b11000101 -> I want temp to be -> temp=8'b00111010
->explain:
A[0]=1,A[1]=0,A[2]=1,A[3]=0,A[4]=0,A[5]=0,A[6]=1,A[7]=1.
A[4..7]=4'b0011,A[0..3]=4'b1010
`timescale 1ns/1ns
module examp(input [7:0]A,output [7:0]O);
logic [7:0]temp = {A[4:7],A[0:3]};
// I wanna temp be 8'b00111010.
assign O = temp;
endmodule
`timescale 1ns/1ns
module examp_tb();
logic [7:0]aa=8'b11000101;
wire [7:0]ww;
examp MUX_TB(.A(aa),.O(ww));
initial begin
#200 aa=8'b01100111;
#200 $stop;
end
endmodule
Note : In the example above, I have a compile error, but in the main question, I have simulation error.
The streaming operator can be used to reverse a group of bits. So you could do:
logic [8-1:0]temp = { {<<{A[7:4]}} , {<<{A[3:0]}} };
The streaming operator also takes a slice argument, which is used to preserve a grouping of bits before performing the bit reversal. The problem with what you want is you are trying to reverse the bits within the slice. You can accomplish this by nesting the streaming operators. This approach with be more useful when dealing with larger vectors
logic [7:0] temp1 = {<<{A}}; // A[0:7]
logic [7:0] temp2 = {<<4{temp1}}; // A[4:7],A[03];
or in a single line
logic [7:0] temp = {<<4{ {<<{A}} }};
One way to swap bits within a nibble is to use a function:
module examp (input [7:0] A, output [7:0] O);
assign O = {swap_nib(A[7:4]), swap_nib(A[3:0])};
function logic [3:0] swap_nib (logic [3:0] in);
swap_nib[3] = in[0];
swap_nib[2] = in[1];
swap_nib[1] = in[2];
swap_nib[0] = in[3];
endfunction
endmodule

Systemverilog localparam array with configurable size

I want to create and define a localparam array in SystemVerilog. The size of the array should be configurable, and the value of each localparam array cell calculated based on its location. Essentially this code:
localparam [7:0] [ADDR_BITS-1:0] ADDR_OFFSET = '{
7*PAGE_SIZE,
6*PAGE_SIZE,
5*PAGE_SIZE,
4*PAGE_SIZE,
3*PAGE_SIZE,
2*PAGE_SIZE,
1*PAGE_SIZE,
0
};
but where the first '7' is replaced with a parameter, and where the parameter initialization is extended to the generic case. So I need a way to loop from 0 to (N-1) and set ADDR_OFFSET(loop) = loop*PAGE_SIZE.
The "obvious" option in SystemVerilog would be generate, but I read that placing a parameter definition inside a generate block generates a new local parameter relative to the hierarchical scope within the generate block (source).
Any suggestions?
For background reference: I need to calculate an actual address based on a base address and a number. The calculation is simple:
real_address = base_address + number*PAGE_SIZE
However, I don't want to have the "*" in my code since I am afraid the synt tool will generate a multiplier, that it will then try to simplify since PAGE_SIZE is a constant value. I am guessing that this can lead to more logic than if I try to do all calculations when generating the localparam array, since this for sure will not give any multiplier in logic.
So with the above localparam definition, I perform the desired address calculation like this:
function [ADDR_BITS-1:0] addr_calc;
input [ADDR_BITS-1:0] base_addr;
input [NBITS-1:0] num;
addr_calc = base_addr + ADDR_OFFSET[num];
endfunction
I think perhaps I found a solution. Wouldn't I essentially accomplish the same by not defining a localparam array, but rather performing the address calculation inside a loop? Since systemverilog sees the loop variable as "constant" (when it comes to generating logic) that seems to accomplish the same? Like this (inside the function I wrote above):
for (int loop1 = 0; loop1 < MAXNUM ; loop1++) begin
if (num == loop1) begin
addr_offset = CSP_PAGE_SIZE*loop1;
end
addr_calc = base_addr + addr_offset;
end
You can set your localparam with the return value of a function.
localparam bit [7:0] [ADDR_BITS-1:0] ADDR_OFFSET = ADDR_CALC();
function bit [7:0] [ADDR_BITS-1:0] ADDR_CALC();
for(int ii=0;ii<$size(ADDR_CALC,1); ii++)
ADDR_CALC[ii] = ii * PAGE_SIZE;
endfunction

Verilog Parallel Check and Assignment Across Dissimilar Sized Shift Registers

I'm looking to perform the cross-correlation* operation using an FPGA.
The secific part that I am currently struggling with is the multiplication piece. I want to multiply each 8-bit element of a nx8 shift register that uses excess or offset representation** against a nx1 shift register where I treat 0s as a -1 for the purposes of multiplication.
Now if I was doing that for a single element, I might do something like this for the operation:
input [7:0] dataIn;
input refIn;
output [7:0] dataOut;
wire [7:0] dataOut;
wire [7:0] invertedData;
assign invertedData = 8'd0 - dataIn;
assign dataOut <= refIn ? dataIn : invertedData;
What I'm wondering is how do I scale this to 4, 8, n elements?
My first though was to use a for loop like this:
for(loop=0; loop < n; loop = loop+1)
begin
assign invertedData[loop*8+7:loop*8] = 8'd0 - dataIn[loop*8+7:n*8];
assign dataOut[loop*8+7:loop*8] <= refIn[loop] ? dataIn[loop*8+7:loop*8] : invertedData[loop*8+7:loop*8];
end
This doesn't compile, but that's more or less the idea, and I can't seem to find the right syntax to do what I want.
https://en.wikipedia.org/wiki/Cross-correlation
** http://www.cs.auckland.ac.nz/~patrice/210-2006/210%20LN04_2.pdf
for(loop=0; loop < n; loop = loop+1)
begin
assign invertedData[n*8+7:n*8] = 8'd0 - dataIn[n*8+7:n*8];
assign dataOut[n*8+7:n*8] <= refIn[n] ? dataIn[n*8+7:n*8] : invertedData[n*8+7:n*8];
end
There's a few issues with this, but I think you can make this work.
You can't have 'assign' statements in a for loop. A for loop is meant to be used inside a begin/end block, so you need to change invertedData/dataOut from wire type to reg type, and remove the assign statements.
You generally can't have variable part-selects, unless you use the special constant-width selection operator (verilog-2001 support required). That would look like this: dataIn[n*8 +:8], which means: select 8 bits starting from n*8.
I don't know about your algorithm, but it looks like loop/n are backwards in your statement. You should be incrementing n, not loop variable (or else all statements will be operating on the same part-select).
So considering those points I believe this should compile for you:
always #* begin
for(n=0; n< max_loops ; n=n+1)
begin
invertedData[n*8 +:8] = 8'd0 - dataIn[n*8 +:8];
dataOut[n*8 +:8] <= refIn[n] ? dataIn[n*8 +:8] : invertedData[n*8 +:8];
end
end

Verilog, logic OR-ing an entire array

Suppose I have an array like this:
parameter n=100;
reg array[0:n-1];
How would one get the logic-OR value of each and every bit in the array?
The resulted circuit must be combinatorial.
This is a follow up question from this one.
(see discussion below the answer)
I don't know if this meets your design requirements, but you might have a much easier time with a hundred bit bus reg [n-1:0] array; than by using an array of 1 bit wires. Verilog does not have the greatest syntax to support arrays. If you had a bus instead you could just do assign result = |array;
If you must use an array, than I might consider first turning it into a bus with a generate loop, and then doing the same:
parameter n=100;
reg array[0:n-1];
wire [n-1:0] dummywire;
genvar i;
generate
for (i = 0; i < n; i = i+1) begin
assign dummywire[i] = array[i];
end
endgenerate
assign result = |dummywire;
I'm not aware of a more elegant way to do this on arrays.

Resources