I'm trying to implement a loop without using loop instructions in verilog so i made a counter module and the simulation went perfectly but when i tried to implement it on the FPGA i got a lot of errors in the mapping , like this one
ERROR:MapLib:979 - LUT4 symbol
"Inst_Count/Mcompar_GND_1105_o_xcount[7]_LessThan_25_o_lut<0>" (output
signal=Inst_Count/Mcompar_GND_1105_o_xcount[7]_LessThan_25_o_lut<0>) has
input signal "Inst_Count/Madd_x[9]_GND_1105_o_add_0_OUT_cy<0>" which will be
trimmed. See Section 5 of the Map Report File for details about why the input
signal will become undriven.
These errors only occurred when i replaced this module with the loop instruction module so does anyone no what's the problem with this one ?
Thanks for giving this your time :)
module average( input rst , output reg [7:0]
reg [7:0] count;
reg [7:0] prv_count;
reg clk;
initial
begin
count = 8'd0;
end
always # (posedge rst)
begin
clk = 1'b0;
end
always # (clk)
begin
prv_count = count ;
count = prv_count + 1'b1;
end
always # (count)
begin
if (count == 8'd255)
G_count= count;
else
begin
clk = ~clk;
G_count= count;
end
end
endmodule
Oh, this is just plain wrong. I don't really think anybody can help here without giving you a lecture on Verilog, but... some things that are noticeable right away are:
You have an obvious syntax error in your module parameter list where you do not close it (i.e. ) went missing).
Clock should be an input to your module. Even if you depend on reset input only and use a register as a "clock", it won't work (logically and you have combinatorial loop that must be broken or else...).
Do not use initial block in the code that should be synthesizable.
prv_count is useless.
No need to manually take care of the overflow (check for 255? 8'd255 is exactly 8'b11111111 and it resets to 0 if you add 1'b1, etc).
And tons of other things, which raise the obvious question — have you tried reading some books on Verilog, preferably those covering synthesizable part of the language? :) Anyhow, what you are trying to do (as far as I can understand) would probably look something like this:
module average(input clk, input rst, output reg [7:0] overflow_count);
reg [7:0] count;
always #(posedge clk or negedge rst) begin
if (~rst) begin
count <= 8'b0;
overflow_count <= 8'b0;
end else begin
count <= (count + 1'b1);
if (count == 8'b0)
overflow_count <= (overflow_count + 1'b1);
end
end
endmodule
Hope it helps and really suggest you take a look at some good books on HDL.
Related
I am trying to define a register file, 32-bit wide 32-bit deep, in Verilog. How to preset all the values to zero or to any value I want with/without a for loop?
Here's my code, I tried but failed:
module register_file(rna, rnb, qa, qb);
input [4:0]rna;
input [4:0]rnb;
output [31:0]qa;
output [31:0]qb;
genvar i;
reg [31:0]registers[0:31];
assign registers[0]=32'b0;
registers[1]=32'b0;
registers[2]=32'b0;
registers[3]=32'b0;
endmodule
A usual way to preset register values is done using clocks and a reset signal. For example:
reg [31:0]registers[0:31];
integer i;
always #(posedge clk) begin
if (reset) begin
for (i = 0; i < 31; i = i + 1)
registers[i] <= 0;
end
else begin
// do some real work with registers here
end
end
in some cases you might want to do some initial setting in your testbench initial block
initial begin
for (i = 0; ...) registers[i]= 0;
end
The above is not usually synthesizable.
There are few other ways available in System Verilog.
I have an application where I'm continuously writing to a block ram at a slow clock speed (clk_a) and within this slow clock cycle need to read three indexes from the block ram at a fast clock speed (clk_b) to use these values as operands in a math module, the result being written back to the block ram on the next slow clock. These three indexes are the current address written to at posedge of the slow clock, plus the two immediate neighbouring addresses (addr_a -1 and addr_a +1).
What is an efficient way to synthesize this? My best attempt to date uses a small counter (triplet) running at fast clock rate that increments the addresses but I end up running out of logic as it looks like Yosys does not infer the ram properly. What is a good strategy for this?
here is what I have:
module myRam2 (
input clk_a,
input clk_b,
input we_a,
input re_a,
input [10:0] addr_a,
input [10:0] addr_b,
input [11:0] din_a,
output [11:0] leftNeighbor,
output [11:0] currentX,
output [11:0] rightNeighbor
);
parameter MEM_INIT_FILE2 = "";
initial
if (MEM_INIT_FILE2 != "")
$readmemh(MEM_INIT_FILE2, ram2);
reg [11:0] ram2 [0:2047];
reg [1:0] triplet = 3;
reg [10:0] old_addr_a;
reg [11:0] temp;
always #(posedge clk_a) begin
ram2[addr_a] <= din_a;
end
always#(posedge clk_b)
if (old_addr_a != addr_a) begin
triplet <= 0;
old_addr_a <= addr_a;
end
else
if(triplet < 3) begin
triplet <= triplet +1;
end
always #(posedge clk_b) begin
temp <= ram2[addr_a + (triplet - 1)];
end
always #(posedge clk_b) begin
case(triplet)
0: leftN <= temp;
1: X <= temp;
2: rightN <= temp;
endcase
end
reg signed [11:0] leftN;
reg signed [11:0] X;
reg signed [11:0] rightN;
assign leftNeighbor = leftN;
assign currentX = X;
assign rightNeighbor = rightN;
endmodule
Regarding the efficiency the following approach should work and removes the need for a faster clock:
module myRam2 (
input wire clk,
input wire we,
input wire re,
input wire [10:0] addr_a,
input wire [10:0] addr_b,
input wire [11:0] din_a,
output reg [11:0] leftNeighbor,
output reg [11:0] currentX,
output reg [11:0] rightNeighbor
);
reg [11:0] ram2 [2047:0];/* synthesis syn_ramstyle = "no_rw_check" */;
always #(posedge clk) begin
if(we) ram2[addr_a] <= din_a;
if(re) {leftNeighbor,currentX,rightNeighbor} <= {ram2[addr_b-1],ram2[addr_b],ram2[addr_b+1]};
end
endmodule
The synthesis keyword helped me in the past to increase the likelyhood of correctly inferred ram.
EDIT: removed second example suggesting a 1D mapping. It turned out that at least Lattice LSE cannot deal with that approach. However the first code snipped should work according to Active-HDL and Lattice LSE.
In C language, there is an array x[0], x[1], ..., x[127], for a given number k in [0, 127), we difine left shift operation as y[n] = x[(n+k)%128], for n=0,1,2...,127
Now I am try to implement this in FPGA, as there are so many this type operations, I like to get the result as fast as possile.
I did this as follows,
module LEFT_SHIFT(
input clk,
input rst,
input [31:0] data_in[0:127])
input [6:0] shift,
output reg [31:0] data_ou[0:127]
);
integer i;
always # (posedge clk)
begin
if (rst)
for (i=0;i<128;i++)
data_out[i] <= 32'bb0;
else
for (i=0;i<128;i++)
data_out[(i+shift)%128] = data_in[i];
end
endmodule
Is this code fine in terms speed, resource and timing? I looks like a RAM, but RAM does't output all the memory at the same time.
Many thanks,
Jerry
If you replace the Mod operator (%) with a replication of the input data to make the circular shift you could make the task easier for the compiler. I tried this on the synthesis tool from a major ASIC tool vendor and the results were quite different.
if (rst)
for (integer i=0;i<128;i++)
data_out[i] <= 32'b0;
else begin
logic [31:0] tmp [0:255];
for (integer i=0;i<128;i++) begin
// replicate input data
tmp[i] = data_in[i];
tmp[i+128] = data_in[i];
end
for (integer i=0;i<128;i++)
data_out[i] <= tmp[128-shift+i];
end
That's a huge mux that will consume a lot of logic resources in the FPGA. I've seen things like that crash the tools before. You may want to consider adding more than just one register in there.
As far as speed, resource, and timing goes, it depends on how fast you want it to run and how many free resources you have. It could be fine at low speed on a big FPGA or impossible at higher speeds or small/full FPGA. But there's no need to speculate about resource and timing, just build it and see what happens.
I'm using SystemVerilog for synthesis. I fought with the fact that arrays of interfaces are not really arrays in SystemVerilog and the index has to be a constant value, but got over it using at lot of boilerplate generate for and assign statements to overcome what is really a language limitation (if I can emulate the effect using more code, the language could just do The Right Thing(tm) itself).
For the following pseudo-code, I leave out much of what's there in the real code (modports, tasks, etc) for clarity.
I have an interface:
interface i_triplex();
logic a; // input wire
logic b; // output wire
logic [127:0] data; // output wires
endinterface
And I am passing an array of these interfaces to a module which looks like
module roundrobin_triplex#(
parameter int NUNITS = 8
)
(
input wire clk,
input wire rst,
i_triplex t[NUNITS]
);
always_ff #(posedge clk) begin
if (rst) begin
// insert code to initialize the "data" member
// in all interface instances in the array.
end
else begin
// ...
end
end
endmodule
What would your preferred way to use all interface instance in the array uniformly -- regardless of the value of NUNITS? I have some suggestions, but I am eager to learn what other engineers can come up with.
Suggestion 1:
Use VHDL.
Suggestion 2:
Scrap the interface and do it oldschool Verilog-style, as in
module roundrobin_triplex#(
parameter int NUNITS = 8
)
(
input wire clk,
input wire rst,
// This was once a proud i_triplex array
input wire i_triplex_a[NUNITS],
input wire i_triplex_b[NUNITS],
input wire [127:0] i_triplex_data[NUNITS],
);
always_ff #(posedge clk) begin
if (rst) begin
for (int i = 0; i < NUNITS; i++)
i_triplex_data[i] <= '1;
end
else begin
// ...
end
end
endmodule
Suggestion 3:
Use a struct for input wires and a struct for output wires instead of the interface.
Suggestion 4:
Use a preprocessor-like system that unrolls generate for loops inside processes (what the language should do anyway!), so the resulting code looks like (preprocessed with NUNITS=4):
module roundrobin_triplex#(
parameter int NUNITS = 8
)
(
input wire clk,
input wire rst,
i_triplex t[NUNITS]
);
always_ff #(posedge clk) begin
if (rst) begin
i_triplex.data[0] <= '1;
i_triplex.data[1] <= '1;
i_triplex.data[2] <= '1;
i_triplex.data[3] <= '1;
end
else begin
// ...
end
end
endmodule
Suggestion 5:
Use the generate for / assign solution:
module roundrobin_triplex#(
parameter int NUNITS = 8
)
(
input wire clk,
input wire rst,
i_triplex t[NUNITS]
);
wire i_triplex_a[NUNITS];
wire i_triplex_b[NUNITS];
wire [127:0] i_triplex_data[NUNITS];
generate
genvar i;
// The wires of the interface which are to be driven
// from this module are assigned here.
for (i = 0; i < NUNITS; i++) begin
assign t[i].b = i_triplex_b[i];
assign t[i].data = i_triplex_data[i];
end
endgenerate
always_ff #(posedge clk) begin
if (rst) begin
for (int i = 0; i < NUNITS; i++)
i_triplex_data[i] <= '1;
end
else begin
// ...
end
end
endmodule
Arrays of module or interface instances cannot be treated as regular arrays because parameterization, generate blocks, and defparam statements can make elements of the array instance non-unique. That cannot happen with arrays of variables/wires.
My suggestion would be a modification of your suggestion 2; put arrays of variables/wires inside a single interface instance.
how about suggestion #6, use parameterized interface:
interface I #(NPORTS = 8);
logic clk;
logic a[NPORTS];
logic b[NPORTS];
logic [127:0] data [NPORTS];
endinterface //
module roundrobin#(NUMPORTS = 8) (I t);
logic [127:0] data[NUMPORTS];
always_ff #(posedge t.clk) begin
data <= t.data;
end
endmodule // roundrobin
Note, that you do not need the loop in system verilog. you can use array assignments:
data <= t.data;
and for the sake of convenience you can add functions or statements to the interface itself, i.e.
interface I #(NPORTS = 8);
logic clk;
logic a[NPORTS];
logic b[NPORTS];
logic [127:0] data [NPORTS];
function logic [127:0] getData(int n);
return data[n];
endfunction // getData
endinterface // I
and use
data[i] <= t.getData(i);
Sorry, the above example is probably not very useful, but it might give you an idea.
Suggestion 1: VHDL may be a practical. However, it seems to become marginal in industry.
Suggestion 2: In my opinion, Interfaces are relevant if you intent to reuse it widely and implement verification/protocols into it. If you can unpack your interface like this, while keeping your sanity, then I do not see a justification for an interface in the first place.
Suggestion 3: I never tried to synthesise struct, it may be a good idea.
Suggestion 4: Simple solution,although quite verbose .
Suggestion 5: I used something similar for one of my project. However, I wrote a sort of adapter module to hide the assigns.
Actually, when I need something like that, I try to write an atomic module which operates on a fixed number of interfaces. Then, I use for generate structures to generalize it on an interface array.
Maybe I'm missing something by why not putting all yours code from roundrobin_triplex into generate (you do not need extra assigns)
interface i_triplex();
logic a; // input wire
logic b; // output wire
logic [127:0] data; // output wires
initial $fmonitor(1,data);
endinterface
module roundrobin_triplex#(parameter int NUNITS = 8)
(
input wire clk,
input wire rst,
i_triplex t[NUNITS]
);
genvar i;
for( i=0; i<NUNITS; i++)begin
always_ff #(posedge clk) begin
if (rst) begin
t[i].data <=0;
end
else begin
t[i].data <=t[i].data+1;
end
end
end
endmodule
module top;
parameter P=4;
bit clk,rst;
i_triplex ii[P]();
roundrobin_triplex #(P)uut(clk,rst,ii);
initial begin
rst=1;
repeat(2) #(posedge clk);
rst=0;
repeat(10) #(posedge clk);
$finish;
end
always #5 clk =~clk;
endmodule
Im working on a NCO (still) and I got problems with adress select block - my teacher wants the samples in ROM block (done that already) but the adressing thingie doesnt seem to work. What I need is a modulo 200 accumulator with variable step... I adopted this code from a sample where somebody used i as counter to pick a value from an array of samples, BUT I need to simply copy i to the output port.
Something with PWM wasnt working, it skipped not ten but ~80 samples, so I decided to check the adressing - Ive been mighty surprised when I noticed that adress changes INDEPENDENTLY from the clock signal. ( http://i.imgur.com/XL9l8mj.jpg )
Heres the code:
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL; --try to use this library as much as possible.
entity adress_select_200 is
port (clk :in std_logic;
step :in integer range 0 to 200;
adress : out integer range 0 to 199
);
end adress_select_200;
architecture Behavioral of adress_select_200 is
signal i : integer range 0 to 399:=0;
begin
process(clk)
begin
--to check the rising edge of the clock signal
if(rising_edge(clk)) then
adress <= i;
i <= i+step;
if ((i + step) > 199) then
i <= (i + step) - 200;
else
i <= i + step;
end if;
end if;
end process;
end Behavioral;
Im not so great with VHDL, but I suppose the whole loop should ONLY execute on clk rising edge, right? Meanwhile its doing that weird sh... in the middle of the cycle, no idea why.
How do I stop that from happening?