Difference between RTL and gate-level simulations - Flipflop with async set and async reset
This article is inspired by a paper presented by Clifford E. Cummings and Don Mills at SNUG San Jose 2002: http://www.sunburst-design.com/papers/CummingsSNUG2002SJ_Resets.pdf . The example is taken from section 4.2 of that article.
Once in a while we receive questions from customers regarding the difference between RTL (behavioral) simulation and gate-level simulation. Many times the issue is traced to a flipflop in the design.
For a flipflop with both asynchronous set and asynchronous reset, the behavioral description is usually written as:
always @(posedge clk or negedge rst_n or negedge set_n) if (!rst_n) q <= 0; // asynchronous reset else if (!set_n) q <= 1; // asynchronous set else q <= d;
This description does not behave as one would expect in this sequence:
- rst_n=0 and set_n=0 => q=0 (rst_n has higher priority)
- rst_n=1 and set_n=1 - one would expect q immediately changes states to 1, but based on the Verilog description, q doesn't go to 1 until the next "posedge clk."
The root cause is that when rst_n goes from 0 to 1, there is nothing triggering the execution of the "always" block and thus q remains unchanged.
For the following testcase:
module dff3_aras (q, d, clk, rst_n, set_n); output q; input d, clk, rst_n, set_n; reg q; always @(posedge clk or negedge rst_n or negedge set_n) if (!rst_n) q <= 0; // asynchronous reset else if (!set_n) q <= 1; // asynchronous set else q <= d; endmodule
Simulation result:
0 d=0 clk=0 rst_n=x set_n=x d=0 q=x 5 d=0 clk=1 rst_n=x set_n=x d=0 q=0 10 d=0 clk=0 rst_n=0 set_n=1 d=0 q=0 15 d=0 clk=1 rst_n=0 set_n=1 d=0 q=0 20 d=0 clk=0 rst_n=1 set_n=1 d=0 q=0 25 d=0 clk=1 rst_n=1 set_n=1 d=0 q=0 30 d=0 clk=0 rst_n=1 set_n=0 d=0 q=1 35 d=0 clk=1 rst_n=1 set_n=0 d=0 q=1 40 d=0 clk=0 rst_n=0 set_n=0 d=0 q=0 45 d=0 clk=1 rst_n=0 set_n=0 d=0 q=0 50 d=0 clk=0 rst_n=1 set_n=0 d=0 q=0 55 d=0 clk=1 rst_n=1 set_n=0 d=0 q=1 60 d=0 clk=0 rst_n=1 set_n=0 d=0 q=1
At time 50, rst_n=1 and set_n=0, but q remains at 0, and changes to 1 at time 55 when there is a rising edge of clk.
Note that this problem is with RTL simulation; it has nothing to do with synthesis. In fact, the gate-level simulation works as expected thanks to the logic determining the priority between set and reset inputs to the flipflop.
Output netlist from Verific's RTL elaboration:
// // Verific Verilog Description of module dff3_aras // module dff3_aras (q, d, clk, rst_n, set_n); // dff3_aras(8) output q; // dff3_aras(9) input d; // dff3_aras(10) input clk; // dff3_aras(10) input rst_n; // dff3_aras(10) input set_n; // dff3_aras(10) wire n8, n12, n16; not (n8, set_n) ; // dff3_aras(15) not (n12, rst_n) ; // dff3_aras(16) and (n16, n8, rst_n) ; VERIFIC_DFFRS i9 (.d(d), .clk(clk), .s(n16), .r(n12), .q(q)); // dff3_aras(16) endmodule // // Verific Verilog Description of PRIMITIVE VERIFIC_DFFRS // module VERIFIC_DFFRS (d, clk, s, r, q); input d; input clk; input s; input r; output q; reg q ; always @(posedge clk or posedge s or posedge r) begin if (s) q <= 1'b1; else if (r) q <= 1'b0; else q <= d; end endmodule
Simulation result:
0 d=0 clk=0 rst_n=x set_n=x d=0 q=x 5 d=0 clk=1 rst_n=x set_n=x d=0 q=0 10 d=0 clk=0 rst_n=0 set_n=1 d=0 q=0 15 d=0 clk=1 rst_n=0 set_n=1 d=0 q=0 20 d=0 clk=0 rst_n=1 set_n=1 d=0 q=0 25 d=0 clk=1 rst_n=1 set_n=1 d=0 q=0 30 d=0 clk=0 rst_n=1 set_n=0 d=0 q=1 35 d=0 clk=1 rst_n=1 set_n=0 d=0 q=1 40 d=0 clk=0 rst_n=0 set_n=0 d=0 q=0 45 d=0 clk=1 rst_n=0 set_n=0 d=0 q=0 50 d=0 clk=0 rst_n=1 set_n=0 d=0 q=1 55 d=0 clk=1 rst_n=1 set_n=0 d=0 q=1 60 d=0 clk=0 rst_n=1 set_n=0 d=0 q=1
At time 50, when rst_n goes to 1, q also goes to 1. This is as expected.
To correct this issue, the simulator needs some help from the designer. The RTL code is modified to force q to 1 when rst_n is 1 and set_n is 0:
// Good DFF with asynchronous set and reset and self- // correcting set-reset assignment module dff3_aras (q, d, clk, rst_n, set_n); output q; input d, clk, rst_n, set_n; reg q; always @(posedge clk or negedge rst_n or negedge set_n) if (!rst_n) q <= 0; // asynchronous reset else if (!set_n) q <= 1; // asynchronous set else q <= d; // synopsys translate_off always @(rst_n or set_n) if (rst_n && !set_n) force q = 1; else release q; // synopsys translate_on endmodule
The "extra" code is ignored by synthesis tools due to the pragmas translate_off/on.
Now the result of RTL simulation matches that of gate-level simulation:
0 d=0 clk=0 rst_n=x set_n=x d=0 q=x 5 d=0 clk=1 rst_n=x set_n=x d=0 q=0 10 d=0 clk=0 rst_n=0 set_n=1 d=0 q=0 15 d=0 clk=1 rst_n=0 set_n=1 d=0 q=0 20 d=0 clk=0 rst_n=1 set_n=1 d=0 q=0 25 d=0 clk=1 rst_n=1 set_n=1 d=0 q=0 30 d=0 clk=0 rst_n=1 set_n=0 d=0 q=1 35 d=0 clk=1 rst_n=1 set_n=0 d=0 q=1 40 d=0 clk=0 rst_n=0 set_n=0 d=0 q=0 45 d=0 clk=1 rst_n=0 set_n=0 d=0 q=0 50 d=0 clk=0 rst_n=1 set_n=0 d=0 q=1 55 d=0 clk=1 rst_n=1 set_n=0 d=0 q=1 60 d=0 clk=0 rst_n=1 set_n=0 d=0 q=1