Services We Provide

Hmm, let's ponder on this one.

Meanwhile we can work with the template data:
Surely you are in need of all of these services.

It is only a question of whether you need them in
our realm and what you offer in exchange

Anyways, here are some notes and good interview
questions for you to enjoy...
                             System Verilog

Constrained Random Stimulus with weights.
System Verilog Assertions and covergroup/coverpoint Functional Coverage.
Object Oriented - Classes and inheritance (extends/implements).
Improved concurrency - fork/join-any/none.
Inter-process communication - mailbox, resource management - semaphores.
Named events and process control mechanisms.

Program block begins execution and separates the test bench from the design.
Virtual Interfaces with actual interfaces dynamically connected at run time.
Clocking blocks to encapsulate synchronous timing details, default #1 delays.
modports to group interface specific directional signals.
C like (dynamic) data types - typedef struct union enum, dynamic/assoc arrays
mailboxes, semaphores
DPI, VPI Extensions to interface to programming languages such as C.

:::: SV2012 ::::

base_trans t_base = ext_trans::new(); // typed new constructors
initial rstN <= 0; // Nonblocking assignment to class properties
(*)class Fifo implements Put, Get; // Multiple inheritance for intf classes
constraint sz {soft pktsz inside {32,64};} p.randomize with { pktsz == 128 }
 // randomize with takes precedence rather than a run-time error
constraint c1 { unique {a,b,c}; } ... c2 { unique {data_array}; } // unique
decoder_1 = dec_class #(4)::dec_func(...) // parameterized static func calls
let OK(..., untyped a) = ... // allow untyped let args in any position
for(var type({a}) i; ...) // for loops with var type()
task automatic put_data(input val, ref d[$]) // ref arg with dynamic array
$countbits(data,'x,'z) // like $countones counts bits set to specified value
`begin_keywords 1800-2012 directive for newly reserved words
 such as -- implements interconnect nettype soft
(*)user-def net types with multi-driver logic custom resolution functions
generic net // infers its type from lower-level connections
coverpoint labels are variables that can be used in expressions
coverage expressions can call functions for shared code
bins mod16[] = { [0:255] } with (item%16 == 0) // bin exclusion
assertions and sampling functions can test
 real, dyn arrays-strings/queues, static class properties
global clocking @(..) // Each hierarchy scope can have its own global clock
Sequences can infer a clock in contexts rather than require a property
let r = s1.triggered; // seq methods can be used with a seq expression
A0: assert #0 (!$isunknown state) // reactive - deferred immediate assertion
A1: assert final (!$isunknown state) // postponed - final assertion
$assertcontrol // fine level of control granularity
checkers can have output arguments
checkers support numerous additional constructs and now support -
 always_comb/latch/ff, blocking/conditional/looping statements
 immediate assertions, task calls, let declarations, continuous assignments
VPI support for 2012 constructs -
 soft constraints, build-in process class
 transition to typespecs added to named events,
 join type property added to the Scope diagram

:::: SV2012 ::::

                             * Data Types *

V95- reg[msb:lsb], integer, real, time, wire (net allows multiple drivers)

Variables can be initialized and have default values, 2/4-state default 0/X.
Class handles are initialized to null and strings to an empty string "".

logic [signed][msb:lsb] is synonymous with reg and can not be driven by
multiple drivers.  Most instances of reg or wire can be replaced with logic.
Note that only the net type wire supports multiple drivers.
Use if($isunknown(port)) to detect X/Z before assigning to 2-bit vars.
X and Z values are otherwise silently mapped to 0.

2-state: bit [signed][msb:lsb] variable [=initial_value] // default unsigned
8-bit byte, 16-bit shortint, 32-bit int, 64-bit longint // default signed
 Add keyword unsigned to specify as unsigned.

$signed and $unsigned can be used to cast values.
Vector part selects and based literals are unsigned unless cast or explicitly
signed such as 32'sheed.  Implicit decimals such as 123 are signed.

shortreal - Equivaent to 32-bit C float type, real is 64-bits like double.
string variable [=initial_value] supports == != compare icompare (ign case)
Also supports: < <= > >= {concat}
len() getc() putc() substr()
{i,hex,oct,bin,real}toa() ato{i,hex,oct,bin,real}() toupper() tolower()
Returned by format function $psprintf, better than $sformatf(str,...)

Variables may be declared automatic so a fresh copy is re-initialized every
time a block is entered or static to create a single copy which is only
created and initialized once and retain their values.  Variables in
automatic tasks and functions or instantiated objects default to automatic.

$time/$stime return a 64/32-bit integer time scaled to the invoking module.
$realtime  returns the time in a real format.
$printtimescale displays the time unit and precision.
$timeformat[(units,prec,sfx,width)] specifies how %t formats the time.

Conversion functions:
integer $rtoi(real), real $itor(int)
[31/63:0] $[short]realtobits([short]real), [short]real $bitstoreal(bits)

typedef <type [msb:lsb]> name_t; name_t var, ... // avoids anonymous decl

[typedef] enum [type] {IDLE[=val],TEST,START} state_t; // Def int, x/z allowed
Declaration: state_e current_state, is symbolic string
Must cast assignments: current_state = state_e'(2)
Can be used directly in an integer expression.

Structures can consist of any types and may also be typedefined and packed.
typedef struct [packed] { ... } mystruct_t;
A packed structure is useful for subdividing a vector into subfields
and can be used as a whole for arithemetic and logical operations.
A packed structure can be accessed as a single vector [n..0].
A structure is treated as 4-state if it contains *any* 4-state fields.
Structures can be used as port nets and contain resolution functions.
Individual fields are accessed as elements: mystruct.field
A structure can be set using an "assignment pattern:"
 extstruct = '{field1:12, intstruct:'{fld1:0, fld2:1}, field2:21}

typedef union { type0 var0; type1 var1; } name_u; name_u inst0, inst1;
or union { ... } inst0, inst1; Reference using inst.var0 inst.var1
unions can also be packed.
A tagged qualifier creates a tagged union which stores the write tag type and
only permits access by the same type.

dynamic cast: assert($cast(type,val|var)), static cast: type'(value|variable)
dynamic-Safe $cast() casting ensures that the code is type/boundary-safe.
Static-Speed `type casting does not effect simulation performance but
tolerates illegal assignments which can have unexpected results.

Fixed-size arrays have C like dimensions: [0:7] can also be specified as [8]
type arr_nm[size] [=initial_values [default:default_value]]; copy: a=b
 Out-of-bounds writes are ignored, reads return default 0/x values.
 Support for multi-dimensional arrays and bit select indices.
 Initialization x[4] = '{0,1,2,3}; x[2][3] = '{'{0,1,2},'{4,5,6}}
 $size(name,dim) gets size hi-lo+1, $[unpacked_]dimensions(name)
 $left(name,dim)=msb, $right=lsb, $low,$high=min/max of $left/right
 $increment(name,dim) returns 1 if $left>=$right else returns 0
Dimensions before array name are packed and useful for streaming:
 type [msb:lsb][msb:lsb] name[hi:lo][constant]
Packed arrays can be sliced but only contain 2-state and 4-state types.
Arrays are accessed using unpacked dimensions followed by packed:

Dynamic arrays: type array_name[] [=initial_value]; // Err if out-of-bounds
ar_nm = new[size] [(copy-src)]; a = b copies src; var.delete deallocates
Associative: type name[type], Use * for any type.  Size is a.num().
a.delete([index]) Deletes index or entire array
a.exists(index) a.first() a.prev() a.last()
 Methods return true only when the specified/requested index is valid.

The string data type is used to represent character arrays and supports many
operators including: {concat} = == != < <= > >=
Elements can be referenced using an index: mystr[3]
A string can be printed by simply using: $display(mystring)

Strings can be formatted using $sformat(mystr,"format %d..",args,...)
which is equivalent to the function mystr = $sformatf(...)

Queues: type name[$[:size]] [={0,1..}], q[i] indices range from 0 to $
q.push_front/back(x), x=q.pop_front/back, q.insert(pos,val) q.delete(pos)
Use q.size() to avoid run time over/under flow errors.  Are synthesizable.

Loop on array items: foreach(name[i[,j])
Operators == != can compare arrays.
Methods: array.find[_{first,last[_index]}]() [with expression]
find.unique[_index] min() max() reverse() sort() rsort() shuffle()
Reduction: sum() product() and() or() xor()

ref passes value by reference so changes are immediate.
const marks a constant value.
const ref passes read only reference, useful to pass arrays.
A variable ref can be passed as a port and represents the last value set.
The var keyword explicitly declares a (port) name as a variable.

Real value math functions equivalent to C.
$ln $log10 $exp $sqrt $pow $floor $ceil $sim $cos $tan $asin $acos $atan[2]
$hypot(x,y) $sinh(x) $cosh $tanh $asinh

Ports may be connected directly to variables and support a shorthand syntax:
.sig(sig) becomes .sig, .* connects unconnected ports to equivalent signals
Implicit port connections for module instances simply use: dut u1(.*)

An event data type object provides a means of communicating between and
synchronizing concurrently active processes.

An alias is System Verilog coding technique to model bi-directional mapping
for ‘inout’ ports or wires in a module. In particular, alias mapping is
direct connection of one inout port to other. It is a short-circuit of wires.
module tomap ( inout [2:0] A, B;); //
alias B = {A[0], A[1], A[2]}; // endian

let - A let declaration defines a template expression (a let body), customized
by its ports. A let construct may be instantiated in other expressions.
Syntax: let (actual/formal variable)=(expr with actual/formal variables)
Usage example1(Formal variable passed in declaration):
   let maxab(a,b)=(a>b?a:b); //Definition
   C=maxab(d,e); //Usage

Compilation Scopes.
$unit represents the top level of each compilation unit, and there is nothing
in $root except for the implicitly instantiated module instances.
The only time you need to use $root or $unit is when a local name in the
current scope hides a name in a higher level scope.

                             * Constructs *

Operators: ++ -- += -= /= %- &= ^= |= <<-= >>= <<<= >>>= (arithmetic)

== != returns x for X/Z operands, === !== support X and Z values.
==? supports X/Z in the rhs as don't care; lhs X/Z still returns X.

Position independent argument passing by formal name: method(.a(x), .b(y))

Block/Statement labels, label: <statement>, begin/end* : label
Not required but recommended and used by some debug tools.

Expressions may use = in parenthesis but this is discouraged: if((a=b))

Support for break/continue - Unstructured constructs, avoid!

The C-like do-while construct is supported.

A package is an explicitly named outermost scope level container.
A package groups and encapsulates code to make it easier to integrate and
 also to deal with potential name contention.
A package creates multiple name spaces which are selectively made visible.
A package may not contain any processes: assign/inital/always
Package subroutines default to static unless declared automatic.
A package may import and export other packages.
A package must be defined before it can be referenced.

There are many ways to reference the contents of a package.
Direct reference: MyPkg::MyClass member, Implicit: import MyPkg::*
 Note that only referenced members are imported by an implicit import.
Explicit import: import MyPkg::MyClass MyPkg::member
Export: export OtherPkg::member export::*
Port type: module mymodule(input mypkg::mytype signal);
Do not use wildcard (import pkg::*)  at $unit as it makes everything global.

Components can also be instantiated at $root and connected via bind
bind test_top test t(test_top.device_if.TB);
bind test_top device dut(.reset_n(test_top.device_if.reset_n), ...
             -or-       .reset_n(test_top.device_if.DUT.reset_n), ... );
The foremost usage of bind is for assertions and care must be taken during
other usage so different bind statements do not interfere with each other.

functions can not block and can only call other functions (same as v95).
Declare functions and tasks automatic for reentrancy.
Functions and arguments may be of any built-in or user defined types.
Functions may be declared void or cast `void.
Tasks and functions do not require a begin/end and may be terminated by
a return but this violates structured programming except at the end.
Arguments may be passed as reference using the key word ref.
Input arguments may have default values making them optional.
Argument mapping can be used in calls: f(.arg(1'b1),arg2(x)...)
Argument direction and type values are sticky and apply to subsequent args.

$random([seed]) $urandom([seed]) $urandom_range([min!=0,]max),
+ntb_random_seed=x +net_random_seed_automatic
Always use $urandom to avoid random stability problems.
System Verilog maintains thread specific random number generators
to provide thread and object stability by using independent random
generators for ecah thread.
It is preferred to use randomize which supports constraints for more
flexibility and clarity:
 data = std::randomize(data) with { data >= MIN_VALUE; data <= MAX_VALUE; };

case inside replaces casex/casez and uses he ==? operator,
i.e. X/Z may appear in the branches, not in the expression.

randcase : label
weight: begin <block> end
weight: begin <block> end
endcase  : label

randsequence is a grammar-driven stream of random productions and supports
weights, interleaving and other control mechanisms.

interface encapsulates signals, protocol assertions, and driver/monitor tasks.
An interface differs from a struct in that it supports net types.
Interfaces need to be instantiated.
Interfaces like modules can be referenced before they are defined.
Interface definitions should be parameterized.
Use bit for clock/reset, logic for other signals (as specified in uvm).
Use extern to define tasks outside interface.
Interfaces are required as class objects cannot connect directly
to module ports.
Interfaces cannot instantiate modules.
Interface ports are used for signals such as clocks and reset that
are not a part of the interface signal set as this is cleaner than
adding these signals directly to the module ports.
Interfaces used to bundle signals are synthesizable.
Interfaces may contains functions, tasks, and assertions.
A modport groups signals and provides a direction relative to the module.
Use export to define tasks outside definition.
Use generate to create modport expressions.
A clocking block synchronizes signals and modports to a clock.
Use in interface definitions.
Defines input (sense) and output (drive) skews.
Use just modport w/o cb for DUT.
Modules should use interfaces as port types.

interface dev_if #(WIDTH=32)(input bit clock,rstn); // arg for @(posedge clk)
logic busy_n, frame_in_n, frame_out_n, valid_in_n, valid_n_out;
logic [`WIDTH-1:0] data_in, data_out;
modport TB(clocking cb, output rstn, input clock); // Async reset
modport DUT(...);
clocking cb @(posedge clock);         // Specify clock signal and edge
 default input #1, output #2;        // Sense and drive skews
 output [#del] reset_n, data_in, frame_in_n, valid_in_n; // Relative to tb
 input  [#del] busy_n, data_out, frame_out_n, valid_out_n; //Can be modport
endclocking : cb
endinterface : dev_if

module test_top;
device_if dev_if(SystemClock);
test t(dev_if);
device dut(.reset_n(dev_if.reset_n, .clock(dev_if.clock), ... ));
endmodule : test_top

repeat (n) @(interface.clocking_block); // Clock delays

Combinational logic:
always @* is "usually" sufficient to ensure complete sensitivity.
always_comb includes variables referenced indirectly through function calls
and also does extra checks.
A complete assignment ensures that every single variable must be assigned
in every possible path to avoid inferred latches.  This is acoomplished by
"unique" and "priority" decision modifiers for priority-encoded/parallel

A specify block in a module specifies a delay across a path or the module.

Use a random gap between independent clocks.

$bits(x) returns the number of bits in an argument.
$typename(expr) returns a string that represents the argument type.

Synchronous Drive Statements:
[##num] interface.cb.signal <= [##num] <value|expression>
 Specified clock cycle delays, drive must be non-blocking.

Sampling Synchronous Signals:
varb = if.cb.signal // no delay, can not sample output signals

@(vif.cb);                              // Sync to cb clock edge
@([(neg|)posedge] vif.cb.sig);          // Sync to signal edge
wait(expr);                             // Wait for expr, no delay if true

#(delay) #(rising,falling[,off]) each can be (min,typ,max)

@e Waits for (next) event trigger, -> e; Triggers event
wait(e) wait for event, no delay if event already triggered

if($value$plusargs("field=%d",field))   // Read +field=value when defined
$display("@%0t %m: ..msg..",$[real]time); // %m is the location of the call
$timeformat[(units,precision,suffix,min_width)]; // Set format

timescale- 20ns (timeunit) / 1 ps (timeprecision), #delay multiplied by tmun
decimal fractions use timeprecision, #20.0053 rounds to #20.005nS (rnd 1ps)

Modules may delcare explicit timeunit and timeprecision values with time units
which apply only to the current and nested modules

The streaming operator is used to pack and unpack data.

fork <block> <block> join (all) | join_any | join_none
Use automatic for forks to create thread specific variable copies.
disable threadname or disable fork to disable all in scope.
Disabling a fork violates structure programming principles.

                             *     DPI    *

Simpler to use but has severe restrictions on data types.
Use PLI aval/bval encoding for 4-state and packed data.
Unlike PLI, does not check declarations and types.

System Verilog calling C/C++ functions.
import "DPI" [cname =][pure] function type name(args);
import "DPI" [cname =][pure][context] task type name(args);
 cname: maps C name to System Verilog prototype name.
Call Verilog tasks to synchronize with time, cannot sync with events.
Function types: ordinary/generic, pure, and context.
ordinary: Can be void, no delays or task calls allowed (save as Verilog).
pure: returns value, only input args, does not call other functions.
context: type required to call System Verilog subroutines from C. i.e.
 Allows C to invoke DPI/VPI/PLI embedded exported tasks and functions.

C calling System Verilog functions.
export "DPI" [cname=] function (or task) name;

chandle allows C to allocate memory and pass to SV, must deallocate in C.

reg/logic[n:0] use svLogicVec[32] which contains ctrl and data fields.
Encoding: 0/1 -> c=0 d=0/1, Z c=1 d=0, X c=1 d=1

                             *     OOP    *

Classes encapsulate variables as properties and subroutines as methods which
are referred to as class members.

Modules are static, classes are dynamic that are created and destroyed.
Class instances are passed as handles.
Classes can be inherited and extended to add functionality.
Multiple-inheritance consists of extending a class with interface classes:
class ext_c  extends base_c implements interface_class1 ...
Polymorphism - Allows same interface to be used to access general class of
actions, such as determining the method based on the class handle type.
Polymorphism requires that all methods be declared virtual.

   |   [BaseClass]   |
   | - Properties    |
   | - Data Structs  |
   | + Methods       |         -----------------
   | + Func/Tasks    |        | Interface Class | ...
    -----------------          -----------------
           /|\                        /|\
            |               Implements |
    ----------------- ----------------- -------------
   |                 |                 |
[Derived]         [Derived]         [Derived]

Classes may contain static methods and properties which are shared among all
instances and may even be invoked without an instance handle.
These methods may only access static methods and properties of the same class
and cannot be overridden in a derived class.

It is bad practice to declare base class properties in an extended class.

Classes can be defined and used anywhere: program module package
program [automatic] test
 class env;begin : env ... ; endclass : env
 initial begin : label .. testcase ... end : label
endprogram : test

A program block does not allow always blocks to simplify the start and
stop flow.

A class definition may also contain parameters with optional default values
that define properties and types:

class env #(type T=int, int p, q=1) extends uvm_env
 declare generators, drivers, checkers, cfg, coverage
 function build - Create components, instantiate ports and exports.
 tasks reset dut, cfg dut, start - starts everything
 stop transactions, then wait for end.

A constructor may be declared as a local or protected method (8.18).
A constructor shall not be declared as a static (see 8.10) or virtual (8.20).
A constructor should not be local as is needs to support

All properties should be declared random and unrestricted: rand int len
The random be disabled in the base class constructor: len.rand_mode(0)
They can also be constrained for most operations: constraint l { len > 0 }

A virtual method is replaced by its equivalent in an extended class which
must have the same signature.

A forward reference to a class is resolved with a typedef: typedef class foo

A typedef can also be used to define a class with specific parameters:
typedef stack #(packet) stack_pkt;

A class may contain only a signature for a method and define it separately
using the scope resolution operator: class_name::method
Any default values in the definition must match the signature.

It is useful to parameterize the type: class my_class #(type Tr=pkt)

A singleton class may not be instantiated and contains only static methods
and properties along with a protected constructor.

A singleton object is instantiated at compile time.
uvm_root is a singleton object.

this/super - handle to the current or parent classes, super=null in parent
Class members are public by default and accessible by external code.
 Properties should be kept public so they can be used by constraints
 in other classes.
protected members can be accessed only by extended classes.
local members can not be accessed directly outside class.
 int x;
 function new(int x);
   this.x = x; // this points to the class variable
 endfunction : new

Down-Casting is OK: base_handle = extended_handle
(Dynamic) Up-Casting requires assert( $cast(extended_handle,base_handle) )

Classes normally only copy handles.
inst2 = inst1; // Copies the handle to the same object
inst2 = new inst1; // shallow copy of inst1 does not copy objects
 A deep copy includes nested objects and requires (layered) custom code. is used to chain through a sequence of handles

pure methods: values returned only via call, no inout or output arguments.
Cannot call other functions or use static or global variables.
For a pure virtual method:
There CAN'T be a definition of the pure virtual method in the base class.
There MUST be a definition of the pure virtual method in the derived class.

instance.randomize([obj1,..]) is a deep function to randomize all rand/randc
or specified objects.  Specified objects do not have to be declared rand.
Constraints support weighted distributions:
 "dist" for distributions: [hi:lo]:= weight(for each), /= to divide weight
Global Constraint expressions involve random variables from other objects.
Conditional constraints: if(expr) constraint; [else constraint;]
"inside" supports ranges: inside [hi:lo], [hi2:lo2] ... [hiN:loN]
"with" allows inline constraints -> randomize with { expr1; ... exprN }
implication -> infers that the rhs applies only when the lhs is true.
 This is also referred to as a bidirectional constraint.
 This is equivalent to using an if(expr) {const..} [else {cons...}
solve before is used to order the randomization, often for implication.
 It is not required and effects the distribution and performance.
Complex constraint expressions will also degrade performance.
 Best to use the binary nature of expressions to simplify them.
{class,object}.constraint_mode(0/1) turns constraints off and on
object.rand_mode(0) turns the randomization off/on for an object.
Properties should not be bundled in a constraint as that makes it harder to
have fine-grain control.
Use constraints for must-obey properties.
Use constraints for should-obey properties to allow for error injection:
 int len; constraint valid_length { len > 0 }
Array constraint: constraint all { foreach x my_array {x > 0;} }

A container class is one which instantiates other classes such as the
transaction (packet).

The distribution can be controlled:

Vft Foundation library - info->base->byte array->trn
(Lowest) sv_info_c Static shared fields err/warn ctrs, dbg lvl, log ctrl
 Access methods such as set_debug_level
Base class - instance and name
Byte array - byte data[] + size + cp + cmp + init + print with width format
data extends - array + offset + + tag + endian
trn extends data - addr + type (Rd/Wr) + target
cpuIf - addr+bank + rd_data + wr_data + req + rdy + wr + burst + rd/wr tasks
fw_api - mailboxes, poll or wait on mbox then execute task
CPU model - runs forever waiting for request and invoking tasks to do work

                             *Test Benches*

+ Common testbench for all tests.
+ Test specific code kept separate from tb.
Randomize - Device Cfg, Env Cfg, Input data, Protocol Exceptions, Delays,
   Errors and Warnings.

Schedule regions.
Preponed:  Sample signals before any changes (#1 step)
Active:    Design simulation (module), including NBA
Observed:  Assertions evaluated after design executes
Reactive:  Testbench activity (program), responds to assignments.
Postponed: Read only phase

Three primary phases.
Build - GEN_CFG: function gen_cfg() generates random dut and env cfg.
       BUILD: func build() allocates and connects components based on cfg.
       RESET_DUT: task reset_dut() resets dut before configuration.
       CFG_DUT: task cfg_dut() configures dut based on random cfg.

Run   - START: task start() starts verif env components to start test.
       WAIT_FOR_END: Test run waits for completion event end_test.
       STOP: task stop() stops verif components for graceful termination.

WrapUp- CLEANUP: task cleanup() performs cleanup operations.
          Let dut drain buffered data, read statistics,
          sweep scoreboard for leftover expected responses.
       REPORT: task report() reports success/failure and closes files.

Layered testbench with transactors.
Test -> Scenario Gen -> (Functional) Agent -> (Command) Driver -signal-> DUT
Test contains constraints to create stimulus.
Directed tests in constrained random environment find unanticipated errors.

   |    |                                               |
   |   \|/                                              |
   | [Generator]         *env*                          |
   |    |                                               |
   |   \|/                                              |
   | [Transactor]--->[Scoreboard]<-----------[Checker]  |
   |   /|\                                      /|\     |
   |    |                                        |      |
   |   \|/                                       |      |
   | [Driver]        [Assertions]            [Monitor]  |
   |   /|\                |                     /|\     |
   |    |                 |                      |      |
        |                 |                      |
        | Interface      \|/   Interface         |

                             *    SVA     *

System Verilog Assertions document and verify the design assumptions as well
as provide functional coverage points.
They frequently require clarifying specifications.
They are ignored by synthesis and can be disabled.
Project planning includes an Assertions Test Plan which includes assignments.
The design team should focus on verifying block functionality and the
DV team should ensure that the design meets the specifications.
Interfaces should contain assertions to detect protocol violations.
"Assertions improve the observability of internal interactions" and help
"pinpoint errors at the point of origin."

Assertions should be used to ensure that:
[Input] ports are connected
Control signals are never X
State values are always legal
Data and parameter values are valid and in range
X-optimism and X-pessimism ar properly managed

pass/fail action is optional, a default $error is printed on fail.
Assertions should be labeled so they can be reviewed in the waveform.
Default clocking or always blocks should be used to disable assertions during
reset with $assert tasks such as: $assertoff $assertkill $asserton
These affect all assertions unless called with one or more arguments.
The first argument is hierarchy and the remainder specific properties.
global clock may be declared anywhere: global clocking @(posedge clk); endcl..
$global_clock is explicit designation of global clock:
default clocking @$global_clock; endclocking
Assertions may be effected by signal hierarchy changes in a gate netlist.
Assertion expressions must evaluate to 1, 0/x/z is a failure.

Assertions have severity levels with an *optional* user message:
$info("msg",msg_args); // just print msg
$warning("msg",msg_args); // run time warning
$error("msg",msg_args); // print error msg and continue (default type)
$fatal(finish_0_1__or_2,"msg",msg_args); // terminate sim with $finish
These constructs can actually be used anywhere in the code.

Immediate are same as if/else and check at the current time:
check_4_XZ: assert(!$isunknown(mysignal));
++ Simple syntax, close to code, can check async values, self documenting
-- Difficult to disable, cannot use binding, susceptible to race conditions

Immediate assertions offer a concise and convenient form to test the
success of dynamic casting and constrained randomization.

A sequence is a series of true/false expressions that test for a sequence
of events over multiple cycles.
Sequence expressions may reference and concatenate other sequences using ##.
A sequence can be used in multiple properties.
sequence<name> [(args)]
 endsequence [:name]

Sequence implication terminology and operators:
|->     overlapped implications: evaluation starts immediately
|=> non-overlapped implications: evaluation starts at next clock
|-> ##1 is equivalent to non-overlapped implication

More complex sequences may be built using special operators such as:
[*n[:m]]  consecutive repetition, [*0] is an empty seq with empty resultant
[=]  non-consecutive repetition
[->] goto repetition
##   delay
$    unbounded time

and or intersect throughout within first_match
methods: .ended .triggered .matched
##0 overlap

Sample value functions, cycle definition is optional and seldom needed:
$rose(   expr,cyc_def) // true if the LSB of the expression changed to 1
$fell(   expr,cyc_def) // true if the LSB of the expression changed to 0
$changed(expr,cyc_def) // true if value of the expression changed
$stable( expr,cyc_def) // true if value of the expression did not change

$rose/$fell differ from edge detection in that the signal must change from
0 to 1 or 1 to 0, changes to or from X and Z do not qualify.

Expression functions return result of evaluated expression:
$sampled(expr,cyc_def) // sampled value at specified clock edge
$onehot(     expr)     // exactly one-bit set
$onehot0(    expr)     // at most one-bit set
$isunknown(  expr)     // value is not 0 or 1
$countones(  expr)     // number of bits set
$isunbounded(expr)     // true if expr is not bounded (!$)
$past( )               // previous sample value

$past(x)==x is equivalent to stable(x)

global clocking past/future functions:
$past_gclk $rose_gclk $fell_gclk $stable_gclk $changed_gclk
$future_gclk $rising_gclk $falling_gclk $steady_gclk $changing_gclk

A simple sequence example.
`define IDLE (state == 2'b00)
`define RDY  (state == 2'b01)
`define BUSY (state == 2'b10)
`define DONE (state == 2'b11)

sequence seq_X;
 @(posedge clk)
 `IDLE       [*1:$] ##1
 `RDY        [*1:$] ##1
 `BUSY       [*1:$] ##1
endsequence : seq_X

Concurrent assertions test for a sequence of events over multiple cycles
using expressions which may include separately defined sequences.
They also "provide semantics for formal verification" and can be used in
emulators and accelerators.
Their properties contain expressions and sequences within these layers:

|      Assertion directive       |
|  ----------------------------  |
| |   Property specification   | |
| |  ------------------------  | |
| | |        Sequence        | | |
| | |  --------------------  | | |
| | | | Boolean Expression | | | |
| | | |____________________| | | |
| | |________________________| | |
| |____________________________| |

property p_req_ack ( @(posedge clock) // sample signal
 disable iff (!rst_n) // optional disable condition
 req |-> ##[1:3] ack) ); // use $rose(req)..$rose(ack) to detect edges
check_4_ack: assert p_req_ack else $error("ack failed in %m");
++ cycle based, can use binding, work with simulations and formal
-- difficult, far from code, cannot detect glitches.

Concurrent assertions sample values in a "Preponed event region" - the
assertion always sees the value that existed before the clock edge causes
any changes.
This tolerates models that drive on the edge without a delay.

Invariant -
Sequential -
Eventuality -

property p_request_grant;
@(posedge clock) // specifies cycle for property
request ##1 /*cycles*/ grant ##[1:3] /*range*/ !request && !grant;
endproperty // req followed by grant in 1 cycle, !* follows in 1-3 cycles

Antecedent -the expression before the implication operator
 The evaluation only continues if the antecedent is true
Consequent - The expression after the implication operator
Vacuous success - if the antecedent is false, the property is considered
 vacuously true.  The check is not of interest, so evaluation is aborted.

It is good practice to define concurrent assertions separate from the rtl.
An assertion module can be used to bind assertions to target ports.
+ Assertion updates will not trigger make based synthesis.
+ Properties can be shared by cover.
+ Properties can be bound to different hierarchies.

The bind statement syntax is:
bind my_assertions module_with_assertions instance ( ... )

A cover property is used to verify that events occur.
Many simulators automatically count assertions passes to provide this info.
state_BUSY : cover property ( @(posedge clk) `BUSY);
trans_ALL  : cover property ( seq_X );

Assumptions are used for formal verification and model the design environment:
assume property (p)

Expect property statements are used in test benches.

Restrictions specify formal verification constraints: restrict property (p)??

Implicit assertions are a part of some System Verilog language constructs:

Unique Case -  An assertion which implies that all possible values for case
expression are in case items and at simulation run time, a match must be found
in case items and at run time, only one match will be found in case items.
Use when each case item is unique and only one match should occur.
Using a "default" case item removes the testing for non-existent
matches, but the uniqueness test remains.

Priority Case - An assertion which implies that all possible values for case
expression are in case items and all other testable conditions are don't
Using a "default" case item will cause priority requirement to be dropped
since all cases are available to be matched.
Use of a "default" also indicates that more than one match in case item is OK.

Generate statements can be used to create multiple instances such as in this
pipeline example.  Note the use of labels to create meaningful paths.
Include uvm_macros.svh to use uvm macros such as info and error in tb.

// Queues can be used to pipelining multiple flows.
// The block and channel labels support a[flow_id] reference.
`include "uvm_macros.svh"                // Support for uvm macros in tb
 begin : block
 for(genvar flow_id=0; flow_id<NUM_FLOWS; ++flow_id)
   begin : channel

   int unsigned channel_delay;         // Channel specific pipeline delay
   logic [DATA_WIDTH-1:0] data_q[$];        // Pipelined egress port data
   logic [DATA_WIDTH-1:0] data_out[NUM_FLOWS]; // Actual output data

     begin : CreatePipeLineDelays[flow_id].channel_delay = std::randomize(channel_delay) with
             { channel_delay >= MIN_DELAY; channel_delay <= MAX_DELAY; };
     `uvm_info("tb",$sformatf("channel[%0d] delay (range %0d-%0d) %0d",

     for(int i=0; i<[flow_id].channel_delay; ++i)
       begin : PrimePipeLine[flow_id].data_q.push_back(0);
       end   : PrimePipeLine

     end   : CreatePipeLineDelays

   always @(posedge tx_clk)
     begin : PipelineData[flow_id].data_q.push_back(egress_interface.data_out[flow_id]);
     end   : PipelineData

   always @(posedge rx_clk[flow_id])
     begin : DriveDelayedEgressData

     if([flow_id].data_q.size() != 0)
       begin : DataReady
       data_out[flow_id] =[flow_id].data_q.pop_front(); // Delayed data
       end   : DataReady
       `uvm_warning("tb",$sformatf("[%0d] Empty Fifo",flow_id))

     end   : DriveDelayedEgressData

   end   : channel
 end   : block

                             *  COVERAGE  *

Functional coverage is derived from spec which defines an explicit cov space.
Code coverage is extracted from RTL implementation.
Coverage Types - Line Branch Expression Toggle Functional
Port Toggle coverage is useful to evaluate SOC integration.

covergroup cov1 (ref but[3:0] sa, da, event port_event) @(port_event)
                                                     ^^ sampling ^^
bins // scalar or vector bins
ignore_bins: ignore
illegal_bins: terminates simulation
cross - simultaneous
option.auto_bin_max  64 default
.goal(#) .at_least(#) .weight(#) .per_instance(0)
$get_coverage() real value representing overall coverage
port_fc.get_coverage() instance coverage

Coverage System Functions.
$set_coverage_db_name $load_coverage_db $get_coverage


mod port TB(clocking cb) ->
Use just modport w/o cb for DUT.
X=null deallocates object
this -> parent scope
extern function(signature) -> Defines function prototype.
function void create(ref data) -> ref passes by reference
tran x,y x=new; y=new x; shallow copy
All classes have constructor new, implied if one is not provided.
Streaming is packing and unpacking desired properties
A transactor gets upstream tr, processes it and passes it downstream.
rand randc (pre/post randomize(args limit randomization)) -> return 1 for succ
Randomize is deep and invokes randomize for all instantiated classes.
pre/post_random tasks are supported before and after randomization.
 prerandomize() is useful to set non-random variables.
 postrandomize() is useful to calculate data dependent values such as CRC.
wildcard -
event e1, e2; ->e1 triggers, @e2 waits, event has zero-width pulse.
e1.triggerred tells if event has been triggered,
@e2 is equivalent to wait(e2.triggered())
mailbox m = new(optional size) [try_]get/put/peek,sneak
que[$][={0,1,2}] push/pop_front/back, insert(pos,value), delete(pos)
Use mbox for fifo and queue for stacks.
You can access and modify any part of queue.
$cast(basehandle,ext handle) - assigns ext handle to base handle
$signed and $unsigned are run time equivalents to `signed and `unsigned.
virtual methods must use the same signatures.
virtual class - Abstract class can be declared and extended, not instantiated
interface classes - A set of classes containing only pure virtual methods
with a common set of behaviours.
They can be "implemented" as additional base classes for an extended class.
callback - Create at top level test to be called from drv, lowest env pt.
Define empty virtual task in driver which does nothing.
Extend driver class and define task in test.
Parameterized class - class foo(type T=default-type)
Instantiate foo #(real) bar

program automatic test(ifx,out rst)
module top;
logic clk, rst
interfaces - declared with clk
test t1(interfaces,rst)
dut  u1(interfaces,clk,rst)
initial begin clk=0; forever #HALF_PERIOD clk = ~clk; end
endmodule : top

                              Verilog 2001

Configuration Blocks - Support version and source location of module sources.
Scalable models - Using generate, genvar, and localparam.
Constant functions - Scale constructs resolved during compile or elaboration.
Indexed part selects - [expr +|-: width], e.g. word[byte_num*8 : 8 ]
Power operator ** equivalent to C pow(), returns real if any arg is real.
Signed arithmetic extensions based on data types.
Automatic width extension - 'bz extends to data size not machine width.
Multi-dimensional arrays, arrays of both variable and net data types.
Bit and part selects within arrays, array[20][5][7:4]
Automatic - Makes tasks and functions re-entrant and allows recursive.
Based on C word auto which is rarely used as it is the default.
Automated and comma-separated sensitivity lists - @*, @(a, b) for @(a or b)
Enhanced file I/O: $ferror, $fgetc/s, $fread, $fflush, $fread, $fscanf,
 $fseek, $fscanf, $ftel, $rewind, $ungetc
String formatting support: $sformat, $swrite{,b,h,o}, $sscanf,
In-line parameter passing by name: fifo #(.WIDTH(32), .DEPTH(64)) fifo32x64;
'95 position - fifo #(32, 64) fifo32x64;
'95 by  name - fifo fifo32x64 (...); defparam fifo.WIDTH = 32, ...
ANSI-style input/output declarations: module cpu(input reg [31:0] opCode,..)
Combined port and data type declaration: output reg [7:0] data;
Variable init in declaration instead of requiring initial: reg valid = 0
Compiler directives: `ifndef `elsif `line (file+line info).
On-event pulse error propagation for pin-to-pin delays.
Improved pulse detection and timing constraint checks.
Enhanced SDF support, extended VCD files, and PLI enhancements.
Attributes: (* directives-for-other-tools *)

                              Utility Tasks

$system("") is equivalent to the C system() function call.

$display prints a formatted string and a newline to stdout.
$monitor prints the string whenever one of the arguments changes.
$strobe displays the string at the end of the simulation.
$write displays the script without a newline.

All these can end in b/h/o for  binary/octal/next variations.
All these can begin with f for file operations with the first arg a fd/mcd.
$monitoron/off can be used to enable and disable the monitor behaviour.
$monitoron triggers an immediate output even if nothing changed.

Similar file management calls:
mcd = $fopen("file") -or- fd = $fopen("file"[,"mode"])
modes: {r,w,a}[+][b] {read,write,append}, + opens the file for update.
fd and mcd are 32-bit descriptors and 0 indicates a failure.
fd STDIN 0x8..0, STDOUT 0x8..1, STDERR 0x8..2
Only mcd values can be orred.
b indicates a binary vesus text file and is ignored on some systems.

$fclose(fd or mcd) closes the file(s).

$ferror displays the most recent encountered error.

$swrite is a string formatting variation that supplements $sformat[f].

$stop(n) stops the simulation.
n=0 prints nothing, n=1 prints time & location, n=2 also prints statistics.
$exit waits for all program blocks to finish then calls $finish.
$finish(n) terminates the simulation.

Dump management.
$dumpfile $dump{on,off} $dumpvars $dumpall $dumplimit $dumpflush


|                                                                        |T
|   [test: seq+cfg(tb,env)]..[test1..n]          program: run_test()     |e
|      |                                                                 |s
|  --------------[cfg]-------------------------------------------------  |t
| |    |                               *env*                           | |
| |    | [VirtSeqr]          ------->[CovCol]<--------------           | |
| |    |   |         [Export]------>[Scoreboard]<-----------[Export]   | |
| |    |   |           /|\      [predictor|evaluator]          /|\     | |
| |   \|/ \|/           |                                       |      | |
| |--[Sequence]--[cfg]--------                        ------[Analysis]-| |Con
| |  [Sequencer]        |     |                      |         /|\     | |tai
| |    |                |     |                      |          |      | |ner
| |   \|/               |     |                      |          |      | |
| |  [Driver]        [Monitor]|     [Assertions]     |      [Monitor]  | |Tr
| |    |      active   /|\    |          |           | passive /|\     | |
| |    |      agent     |     |          |           | agent    |      | |Phy
|  --------------------------------------|-----------------------------  |
|      |                |                |                      |        |
|      |        Virtual | Interface      |    Virtual Interface |        |
     |                |                |                      |
|      |                |                |                      |        |
|       --------------------------->[Interface]<----------------         |
|                                       \|/                              |
|      Test bench harness              [DUT]                             |

                                             Stimulus: transient obj
    -----------------     -----------------     ---------------
   |    uvm phase    |-->|   uvm object    |<--|uvm transaction|
    -----------------     -----------------     ---------------
           /|\                   /|\                   /|\
            |                     |                     |
    -----------------             |             ---------------
   |   uvm domain    |            |            | sequence item |
    -----------------             |             ---------------
                                  |                    /|\
                                  |                     |
    -----------------     -----------------     ---------------
   | report handler  |-->| uvm report obj  |   | uvm sequence  |
    -----------------     -----------------     ---------------
    -----------------     -----------------     ---------------
   | uvm subscriber  |-->|  uvm component  |<--|   TLM Ports   |
    -----------------     -----------------     ---------------
      Testbench: permanent obj   /|\
   ----   ---   -----   ---------   ------   -------   ----------
  |test| |env| |agent| |sequencer| |driver| |monitor| |scoreboard|
   ----   ---   -----   ---------   ------   -------   ----------

UVM consists primarily of object and component base class libraries.

data class - sequence and sequence item derived from UVM transaction class
Triggered during run time, create in run_phase, constructor has no parent.
The heart of a sequence class is the body() task.
`uvm_object_utils_begin(packet) // register each object and field
   `uvm_field_int    (src_addr, UVM_ALL_ON) // used for trans methods
   `uvm_field_int    (dst_addr, UVM_ALL_ON) // sprint copy compare ...

`uvm_field_* supports these types:
 int real event object string enum
 array_*  sarray_* queue_*  dynamic/fixed-size arrays and queues of types
 aa_<type>_<index> associate arrays

UVM_ALL_ON contains the flags:
Other supported flags:

component class - test,  environment, agent, sequencer, driver, monitor, sb
"Agents are the building blocks across tests and projects."
Create before run time in the build phase, constructor has name and parent.
`uvm_component_utils(name)    // register each component

UVM extends the uvm_port_base class to provide unidirectional ports, exports,
and implementation exports for connecting components via the TLM interfaces.
TLM Classes: uvm_{export,imp,port,fifo,socket}
Ports - instantiated in components that require, or use, the associated
interface to initiate transaction requests.
Ports require a connection to a remote implementation of interface methods.
Exports - instantiated by components that forward an implementation of the
methods defined in the associated interface.  The implementation is
typically provided by an implementation export in a child component.
Export classes provide the implementation to a remote port.
Imps - instantiated by components that provide or implement an
implementation of the methods defined in the associated interface.
A UVM imp is an export that contains the functional implementation to a
remote port.

Transactions can be pushed or pulled:
push:                 port.put() []-->o consumer implements put
pull: producer implements  get() o<--[] port.get
fifo:                 port.put() []-->o[fifo]o-->[] port.get
analysis/broadcast: port.write() <>-->o subscribers implement write()
                                  -->o write() ...
A UVM environment "encapsulates DUT specific components" and consists of -
top test bench module which instantiates: clock/reset-modules
dut Interfaces with modports and clocking blocks
Testcase - Program which includes tests and contains: initial run_test()
 == The dut is compiled along with these and run with
 == +UVM_TESTNAME=test_name to select a test.
The initial run_test should be in the program not the test bench.
The program should contain:
 import uvm_pkg::*;
 `include "uvm_macros.svh"
The environment should not contain references into the hierarchy which
may change and make the code less portable.
It is better to do the configuration in the env rather than the test case.

The UVM components have a logical hierarchy starting with the test base,
environment, agent, and then the components within the agent.
The build is a top-down phase in this hierarchy.
All other phases, except for final, are top-down.

The factory create function is used to instantiate objects so they can be
modified as needed:
 obj_handle = obj_name::type_id::create("obj_handle",this)

The user should not modify built-in methods such as copy and print.
 Only the clone method is meant to be modified by the user.
 The clone method must be $cast as it returns a base type.
 Copy differs from clone in that it does not copy the name field.
   pkt0.copy(pkt1), $cast(pkt1,pkt0.clone)
 Equivalent user defined do_* methods are invoked automatically.
 Implement convert2string instead of print for terse messages.

A port passes data to an export which implements methods to process it.
An analysis port (in a monitor) transfers data to multiple exports.
An imp export implements the interface methods.

Defines are used to handle different UVM environments:
`ifdef UVM_VERSION_1_1

The UVM environment consists of the following nine phases:
 Only the build and final phase are top down so higher level components
 can decide what to build based on a random or test defined configuration.
 Only the run phase takes time as it runs the simulation.

 build - construct and cfg various child components/ports/exports.
 connect - connect the component ports/exports.
 end_of_elaboration - configure the components if required.
 start_of_simulation - print the banners and topology.
 run -  execute test body and fork all threads, only phase task.
   The test is selected using: +UVM_TESTNAME=<testcase>
   The run phase consists of twelve divisions:
 extract - gather required information.
 check - check pending requests in scoreboard, read regs/statistics.
 report - report pass/fail status.
 final - simulation is about to end and print final messages

A typical flow is explained next.

The base test class extends uvm_test and contains:
build_phase - creates env and dynamically connects virtual interfaces.
 It can also be used to set the default sequence for the main phase,
 Doulos Easier UVM discourages this approach to start a sequence.
 The uvm_config_db #(type) function is used to get and set properties
 from various points in the hierarchy.
   uvm_config_db #(type)::set(this,"scope","property",value)
 The equivalent get returns a bit value which equals 0 for an error.
The test also selects sequences and configures the dut and environment.
The virtual interfaces can also be connected in the env class.
run_phase - creates packet sequence.
main_phase - starts sequencer (alternate approach), and sets drain time.

Additional tests extend test_base with a build phase containing specific
sequences and constraints as well as other dut and env configurations.

The environment build phase creates the agents and the scoreboard.
The connect phase connects the agent analysis ports to the scoreboard.

Agents encapsulate the driver, monitor, and sequencer for an interface.
Agents are active if they drive transactions, passive otherwise.
Usually input agents are active and output agents are passive.
An active agent's build phase creates the sequencer and driver.
In both active and passive agents the build phase creates a monitor and
instantiates an analysis port.
Instantiating a coverage collector in an agent is discouraged.
The connect phase connects an active driver to the sequence port and connects
the monitor to an analysis port.

The sequence body passes a sequence to the sequencer: `uvm_do(req)
This macro consists of the following steps:
`uvm_create(seq_item); start_item(..) seq_item.randomize() finish_item(..)
The user may choose to do these steps manually for additional control.

The sequencer pulls a transaction from the sequence for the driver.
It is generally used directly from the library:
typedef uvm_sequencer(#packet_type) packet_sequencer
More complex scenarios require a user defined (virtual) sequencer.

A transaction is written into a port and transferred to an export which
implements a method to process it.

The driver build phase gets the virtual interface handle from the config.
The run phase pulls sequence items from the sequencer and drives them on the
virtual interface.

The monitor build phase instantiates an analysis port and gets a handle to the
virtual interface from the configuration.
The run phase snoops traffic from the interface and writes it to the analysis
The monitor consists of a collector portion which interacts with the bus.

The scoreboard build phase instantiates ingress and egress exports and the
connect phase connects them to the corresponding agent ports.
The scoreboard also contains ingress and egress write functions for in order
or multi-stream comparators.
Finally the scoreboard contains a report phase to generate the comparison

Note that the port and export are instantiated using new, not created.

A (multiple channel) virtual sequencer synchronizes the timing and data
between interfaces "and orchestrates the system level scenario."
i.e. It explicitly manages multiple sequences across multiple agents within
a phase and controls the sequence order.

UVM macros are used for messages and correspond to specific actions:
`uvm_fatal(ID,MSG) -- UVM_DISPLAY | UVM_EXIT
`uvm_error(ID,MSG) `uvm_fatal(ID,MSG) -- UVM_DISPLAY | UVM_COUNT
`uvm_warning(ID,MSG) -- UVM_DISPLAY
`uvm_info("ID","MSG",verbosity UVM_{LOW,MEDIUM(default),HIGH,FULL,DEBUG})
 -- UVM_DISPLAY, filterred out when verbosity is lower than set value.

The following actions are supported and can be orred:
 UVM_EXIT - Exit simulation
 UVM_COUNT - Increment error count, exit after +UVM_MAX_QUIT_COUNT=<val>
 UVM_DISPLAY - Display message on console
 UVM_LOG - Capture message in a file
 UVM_CALL_HOOK - Invoke callback method
 UVM_NO_ACTION - Do nothing

Equivalent functions uvm_report_* are discouraged as they effect performance.

uvm switches are used to control the library behaviour:
+uvm_set_verbosity=<comp>,id,<verbosity>,{<phase> or time,<time>}
 component can contain wild cards, id can be _ALL_

uvm_cmd_line_processor can be used to manage run-time command-line arguments.
uvm_cmdline_processor cmdargs = uvm_cmdline_processor::get_inst();
 string tool_name = cmdargs.get_tool_name();
 string version = cmdargs.get_tool_version;
 string args[$]; cmdargs.get_args(args); // all plsuargs
 string count; cmdargs.get_arg_value("+pkt_count=",count);
   int count_value = count.atoi();

uvm_object Class Members.
static functions:
get_inst_count uvm_object_wrapper_get_type
reseed print sprint record copy compare pun]pack[_{bytes,ints}]
virtual functions:
set/get_name get_full_name get_inst_id get_object_type get_type_name
clone convert2string

A global module containing uvm_pkg can be used to invoked UVM messaging from
legacy Verilog (2001) code.

A UVM test needs to raise objections while there is pending activity and drop
them when it is done.
When using a default sequence this should be done in the packet_sequence
pre_stat and post_start methods.
UVM-1.2 simplifies this by supporting this in the sequence constructor:
 set_automatic_phase_objection(1); // UVM-1.2 only
A drain time is used to give the hardware more time to flush the activity.
 objection = phase.get_objection();
phase.get_get_objection_count() returns the pending objection count. (1.2)
set_propogate_mode(0) avoids rippling of objections through the hierarchy.

Modifying Constraints, by type and instance.
set_type_override("packet","another_packet_type"); // run-time
The compile-time version is better:
cmd-line: +uvm_set_type_override=<req type>,<override type>,<inst path>

set_inst_override("env.agt.seqr.*","pkt","pkt_alt"); // run-time
The compile-time version is better:
cmd-line: +uvm_set_inst_override=<req type>,<override type>

The uvm_report_catcher class is extended to catch and throw exceptions.


UVM-1.1 to UVM-1.2 transition.
Global handles uvm_top and factory are deprecated.
 uvm_top.print_topology() => uvm_root::get().print_topolgy()
 factory.print => uvm_factory::get().print()
 Equivalent UVM-1.2 only implementation:
   uvm_coreservice_t cs = uvm_coreservice_t::get();
   cs.get_root.print_topology(); // Fails in UVM-1.1
Additional cs methods:
 get_root() get/set_factory(), get/set_report_server()
 get/set_default_tr_database() get/set_component_visitor()
set/get_config_* are deprecated.
set_automatic_phase_objection(1) is supported in UVM-1.2.
starting_phase has been removed and replaced with:
 get_starting_phase().raise_objection(this); // in pre seq.pre_start()
 get_starting_phase().drop_objection(this); // in pre seq.post_start()

                             * Guidelines *

Based on Doulos Easier UVM Guidelines 2016-06-24

Variables lower case: int length, width
enum upper case: IDLE BUSY
Short local variable names, long global variable names.
Prefix class members (not block/method variables): m_length, m_width
Generic names of single instance components: m_sequencer m_driver m_monitor
Suffix unique instance names: m_teng_env m_pcie_agent
Single component/sequence config: m_config, user config: myconfig_config
Suffix ports without prefix: my_port my_export
Suffix multiple virtual interfaces: pcie_vif, for single interface use vif
Suffix user type definitions: mystruct_t
Suffix user defined package names: my_pkg
Use just enough comments to promote job security.
Use white space, blank lines and indentation, to improve legibility.
Do not use a superfluous virtual when overriding phase methods: build_phase..

Use a consistent file structure and naming convention.
Define classes within packages instead of modules or files.
Use `include to place each class in a separate class.
Provide double indemnity protection: ifndef SHORT_PKT_SV `define SHORT_PKT_SV
Do not use wildcards at unit scope (outside module/package): import mypkg::*
Include uvm_macros.svh and import_uvm_pkg::* anywhere UVM is referenced.
Use one agent per interface with passive monitor and optional sequencer/drvr
Use get_active to determine whether an agent is active or passive.
An agent should only instantiate one sequencer, one driver, and one monitor.
Use virtual sequences to co-ordinate the stimulus generation activities.
Checking and functional coverage should be external to any agent, in
subscriber component exports connected to the analysis ports.
A UVM top-level env should be reusable as a sub-env in a larger system.
Use factory overrides and/or cfg database to repurpose components.
A top level module should set configuration parameters that are retrieved by
the test, the test shold set parameters retrieved by the env, and the env
should set parameters retrieved by lower-level environments or agents.
To decouple tests from the environment they should not use hierarchical paths.
Represent layered protocols with multiple sequencers with their own
transaction types.

                              UVM Templates

A UVM template generator is a good way to implement a consistent environment.

Accelra and Doulos have free template generators available.

A good template generator should perform the following tasks:


.cpp file suffix.  Same line comments delimited with: //
Standard function library inherited from C and class library
Class library includes STL (Standard Template Library)
Adds type bool w_char to C data types: char int float double void
Object Oriented Programming - Data controls access to code
User defined operations valid for data type.
C++ adds cout << ".." cin >> i from #include iostream
Namespace is a declarative region to avoid name contention using namespace std.
OOP Supports -
Encapsulation - Bind code and data to make it safe from outside interference
Inheritance - One object inherits the properties of another
A class is an abstract data type which implements a structure of user data
along with associated functions and operators.
A struct is like a class except that the members are public by default.
A class instead of struct should be used unless all members are public.
A constructor is a function named after the class which is automatically
invoked when the class is declared to initialize properties and perform
dynamic storage allocation.  Constructors are frequently overloaded.
A destructor is a function named ~class_name and automatically invoked
at the end of the (scope) block where the class is declared.
Constructors and destructors are optional and may not be invoked explicitly.

Access Privileges allow external access only to public members.
Private members may be accessed by derived classes and functions explicitly
declared as a friend.
Protected members may only be accessed within the class and by a friend.
Classes may be nested but this is considered very bad practice.
static class::object (property or method),
Anonymous Union

Function prototypes explicitly list the type and number of arguments to
enforce strong type checking.
Overloading refers to the practice of selecting a function implementation
based on the type and number of arguments.
const replaces preprocess defines for constants with a named and typed literal.
inline requests that the compiler treat a short function as a macro.
void indicates a function without a return value or an empty argument list.
The new keyword allocates storage for the specified number of a type.
The delete keyword frees up the storage used by the specified object.
extern variable[=val] - init value changes definition to declaration.

Default values - Allows optional arguments
Templates - Type independent functions
Assertions -
Terms of interest --
enum, , operator,
call by ref <type>&, ::external-scope, default args,
public, private, protected, virtual,
stream.h, formatting, functions - dec/oct/hex/chr-str/from,
catch throw try

                                System C

Orders of magnitude faster than HDL, suitable for firmware validation.
A container class SC_MODULE is used to represent modules.
Signals can be complex data structures.
Supports concepts of clocks/time, reactivity/events, and ports.
Fixed-point data types are useful to represent hardware.
Use DPI or PLI to integrate with Verilog, PLI is more flexible.

Class constructor type SC_CTOR, SC_METHOD methods, SC_THREAD threads.
Time is a 64-bit integer but can be made larger.
 Class sc_time and sc_set_time_resolution, units SC_{{F,P,N,U,M}S},SEC
Event objects are represented using the class sc_event
wait() is used with time, events, and composite events such as (a & b)|c.
An "interface" defines a set of methods (functions) to send and receive data
through the channel: sc_signal_in<T> sc_signal_inout<T> sc_port_base
Communication channels encapsulate how information is transferred between
modules and allows abstract high-level communication.
Pre-defined built-in channels include FIFO, Mutex, and Semaphore and are
generally not synthesizable.
User-defined channels can be modeled as abstract or at a more detailed level
and can be synthesizable.
FIFO channel is unsized and untyped and provides write/read methods used
to communicate between blocks that operate on different, asynchronous clocks.

                                DDR  SDRAM

DDR3 is a DRAM JEDEC interface specification for Double Data rate type three
synchronous dynamic random-access memory which is capable of transferring data
at eight times the speed of its internal memory arrays enabling higher peak
data rates.

DDR Verification needs to cover these topics (WKK) -
Find the best eye openings for each data bit.
Find the best read and write timing.
Find the best read timing before write timing and the other way around.
What are all the uses of the mode registers.
Check 1/64 cycles for each transfer.
Check every 1/64 level of voltage.
Cross-die skew verification.
Low speed DDR test failure.
Verify intra-byte skew and inter-die skew.
Explain use of DBI and DM.

A DDR3 design needs to be tested for electrical characteristics and:
The state diagram needs to be exhaustively exercised to ensure that all
states and transitions are properly exercised.
Associated with the state diagram is the Command and CKE Truth Tables
which needs to be exhaustively exercised.
This covers basic functionality, power-up, reset and initialization
procedures, as well as handling of power state dynamics.
Functional tests should verify all alignments, enables, and masks, access
to randomly selected and corner points of supported address ranges,
and should cover all states for the address and data lines using an
approach that ensures that they operate independently.
This includes the burst behaviour such as burst length, type, and order.
Supported read and write access to all mode registers must be verified.
In addition to this the behaviour of all legal and illegal register field
values must be checked to ensure that they behave according to specs.
This should cover all device modes and configurations as well as Self-
Refresh Settings and timing variations such as CAS latency.
This should also exercise DLL enables and Reset, Latency Settings, Write
Leveling and Write Recovery.
The testbench should include protocol checking and exercise the logic
at all supported frequency ranges along with jitter tolerance.

The DDR SDRAM interface makes higher transfer rates possible by more strict
control of the timing of the electrical data and clock signals.
Implementations often have to use schemes such as phase-locked loops and
self-calibration to reach the required timing accuracy.
The interface uses double pumping (transferring data on both the rising and
falling edges of the clock signal) to double data bus bandwidth without a
corresponding increase in clock frequency. One advantage of keeping the clock
frequency down is that it reduces the signal integrity requirements on the
circuit board connecting the memory to the controller.

DDR4 is the latest DDR specification.

DDR3 SDRAM gives a transfer rate of:
(memory clock rate) x 4 (for bus clock multiplier) x 2 (for data rate)
 x 64 (number of bits transferred) / 8 (number of bits/byte).
Thus with a memory clock frequency of 100 MHz, DDR3 SDRAM gives a
maximum transfer rate of 6400 MB/s.
The DDR3 standard permits DRAM chip capacities of up to 8 gibibits, and up to
4 ranks of 64 bits each for a total maximum of 16 GiB per DDR3 DIMM.
1 gibibit = 2^30 bits.

GDDR3 is a type of DDR SDRAM specialized for graphics processing units
offering less access latency and greater device bandwidths.

A DDR SDRAM with a certain clock frequency achieves twice the bandwidth
of a SDR SDRAM running at the same clock frequency due to double pumping.
DDR2 memory modules are at least twice as fast as DDR memory modules.
DDR3 transfers data at twice the rate of DDR2, eight times the speed
of its internal memory arrays.
DDR4 supports a higher module density and lower voltage requirements coupled
with higher data rate transfer speeds.

The different DDR types have distinct electrical characteristics and are
not compatible.
In addition to bandwidth designations and capacity variants, modules may
support additional characteristics such as:
ECC - Error Correction Code support to detect and correct errors.
Registered or buffered to improve signal integrity at the cost of latency.
Load Reduced which buffer both control and data.
Low Voltage or Low Power.


Peripheral Component Interconnect express-High speed serial replacement for PCI.
Topology based on point-to-point serial links for mother board interconnect and
an expansion card interface.
SW compatible to PCI, supports legacy address space (registers) and features.
1.25 GHz -> 250 MB/s, 2007 Gen2 2.5 GHz -> 500 MB/s, 2010 Gen 3 4GHz -> 1 GB/s
Dynamic speed/width changes. Completion timeout control.
Duplicate TLPs, Lane Reversal.  Payload is usually <256 bytes.
Point-to-point serial links between PCIe ports.  Lane is a Rx/Tx link pair.
Full duplex, no arbitration.
Routed on main board with a crossbar switch to allow simultaneous transfers.
Channel grouping is multiple lanes bonded to a device for higher bandwidth.
Number of lanes if negotiated during power up or explicitly during operation.
Use highest number supported by both devices.
Data is in 8-bit byte format.  Lane count can be 1, 2, 4, 8, 16, and 32.
Serial bus avoids problem of data bits arriving with timing skews.
Layered protocol - Transaction layer, Data link layer - includes MAC,
Physical layer - Logical and electrical
PCI requests - 3 types of address spaces Cfg/IO/Mem Rd/Wr,
Interrupts - INTx Legacy Interrupts INTA[BCD]#,
 Message Signalled Interrupts MSI MSI-X
PCIe PHY - Electrical and Logical, Logical -> Mac sublayer and PCS
PMA (Physical Media Attachment) includes SerDes and analog circuitry.
Electrical - Lane has 2 unidirectional LVDS or PCML differential pairs (4 wir)
A link is a 1 or more lane connection between two devices.
Data bytes are interleaved across lanes -
Data Striping, requires deskew (sync bytes)
Clocking is embedded in signal.  8b/10b encoding causes 20% overhead.
Gen3 is 128b/130b encoding for lower overhead.
Data Link Layer implements Flow Control, the sequencing of
Transaction Layer Packets (TLPs).
32-bit L(ink)CRC for data protection.
ACK if CRC is good, else NAK.
TLPs that timeout waiting for ACK or if a NAK is received are retired.
ACK and NAK are communicated via low-level packets known as DLLP.
TLPs in a TC are ordered.
Phy Layer is responsible for Link Training and initialization.
Link Power States L0 (timeout) L1 L2 L3 (off) - some are optional.
Posted requests do not require completion.
Data Link Layer Packets also communicate flow control information between
device transaction layers and for power management functions.
Transaction layer implements split transactions, request and response separated
by time, to allow the link to continue carrying other traffic.
Credit based flow control - Device issues receive buffer credits.  Sender can
only send where there are credits.  The receiver restores credits when done.
Root Complex -
Switch -
Bridge -
End Point -
Validation Methodology -
Functional Testing -
 Reset Test - Clean Data paths and proper start-up initialization.
 Packet Transmission - Different packets can be transmitted across links.
 Error Recovery - Detect and report errors.
 Power Management - Proper negotiation through power save states.

Intel RAS White Paper about PCIe RAS.
Transaction layer has End-to-End CRC, ECRC.
Data Link Layer has L(ink)CRC which can be regenerated by switch or bridge,
 Pkt Sequence Projection, and Pkt Error Detect/Correct.
Phy has reliable 8b/10b Framing.
Data Link Layer is responsible for protocol error detection and correction.
TxDLL automatically retries TLPs.
Uppermost Transaction Layer is responsible for protocol error detection and

                  |<------- Transaction Layer ---->|
           |<-------------- Data Link Layer ---------------->|
:FramingEnd : LCRC : ECRC : Data 0-1024 DW : Header : Seq Num : STP Framing:
|<-------------------------- Physical Layer ------------------------------>|

Embedded Computing Article by Akber Kazmi, PLX Technology.
Quality of service support enables traffic prioritization.
 Traffic Class - Device specific, TC mapped to VC.
 Virtual Class - Usually only one, sometimes two combine high/low priority.
 Mapping TCs on VCs - To interact with devices that do not support VCs.
 VC arbitration schemes - Table? Round-Robin, Hi/Lo.

Embedded Computing Article Solutions for PCI Express design verification.
Compliance does not guarantee interoperability.
Commercial VIP such as Denali - BFRm, Assertions, Compliance suite.
 Test Sequences and coverage metrics,
   Debug Solutions - Transaction Level Debug and analysis at all layers.
   EDA integration and portability.

                             Ethernet 802.3

Ethernet: L1 - Physical Layer, L2 - Data Link Layer, [ L3 - Nework (IP) ]
Other:    L4 - Transport (TCP/UDP), L5 - Session, L6 - Presentation,
       L7 - Application (HTTP/SNMP/Socket/PROFINET)

Pre - SOF - MAC DA - MAC SA - [802.1Q tag] - Type/Len - Payload - CRC - IfGap

Preamble: 7 { 10101010 } 0xaa
Start Of Frame: 10101011 0xab
Type/Len: 8100 Vlan, 8801 PSE, 8847 8848 MPLS
Payload: 46-1500 byte range, jumbo frames up to 9000 bytes
CRC: 4 bytes over entire frame
Interface Gap: Minimum 12 octets

Repeaters simply repeat the signal.
Bridges filter the traffic, only allowing packets with local destinations.
Routers create separate logical networks and do not pass broadcasts.
Switches have a dedicated segment for each station to allow simultaneous tfrs.
Full duplex means receive and transmit at the same time.
There are no collisions in a switched network.

                             Wireless 802.11


                             Cache Coherency

Memory and Cache Layers.
L1 is the first level cache for each individual processor.
L2 is the second layer cache and shared across the system bus.
L3 is the main Memory.

                                   /  \
                                  / Reg\
SRAM - Fast Expensive Lots of Power/  L1  \     Hit Rate: #Times L1
                                /   L2   \    Hit Time: Time to access L1
DRAM - Slow Cheap Low Power      /   Main   \   Miss Rate: 1 - Hit Rate
Media - Disk, Tape, CD/DVD, ... /  Secondary \  Miss Penalty:Time to replace

A Coherence protocol specifies how caches communicate with processors and
each other so that processors will have a predictable view of memory.
Caches that always provide this predictable view of memory are said to
be coherent. (Steven Farago)

Cache Rule : The miss rate of a direct-mapped cache of size X is about the
same as a 2-way-set-associative cache of size X/2. (Hennessy & Patterson)
Programs favor a portion of their address space at any instant of time. (H&P)
Temporal locality - If an item is referenced, it will tend to be referenced
again soon.
Spatial locality - If an item is referenced, nearby items will tend to be
referenced soon.
Every pair of levels in the memory hierarchy can be thought of as having an
upper and lower level.  Within each level the unit of information that is
present or not is called a block or line.
A successful access at a upper level is a hit, otherwise it is a miss.
Average memory-access time: Hit time + Miss rate * Miss penalty
The Miss rate is the more important metric.
The Miss penalty is further divided into two components:
access time or latency is the time to access the first word of a miss
and is related to the latency of the lower memory.
transfer time is the time it takes to transfer the remainder of the block
and is related to the bandwidth between the lower and upper memories.

The upper portion of the memory address is the block-frame address which
is used to identify the cache block.  The lower portion is the block-offset
which is used to identify an item within the block.

Caches are sometimes divided into instruction-only and data-only caches.
Using separate ports increases the bandwidth between the cache and the CPU.
Caches that can contain either are known as unified or mixed caches.

Classifying Memory Hierarchies.
Block placement - Where a block can be placed in the upper level.
Block identification: How a block is found to be in the upper level.
Block replacement: Which block should be replaced on a miss.
Write Strategy: What happens on a write.

A cache represents the level of the memory hierarchy between the CPU and
main memory.

Placing the Cache Block.
Direct Mapped: Each block has only one place it can appear in the cache.
Fully associative: A block can be placed anywhere in the cache.
Set associative: A block can be placed in a restricted set of places in the
 cache.  A set is a group of two or more blocks in the cache.
 A block is first mapped onto a set and then placed anywhere within the set
 n-way set associative indicates that there are n blocks in a set.
Finding the Cache Block.
A valid bit in a tag is used to indicate whether a cache block is valid.
In a direct mapped cache there is only one place to look.
In a fully associative cache all blocks must be searched.
 As the goal is speed, all tags are searched in parallel.
In a set-associative implementation the block to be searched is based
on the index field in the address.  All sets in the block are then searched.
The offset is unnecessary in the comparision because all offsets match.
Once the block is identified, only the tag needs to be checked.
Replacing the Cache Block.
A pseudorandomized method, reproducible for hardware debugging, can be
used to select the replacement block and spread the allocation uniformly.
A Least-recently used (LRU) method can also be used to replaced blocks which
are less likely to be reused.  This requires a more complex implementation.
Writing the Cache Block.
In a Write Through implementation the information is written to both the
Cache block and the next lower-level memory.
In a Write back the implementation the information is written to main memory
when it is replaced.

3 portions of an address in a n-way set-associative or direct-mapped cache.
           m-k-n bits k-bits   n-bits
|| State || Tag | Index | Block-Offset ||

Index-width (k) = address-size modulo cache_size

The state consists of status bits: Invalid + Modified + Shared + Pending

The Invalid bit is reset when the data is valid for reading.
The Modified bit indicates that the data is dirty and needs to be flushed,
.i.e. written out to the next lower level.
The Shared bit indicates that the data is also valid in another cache.
Writes to this data require that the other caches invalidate it unless
it is broadcast.  Valid & !Shared is the Exclusive state.
The Pending bit indicates that there is an active flush (write to memory).
In Multi-processor systems corresponding reads have to be retried.
Combinations of these define additional transient states.
Variations are used for Read-only or Read-Modify-Write accesses.

MESI Protocol: Modified, Exclusive, Shared, Invalid (Steven Farago)

In the Berkeley Ownership protcol a block may be in one of four states:
Invalid, UnOwned, OwnPrivate, and OwnShared.
Only one block may own a block but many caches can have shared copies.
Wood, Gibson, Katz.

An additional Locked special purpose state can be used for Read-Modify-Write.

Sources of Cache Misses.
Compulsory: The first access to a block is not in the cache and fetched.
Capacity: The cache cannot contain all he required blocks and capacity
misses occur due to blocks being discarded and later retrieved.
Conflict: If the block-placement strategy is associative or direct mapped,
conflict misses also occur a block can be discarded and later retrieved if
too many blocks map to its set.  These are also called collision misses.

German Cache Coherency Protocol (by Steve German) and Murphi Models.
The German protocol is a simple cache coherence protocol devised by Steven
German in 2000 as a challenge problem to the formal verification community.
Since then it has become a common example in papers on parameterized
The Murphi tool is an enumerative (explicit state) model checker with its
own language that uses a guard->action syntax executed in an infinite loop.
Murphi has a formal verifier that is based on explicit state enumeration
which can be performed as a depth-first or breadth-first search of the state

The by-now standard method in industry for debugging a cache coherence
protocol is to build a formal model of the protocol at the algorithmic level
and then do an exhaustive reachability analysis of the model for a small
configuration size (typically 3 or 4 nodes) using either explicit-state or
symbolic model checking.  Chou, Mannava, Park, Intel.

Multiprocessor cache coherency requires additional logic.
A directory based implementation is where one block of physical memory is
kept in only one location.
Systems with a shared-memory bus can use snooping where every cache that has
a copy of the data from a block of physical memory also has a copy of the
information about it.  All controllers monitor or snoop bus activity and
react accordingly.
There are two types of snooping protocols:
Write invalidate - The writing processor causes all other copies in other
caches to be invalidated so it is can then be free to change its local copy.
Write broadcast - The writing processor broadcasts the new data over the bus
so other processors can update their copies similar to a write-through.
Write-broadcast protocols usually allow blocks to be tagged as
shared (broadcast) or private (local).

Most cache-based multiprocessors use write back caches because it reduces
bus traffic and thereby allows more processors on a single bus.
Other variations of these approaches are also used.

Other optimizations based on program behaviour:

Cache layers, generally L1 is local and L2 is shared.

Pipelined writes make it easier to flush data.

Sub-block valid tag bits.

Instruction-Prefetch Buffers take advantage of the normal sequential
execution of instructions by buffering two or more instructions.
A wider memory path helps make this easier to implement.
Prefetch-buffers are also useful to help align variable-sized instructions.
They have the disadvantage of increasing memory traffic by fetching
instructions that are not used such as during branches.
Register Windows consist of register banks that are switched during calls.
This limits the call depths unless a circular buffer is used.
The banks can be overlapped to to create a common area to pass parameters.

A significant amount of the Cache Coherency material is quoted directly from
John L. Hennessy & David A. Patterson, [1990]
Computer Architecture A Quantitative Approach, Morgan Kauffmann Publishers

Verifying the cache Design.

Stimulus: Constrained-random stimulus is required to mimic all possible
scenarios. The ability to define high-level scenarios and also to coordinate
between the multiple agents is necessary for an ACE-based SoC. To achieve
this, we discussed the following tools: configuration properties to reduce the
verification space, constraints to generate a correct data item, cache models
to provide the cache state to the constraints, and a virtual sequence to
coordinate the different masters. (Mirit Fromovich, Tamar Meshulum)

Checks: Each interface requires complete protocol compliance checking to
ensure protocol compliance at the interface level.  In addition, we have seen that
system coherency checks at the interconnect level are necessary, in order
to verify the system coherency. (Mirit Fromovich, Tamar Meshulum)

Coverage: Reduce the huge ACE coverage space by developing and using a
coverage map that matches your design and the accurate specification of all
legal cores. (Mirit Fromovich, Tamar Meshulum)

Requires - (Steven Farago)
Processor Model:  Non-deterministically issues Loads and Stores to cache
Cache Model:  Two parts - initially combined into a single process
 Main Cache - Services processor requests.
 Snooper - Responds to messages from memory controller.
Memory Controller - Services requests from each cache and maintains
coherency among all.

A Cache model is required to determine proper hit/miss/write-back behaviour.
A virtual sequence coordinates test activity across multiple masters.
A virtual sequence conects to sequencers in all the agents.
Use an interconnect monitor to ensure that snooping and speculative fetches
work correctly.

Simple I$ Strategy:
Use randc to fill cache, all misses.
Repeat, all hits.
Use {[!]inside []} constraints to favour index reuse to force tag changes.

                             Simple DMA Des

           ----------       Cmd: {preamble,type,parity,pad}
          |   CPU    |    WrReq: {cmd,addr,data,[addr,data,..],EOP,crc}
           ----------     RdReq: {cmd,addr,[end-addr],crc}
              /|\           Ack: {cmd,[data,[data,..]],crc}
               |           Read:  Addr-single read, EA+Addr-burst
     ----------------------            [reset ]
    |                      |         ---->|
   \|/                    \|/       |     |
-----------             ---------    |    \|/
|         |D|           |Ctl: Park|   |  [delay ]
|         |i|           |St Ack TO|   |     |
| Shadow  |r|           | Wr TO   |   |    \|/
|         |t|<--        | Rd TO   |   |<-[! Park]
| Memory  |y|   |       |         |   |     |
|         | |   |       | Rd Addr |   |    \|/
|         | |   |       |RdBurstEA|   |  [!Dirty]-->[Tx Wr]-->[  Tx  ]
-----------    |        ---------    |     |                    |
   /|\        |           /|\       |    \|/                  \|/
    |         |            |        |  [!SglRd]-->[TxRd1]   [TO Ctr]
    |        \|/           |        |     |                    |
    |     ----------       |        |     |                   \|/
    |    |  Master  |      |        |  [!BstRd]-->[TxRdB]   [Tx Pkt]
     --->|          |<-----         |     |                    |<------
         |   FSM    |               |    \|/                  \|/      |
          ----------                |<----------------------[!RxAck]   |
              /|\                   |                          |       |
               |                    |                         \|/      |
              \|/                   |                       [!--TO ]-->
           ----------               |                          |
          |  Slave   |              |                         \|/
          |   FSM    |              |                       [Error ]
          |--------- |              |                          |
          |  Memory  |              |                         \|/
           ----------                <-------------------------

DMA Test Strategy.

test0 / init - prelude to all tests:
Set Park
Write random data to memory.
Release Park
Poll for Ack
Read-back and check Shadow memory.

Verify read/write access to all registers.

Randomly sequence through these events:
 Randomly select - Read/Write, Random length
 Select Random length
 For read length 1, randomly write End Address: weight 5%
 If write -  Select random data, Park: weight 5%
 Wait for Ack

Inject random addr/data errors.
Throttle read data to verify timeout.
Throttle write response to verify timeout.


                                MP Notes  

AND 11 => 1 0x x0 => 0 else x         NAND => AND . NOT
OR  00 => 0 1x x1 => 1 else x         NOR  =>  OR . NOT
NOT  0 => 1     1 => 0 else x
XOR 00 11 => 0 01 10 => 1 else x      XNOR => XOR . NOT

Convert bcd string to integer and reverse it:

#include <stdio.h>

main(int argc,char *argv[])
 { // equivalent to atoi
   char *d = argv[1];
   int l=0,x=0;

   for(char* c=d; *c++; ++l); // strlen

   for(int i=0;i<l;i++)
     x *= 10;
     x+= d[i]-'0';
   // now reverse the value
   int rev = 0;
     rev *= 10;
     rev += x%10;
     x /= 10;
   printf("%s String '%s' len %0d value %0d 0x%x, rev %d\n",
   fprintf(stderr,"%s Enter numerical value.\n",argv[0]);
 } // main()

Define a constraint to limit the number of bits set to four:

constraint four_or_less_ones { $countones == 4; }

rand short b0, b1, b2, b3;
constraint bp {b0 != b1, b0 != b2, b0 != b3, b2 != b3, b2 != b4, b3 != b3;}
val = (1<<b0) | (1<<b1) | (1<<b2) | (1<<b3);

How do $display, $monitor, $strobe, and $write differ?
$display prints a formatted string and a newline to stdout.
$monitor prints the string whenever one of the arguments changes.
 Equivalent to: always @(*) $display()
$strobe displays the string at the end of the simulation.
$write displays the script without a newline.

Implement a AND/NAND/OR/NOR/XOR/XNOR/NOT/Latch using a Mux.
AND: A ? B  : 0  NAND: A ? B* : 1   OR:  A ? 1 : B  NOR: A ? 0 : B*
XOR: A ? B* : B  XNOR: A ? B  : B*  NOT: A ? 0 : 1  Latch: Clk ? D : 0

Implement a buffer and an inverter using an XOR:
Buffer: A ^ 0  Inverter: A ^ 1

Measure 4-litres from a tap with a 3-litres and 5-litres container pair.
Fill 3L, pour into 5L, repeat, leaving 1L.  Pour out 5L, add 1L and 3L.
 Uses  9L and discards 5L.
Fill 5L, pour into 3L and pour out, transfer remaining 2L into 3L.
 Fill 5L and pour 1L into 3L leaving 4L.
 Uses 10L and discards 3L.

You have 3-full and 3-empty glasses: F F F E E E
How do you get every other glass empty by moving only one glass?
Pour from the second full glass into the second empty glass: F e F E f E

                                NP Notes  
// ==> Practice Question 1:
// You have a 4-to-1 packet mux.
// The 4 inputs coming in with their own queues.
// The scheduler selects one of the four to go out.
// Describe at a high level how would you verify this design,
// i.e. major tb components, corner cases, assertions, etc.
// Write the scoreboard for this module.

// ==> Practice Question 2:
// Describe a technical challenge you encountered and how you overcame it.

// ==> Practice Question 3:
// Describe your chip level tb in a logical fashion (within a few minutes).
// Elaborate on some of the challenges.

// ==> Practice Question 4:
// What steps can you take the reduce the turn-around time on running a sanity
// test if the minimum simulation time after an rtl change is too long?

Develop sub-block test benches which run more efficiently.
Replace stable blocks with behavioural models.
Profile the code to identify potential bottle necks.

// ==> Practice Question 5:
// Write the Data Class for an old school telephone with four input options:
//  a) 0 - for the operator
//  b) 911 - for emergency
//  c) 7 digit code for local
//  d) 10 digit code for international, that always starts with 1.

// under construction!
class ClassicPhone;
//localparam max_digits = 11;                // 1+3+7
typedef enum num_type_t {OPERATOR=1, EMERGENCY=3, LOCAL=7, LONG_DISTANCE=11};
rand num_type_t num_type;
byte digit[];
constraint legal
 num_type == OPERATOR |-> digit[0]==0;
 num_type == EMERGENCY |-> digit[0]==9, digit[1]==1, digit[2]==1;
 num_type == LOCAL |-> digit[0]!=0, digit;
 num_type == LONG_DISTANCE |-> digit[0]==1;

endclass :ClassicPhone

module top();
class ClassicPhone dial;
dial = new();
endmodule : top

// ==> Practice Question 6:
// You have an synchronous fifo where the input can be any number of bytes,
// and the output can request any number of bytes.
// Write a scoreboard for this.
//   An example scenario may be something like:
//     Time x  : Input drives 3 bytes
//     Time x+1: Input drives 5 bytes
//     Time x+2: Input drives 2 bytes
//     Time x+3: Output requests 10 bytes

// ==> Practice Question 7:
// How do you go about verifying a given block?
// Describe the basic verification flow, from architecture spec to gate level
// sim/coverage closure.

8) Let's say you have a block which is connected to a network on one side and a ddr
module on the other.  Let say that the block receives network packets which provide
either a read or write command to the ddr..  If it is a write, the data will be
provided in that packet.  If it is a read, you would send the data back on a response
packet.  How would you verify this?  Write code for a data class for this.

9) Write a sv class which acts as a memory manager.  The class has two functions a req
(int bytes), free(addr, bytes).  Fill out the code for the two functions.  The free
may not free the entire block, the free can free a subset of a previously granted

// ==> Practice Question 10:
// You have a block with a data_valid, data, and ready signal.
// Write a driver for it.
//--    forever
//--      begin
//--      data_valid <= 0;
//--      mbox.get(tr_data);
//--      while(!ready) @(posedge clk);
//--      data <= tr_data;
//--      data_valid <= 1;
//--      @(posedge clk);
//--      end

`define BUS_WIDTH 8
`define PERIOD 10
`define MAX_TR 100
//efine DEBUG

interface bus_if(input bit clk);
logic ready;
logic valid;
logic [`BUS_WIDTH-1:0] data;
int id;
modport Tx(input ready, output valid, data, id);
endinterface : bus_if

class transaction_c;
logic [`BUS_WIDTH-1:0] data;
int id = 0;
function new(logic [`BUS_WIDTH-1:0] data, int id); = data; = id;
endfunction : new;
endclass : transaction_c;

class driver_c;
virtual bus_if Tx;
mailbox mbox;
function new(mailbox mbox, virtual bus_if.Tx Tx);
 this.mbox = mbox;
 this.Tx = Tx;
endfunction : new
task run();
`ifdef    DEBUG
 $display("@%0t: Running driver...",$time);
`endif // debug
 begin : DriveData
 transaction_c tr;
 Tx.valid <= 1'b0;
`ifdef    DEBUG
 $display("@%0t: Driver waiting for data...",$time);
`endif // debug
`ifdef    DEBUG
 $display("@%0t: Driver waiting for ready...",$time);
`endif // debug
 while(!Tx.ready) @(posedge Tx.clk);
`ifdef    DEBUG
 $display("@%0t: Driver driving data 0x%x",$time,data);
`endif // debug <=; <=;
 Tx.valid <= 1'b1;
 @(posedge Tx.clk);
 end   : DriveData
endtask : run
endclass : driver_c

module clock(output bit clk);
 begin : DriveClock
 forever #(`PERIOD/2) clk <= ~clk;
 end   : DriveClock
endmodule : clock

module top();
bit clk;
clock clock(clk);
bus_if bus(clk);
mailbox mbox = new(1);
driver_c driver = new(mbox,bus.Tx);
 begin : ActualTest
 int num_tr;
 num_tr = $urandom_range(1,`MAX_TR);
 $display("@%0t: Sending %0d transactions...",$time,num_tr);
 bus.ready <= 1'b0;;
 for(int i=0; i<num_tr; ++i)
   begin : SendTransaction
   bit [`BUS_WIDTH-1:0] data;
   transaction_c tr;
   int delay;
   data = $urandom() & '1;
   $display("@%0t: [%0d] Put data 0x%x",$time,i,data);
   tr = new(data,i);
   delay = $urandom_range(0,16)/4;
   bus.ready <= 1'b0;
   repeat(delay) @(posedge clk);
   bus.ready <= 1'b1;
   end   : SendTransaction
 repeat(10) @(posedge clk);
 bus.ready <= 1'b0;
 end   : ActualTest
final $display("@%0t: Bye",$time);
endmodule : top

// ==> Practice Question 10 variation(a):


Syntax: cmd arg1...argn, cmd can be a Tcl command or procedure.
Commands end at newline or ;   \ to escape character or continue line
# is a comment but only at the start of a cmd, continued with \
puts stdout {this is how to add a} ;# comment after a command
To group words: {no variable substitution} or "supports $var expansion"
Grouped arguments and leading " must be space delimited.
A $ followed by a space is treated as a literal $ character.
Output: puts stdout "My message"; puts stderr { my error }
Variables: set max 5; set count $max, expressions: expr $count + 1
Expressions use double float, use tcl_precision to set precision
Use quotations for string expressions: if{$ans == "yes"}
set var (without a value) returns the value of the variable: set $max
Variable names can be embedded: ab${c}de references the variable $c
To delete variables (they must be defined): unset variable1..variablen
To check if variable is defined: info exists variable (true if defined)
Using cmd output: [puts stdout like shell `` [puts stdout but may be nested]]
[] does not group, grouping occurs before substitution.
Procedures: proc name arglist body: proc area {l w} { return [expr l * w] }
Math Functions: asin(x) pow(a,b) rand() srand(seed) ...
Commands: error: raise an error, eval: evaluate as a command, exit:terminate
for: loop, foreach: iterate list, formatf: format string, gets: read line
glob: expand pattern to file names, if/else/elsif: conditional
join: concatenate with separator, package: code package, read: read chars
return: return value, scan: parse string based on format
source: evaluate commands in file, split: chop string into elements
switch: multi-way branch, time: measure execution time, while: loop
Script header: #!/usr/local/bin/tclsh #!/usr/local/bin/wish
-or- #!/bin/sh .. exec /path/to/wish "$0" ${1+"$@"}

                               Verif Plan

From "Verification Methodology Manual for System Verilog:"

Define what the design does and should not do.
Enumerate and prioritize the verification requirements.
Specify functional coverage to correlate to the verification requirements.
Correlate descriptions to the design requirements and specifications.
Clarify what is implementation specific.
Identify what functionality will not be verified.

Define verification scopes based on design partitions.
Identify model requirements at various levels of abstraction.
Describe how things will be checked.

Identify the stimulus requirements and methodology.
Support error injection to simulate all possible scenarios.
Define monitors and assertions to verify assumptions and responses.
 Only assertions should be allowed to access internal design signals.
 Clarify which assertions are implementation specific.

Response checking should be at the transaction level.
Response checking should only be accurate as necessary.
Timing should only be checked at the interfaces.

Detailed description of the test bench architecture and supported
This should include all the required scopes including system-level.

Clarify any dependencies that may exist.

Define and use a consistent directory structure.

Describe how tests will be implemented and define the tests.
Prioritize and start with basic tests and move to complex scenarios.

What to test:
Power-up (cold) reset, (warm) reset.
Proper read/write access to registers and memory.
Supported device configurations, all register config fields and status.
Explicitly defined functionality towards meeting coverage requirements.

Other things:
Select test bench infrastructure: UVM vs SystemC
How to reproduce lab bugs
How to leverage current test cases
Create block and system level test plans

                               MPEG Video

mpeg4 takes more compute power but has better images from same data size.
codec - A codec is a set of COmpression/DECompression instructions.

             =============== =============== ===============
                 Design Verification Interview Questions
             =============== =============== ===============

                             *   Basic    *

Why use a structure instead of a class?
Structures can be packed/unpacked or declared as packed [msb:lsb]
Packed array can be waited on. (?)
Unpacked arrays stored in low word.

What type of issues are identified by gate and sdf simulations?
Clock domain crossing issues corresponding to false or multi-cycle paths.
Gate and sdf simulations can also identify issues with integrated IPs that
have separate timing constraints and analysis.
LRC and Static timing analysis are used instead of -
 Gate simulations with zero-delay to identify synthesis problems.
 Sdf simulations with routing and cell delays to identify timing issues.

What issues are encountered while merging coverage results?
Coverage is instance based by default but can also be module based.
Furthermore design changes modify names and hierarchies.
Switches can be used to map tokens but this is not a clean process.

What are the System Verilog fork types?
Join all/any/none.
disable thread_label - Disables specified thread.
disable fork - Disables all forks in current thread.
Note that these constructs are not conducive to structured programming.

When is the verification done and the device is ready for tapeout?
When the top-down schedule says it is, hopefully by then -
The code coverage stops converging at some point and the process
is then reviewed with the design team to define exclusions.
The verification effort achieves a 100% Functional Coverage.
The Code coverage also approaches this value as well after exclusions
based on reviews with the design team.

How do you test unordered transactions.
Unordered transactions need to be tagged for tracking.

Count the number of bits set in an argument in C.
int c = 0;    // Simplest size independent
for(int i=0; i < sizeof(x)*8; ++i)
 if( (x>>i) & 1) // alternative (x & (1<<i))

int c = 0; // More efficient, stop when done
 { // also immune to type of x
 if(x & 1)
 x >>= 1;

int c; // Optimal, least passes and type independent
for(c=0; x; x &= x-1)
 ++c; // Each pass clears one bit

Create a weighted distribution equivalent to randcase.
=> SSS

How do you avoid circular dependencies?

What is the logic for a clock divide-by-two?
What is the logic for a clock divide-by-three?
A divide-by-n clock can be implemented using a ctr that rolls over at n-1.
which is clocked at every edge:
   begin : inc_ctr
     begin : gen_pulse
     ctr= 0;
     clkout = ~clkout;
     end   : gen_pulse
   end   : inc_ctr

What is the logic for a clock multiply-by-two?
A clock multiplier is implemented by measuring the clock period
and dividing it by the rate n such as two.
This is then divided by two to get the half-period for a 50% duty cycle.
A different duty cycle can be managed with a bit more logic as well.

What is the difference between a flip-flop and latch
A Flip Flop is a clocked latch i.e. flip flop = latch + clock
The latch output follows changes to the input regardless of the clock.

What happens when you use at the end of a list?
It returns FALSE indicating an error.

What is the difference between << >> and <<< >>>?
<< and >> are logical whereas <<< and >>> are arithmetic operators.

How do you reference a package item?
package_name::item such as chip_state::READY

What is structured programming?
Code should have a single flow path.
This excludes constructs such as -
 break (except for case), continue, last, next, and the infamous goto.

What is TDM?
Time Domain Multiplexing where signals share time slices.

What are the Sonet Layers?
[ Physical ] Photonic -> [ Data Link ] Section -> Line -> Path

Design a Sonet A1A2 0xf628 SOF detector.

What is a timescale and how should it be used?
The simulation reference delay and granularity: `timescale 1ns/1ps
 The numerator is the reference time unit used for #delays.
 The denominator is the precision of the simulation events, events that
 occur within that window will not be detected.
Timescale is generally defined in a shared header file so it is consistent
across the design.
Vendor IPs with different timescale requirements need be managed carefully
generally by compiling them into object libraries.
Providing the value in the command-line is highly discouraged.

Is UVM polymorphic?
UVM classes contain virtual methods which are replaced by extended classes
thus the UVM library is polymorphic.

What is overloading?  How does it work in SV?
Overloading is when a function has multiple signatures with different types.
System Verilog does not support overloading but default values can be used
to some extent to achieve this type of functionality.

Design a mux in gates.
The signals go to 2-input AND gates whose output feeds an OR gate.
The selects provide the other AND input with one INVerted.
 (D0 & SEL) | (D1 & ~SEL)

Swap two values without using a temporary location.

x = x + y       -or-    x = x ^ y;
y = x - y               y = x ^ y;
x = x - y               x = x ^ y;

Write a factorial function:

long fact(long x)
   x *= fact(x-1);
   x = 1;
 } // fact

Write a Fibonacci function:

long fib(long x)
   x = fib(x-1) + fib(x-2);
 } // fib

From 9 coins find the lighter one by weighing them twice.

Weigh three coins against each other.
 If they are equal, weigh a pair from the remaining.
 Otherwise with the lighter set next.
 Now weigh one against one.  If they are equal, choose the remaining one.

What is the highest number you can get by moving two bars?
>  _   _   _   _
> |_  | | | | |_|
>  _| |_| |_| |_|

_   _   _   _       _       _   _       _       _   _
|_  | | | | |_|     |_  | | | | |_| |   |_  ||| | | |_|
_| |_| |_| |_|      _| | | |_| |_| |    _| ||| |_| |_|

                             * Actual Int *
* +++++:
* *****:

How do you test CDC signals?  Why are they flopped twice?
Crossing into a faster clock domain (Tx pulse is at least 1.5xRx pulse) -
Signals are generally double-flopped when crossing into faster clock
domains to avoid metastablity.
The number of flops effects the MTBF but two is generally sufficient.
Each flop adds a delay and has the potential to effect system performance.
Steps such as gray-code encoding the outgoing signals can also help
eliminate CDC related issues.

Crossing into a slower clock domain -
One way to manage signals traveling into a slower clock domain is to
use a closed loop which provides an acknowledgement handshake to ensure
that the transition is not lost.
=>  Another way...

Design and test an entry/exit circuit that counts people.
Lock some entries when almost full and correct for underflow.
=> SSS

As a lead when would you say the design is ready for tapeout?
=> Already covered.

Write an algorithm to compute the sqrt of a bit[31:0] value.

What is the difference between: a = x ? b : c -and- if(x) a=b else a=c ?
The first is a mux and the second is priority encoded logic.
They also behave differently when X is indeterministic.

How do you pass a rgb[] dynamic array to C?
How would you write rgb vector data in a file from Verilog?
=> SSS

Print a file in reverse using Perl.
Input format: FirstName LastName Title Pay.  Output format: Last First Pay
=> SSS

Write a Perl script to process input: Test Status-Pass/Fail
Output sorted by count: Test FailCount
=> SSS

How do four people cross a dark bridge in 17 or less minutes?
They take 1, 2, 5, and 10 minutes each to cross and one must walk back
with the torch to get the next one. (AM)
STEPN: <persons> --direction -- <persons> (time taken for that trip)
STEP1: CD ---------> AB (2)
STEP2: ACD <--------- B (1)
STEP3: A ------------> BCD (10)
STEP4: AB <----------CD (2)
STEP5: ---------> ABCD (2)
Total time = 2 +1 + 10 + 2 +2 = 17

How do you test a Round-Robin Arbiter?
From System Verilog Assertions Handbook, 4th Edition:  http://systemverilog.

import uvm_pkg::*; `include "uvm_macros.svh"
module top;
       bit clk;
       bit[3:0] req, gnt, prev_grnt;  
       default clocking @(posedge clk); endclocking
       initial forever #10 clk=!clk;   

  generate for (genvar i=0; i<=4; i++)
     property p_arbiter;
      bit[16:0] v;
       (req[i]==1'b1, v=prev_grnt, v[i+1]=1'b1) ##0 req < v |->
          gnt[i]==1'b1 ##0 $onehot(gnt);
      endproperty : p_arbiter
      ap_arbiter: assert property(@(posedge clk) p_arbiter);
  ap_zero_req0: assert property(@(posedge clk) req==0 |-> gnt==0);

initial begin
    repeat(200) begin
      @(posedge clk);   
      if (!randomize(req, gnt, prev_grnt)  with
          { req inside {0, 1, 2, 4}; // dist {1'b1:=1, 1'b0:=3};
            gnt inside {0, 1, 2, 4}; // dist {1'b1:=1, 1'b0:=2};
            prev_grnt inside {0, 1, 2, 4};
          }) `uvm_error("MYERR", "This is a randomize error")

What happens when a CPU interrupt occurs?  Where is the Interrupt Vec Table?
=> SSS

Write an assertion statement for a request that must be followed by a grant
and ignores reset.
property req2gnt ( [@posedge(clock)]
   disable iff(!reset_n) $rose(req) |=> ##1 grant )
check_for_grant: assert req2gnt;

* +++++++:
* *******:

What would the verification plan for your design block contain?
What are some of the things that you need to verify?
The verification plan defines the DUT, the functionality to be exercised,
and what will not be tested.  It defines the test bench and how the
functionality is tested followed by descriptions of the required tests.

How do you know that you are ready for tapeout?
+Already covered.
Meet functional and code coverage goals.

How do you check out of order transactions in a scoreboard?
How do you make sure that transactions of a given id are in order?
Where are the out-of-order transactions gathered?
How do you make sure nothing got dropped?
The ingress transactions are placed in a queue and the egress transactions
are popped from a mailbox.
Each transaction has an id which is matched with the corresponding response.
The transaction must match the first ingress transaction with the right id
to ensure ordering.
Out of order transactions can be gathered in the driver or a sequencer.
The egress mailbox should be empty after the drain time.
In order to make sure nothing got dropped the ingress queue must be empty.

How do you know when your test has ended?
Where in the code do you implement the objections?
When all objections are dropped followed by exhausting the drain time.
The objections are in the packet class.

How do you coordinate the initialization and a random test sequence?
You use the corresponding run phases.

How do you handle a DUT deadlock, especially during startup?
Fork a watchdog. (Incomplete answer)

Describe how to write a CPU (AXI) and Memory BFM pair and how read/write
transactions work.  How is the memory model implemented?
How do you manage out-of-order pipelined transactions.
Use an interface containing Addr,Data,RdWr* Enb, and possibly ack.
Drive the address and enable along with the write RdWr* value.
Pipelined requests contain an id which is correlated with the response.
Writes are simply written out to memory, reads come back with an ack.
A small memory model can be a simple array or a dynamic array.
A sparse memory model is an associative array.

* ++++++++:
* ********:

How do you enable an SPI loopback at run time?
Implementation specific but generally requires a write to a cfg reg.

How do you select an output channel from firmware code?
Implementation specific but generally requires a write to a cfg reg.

How do you identify the source of an undefined 'X' signal?
A good implementation should contain assertions to catch the scenario.
Waveform viewers support the ability to trace drivers back to the source.
There are also tools dedicated to help with this effort.

How do you force a signal value that will not stick?
Use a $deposit(var,val) instead of force.
A force must be released or over ridden by a procedural assignment.
These constructs add risks and should be used with great care.

How would you test an asynchronous fifo?
You need an ingress src generator and an egress sink UVC pair which
operate independently.
The generator should be able to fill an empty fifo and the sink should be
able to drain all data.
Fifos frequently contain watermarks which also need to be exercised
to ensure proper throttling.

* +++++++:
* *******:

Design a clock with a 70% duty cycle.
`define DUTY_CYCLE (70.0/100.0)     // Test code here
initial forever
 begin : clock_block
 clk= 0;
 #(`CLK_PERIOD * (1 - `DUTY_CYCLE) );
 clk= 1;
 #(`CLK_PERIOD *      `DUTY_CYCLE  );
 end   : clock_block

Search efficiently in a large link list for the contents of a small link list?
typedef struct
 char d[SEGMENT_SIZE];
 link_list *next;
 } link_list;

link_list *ll, *sl;
while(ll->next != sl)
 ll = ll->next;

Draw a state machine diagram and the corresponding design logic to detect the
pattern 'b11011 including repeated occurrences such as in 11011011.

                                 || <--------
                                 ||          |
                                 \/          |
                             ( 0 xxxxx ) -0->|
                                 ||          |
                                 || 1        |
                                 \/          |
                   --------> ( 1 xxxx1 ) -0->|
                  |              ||          |
                  |              || 1        |
                  |              \/          |
                   <------1- ( 2 xxx11 )     |
                   --------> ( ^^^^^^^ )     |
                  |              ||          |
                  |              || 0        |
                  |              \/          |
                  |     ---> ( 3 xx011 ) -0->|
                  |    |         ||          |
                  |    |         || 1        |
                  |    |         \/          |
                  |    |     ( 4 x1011 ) -0->
                  |    |         ||
                  |    |         || 1
                  |    |         \/
                  |     <-0- ( 5 11011 )

How deep should the fifo be if Tx is 70xData+10xIdle and Rx is 7xD+1xI.

A 16-deep fifo suffices for this example.

The fifo must prevent the tx from starvation.
During the Tx 10 idles - 1 is idle, then 7 data, and another data.
Therefore 8 data must already be buffered.
The next 6 are data which can flow through completing two frames.
8 of the 10 frames remain and each must have a byte buffered during idle.
That is another 8 bytes for a total of 16.

Sort four 8-bit values (a,b,c,d) by cascading a 2-bit sorter.

Count the number of bits in a 6-bit value using a 2-bit adder with Sum/Carry.

Write a Verilog add/shift routine to compute the product P= A[m:0] * B[n:0]

Code in Verilog and draw a ff with Asynchronous and Synchronous resets.
always( @posedge clk
`ifdef   `ASYNC_RST
       or @negedge rst_n
`endif // async rst
   begin : flip_flop
     q <= 0;
     q <= D
   end   : flip_flop

Generate a 32-bit rand value with 2-bits set and cover *all* values.
+ These are both speculative and likely invalid solutions.
unsigned int x;
randc unsigned byte b0, b1;
constraint b2 { b0 != b1; b0 < 32; b1 < 32; } // cannot constraint randc
x = (1<<b0) | (1<<b1);

randc unsigned int x;
assume { $bits(x) == 2 }

Where do you put assertion points?
Assertions that validate protocols should bind to corresponding interfaces
Black box assertions are bound to ports and should be owned by DV.
White box assertions are specific to the design implementation and bind
within the hierarchical design and should be owned by the designer.

What do you do to make a simulation environment conducive to debugging?
A detailed printed message is a proven way to debug anything!
Simulations should contain such messages controlled by debug level
switches such as those supported by `uvm_info.
This should be supplemented with depth controlled hierarchical dumping and
assertions that help isolate the locale.

Generate unique random values in the range 100-100,000.  Ok to use memory.

#define MIN 100
#define MAX 100000
#define SIZE (MAX-MIN+1)   // Count values are +1

int d[SIZE];
for(i=0; i < SIZE; ++i)
 d[i] = MIN + i;  // Unique values

for(i=0; i < SIZE; ++i)
 { // Shuffle by swapping random locations
 int x = random() % SIZE;
 int tmp = d[i];
 d[i] = d[x];
 d[x] = tmp;

// 17-bit randc based solution, 100,000-100 = 99,900 = 0x1_863C
// -- Will not actually work as randc is not allowed in a constraint!
randc bit [31:0] Unique_val; // Unique random value
constraint Unique_val_c {Unique_val > 100; Unique_val < 100000; }

What is bless in Perl?
bless REF[,classname] specifies that the referenced item is now an object
in the current or specified class.
It generally references a hash and occasionally an array.

What will this display for a:
> integer a;
>   initial a = 5;
>   function integer f(input integer a)
>     begin
>     f = a << 1;
>     end
>   always(@posedge clk)
>     begin
>     a <= f(a);
>     $display("@%0d: a = 0x%X",$time,a);
>     end
>   0x5, 0xA, 0x14, 0x28, 0x50, 0xA0, 0x140 ...

* +++++:
* ******

What does this code do:
> int list[5]; // Declares a five element integer array
> listp = &list; // Declares a pointer to an integer list pointer
> int *listp;  // Declares an integer pointer such as to a list
> listp = &list[0]; // Declares a pointer to an integer list pointer
> listp = list; // Assigns a list pointer to another variable
=> SSS

Define a class for a CPU model and explain why it should be used.
=> SSS

* ++++++++:
* ********:

What are the steps a driver uses to process a request?

How is a sequence initiated and by what component?
The sequence is initiated by the sequencer either by specifying a default
sequence and allowing the factory to do it or by explicitly invoking
seqr.start in the run phase.

Describe the advantages and disadvantages of pushing or pulling transactions?

What coding guidelines should be used and why?

How do you use interfaces in a UVM test bench?

Declare an associative array.
<element-type> <array-name> [index-type]

Allocate a dynamic array with 10 elements.
dyn_arr = new[10]

How do you safely pass a set of scalars and a hash to a Perl subroutine?
As a hash reference, $scalar1..n, $data_href, $more_scalars...

How do you ensure that a hash is not disturbed by a called Perl subroutine?
Copy it as an array so a missing value does not cause issues.
my @local_hash = @{$passed_href}

What is the difference between a Perl hash and array?

How do you determine the required depth for a fifo?

Define a constraint for a sequence to cover the range say 10-10000.
constraint range { x >= 10; x <= 10000;
 foreach(x[i]) for(int j=0; j<=i; ++j) x[i] != x[j]; }

What are virtual classes?

What are pure functions?

Write a C program to determine whether a word is a Palindrome,
such as "Noon" or "Mom."
// Test for Palindrome

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>

main(int argc,char *argv[])
 int p = 0;
 if(argc>1 && argv[1][0] != '-')
   int i = 0, j = strlen(argv[1])-1;
   p = 1;
   while(p && j >= i)
     //printf("[%0d] %c <> [%0d] %c\n",i,argv[1][i],j,argv[1][j]);
     if(tolower(argv[1][i++]) != tolower(argv[1][j--]))
       p = 0;
   printf("Is%s a Palindrome: %s\n",p?"":" NOT",argv[1]);
   fprintf(stderr,"usage: %s string - tests for Palindromes.\n",argv[0]);
 } // main()

How do you extract a field from a bit vector using position and width?
field = (vec>>lsb) & ((1<<width)-1)

How can you tell the spin direction of a disk that is half-black and
half-white using two photo sensors and a circuit element?
Place sensors apart (ideally 90-degrees), use one for D-FF clock and
the other for data.
 The output value determines the direction of travel.

How do you divide a clock by two?
Feed the output into the input with an inverter.

Define a UVM environment to test a system with input and output fifos with
independent credits.  What type of errors may occur?

Write code to make sure a transaction request makes it across a bridge.
assert( [@(posedge clk)] req |=> ##[*0:$] gnt

What (four) issues can occur with a mesh of network routing nodes?
Starvation Deadlock Incorrect-destination

How do you test a Round-Robin Arbiter?
Each node checks: assert(req |-> $last(gnt) != gnt)

* +++:
* ****

Write the code for a port driver.
A port driver needs to loop waiting on a mbox for transactions.
Each transaction is driven according to the protocol, generally involving
a request, wait-for-grant, and then driving the data across the bus.

Reverse a large link list.
typedef struct
 char d[SEGMENT_SIZE];
 link_list *prev; // optional
 link_list *next;
 } link_list;

link_list *ll, *next;
do { // If the list contains prev, simply swap next and prev.
 next = ll->next;
 ll->next = ll->prev;
 ll->prev = next;
} while(next);

Most cases will not contain prev:
 link_list *base, *next[MAX_NUM_ELEMENTS];
 base = ll; // save
 int i = 0;
 do {
   next[i++] =;
   ll =;
 } while(ll);
 ll = base; // restore
 for(int j=0; j<i; ++j)

Write constraints to toggle even/odd lengths.
rand int DataLen;
rand bit polarity; // random +1 or -1
int TransactionID = 0;
constraint DataLen_c { DataLen%2 == 0 ; }
if(TransactionID % 2)
 { // promote to odd
 if((polarity && DataLen<DATA_LEN_MAX) || !DataLen)
   DataLen++; // careful not to overflow
   DataLen--; // already made sure this is not 0

// Alternate - NOT WORKING solution
randc bit lsb;
rand int length;
constraint even_odd_len =
 { solve lsb before length; length <= MAX_LEN; length & 1 == lsb };

Draw a state machine diagram to detect patterns divisible by 5 and 10.
As 10 is divisible by 5, the problem simplifies to divisible by 5.

* ++++++:
* ******:

What are the different steps of running a System Verilog simulation?
(Excerpt from notes)

What do you do to allow (simple) tests to be written faster?
Use a configuration class and layered sequences so tests can simply consist
of constraints.
Also use templates to take advantage of polymorphism and create type
independent functions and tasks.

What are callbacks?  How are they implemented in a driver?
Dummy virtual functions that can be defined by tests to inject errors.
UVM has a mechanism to specifically support this process.

How does a generator talk to a driver?
It pushes transactions into a mailbox that the driver is pending on.

What is the difference between a heap and a stack?
A LIFO stack stores data in the provided order whereas a heap orders the
data such as to represent a prioritized queue.

How to you identify and improve simulation performance issues?
Most tools provide code profiling to help identify performance issues.
It is also important to utilize the proper tool switches along with
hierarchical test benches which limit the scope.
Performance is particularly sensitive to the timescale resolution.
Dumps should consist of only the required hierarchies and depths.
A compile-once-run-many flow is also preferred.

What are the advantages and disadvantages of block versus device simulations?
Block level has many advantages as it performs more efficiently, does not
require the availability of other components, is easier to debug, and
provides greater control on the stimulus.
Device simulations are useful to ensure system interconnectivity and
throughput behaviour.

What does an interface consist of?
Signal wires and constructs such as modports and clocking blocks.
Interfaces should not be used to define functions or tasks.
Interfaces should also contain protocol related assertions.
Clock and reset signals in an interface should be defined as type bit.

What is the difference between logic and wire?
logic represents the last driven value whereas a wire will go to X
when driven by multiple drivers.

How do you test asynchronous signals for different clock domains?
These issues show up in back-annotated sdf simulations.
They should be detected using dedicated tools such as 0in.
Various methods exist to manage them such as gray-coding the signal path.

What are metastability issues with a latch?
When the input is unsettled the output is indeterministic and
may amplify the line noise.

How do you write and implement a verification plan?
=> SSS

How do you know that a device is verified?
How do you assess that the coverage is sufficient?
At what point do you start adding exclusions?
=> Already covered.

What is the difference between case, casex, and casez?
casex and casez are supersets which support X and Z values, respectively.
They are generally avoided as they make the design more complex and have a
detrimental effect on the performance as well.

Describe the behaviour of always(@a) b = #5a versus always(@a) #5 b = a
The first captures a and waits #5 to assign it to b and continuing.
The second waits #5 then captures a and assigns it immediately to b.

   0  1  2  3  4  5  6  7  8  9  0  1  2  3  4  5
   |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |
 a _________|                    |________________     a toggles @3 @10
 b ________________________|                    |_     b = #5a Transport
 b ________________________|                    |_  #5 b = a   Inertial

 However if the toggle happens faster than the gate delay the inertial
 delay swallows the glitch.
 a _________|        |____________________________     a toggles @3 @6
 b ________________________|        |_____________     b = #5a Transport

 b _______________________________________________  #5 b = a   Inertial

+ Models construct variations:
 always @(a or b or ci or tmp) begin tmp<=#12 a+b+ci; {co,sum}<=tmp; end
 -or- assign #12 tmp = a + b + ci; assign {co,sum} = tmp;
 -or- assign #12 {co,sum} <= tmp; always @(a or b or ci) tmp <= a +b+ci;

Find the second largest value in an array of numbers.  Test this code.
=> SSS

* +++++++:
* *******:

What is a singleton?
Something that occurs only once such as a factory or a static variable.

What is randc?
Cyclical random which exhausts all patterns before repeating one.
randc variable size is implementation dependent but must be at least 8-bits.

What are local and protected and should they be used in verification?
local variables cannot be accessed outside the declaring class.
protected variables can be accessed only by derived classes.
They do not have a significant role in verification.
A good implementation uses methods rather than accessing properties.

What is an inertial delay and what are related issues?
+Already answered in detail.

What is automatic?  Is it used in a class?
Task variables are dynamically allocated to make it reentrant.
Classes are instances and do not require this key word.

Is x==5 |-> y>3 unidirectional or bidir?  When do you use solve before?
This is bidirectional as y is related to the value of x which must be
solved first.

Can you redefine a constraint with the same name in an extended class?
The redefined constraint overrides the original definition.

What is a modport?
A bundled set of interface signals with device specific direction.

What is a package and why is it used?
A package groups and encapsulates code to make it easier to integrate and
also to deal with potential name contention.

Why do you do gate and sdf simulations?
+Already answered in detail.

What type of forks are in System Verilog?
+Already answered in detail.

What causes X in gate simulations and how do you trace them?
X values can be the result of flops that are not properly reset,
setup and hold violations, and driver contention.
The driver contention will even show up in rtl simulations.
They can be traced using dedicated tools such as most wave form viewers.
Decision statements use: assert(!$isunknown(mysig)) else $error("%m s = X")
A more comprehensive list from Stuart Sutherland:
 • Uninitialized 4-state variables
 • Uninitialized registers and latches
 • Setup or hold timing violations
 • Multi-driver conflicts (bus contention)
 • Unconnected module input ports
 • Out-of-range bit-selects and array indices
 • Low power logic shutdown or power-up
 • Operations with an unknown result
 • Logic gates with unknown output values
 • User-assigned X values in hardware models
 • Testbench X injection

What type of sdf simulations do you simulate?

Is the transition from X to 0 (or 1) an edge?
A negedge shall be detected on the transition from 1 to x, z, or 0,
and from x or z to 0. i.e. from 1 or to 0.
A posedge shall be detected on the transition from 0 to x, z, or 1,
and from x or z to 1. i.e. from 0 or to 1.

What is your worst verification failure?

What keyword do you use to pass by reference?

Can you assign an extended class handle to a base class handle?
Yes but the other way requires an assert($cast(ext,base))

What is a virtual method?  Pure virtual method?
A virtual method is one that can be redefined in an extended class.
A pure virtual method must be defined by the extended class.
Constant expressions require pure functions whose value depends only on
input arguments and have no side effects.

What is an abstract class?
An abstract class is a virtual base class that contains only pure virtual
methods and cannot be instantiated, these methods are defined in the
extended class.
A non-abstract extended class *must* implement all the pure methods.

Does initial execute before always? How do always_comb and always differ?
There is no defined order of execution between initial and always blocks.
An always_comb block runs at time 0 whereas the equivalent always may not.

Which variables are 2-state: bit(vec) byte {short,,long}int [short]real
4-state: reg logic, integer (reg[31:0]), time (reg[63:0])

What is the race condition in this design (did not share design)?
A race condition is a flaw in a system or process that is characterized by
an output that exhibits an unexpected dependence on the relative timing or
ordering of events. The term originates with the idea of two signals racing
each other attempting to influence the output first. (Cummings)

What is polymorphism?
"The reuse of the same code to take on many different behaviours based
on the object on hand." Dave Rich, Mentor

Can you randomize a real number?
No, only scalar variables of type integer, reg, and enumerated.
A union could be used to emulate the behaviour with some hand-holding.
real_number = $bitstoreal({$random,$random});

What is the difference between immediate and concurrent assertions?
Can you use them together in a class?
Immediate assertions are simple boolean expressions whereas concurrent
assertions are temporal whose behaviour is time dependent.

* +++++ :

What is polymorphism, explain the mechanism with and without virtual in
base class?

Which statement will be printed?

>  program testcase();
>      import uvm_pkg::*;
>      `include "uvm_macros.svh" // add for irun
>      class b;
>          virtual task b_t;
>          `uvm_info("b.b_t --", $sformatf("inside Base"), UVM_HIGH);
>          endtask : b_t
>      endclass : b
>      class d extends b;
>          task b_t;
>          `uvm_info("d.b_t --", $sformatf("inside d"), UVM_HIGH);
>          endtask : b_t
>      endclass : d
>      class d1 extends d;
>          task b_t;
>          `uvm_info("d1.b_t --", $sformatf("inside d1"), UVM_HIGH);
>          endtask : b_t
>      endclass : d1
>     initial
>          begin : process // {
>          b b_h;
>          d1 d1_h = new();
>          b_h = d1_h;
>          b_h.b_t();
>          `uvm_info("--", $sformatf("goodbye"), UVM_HIGH);
>          end : process // }
>  endprogram : testcase

"The reuse of the same code to take on many different behaviours based
 on the object on hand." Dave Rich, Mentor

LRM: 8.20 Virtual methods
A method of a class may be identified with the keyword virtual. Virtual
methods are a basic polymorphic construct.
A virtual method shall override a method in all of its base classes,
whereas a non-virtual method shall only override a method in that class
and its descendants.
One way to view this is that there is only one implementation of a virtual
method per class hierarchy, and it is always the one in the latest
derived class.

The base class declares the method virtual so you get:
    UVM_INFO @ 0: reporter [d1.b_t --] inside d1

@A in the code, when A goes from x->0, what transition is it?
Is the transition from X to 0 (or 1) an edge?

This is a negedge.
 From the notes:
 A negedge shall be detected on the transition from 1 to x, z, or 0,
 and from x or z to 0. i.e. from 1 or to 0.

When a bit of vector variable is goes to "X's", how do you catch it?
>> what may be the underlying operation in $isunknown()?

isunknown =   ^vector != 1'b0 && ^vector != 1'b1
isunknown = !(^vector == 1'b0 || ^vector == 1'b1)

If the constraint is overridden in the derived class, when we randomize
the derived object which constraint block will get picked up?

The one in the derived class because randomize is a virtual method.

Explain about the agent and the components you will build inside the agents?

The agent consists of a sequencer, driver, monitor, and configuration.
It may also contain a subscriber component such as a coverage collector.

What is a uvm_subscriber?

A subscriber processes transactions from the egress monitor that are
written into the analysis port.

>> What is it used for? can we implement data check mechanism?

The subscriber pushes the transactions to the scoreboard and coverage
collection components.
The data check mechanism is in the scoreboard, not the subscriber.

>> Do we need to implement analysis export write method ?

The analysis export write method is already provided by the UVM library.

Explain about phases in UVM and execution order

From the notes:
Only the build() and final are top down, only the run() takes time.
build - construct and cfg various child components/ports/exports.
connect - connect the component ports/exports.
end_of_elaboration - configure the components if required.
start_of_simulation - print the banners and topology.
run -  execute test body and fork all threads, only phase task.
 The test is selected using: +UVM_TESTNAME=<testcase>
 The run phase consists of twelve divisions:
extract - gather required information.
check - check pending requests in scoreboard, read regs/statistics.
report - report pass/fail status.
   final - simulation is about to end and print final messages

What is uvm_scoreboard and how do you declare user defined analysis
export methods for input & output monitors?

The scoreboard is used to compare the egress transactions against the
ingress originated traffic.

What is virtual sequence
Used for managing concurrent transactions to multiple dut interfaces.
Requires a virtual sequencer which contains no sequence items of its own
and relies on

>> If you fork two sequences inside the body of virtual sequence how do
you control their sequences execution (order) ?
>> what are the sequencer arbitration methods available
   There are five sequencer arbitration methods:
   arbitration fifo, random sequence, priority sequence
       ?? ??

What are the differences between build & new?
How do you register user defined classes inside factory?

new instantiates an object which is then static for the duration of the
build is a phase which is used to create the objects using the
corresponding factory method and can be overridden for flexibility.

What are the ways to access uvm_config_db inside the sequences?

Not sure if this is the right answer:
(73)How to use set_config_* for setting variables of sequence?
   UVM1.2 deprecates set_config_* and uses:
   uvm_config_db#(TYPE)::get(null, this.get_full_name(), "field", field);

What is p_sequencer and what is the m_sequencer?

m_sequencer belongs to the parent,
p_sequencer is the current sequencer used by the sequence.

virtual sequence `p_uvm_sequence(seq_name) extends uvm_sequence
 `uvm_do_on() instantiate agent sequences
 `uvm_do_on_with() supports a constraint as well

Can we override built in sequence_items methods like compare & copy in
the derived classes?

It is not a good idea to override the built in sequence methods like
compare and copy.  Equivalent do_* methods are provided for this.

>> What is the use of do_* methods?
>> ex: do_compare method and how it's being used?

 The do_* methods are intended for the user to override the built-ins.
 do_compare allows the user to define a compare implementation.

* +++++++:
* *******:

Build a complete System Verilog environment around a given DUT.
Explain in detail the advantages of all components to justify the effort.
How much time would this effort require for an already working environment?

What are the challenges with putting your design on an FPGA?

What simulation tool do you prefer and why?

How do you simulate ATPG vectors?  What vcs features are useful for this?

Show how your template generator would build a UVM environment for a DUT with
multiple interfaces.  Diagram all details of a UVM testbench and environment.

What is the difference between TLM 1.0 and 2.0 and why are they used?

How do you configure UVM components?

How do the different UVM components interact?

What would you use DPI for in a UVM enviornment?

Do you always use DPI instead of PLIs?

How does the driver get transactions from the sequencer?

What languages are best to use for regression environments?

What are the advantages and disadvantages of using Ruby versus Perl?

How do you declare an associative array in System Verilog?

What is the syntax to talk between Verilog and C?
Can this be done completely using DPI?

Print a perl hash with numerically sorted keys.
for(sort keys %myhash) { print }

How do you initialize a perl hash?
%h = (John => "Junior", Sam => "Senior")

What is a PCIe BAR?

What happens when a PCIe device is plugged in (startup)?

How does PCIe2 differ from its predecessor?

What is 8b10 and why is it used?  What is used for this purpose in 10G?

How do you call a C function using a pointer?

* +++++++++:
* *********:

Verify a 2-bit counter with load capability. (MR/DT)

How do you allocate and enlarge a dynamic array? (MR)
data = new(`BIG_SIZE)[old_data]

Find the second largest value in an array of numbers.  Test this code. (HK)
// find second highest number in an array

#include <stdio.h>
#include <stdlib.h>

#ifndef   NUM_BITS
#define   NUM_BITS 8
#endif // num bits

#ifndef   DEF_SIZE
#define   DEF_SIZE 10
#endif // def size

main(int argc,char *argv[])
 int size;
 int mask = (1<<NUM_BITS)-1;
 int status = 0;
     status = 1; // trigger help
     size = atoi(argv[1]);
       status = 1;
       int seed = atoi(argv[2]);
       printf("%s: seed %d\n",argv[0],seed);
       mask = (1<<atoi(argv[3]))-1;
       printf("%s: mask 0x%x\n",argv[0],mask);
   size = DEF_SIZE;

   fprintf(stderr,"usage: %s [size(default %d) [seed [bits(def %0x)]]]\n",
   int x, y, delta;
   int* a = new int [size];

   for(int i=0; i<size; ++i)
     a[i] = random() & mask;

   int count = size > DEF_SIZE ? DEF_SIZE : size; // print short array
   printf("%s: Array contents:",argv[0]);
   for(int i=0; i<count; ++i)
     printf(" %0d",a[i]);

   x = a[0];
   y = x;
   for(int i=1; i<size; ++i)
       y = x;
       x = a[i];
     else if(a[i]>y && x>a[i])
       y = a[i];
   delta = x - y;
   printf("\n\n%s: Array size %d, max %d, 2nd %d, delta %d\n",

 } // main()

What are some of your best and worst verification experiences? (MP)
Dealing with managers!

What are some of the more interesting bugs you have identified? (MP)
All bugs originate with the designer, DV is innocent.
My WEP model disagreed with the specifications and was correct.
An Arm bus bridge that did not handle unaligned transfers correctly.
Issues relating to race conditions in the cache design
and parallel instruction execution.
Most issues them tend to be related to some typographical slips which are
reflected as incorrect data transferring across bridges or being returned
from registers.  Another prevalent category is when the design is not
updated to reflect changes in specifications.
Found by team: An instruction is executing in parallel while a decision is
being made as to whether it should be cancelled and then it failed to be
cancelled when the decision is made to cancel it.

How would you detect dead memory cycles such as from a bridge? (MP)
Assertions should verify that active inputs are followed by outputs.
We actually found a bug with this approach.

How do you improve the Scan fault coverage for outputs that feed Analog logic?
You don't ??? (RB)

Write a factorial function and describe how to verify it. (HK)

long fact(long x)
 x *= fact(x-1);
 x = 1;
} // fact

* ++++++:
* ******:

Design a counter that counts from 0 to 4 and has a zero pulse.
 => SSS

Write constraints to generate two memory ranges C-D and E-F within A-B.
 The two ranges should not overlap.
 => { C > D, E > F, }

* +++++:
* *****:

Shuffle a deck of cards.
#define SIZE 52
for(i=0; i < SIZE; ++i)
 { // Shuffle by swapping random locations
 int x = random() % SIZE;
 int tmp = d[i];
 d[i] = d[x];
 d[x] = tmp;

// More elaborate solution (needs more work).
#define SIZE 52
typedef enum { SPADES, CLUBS, DIAMONDS, HEARTS } suit_t;
suit_t suit;
typedef struct { suit s; int value } card_t;
card_t deck[SIZE];

for(i=0; i < SIZE; ++i)
 { // Shuffle by swapping random locations
 int x = random() % SIZE;
 card_t tmp = deck[i];
 deck[i] = deck[x];
 deck[x] = tmp;

                             * FromTheNet *

+ The internet address or each source has been provided.
+ As required: Grammar cleaned up, answers provided or improved.
+ Link contains answers.

What is callback ?
What is factory pattern ?
Explain the difference between data types logic and reg and wire
A wire is a net that connects nodes like a physical wire.
It is assigned by a port or assign and read but does not store any value.
A wire goes to X if it has multiple active drivers.
Legacy Verilog requires wires to connect to port signals.
reg is the legacy data storage element and does not infer a register.
It retains the last driven value and be a FF, latch, or combinatorial logic.
In legacy Verilog a reg cannot be directly connected to a port.
logic is a new term synonymous with reg and in System Verilog this
construct can be connected to a port just like a net (wire).
What is the need of clocking blocks ?
What are the ways to avoid race condition between testbench and RTL using
Explain Event regions in SV.
What are the types of coverages available in SV ?
What is OOPS?
What is inheritance and polymorphism?
What is the need of virtual interfaces ?
Explain about the virtual task and methods .
What is the use of the abstract class?
What is the difference between mailbox and queue?
What data structure you used to build scoreboard
What are the advantages of linkedlist over the queue ?
How parallel case and full cases problems are avoided in SV
What is the difference between pure function and cordinary function ?
What is the difference between $random and $urandom?
What is scope randomization
List the predefined randomization methods.
What is the difference between always_combo and always@(*)?
What is the use of packages?
What is the use of $cast?
How to call the task which is defined in parent object into derived class ?
What is the difference between rand and randc?
What is $root?
What is $unit?
What are bi-directional constraints?
What is solve...before constraint ?
Without using randomize method or rand,generate an array of unique values?
Explain about pass by ref and pass by value?
What is the difference between bit[7:0] sig_1; and byte sig_2;
What is the difference between program block and module ?
What is final block ?
How to implement always block logic in program block ?
What is the difference between fork/joins, fork/join_none fork/join_any ?
What is the use of modports ?
Write a clock generator without using always block.
What is forward referencing and how to avoid this problem?
What is circular dependency and how to avoid this problem ?
What is cross coverage ?
Describe the difference between Code Coverage and Functional Coverage Which is more
important and Why we need them
How to kill a process in fork/join?
Difference between Associative array and Dynamic array ?
Difference b/w Procedural and Concurrent Assertions?
What are the advantages of System Verilog DPI?
How to randomize dynamic arrays of objects?
What is randsequence and what is its use?
What is bin?
Why always block is not allowed in program block?
Which is best to use to model transaction? Struct or class ?
How SV is more random stable then Verilog?
Difference between assert and expect statements?
expect - An expect statement is very similar to an assert statement,
but it must occur within a procedural block (including initial or
always blocks, tasks and functions), and is used to block the
execution until the property succeeds.

How to add a new process without disturbing the random number generator state ?
What is the need of alias in SV?
What is the need to implement explicitly a copy() method inside a transaction , when
we can simple assign one object to other ?
How different is the implementation of a struct and union in SV.
What is "this"?
What is tagged union ?
What is "scope resolution operator"?
What is the difference between Verilog Parameterized Macros and SystemVerilog
Parameterized Macros?
What is the difference between

>   1.  logic data_1;
>   2.  var logic data_2;
>   3.  wire logic data_3;
>   4.  bit data_4;
>   5.  var bit data_5;

What is the difference between bits and logic?
Write a State machine in SV styles.
What is the difference between $rose and posedge?
What is advantage of program block over clock block w.r.t race condition?
How to avoid the race condition between program block ?
What is the difference between assumes and assert?
What is coverage driven verification?
What is layered architecture ?
What are the simulation phases in your verification environment?
How to pick a element which is in queue from random index?
What data structure is used to store data in your environment and why ?
What is casting? Explain about the various types of casting available in SV.
How to import all the items declared inside a package ?
Explain how the timescale unit and precision are taken when a module
does not have any timescale declaration in RTL?
What is streaming operator and what is its use?
What are void functions ?
Functions that do not return a value.

How to make sure that a function argument passed as ref is not modified?
Pass it as a const.

What is the use of "extern"?
extern allows an out-of-body declaration using the scope resolution ::.

What is the difference between initial block and final block?

How to check whether a handle represents an actual object?
+ Check if it is set to null.

How to disable muGtiple threads which are spawned by fork...join
+Already answered in detail. -- great website which sadly disappeared!
"The greatest of all faults is to be conscious of none"
(40)What are the differences between vmm and ovm/uvm ?
 VMM is an alternate implementation, OVM is the foundation of UVM-1.0EA

(41)What is the advantage of using an UVM agent?
 It encapsulates the components that share an interface.

(42)How multiple set_config_* are resolved at same hierarchy level and at
 different hierarchy levels?
 The configuration at the highest hierarchy level is used.
   test -> env - agent -> driver

(43)Is it possible to connect multiple drivers to one sequencer?  How?
 Each driver can do a TLM peek and the last one does a get.
 However the drivers must be coordinated to not peek at the same time.
 A cleaner approach is to create another driver which gets the item
 from the sequencer and passes it to multiple driver instantiations.

(44)What is the difference between factory and callbacks ?
 Factory overrides are static for the duration of the simulation.
 Callbacks are dynamic and are only present after being registered and
 can be subsequently removed or turned off.

(45)Explain the mechanism involved in TLM ports.
 TLM ports communicate between components at the transaction level.

 The 23 TLM-1 ports define which access methods to use.
 Each is a subclass of the uvm_port_base class which in turn is a
 subclass of the uvm_tlm_if_base class; uvm_seq_item_pull_port is a
 subclass of uvm_sqr_if.
 They support a combination of blocking and non-blocking operations.
 A blocking interface task does not return until the transaction has
 been consumed.
 A non-blocking function does not consume simulation time and returns a
 failure status if the component is busy and cannot accept the request.
 There are also master and slave specific ports.
 This is a simplified subset of the ports:

       |   uvm_tlm_if_base #(REQ,RSP)                    |
       | uvm_port_base #( uvm_tlm_if_base #(T,T))        |
       | uvm_[[non]blocking]_{put,get,peek}_port #(T)    |

(46)Why is the TLM fifo used ?
 The TLM fifo is used to store multiple transactions.

 The TLM FIFO provides a standard interface to the module
 which improves portability.
 There are two FIFOs: uvm_tlm_fifo and uvm_tlm_analysis_fifo
 and two channels: uvm_tlm_req_rsp_channel uvm_tlm_transport_channel
 The analysis version is only required when there is an analysis port.

(47)How to add user defined phase?
 The phase class can be extended and new phases can be added using an
 add_phase() function.
 It is also useful to be able to jump phases such as doing another
 reset after the run phase.

(48)What are the ways to get the configuration information inside component?
 uvm_config_db(type)::get(context,"instance_name "field,"value")

(49)Is it possible to use get_config_obj inside a sequence class?
 A sequence class can use get_config_obj in this manner:
 uvm_config_db#(my_type)::get(null, get_full_name(), "foo", foo);

(50)What is the difference between create and new method?
 Create is a factory method that allows things to be overridden which
 is useful for things like error injection.
 New instantiates the object and cannot be changed later.

(51)What is virtual sequencer? Explain by writing example.
 A virtual sequencer and virtual sequence are required when multiple DUT
 interfaces need to be synchronized.
 A virtual sequence does not contain its own items.

(52)How a sequence is started?
 A sequence can be started explicitly using: seq_obj.start(Sequencer)
 It can also be started implicitly:

(53)Explain end of test mechanism.
 The test ends when all raised objections are dropped.
 Active transactions should always have a corresponding objection.

(54)When/how the stop() method is called?
 User code or assertions can invoke $stop when a condition requires
 further debug.
 The stop method is also invoked when the severity action is set to

(55)What is port/imp/export?
 A component such as the monitor drives the transaction out through
 a port.  The agent then exports the transaction to the implementation
 which models the device behaviour.

(56)Which phase is top down, and which phase is bottom up?
 The build and final phases are top down, all others are bottom up.

(57)In which phase method, super method is required to call.
 What if the super is not called ?
 The super method is generally called for all but the run phase.
 If the super is not called the parent class properties will not be

(58)Explain the different phases and their purpose.
 Only the build() and final are top down, only the run() takes time.
 build - construct and cfg various child components/ports/exports.
 connect - connect the component ports/exports.
 end_of_elaboration - configure the components if required.
 start_of_simulation - print the banners and topology.
 run -  execute test body and fork all threads, only phase task.
   The test is selected using: +UVM_TESTNAME=<testcase>
   The run phase consists of twelve divisions:
 extract - gather required information.
 check - check pending requests in scoreboard, read regs/statistics.
 report - report pass/fail status.
 final - simulation is about to end and print final messages

(59)How to use factory override a sequence?
 By name or by type.
 A factory can be used to override a single instance using
 override_instance or all instances of a type acquired using a static
 method override_type(class_name::object.get_type).

(60)Explain how scoreboard is implemented.
 A scoreboard is implemented using two analysis exports, along with
 a queue and an associative array to hold the transactions.
 A SV scoreboard can is generally implemented using queues but can
 also be implemented using mailboxes for the egress traffic.

(61)What is the use of subscriber?
 A subscriber has a built in analysis_export and is typically used with
 a monitor which passes the transaction using a write function.  It is
 used to collect functional coverage or check the trans validity.

(62)Explain about run_test().
 run_test() runs a specified test, default test, or one named using
 +UVM_TESTNAME which also defines UVM_TESTNAME to the test name:
   if(UVM_TESTNAME == long_test) num_iterations *= 10;
 The uvm_root singleton object uses the factory to construct a
 component of that class with name "uvm_test_top."

(63)How interface is passed to component.
 An interface is fetched and set using a uvm_config_db get/set pair.
 SV interfaces handles are passed to a constructor which assigns
 the value to its virtual interface.

(64)What is the different b/w active agent and passive agent?
 An active agent drives a bus whereas a passive agent only acts as a
 monitor.  Ideally monitors should have a switch that allows the driver
 to go into a passive mode and allow only the monitor to run.

(65)Explain how a interrupt sequence is implemented.
 Interrupts are device specific and can be managed as a transaction.
 Interrupt sequences are implemented using grab(), ungrab() methods.

(66)Explain how layered sequencers are implemented.
 Each layered sequence is defined as a separate class with the next
 layer sequence contained as a payload.
 For example the PCIE physical layer contains the data link layer as
 its payload which in turn contains the (PCIe) transaction layer.

(67)How to change the verbosity level of log messages from command line ?

(68)How to set verbosity level to one particular component?

(69)How to fill the gap b/w different objections?
 The recipient must drop the objection first.

(70)What is the use of uvm_event?
 The uvm_event is part of an event pool that provides a synchronization
 mechanism that enhance the equivalent System Verilog functionality.
 It supports two distinct operational modes:
   edge sensitive & level sensitive

(71)How to connect multiple sequencers to one driver ?
??      Using a virtual sequencer.

(72)What is return by get_type_name()?
 The get_type_name() call returns a string value of the object type.

(73)How to use set_config_* for setting variables of sequence?
 UVM1.2 deprecates set_config_* and uses:
 uvm_config_db#(TYPE)::get(null, this.get_full_name(), "field", field);

(74)For debugging purpose, how to print the current component phase?
 The UVM phase is automatically printed by the `uvm message macros.

(75)What is the disadvantage if sequence is registered to sequencer using
 utility macros? If sequence is not registered with sequencer, then how
 to invoke the sequence execution ?
 It cannot be used by other sequencers and must be managed explicitly.

(76)What is the use of uvm_create_random_seed?

(77)What is the difference b/w starting a sequence using default_sequence
 and sequence.start() method?
 The default sequence is only used when nothing else needs to be run.

(78)I don't want to register a sequence to sequencer.
??      What is the alternate macro to uvm_sequence_utils() macro ?
 The sequence can be instantiated followed by the call

(79)Explain how UVM callbacks work?
 UVM callbacks are implemented using virtual pre/post methods which
 the user can define.

(80)What is the difference between m_sequencer and p_sequencer?
 m_sequencer belongs to the parent, p_sequencer is the current
 sequencer used by the sequence.

 virtual sequence `p_uvm_sequence(seq_name) extends uvm_sequence
   `uvm_do_on() instantiate agent sequences
   `uvm_on_on_with() supports a constraint as well

SystemVerilog Interview Questions 7

1)  Difference between Associative array and Dynamic array ?


 Dynamic arrays are useful for dealing with contiguous collections of variables whose
number changes dynamically.
 e.g.            int array[];
 When the size of the collection is unknown or the data space is sparse, an
associative array is a better option. In associative array, it uses the transaction
names as the keys in associative array.
e.g.            int array[string];

2)  What are the advantages of SystemVerilog DPI?

SystemVerilog introduces a new foreign language interface called the Direct
Programming Interface (DPI). The DPI provides a very simple, straightforward, and
efficient way to connect SystemVerilog and foreign language code unlike PLI or VPI.

3)  What is bin?

A coverage-point bin associates a name and a count with a set of values or a sequence
of value transitions. If the bin designates a set of values, the count is incremented
every time the coverage point matches one of the values in the set. If the bin
designates a sequence of value transitions, the count is incremented every time the
coverage point matches the entire sequence of value transitions.

program main;
 bit [0:2] y;
 bit [0:2] values[$]= '{3,5,6};

 covergroup cg;
   cover_point_y : coverpoint y
                   { option.auto_bin_max = 4 ; }

 cg cg_inst = new();
      y = values[i];


4) What are void functions ?
A void function does not have a return value.

5) What is coverage driven verification?

Coverage Driven Verification is a result oriented approach to functional verification.
The manager and verification terms define  functional coverage points, and then work
on the detail of process.
Used effectively coverage driven verification focuses the Verification team on
measurable progress toward an agreed upon comprehensive goal.

6) Explain about pass by ref and pass by value?
Pass by value is the default method through which arguments are passed into functions
and tasks. Each subroutine retains a local copy of the argument. If the arguments are
changed within the subroutine declaration, the changes do not affect the caller.

In pass by reference functions and tasks directly access the specified variables
passed as arguments.Its like passing pointer of the variable.

task pass(int i)    //  task pass(var int i) pass by reference
i = 1;
printf(" i is changed to %d at %d\n",i,get_time(LO) );
i = 2;
printf(" i is changed to %d at %d\n",i,get_time(LO) );

7) What is the difference between program block and module ?

The module is the basic building block in Verilog which works well for Design.
However, for the testbench, a lot of effort is spent getting the environment properly
initialized and synchronized, avoiding races between the design and the testbench,
automating the generation of input stimuli, and reusing existing models and other

System Verilog adds a new type of block called program block. System Verilog adds a
new type of block called program block. The program construct serves as a clear
separator between design and testbench, and, more importantly, it specifies
specialized execution semantics in the Reactive region for all elements declared
within the program. Together with clocking blocks, the program construct provides for
race-free interaction between the design and the testbench, and enables cycle and
transaction level abstractions.

8) Describe the difference between Code Coverage and Functional Coverage Which is more
important and Why we need them.

Code Coverage indicates the how much of RTL has been exercised. The Functional
Coverage indicates which features or functions has been executed. Both of them are
very important.  With only Code Coverage, it may not present the real features
coverage. On the other hand, the functional coverage may miss some unused RTL coverage.

Below are the most frequently asked UVM Interview Questions,

What is uvm_transaction, uvm_seq_item, uvm_object, uvm_component?
What is the advantage of  `uvm_component_utils() and `uvm_object_utils() ?
What is the difference between `uvm_do and `uvm_ran_send?

What is the difference between uvm_transaction and uvm_seq_item?

uvm_transaction is the parent class for uvm_seq_item.
uvm_transaction usage is discouraged and will likely be deprecated.

What is the difference between uvm_virtual_sequencer and uvm_sequencer ?
A uvm_virtual_sequencer does not have a one-to-one corresponding sequence.

What are the benefits of using UVM?
What is super keyword? What is the need of calling and super.connect()?
Is UVM independent of System Verilog ?

Can we have user defined phase in UVM?
User defined phases can be added using add_phase.

What is p_sequencer ?
p_sequencer is the local user defined handle.

What is a UVM RAL model ? why it is required ?
What is the difference between new() and create?
What is analysis port?
What is TLM FIFO?
How sequence starts?
What is the difference between UVM RAL model backdoor write/read and front door
write/read ?
What is objection?
What is the advantage of `uvm_pre_body and `uvm_post_body ?
What is the difference between Active mode and Passive mode?

What is the difference between copy and clone?
Unlike clone the copy method does not copy the name field.

What is UVM factory?

What are the different types of sequencers?  Explain them.

??  Sequencers can be push or pull, request or request & response.

What are the different phases of uvm_component? Explain each?
How set_config_* works?
Ehat are the advantages of UVM RAL model ?

What is the different between set_config_* and uvm_config_db ?
set_config has been deprecated.

What  are the different  override types?
What is virtual sequence and virtual sequencer?
Explain end of simulation in UVM?

How to declare multiple imports?

Multiple imports can be declared using `uvm_decl_imp_declare(name)

What is symbolic representation of port, export and analysis port?
What is the difference in usage of $finish and global stop request in UVM?
Why we need to register class with UVM factory?
can we use set_config and get_config in sequence ?

What is uvm_heartbeat ?

The uvm_hearbeat uses UVM's objection mechanism so that environments can
ensure that their descendants are alive.

how to access DUT signal in uvm_component/uvm_object ?

Below are the most frequently asked System Verilog Interview Questions,

What is the difference between initial and final block of System Verilog?
Explain simulation phases of System Verilog verification?
What is the Difference between System Verilog packed and unpacked array?
What is "this " keyword in System Verilog?
What is alias in System Verilog ?
randomized in System Verilog test bench?
in System Verilog which array type is preferred for memory declaration and why?
How do you avoid race conditions between the DUT and the test bench?
Use the program block.

What are the advantages of System Verilog program block?
What is the difference between logic and bit in System Verilog ?
What is the difference between data type logic and wire?
What is virtual interface?
What is abstract class?
What is the difference between $random and $urandom?
What is expect statements in assertions ?
What is DPI ?
What is the difference between == and === ?
What are system tasks ?
What is System Verilog assertion binding and advantages of it ?
What are parametrised classes ?
How to generate array without randomisation ?
What is the difference between always_comb() and always@(*) ?
What is the difference between overriding and overloading ?
Explain the difference between deep copy and shallow copy?

Below are the most frequently asked SystemC interview questions,

What is SystemC SC_HAS_Process?
What is the difference between SystemC sc_int and sc_bigint ?
What is the difference between SystemC SC_METHOD and SC_THREAD?
what are different types of sensitivity in SystemC?
SC_METHOD preferable over SC_THREAD.Why?
what is context switching in SC_THREAD?
Explain the SytemC simulation kernel?
what is virtual prototyping?
What is end of elaboration and before end of elaboration?
what is SC_ZERO_TIME?
What is the difference between sc_port and sc_export?
What are the features of TLM 2.0
What are the difference between TLM 1.0 to TLM 2,0?
What are the different transport interfaces in TLM 2.0?
Difference between method and thread?
Which one you choose for implementation?
what is sc_zero_time in System C and what is the use?
What is the use of dont_initialize ?
Explain SystemC next_trigger and wait difference?
Why is utility sockets  are under non interoperable layer?
Explain Debug and Direct memory interface?
What is temporal decoupling ?
Explain mutex and semaphore?

** Pending **


* +++++:
* *****:

1) a 4 to 1 packet demux.  4 inputs, coming in each with their own queue, scheduler
selects one of the four to go out.  Describe at a high level, how would you verify
(major tb components, corner cases, assertions, etc).  Write the scoreboard for this

2) What was a technical challenge you encountered and how did you overcome it

3) Describe my AQ chip level tb.  The challenge with this question was that I was
given like 10 minutes to describe it.  The key to answering this properly is to have
an answer that describes it in logical fashion, while hitting all the challenging

4) Imagine you joined a team w/ only design engineers and you are the only verif
enginner.  They run a sanity test before checking in any rtl code, but the sanity test
takes an hour.  How would you fix that?

5) Imagine you have an old school telephone.  It takes one of 4 options as an input.
 a) 0 - for the operator
 b) 911 - for emergency
 c) 7 digit code for local
 d) 10 digit code for international, that always starts with 1.

Write the data class for this scenario.

6) You have an synchronous fifo, where the input can come in any number of bytes,
while the output can request any number of bytes.  Write a scoreboard for this.
  Ex, let say the following scenario is plausible
            Time x:   Input drives 3 bytes
            Time x+1 : Input drives in 5 bytes
            Time x+2: Input drives in 2 bytes
            Time x+3: Output requests 10 bytes

7) Lets say you are given a block to verify, How do you go about doing this?  
       This was basically to see if you understand the basic verification flow, from
architecture spec to gate level sim/coverage closure.

8) Let's say you have a block which is connected to a network on one side and a ddr
module on the other.  Let say that the block receives network packets which provide
either a read or write command to the ddr..  If it is a write, the data will be
provided in that packet.  If it is a read, you would send the data back on a response
packet.  How would you verify this?  Write code for a data class for this.

9) Write a sv class which acts as a memory manager.  The class has two functions a req
(int bytes), free(addr, bytes).  Fill out the code for the two functions.  The free
may not free the entire block, the free can free a subset of a previously granted

10) You have a block w/ a data_valid, data, and ready signal.  Write a driver for it.

//Instantiating the memory
uvm_mem   xyz_mem;
xyz_mem = new("xyz_mem_name", xyz_mem_size, 8, "RW", UVM_NO_COVERAGE);

// Getting a random memory region
uvm_mem_region tmp_mem_reg;
bit[63:0] addr;

tmp_mem_reg = xyz_mem.mam.request_regrion( region_size);
addr = tmp_mem_reg.get_start_offset() + xyz_mem_base_addr;

// Releasing a region

* +++++++++++:
* ********** :
1) How will you write a coverpoint for instruction set of a CPU? How will you make
sure that a hit in the bin is a genuine hit?

2) Write a cover point for sequence of instructions. It is a single bin which covers a
sequence of instructions. For example RD->WR->ADD.

3)Which hierarchy of UVM you should collect coverage?

4) Write a transaction class for a dummy CPU?

5)What is driver driving on the bus when there is a wait in sequence. For example
there is #DELAY in sequence before sending a new transaction to the driver?

6) Write a c program for finding a duplicate index in an array and return the
duplicate value

Asic Guru Review -
Event Regions: Active - Assigns values, Reactive - Responds to assignments.
Coverage Types - Line Branch Expression Functional
Abstract Class
Diff between mbox and queue?
 You can access and modify any part of queue.
$random([seed]) $urandom([seed]) $urandom_range(min,max), +ntb_random_seed=x
Solve before constraints randsequence
Final block
How do you avoid circular dependencies ?
Diff between Code and Functional coverage.
Alias in SV.
Tagged Union.
What are the simulation phases.
How to disable threads.
Streaming Operator.

Other quesitons:
VMM/UVM phases:
Reactive region
Polymorphism - Replace virtual methods in derived/extended class.
Random Stability - Random generators are independent

Allocate a memory region:

# ^s^3^210101^0^^^^|72|^-78-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^