Skip to content

[Synthesis] nangate45/swerv_wrapper QoR fluctuation due to Yosys abc_new nondeterminism #4056

@jhkim-pii

Description

@jhkim-pii

Describe the bug

On a fresh ORFS checkout, repeated synthesis-only runs of nangate45/swerv_wrapper can produce different 1_2_yosys.v netlists, even when the RTL and flow settings are unchanged.

This can create a false early QoR delta in downstream OpenROAD stages.

Observed symptom:

  • Re-running synthesis in the same checkout can change 1_2_yosys.v.
  • That can change 1_synth.odb and early downstream artifacts.
  • The resulting QoR delta is a false positive caused by synthesis nondeterminism.

Example CLI to show run-to-run hashes:

sha256sum results/nangate45/swerv_wrapper/repeat*/1_2_yosys.v

Example hashes observed from repeated runs on the same canonical RTLIL input:

2c4cd6c367f1fcdd1a435a546d3678f31d3846f06baea926e746ca06d11e8090
3cd1746d2f44a8b83b5a24b7a4c95f680ba62df3ca442e3170d542780f626936
ce255a889b890faadccd788b0e655c2def35d0c838bdd578a335fc09fbbb92c4

First visible divergence in logs is the abc_new module visitation order. Example excerpt from two repeated runs:

Run 1
295: 20.5. Mapping module 'ALU_33_0_33_0_33_unused_CO_X_HAN_CARLSON'.
306: 20.8. Mapping module 'ALU_64_0_64_0_64_HAN_CARLSON'.
315: 20.11. Mapping module 'ALU_19_0_1_0_19_unused_CO_X_HAN_CARLSON'.
324: 20.14. Mapping module 'ALU_32_0_1_0_32_unused_CO_X_HAN_CARLSON'.
333: 20.17. Mapping module 'ALU_32_0_32_0_33_unused_CO_X_HAN_CARLSON'.
342: 20.20. Mapping module 'ALU_20_0_1_0_20_unused_CO_X_HAN_CARLSON'.

Run 2
295: 20.5. Mapping module 'ALU_32_0_1_0_32_unused_CO_X_HAN_CARLSON'.
306: 20.8. Mapping module 'ALU_32_0_32_0_33_unused_CO_X_HAN_CARLSON'.
315: 20.11. Mapping module 'ALU_64_0_64_0_64_HAN_CARLSON'.
324: 20.14. Mapping module 'ALU_32_0_32_0_32_unused_CO_X_HAN_CARLSON'.
333: 20.17. Mapping module 'ALU_20_0_1_0_20_unused_CO_X_HAN_CARLSON'.
342: 20.20. Mapping module 'ALU_33_0_33_0_33_unused_CO_X_HAN_CARLSON'.

Expected Behavior

  • Re-running synthesis with the same RTL and the same settings should produce the same 1_2_yosys.v.
  • OpenROAD QoR comparisons should not be confounded by nondeterministic Yosys output.

Yosys/ABC standalone test case

https://drive.google.com/file/d/1R3TRw4duoD6hwauSTt6WyhBsIrcUojeR/view?usp=drive_link

./run.sh -n 10           # 10 runs in parallel
./run.sh -n 10 --yosys <yosys_exec_path> --abc <abc_exec_path>    # Use the given executables

Reproduce From A Fresh ORFS Clone

Clone and build:

git clone --recursive https://github.com/The-OpenROAD-Project/OpenROAD-flow-scripts.git
cd OpenROAD-flow-scripts
./build_openroad.sh --local
source ./env.sh

Run synthesis twice in the same checkout:

cd flow

make -j1 NUM_CORES=16 \
  FLOW_VARIANT=run1 \
  DESIGN_CONFIG=./designs/nangate45/swerv_wrapper/config.mk \
  clean_synth \
  results/nangate45/swerv_wrapper/run1/1_2_yosys.v

make -j1 NUM_CORES=16 \
  FLOW_VARIANT=run2 \
  DESIGN_CONFIG=./designs/nangate45/swerv_wrapper/config.mk \
  clean_synth \
  results/nangate45/swerv_wrapper/run2/1_2_yosys.v

sha256sum \
  results/nangate45/swerv_wrapper/run1/1_2_yosys.v \
  results/nangate45/swerv_wrapper/run2/1_2_yosys.v

If the first 2 runs happen to match, run 5-10 times:

for i in $(seq -w 1 10); do
  make -j1 NUM_CORES=16 \
    FLOW_VARIANT=repeat${i} \
    DESIGN_CONFIG=./designs/nangate45/swerv_wrapper/config.mk \
    clean_synth \
    results/nangate45/swerv_wrapper/repeat${i}/1_2_yosys.v
done

sha256sum results/nangate45/swerv_wrapper/repeat*/1_2_yosys.v

If multiple hashes appear, the issue is reproduced.

Failure Flow

flowchart TD
    A[Same RTL] --> B[Repeat synth]
    B --> C{abc_new / ABC9}
    C -->|unstable<br/>order| D[Different<br/>1_2_yosys.v]
    D --> E[Different early<br/>ORFS artifacts]
    E --> F[False QoR delta]

    C -->|stable<br/>order| G[Same<br/>1_2_yosys.v]
    G --> H[Stable QoR<br/>comparison]
Loading

Why This Happens

Immediate cause

abc_new depends on module iteration order.

Relevant function:

// passes/techmap/abc_new.cc
std::vector<Module*> order_modules(Design *design, std::vector<Module *> modules)
{
    ...
    selected_modules = order_modules(active_design,
                                     active_design->selected_whole_modules_warn());
    ...
}

selected_whole_modules_warn() comes from Design::selected_modules() and iterates modules_ directly.

Raw member type:

// kernel/rtlil.h
dict<RTLIL::IdString, RTLIL::Module*> modules_;

Relevant function:

// kernel/rtlil.cc
std::vector<RTLIL::Module*> RTLIL::Design::selected_modules(
    RTLIL::SelectPartials partials,
    RTLIL::SelectBoxes boxes
) const
{
    ...
    for (auto &it : modules_)
        result.push_back(it.second);
    ...
}

So abc_new / ABC9 can visit modules in different orders across runs. That changes internal naming and finally changes write_verilog output.

About dict<>

The relevant container is:

kernel/rtlil.h
// kernel/rtlil.h
dict<RTLIL::IdString, RTLIL::Module*> modules_;

Template declaration and storage:

// kernel/hashlib.h
template<typename K, typename T, typename OPS = hash_ops<K>> class dict;

template<typename K, typename T, typename OPS>
class dict {
    ...
    std::vector<int> hashtable;
    std::vector<entry_t> entries;
    ...
};

iterator is a nested class inside dict<K, T, OPS>. So the iterator used by:

for (auto &it : modules_)

is dict<RTLIL::IdString, RTLIL::Module*>::iterator, and it walks the internal entries array of that specific dict instance.

Relevant nested iterator:

// kernel/hashlib.h
template<typename K, typename T, typename OPS>
class dict {
    ...
    class iterator
    {
        ...
        iterator operator++() { index--; return *this; }
        std::pair<K, T> &operator*() { return ptr->entries[index].udata; }
        ...
    };
};

Insertion path:

// kernel/hashlib.h
entries.emplace_back(...);
hashtable[hash] = entries.size() - 1;

Hypothesis

Important clarification:

  • dict<> itself appears deterministic.
  • The issue is not that dict<> iteration is random.
  • The issue is that dict<> iteration is not canonical / key-sorted.
  • Iteration follows internal entries order, which depends on insertion / erase history.

So the most likely ND mechanism is:

upstream pass order or object creation order changes
-> modules_ insertion history changes
-> dict iteration order changes
-> selected_modules() returns modules in a different order
-> abc_new sees a different module visitation order
-> 1_2_yosys.v changes

This is an inference from the observed behavior:

  • Same input.
  • Same binary.
  • Different abc_new module visitation order.
  • Stability restored when Design::sort() is forced before abc_new.

Relationship summary:

Design::modules_
  = dict<IdString, Module*>
  -> stored in dict::entries
  -> iterated by dict::iterator
  -> consumed by Design::selected_modules()
  -> consumed by abc_new

Workarounds

ORFS local workaround

Insert sort immediately before abc_new in flow/scripts/synth.tcl:

sort
log_cmd abc_new {*}$abc_args

Yosys workaround

Insert active_design->sort() at the start of the abc_new map stage:

// passes/techmap/abc_new.cc
if (!help_mode) {
    active_design->sort();
    selected_modules = order_modules(active_design,
                                     active_design->selected_whole_modules_warn());
}

The sort() call itself is defined here:

// kernel/rtlil.cc
void RTLIL::Design::sort()
{
    scratchpad.sort();
    modules_.sort(sort_by_id_str());
    for (auto &it : modules_)
        it.second->sort();
}

Both mitigations were verified to stabilize repeated synthesis runs.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions