Bugs? Not in my schedule please.

I have rescued a few designs during my 25+ years at the helm of Aspen Logic's FPGA/ASIC contracting business. Reasons abound for the causes of the failures but there are three areas that need consistent focus:

  1. Designers give little up front thought to logic reset which leads to insidious, hard to find bugs
  2. Management typically assigns a cost of $0 to bugs because FPGA devices are re-programmable so no time is allocated in the schedule for squishing them
  3. Lack of a road map for features and fixes which makes decision making on large design efforts difficult

Reset Logic

#1 just requires an up front process to recognize the significance of this rather boring chore followed up by design review coverage. Pay special attention to 3rd party intellectual property cores and ask your vendor how their block is reset. If it contains a mixture of asynchronous and synchronous resets along with multiple clock domains -- run for the hills, pick a new vendor or demand to see a detailed explanation of how those are all implemented!

Zap Zap Zap (Go the Bugs)

Most of this post covers #2 as it requires zapping project management to wake them up. You like the idea of zapping your boss, task-master, Voldemort project manager (PM) but not necessarily getting fired am I right? 

A quick way to zap is to insist on adding line items to your next project plan for fixing bugs and then assign outrageous $$$ signs to them. For example, tell the PM to budget for removal of 50 simple bugs, 20 hard bugs and 5 wicked bugs. (Scale the numbers with the complexity of the project). Make each wicked bug take 40-80 days to solve, hard bug 10-15 days and simple bug 1-2 days. Explain that wicked bugs occur because of requirement creep impacting the implementation with no way to regression test new feature code, poor or non-existent design reviews and schedule driven code creation with little or no up front design effort. Hard bugs occur when engineers do not get time for writing block/module/unit test benches. Easy bugs occur because engineers race to meet artificial deadlines, miss simple details and make bonehead mistakes. (This is just a concrete way of starting a discussion -- bugs are a result of many things including the items mentioned.)

Note that every project manager on the planet prohibits bug removal as one big line item in their schedules because the duration is impossible to predict and therefore their delivery date becomes impossible to predict. Writing HDL code delivers something concrete they can measure so schedules are always built around those tasks with a fixation bordering on mania.

Many could not write a line of code to save their mothers but they know, conveniently, it will only take you 11.5 days to code (meaning design, write HDL, test, debug, document) block #1! After checking off the coding tasks, everybody is goosed to delivery a bug free design through long hours and overtime which unfortunately just makes for tired engineers and more bugs. PM's always assume free overtime effort will cure any schedule deficiency and frequently state, "By the way, I thought it was free to re-program FPGAs. Isn't that why we bought them?" :-)

Do not fall into the gaping maw of this trap. Insist on predicting bug killing time up front, and separately from design, then you can have a fist fight discussion how to spend time and money effectively to reduce (but not entirely eliminate) it. Negotiate the time needed to fix bugs, but not the number of them, by insisting that design, verification and review will reduce the severity of each class of defect. Educate them that controllability and observability are key to defect diagnosis and that having the proper tools (hardware and/or software) and resources (50% unused LUTs/memory/routing for internal FPGA logic analyzers) are critical so they must be considered early in the budgeting process.

Road Map (to salvation)

Lastly, from now on, I am insisting that customers prepare configuration management plans up front that contain a malleable road map document which identifies the build versions that will contain new features and bug fixes. That keeps the pressure and the creep down as people can at least sit in a room and make rational decisions about the costs and priorities based on a written, high-level plan that evolves as the project progresses. Try these ideas on your next project and let me know how they go.