Assessing the project health of an electronic system design effort involves analyzing a complex critical path that a global team needs to execute.
Time Capsule: Project Health for Electronic Systems Design (2009)
This is an old essay on assessing the health of electronic system design projects I originally wrote for Achilles Test back in 2009. I have recovered it as a “time capsule” post because the trends we observed in 2009
Trends in Project Health
Most organizations use project health as a way to manage trade-off between development time, cost, and quality. Various measurements are used to access the performance or effectiveness of a project team. Without measurements they would be subjective guess. However traditional methods like measure test coverage or measuring design stability are falling short because of fundamental changes that are occurring in electronic design. Here are some of the top trends in design and the impact they have on tracking project health.
1. It’s an intelligent mesh, not just a flow.
There is no longer a sequential design flow in electronic design: there is architectural, layout, and verification exploration. Layout and verification often start before architecture and implementation details are finished. IP blocks further parallelize, loop and interconnect the project flow. With a mesh development, the next extremely difficult question is resource allocation. How do you apply people, machine, licenses, and testing runs? What stage needs more resources? What will be the resource impact on development time, cost and quality?
2. Global teams
Expect more “follow the sun” design work. However for distributed engineering groups, it can be difficult to manage and track progress. Global teams must share data, status, and issues. Project team can no longer rely on self report to get a picture of where things are. Teams must find ways to automate sharing status and escalating and brainstorming on issues.
3. Multiple dimensions of analysis
Engineering is managing constraints and trade-offs. A workable plan must balance cost, performance, schedule and quality to develop a useful design. Whole teams are dedicated to power, performance, fault, routing and timing. Each group needs to communicate and escalate optimization and trade-off decisions.
4. Runtime jobs growing faster than transistor count
Sure, transistors counts are growing but the jobs running analysis are exploding. Teams routinely run many hundreds of builds and automated tests. Expect to see this trend continue. How are you going to manage it? Are you getting best use of software licenses and CPU resources? Are you spending more on licenses than hardware?
5. Summarizing and Analysis of More Testing:
As system size grows, manual testing typically cannot keep up. So everyone is turning to test-data generation. But test generation–regardless of the framework used–requires manual sorting and analysis of the test results. Developers must wade though an ocean of data looking for warnings and errors. The result is that–if you don’t automate the data collection and analysis–you will spend all your time grepping log files. If you add more CPU resources and run more jobs, can you analyze the output by hand? You must automate the analysis of the results as well after a certain point or further test generation is pointless.
SKMurphy Take
We put this image on a poster for the ICCAD conference in 2008 and it worked like flypapers. Engineers would be walking along and just stop to study it for five minutes. We did a Birds of a Feather on “Managing Project Health” at the 2009 Design Automation Conference and had a wide ranging discussion of some key aspects of the challenge.
- data stability in a distributed computing environment
- provisioning: dynamic software license management layered on cloud CPU request and allocation.
- flow definition and management: what jobs need to be run next, where is the critical path
- project level status and semantics
- Job/test status (See “Visualizing Project Health“ [PDF])
- Failed -> Investigating -> Understood
- Bug Filed
- Test Fixed
- Passed
- Failed -> Investigating -> Understood
Related Blog Posts
Achilles Test related
- Managing Project Health Birds of a Feather at DAC 2009
- Project Health BoF at DAC2009 Recap
- Chris Kappler, Achilles Test “Visualizing Project Health“ [PDF]
- Conference Testimonial from Achilles Test Systems
More general
- Moore’s Law Enables New Uses For Old Algorithms
- Hadoop Summit 2009 Quick Impressions
- Address a Problem an Industry has Promoted by Satisfying a Basic Need
- IEEE/NATEA Event on Cloud Computing July 19 2008 at Stanford
- A Picture Is Worth a Thousand CPU Hours
Image: “Weekly Regression Success Probability” (c) Achilles Test 2008, used with permission