EE 579: Digital System Testing Lecture 1: Course Introduction and Overview John P. Hayes University of Michigan EECS 579 Fall 2001 Lecture 01: Page 1 EECS 579 Course Goals To learn about The role of testing in digital systems The various types of faults expected and how to model them Testing methods and how to compute tests for manufacturing and field testing Design methods to improve testability Built-in self-test (BIST) methods Relation to the design verification problem To gain project experience in one of the following: Research Using/building CAD tools for testing Testing VLSI chips John P. Hayes University of Michigan EECS 579 Fall 2001 Lecture 01: Page 2
Course Organization Class Schedule Tuesday and Thursday 9: :30 am EWRE Building, Room 153 Instructor's TBA Office Hours Location: EECS Building, Room 2114e Contacting the See him in person during the above office hours or Instructor Send e-mail to jhayes@eecs.umich.edu or Telephone 763-0386 Prerequisites: Course in Logic Design such as EECS 270, Basic architecture and programming (C/C++) Text (required): Essentials of Electronic Testing by M. Bushnell & V. Agrawal, Kluwer, 2000. Additional books will be placed on reserve in the Media Union Library. Lecture notes and other material will be posted on the class home page http://www.eecs.umich.edu/courses/eecs579/ John P. Hayes University of Michigan EECS 579 Fall 2001 Lecture 01: Page 3 Tentative Course Plan 1. Introduction Chap. 1-3 2. Fault modeling Chap. 4 3. Combinational circuit testing Chap. 7 4. Sequential circuit testing Chap. 8 5. System testing Chap. 9 Midterm Exam 6. Design for testability Chap. 6, 14 7. Built-in self-testing Chap. 15 8. Fault simulation Chap. 5 9. System-on-a-chip (SOC) issues Chap. 18 9. Other topics TBA Project Presentations Final Exam John P. Hayes University of Michigan EECS 579 Fall 2001 Lecture 01: Page 4
Course Assignments Grades 1. Midterm exam 2. Homework assignments (about six) 3. Term project/paper. 4. Final Exam Term Project You will be allowed to propose a project from one of the following: A. Programming a test generation or simulation algorithm B. In-depth literature survey of an advanced topic C. Individual research into some special topic or problem D. Experimental testing of VLSI chips from 427/627 E. Experiments with commercial CAD hardware or software Note: The class will be conducted in accordance with the College of Engineering s Honor Code. John P. Hayes University of Michigan EECS 579 Fall 2001 Lecture 01: Page 5 What is Testing? Fault modeling Test generation problem Design for testability Fault F Unit under test (UUT) Test responses R Test patterns T Test application Reference (expected) responses R' Response comparator Stimulus signal generator Automatic test equipment (ATE) Pass: R = R' Fail: R R' John P. Hayes University of Michigan EECS 579 Fall 2001 Lecture 01: Page 6
Why Do Systems Fail? Human design errors Manufacturing defects: IC processing and packaging; Subsystem assembly and wiring Installation errors Operational (field) failures: Environment: temperature, humidity, vibration Power supply Interference: ESD, EMI, RFI, radioactivity Wear and tear: friction, corrosion, electromigration Human operator errors John P. Hayes University of Michigan EECS 579 Fall 2001 Lecture 01: Page 7 Horror Story 1: World War III (almost) John P. Hayes University of Michigan EECS 579 Fall 2001 Lecture 01: Page 8
Horror Story 2: Three-Mile Island How to make a nuclear reactor safe Remove fuel rods from reactor core (normal) Low-pressure cooling system 1 (normal) Low-pressure cooling system 2 (normal backup) High-pressure cooling system 3 (emergency) Blow pressure release plugs and flood containment building (extreme emergency) Meltdown Why the accident happened: Minor hardware faults occurred while cooling system 1 was shut down for routine maintenance Major design errors Extreme operator errors John P. Hayes University of Michigan EECS 579 Fall 2001 Lecture 01: Page 9 Horror Story 3: Therac-25 One of the best-documented computer accidents Therac-25 was a radiation machine for cancer therapy It generated X-rays of programmable intensity and duration It caused a mysterious and deadly series of accidents in the mid 1980s Operator set correct therapeutic dose levels Some patients received high and deadly doses of radiation Initial attempts to reproduce the accident conditions failed Why the accident happened: Primary reason: Faulty software in the form of a badly designed interface timing loop Secondary reasons: Absence of hardware interlocks Reuse of old, undocumented (assembly language) code John P. Hayes University of Michigan EECS 579 Fall 2001 Lecture 01: Page
Horror Story 4: Floppy Drive Chip In the late 1980 s NEC warned of a design bug in one of its floppydisk controller chips that could cause data loss under certain conditions The problem was due to a flaw in the controller code; a software fix was provided Toshiba and other computer manufacturers continued to use the flawed chip in laptops In 1999 Toshiba was the target of a class action suit over this bug Toshiba (without admitting guilt) agreed to pay each owner of an affected laptop $2 to $443, depending on purchase date Estimated overall cost of the proposed remedy: over $2 billion No actual data loss due to this bug was ever reported! John P. Hayes University of Michigan EECS 579 Fall 2001 Lecture 01: Page 11 Horror Story 5: Printer Controller Around 1980, a printer company(centronics) introduced a new line of under-$00 microprocessor-controlled printers Many of the printers mysteriously shut down during normal operation. On being restarted they worked perfectly again for a while The company spent many weeks trying to diagnose the problem They finally figured out that test pins on the Intel microcontroller chip were the cause of the problem. They acted as antennas and could pick up interference (EMI) causing the chip to enter a test mode of operation Intel claimed that the microcontroller chip was improperly designed into its board by Centronics Centronics eventually went out of business John P. Hayes University of Michigan EECS 579 Fall 2001 Lecture 01: Page 12
Why is Testing Important? (Why do we need a class in testing?) Faults cannot be eliminated entirely Safety and reliability Its usually not OK to sell faulty products Digital systems are the brains of embedded systems In many applications, undetected failures are dangerous Testing is inherently a hard problem Good progress has been made, but systems keep getting more complex Testing is very expensive ATE for IC production costs millions of dollars Test development affects time to market Adding circuits to improve testability can be costly John P. Hayes University of Michigan EECS 579 Fall 2001 Lecture 01: Page 13 Why Testing is Hard Number of transistors per IC 9 Million-transistor 8 32-bit microprocessor First commercial 7 integrated circuit (a flip-flop) 6 1G-bit DRAM 5 First (four-bit) microprocessor 4 1M-bit DRAM 3 2 1K-bit DRAM 1 1 1960 1970 1980 1990 2000 IC technology is a moving target Clock rates and power consumption are soaring too John P. Hayes University of Michigan EECS 579 Fall 2001 Lecture 01: Page 14
Why Testing is Hard: SOCs SOCs incorporate multiple complex devices and/or technologies on a single IC Processors Memories Communication circuits Application-specific circuits In the future: FPGAs MEMS John P. Hayes University of Michigan EECS 579 Fall 2001 Lecture 01: Page 15 Testing Costs Manufacturing test equipment Capital cost of automatic test equipment (ATE) Operating cost of test facility Test software development Automatic test pattern generation (ATPG) code Fault simulation and other debugging code Design for testability (DFT) Chip area overhead (implying yield loss) Performance overhead John P. Hayes University of Michigan EECS 579 Fall 2001 Lecture 01: Page 16
Testing Costs: ATE Example of Cost Estimation 1.0 GHz 00-pin production IC tester Purchase price: $1. 0M + 1,000 x $3,000 = $4.0M Annual operating cost Depreciation (4-year) + Maintenance + Operation $1.0M + $0.1M + $0.4M = $1.5M/year Test cost (assuming continuous use) $1.5M/(365 x 24 x 3,600) 5 cents/sec John P. Hayes University of Michigan EECS 579 Fall 2001 Lecture 01: Page 17 Automatic Test Equipment Advantest T6682 John P. Hayes University of Michigan EECS 579 Fall 2001 Lecture 01: Page 18
Automatic Test Equipment Advantest T6682 John P. Hayes University of Michigan EECS 579 Fall 2001 Lecture 01: Page 19 Testing Costs: DFT Intel Pentium Microprocessor Data from Keynote Address, International Test Conference 1995 Cost impact of BIST logic that increases area by 1 or 15% Nominal Pentium die 1% Die size increase 15% Die size increase Wafer cost $1,460 $1,460 $1,460 Die size 160.2mm 2 161.8mm 2 184.2mm 2 Die cost $84.06 $85.33 $2.55 Added annual cost $63.5M $961M Dies required/week 1M 1M 1M Chips fabricated/week 498.1K 482.9K 337.5K John P. Hayes University of Michigan EECS 579 Fall 2001 Lecture 01: Page 20