### THE UNIVERSITY OF TEXAS AT DALLAS

ERIK JONSSON SCHOOL OF ENGINEERING AND COMPUTER SCIENCE

# TEXAS ANALOG CENTER OF EXCELLENCE

## **ANNUAL REPORT** 2023 – 2024



Semiconductor Research Corporation



### **TXACE MISSION**

The Texas Analog Center of Excellence seeks to create fundamental analog, mixed signal and RF design innovations in integrated circuits and systems that improve energy efficiency, health care, and public safety and security.

### **TXACE THRUSTS**

Safety, Security and Health Care
 Energy Efficiency
 Fundamental Analog Circuits

### TxACE 2023–2024 ANNUAL REPORT

The Texas Analog Center of Excellence (TxACE), located at the University of Texas at Dallas is the largest analog research center based in an academic institution. Analog and mixed signal integrated circuits engineering is both a major opportunity and challenge. Analog circuits are critical components of the majority of products for the \$550+ billion per year integrated circuits industry, providing sensing, actuation, communication, power management and other functions. Digital integrated circuits such as microprocessors, logic circuits and memories are now integrating analog functions such as input/output circuits, phase locked loops, temperature sensors and power management circuits. It is also common to find microcontrollers with multiple analog-to-digital and digital-to-analog converters. These circuitries impact almost all aspect of modern life: safety security, health care, transportation, energy, entertainment and others.

Creation of advanced analog and mixed signal circuits and systems depends on the availability of engineering talent for analog research and development. TxACE was established to help translate the opportunity into economic benefits by overcoming the challenge and meeting the need through a collaboration of the state of Texas, Texas Instruments, the Semiconductor Research Corporation, the University of Texas System, and The University of Texas at Dallas.

The research tasks are organized into three research thrust areas: Safety, Security and Health Care, Energy Efficiency and Fundamental Analog. The scope of investigation extends from circuits operating at dc through terahertz, data converters that sample at a few samples/sec to 10's of Giga-samples/sec, AC-to-DC and DC-to-AC converters working at  $\mu$ W to Watts, energy harvesting circuits, sensors and many more. Significant improvements to existing mixed signal systems and new applications have been made and continued to be anticipated. Students who have been exposed to hands-on innovative research are forming the leading edge of analog talent flow into the industry. Close collaboration with and responsiveness to industry needs provide focus to the educational experience.

#### DIRECTOR'S MESSAGE



**Courtesy of Jason Janik** 

The Texas Analog Center of Excellence (TxACE) is leading analog research and education. Over the past year, TxACE researchers published 25 journals, 68 conference papers, and made 4 invited presentations. We also filed 5 patent applications and 2 invention disclosures. 36 Ph.D. and 8 M.S. students have completed their degree program.

Last year, the Center funded 91 research tasks led by 76 principal investigators at 28 institutions, including three international universities in Korea, Taiwan, and Canada. The Center supported 238 graduate and undergraduate students.

The Center is making steady progress toward developing technologies that will enable predicition of time to failure and more importantly continuing to make impact to the industry and our way of life through its research accomplishments. There are always too many to list all. A partial list includes demonstration of (1) a voltage reference achieving a temperature coefficient of 176ppm/°C, reduced current variation of 2.75%/°C while consuming 4.6nW, (2) a PLL using a ring oscillator with a wide frequency range of 7 to 14 GHz, achieving an integrated jitter of 70 fs RMS, (3) a 24V-to-1V DC-DC converter with an on-line remaining useful life estimator for capacitors that allowes an efficiency improvement of 35% from the baseline design, (4) use of nonlinear and nonordered control strategies in a SIMO DC-DC converter to achieve a load transient of 2A/ns and the peak efficiency of 96.1%, (5) a datadriven analog circuit synthesizer with automatic topology selection and sizing that can generate OpAmp designs within minutes while achieving quality approaching that of experienced designers, and (6) an active discharge system for safety that reduces the DC link capacitor discharge time in electric vehicles during emergency from over 10 sec. to 1 sec. by employing the main inverter switches, which also removes separate discharge components for lowers cost.

It was gratifying to see many of our current and former PI's at the NSTC AI Driven RFIC Design proposer's day workshop. I look forward to seeing our on-going work on AI assisted circuit design and PI's contributing toward this national research initiative.

The TxACE laboratory is continuing to help advance integrated circuit research by making its instruments and expertise available to researchers and industrial partners all over the world. Lastly, I would like to thank the students, PI's and staff for their efforts, and I look forward to another year of working with the TxACE team to make our way of life better, safer, healthier and more energy efficient through our research, education and innovation.

Kenneth K. O, Director TxACE Texas Instruments Distinguished University Chair Professor The University of Texas at Dallas

#### **BACKGROUND & VISION**

The \$600+ billion per year semiconductor industry is evolving into an analog/digital mixed signal industry. Analog circuits are providing or supporting critical functions such as sensing, actuation, communication, power management and others. These circuits impact almost all aspect of modern life including safety, security, health care, transportation, energy, and entertainment. To lead this change, in particular to lead analog and mixed signal technology education, research, commercialization, manufacturing, and job creation, the Texas Analog Center of Excellence was announced by Texas Governor Rick Perry in October 2008 as a collaboration of the Semiconductor Research Corporation, state of Texas through its Texas Emerging Technology Fund, Texas Instruments Inc., University of Texas system and University of Texas at Dallas. The Center seeks to accomplish the objectives by creating fundamental analog, mixed signal and RF design innovations in integrated circuits and systems that improve energy efficiency, healthcare, and public safety and security as well as by improving the research and educational infrastructure.



Figure 1. TxACE organization relative to the sponsoring collaboration.

#### **CENTER ORGANIZATION**

The Texas Analog Center of Excellence is guided by agreements established with the Center sponsors. Members of the industrial advisory boards identify the research needs and select research tasks in consultation with the Center leadership. Figure 1 diagrams the relationship of TxACE to the members of the sponsoring collaboration.

### The internal organization of the Center is structured to flexibly perform the research mission while fully embracing the educational missions of the Universities.

Figure 2 shows the center management structure. The TxACE Director is Professor Kenneth O. The research is arranged into three thrusts that comply with the center mission: Safety, Security and Health Care, Energy Efficiency and Fundamental Analog Research. The third thrust consists of vital research that cuts across the first two research thrusts. The thrust leaders are Prof. Yiorgos Makris of the University of Texas at Dallas for safety, security and health care, and Prof. Ali Niknejad of the University of California, Berkeley for energy efficiency. The leader for fundamental analog is Prof. Pavan Hanumolu of University of Illinois, Urbana-Champaign. The thrust leaders along with Professor Dongsheng Ma of The University of Texas at Dallas form the executive committee. The committee, along with the director, forms the leadership team that works to improve the research productivity by increasing collaboration, better leveraging the diverse capabilities of principle investigators of the Center, and lowering research barriers. The leadership team also identifies new research opportunities for consideration by the Industrial Advisory Boards.



Figure 2. TxACE organization for management of research

#### SAFETY, SECURITY & HEALTH CARE

(Thrust leader: Yiorgos Makris, University of Texas at Dallas)

The efforts in the Safety, Security, and Health care thrust focus on improving safety by mitigating various reliability threats in analog/RF devices, as well as by developing effective machine learning-based design, verification and self-test solutions. Particular emphasis has been placed on characterizing circuit aging, in-field detection and localization of both hard and soft analog faults, as well as for low-cost design, test and calibration of RF MIMO systems. Machine learning assisted solutions for improving testability, statistically characterizing effectiveness of test suites through analog test metrics, performing outlier detection for analog security are also being developed. Additionally, this thrust investigates submillimeter electromagnetic waver radar imaging, steerable focal plane arrays and PLMs for lidar and holographic displays, as well as sensors for biomedical applications and for cross-modal human sensing in complex environments. This thrust also investigates methods for motor health monitoring, laser systems for creating single-event effects that can be used to study radiation tolerance, as well as efficient temperature sensors for thermal performance characterization in power ICs.



Figure 3. (Top left) Solid-state, wide FOV and long-range lidar (Y. Takashima, University of Arizona). (Top center) Die photo of a 410-GHz imaging concurrent transceiver pixel (K. K. O, University of Texas at Dallas). (Top right) Cross-sectional STEM images of an E-mode P-GaN HEMT device (M. Kim, University of Texas at Dallas). (Middle center) Temperature variations on a SiC power module after 200,000 discharge cycles (B. Akin, University of Texas at Dallas). (Middle center) Temperature variations on a SiC power module after 200,000 discharge cycles (B. Akin, University of Texas at Dallas). (Middle right) Layout of a 3-segment interpolation string DAC (D. Chen, University of Iowa). (Bottom left) In-vehicle mm-Wave radar sensing via deep learning algorithms (M. Torlak, University of Texas at Dallas). (Bottom center) Hardware test vehicle for ML-assisted scalable DFT and BIST of AMS Systems (A. Chatterjee, Georgia Tech). (Bottom right) Pulse I-V ESD response of a 65-nm NMOS transistor (E. Rosenbaum, University of Illinois, Urbana-Champaign).





Figure 4. The research tasks include multi-stage DC-DC converters pushing the boundaries of efficiency in high conversion ratio (24V-to-1V) applications (Top left, H. Lee, UT Dallas) and 3D packaged vertically integrated two-stage series DC-DC and switched-capacitor converters (Top right, H.-P. Le, UCSD). Increasingly the focus of the research is at the intersection of power, energy, and AI/ML applications. Edge processing requires orders-of-magnitude reduction in inference AI power/energy consumption (Middle left, Bottom left and middle, M. Seok, Columbia University). This thrust also includes efforts on Switching-Mode Active EMI filtering using GaN transistors (Middle right, A. Hanson, UT Austin) and techniques to improve the transient response to higher loads for SIMO converters (Bottom right, C. Huang, Iowa State U.)

The TxACE Energy and Efficiency thrust encompasses cross-cutting research tackling energy efficiency in electronic systems, spanning from advanced power management, all the way to the emerging fields of low power machine learning/AI for edge computing and applications to IoT sensor nodes. The power management research forms the foundation of the center and tackles important issues of efficiency in complex system applications, for example in digital multi-core systems that use single inductor multiple output (SIMO) DC-DC converters, addressing modeling and simulation and optimization of performance (transient response, EMI, security) using non-linear computational control, mixed-signal techniques, machine learning and AI, and adaptive algorithms and design automation. This thrust investigates non-conventional hybrid architectures and integration strategies for applications. Many of the solutions employ mixed-signal techniques, exploiting advanced CMOS digital nodes alongside GaN power devices, and utilize novel scaling-friendly analog architectures to improve the control and expand the flexibility of the overall system.

#### FUNDAMENTAL ANALOG CIRCUITS RESEARCH

(Thrust leader: Pavan Hanumolu, U. of Illinois Urbana-Champaign)

The research in this thrust focuses on cross-cutting areas in analog and mixed-signal circuits, which impact all TxACE application areas (Energy Efficiency, Public Safety, Security, and Health Care). The research includes the design of various analog-to-digital converters, communication links, I/O circuits, noise reduction techniques, temperature sensors, new amplifier topologies suitable for use in nano-scale CMOS, development of CAD tools for AI-assisted design and layout generation, and testing of integrated circuits.



Figure 5. (Top left) CT ΔΣ Modulator (N. Maghari, University of Florida). (Top middle left) Low jitter ring PLL (P. Hanumolu, University of Illinois Urbana-Champaign). (Top middle right) 1024 QAM transmitter (H. Wang, National Taiwan University). (Top right) Subthreshold voltage reference (D. Sylvester, University of Michigan). (Bottom left) Reservoir computer with output ANN layer (A. Sanyal, Arizona State University), (Bottom middle) CNN for PA design (M. Chen, University of Southern California), (Bottom middle) ANN MOS transistor model (R. Rohrer, Carnegie Mellon University).

#### TXACE ANALOG RESEARCH FACILITY

The centralized group of laboratories of the Texas Analog Center of Excellence dedicated to analog engineering research and training occupy a ~ 8000-ft<sup>2</sup> area on the 3rd floor of the Engineering and Computer Science North building (Figure 6). The facility includes RF and THz, Integrated System Design, Embedded Signal Processing, and Analog & Mixed Signal laboratories as well as CAD/Design laboratory structured to promote collaborative research. The unique instrumentation capability includes network analyses and linearity measurements up to 325 GHz, spectrum analysis up to 120 THz, and cryo-measurements down to 2°K. The Center also added a pulsed multiple harmonic load and source pull measurement set up (up to 60 GHz for the third harmonic) and a 325-GHz antenna measurement set up. The close proximity of researchers in an open layout enables natural interaction and compels sharing of knowledge and instrumentation among the students and faculty. The TxACE analog research facility is one of the best equipped electronics laboratories. The laboratory is available for use by TxACE researchers and industrial partners all over the world.



Figure 6. TxACE Analog Research Facility

#### **RESEARCH PROJECTS AND INVESTIGATORS**

The Texas Analog Center of Excellence (TxACE) is the largest university analog technology center in the world. Table 1 lists the current principal investigators of the 91 tasks from 28 academic institutions funded by TxACE. Three universities (Texas A&M, UT Austin, UT Dallas) are from the state of Texas. 25 are from outside of Texas. Three (Seoul National University, National Taiwan University, and University of Toronto) (Figure 7) are from outside of the US. Of the 76 investigators, 25 are from Texas. During the past year, the Center supported 189 Ph.D., 19 M.S., and 30 B.S. students. 36 Ph.D. and 8 M.S. degrees were awarded to the TxACE students.

| Investigator  | Institution      | Investigator    | Institution      | Investigator         | Institution      |
|---------------|------------------|-----------------|------------------|----------------------|------------------|
| B. Akin       | UT Dallas        | D. Heo          | Washington State | К. К. О              | UT Dallas        |
| N. Al-Dahir   | UT Dallas        | S. Hoyos        | Texas A&M        | S. Ozev              | Arizona State    |
| D. Allstot    | Oregon State     | C. Huang        | Iowa State       | S. Palermo           | Texas A&M        |
| K. Basu       | UT Dallas        | T. Huang        | National Taiwan  | D. Pan               | UT Austin        |
| R. Baumann    | UT Dallas        | Y. Jia          | UT Austin        | P. Pande             | Washington State |
| D. Blaauw     | U Michigan       | Y. Kaneda       | U Arizona        | M. Quevedo-<br>Lopez | UT Dallas        |
| N. Cao        | U Notre Dame     | C. Kim          | U Minnesota      | G. Rincón-Mora       | Georgia Tech     |
| F. Chang      | UCLA             | M. Kim          | UT Dallas        | R. Rohrer            | CMU              |
| A. Chatterjee | Georgia Tech     | J. Kulkarni     | UT Austin        | E. Rosenbaum         | UIUC             |
| D. Chen       | Iowa State       | H. Le           | UCSD             | A. Sanyal            | Arizona State    |
| S. Chen       | USC              | H. Lee          | UT Dallas        | S. Sapatnekar        | U Minnesota      |
| Y. Chiu       | UT Dallas        | M. Lee          | UT Dallas        | V. Sathe             | Georgia Tech     |
| W. Choi       | Seoul National   | T. Levi         | USC              | M. Seok              | Columbia         |
| J. Doppa      | Washington State | K. Lin          | National Taiwan  | S. Shah              | U Maryland       |
| B. Floyd      | NC State         | J. Liu          | UT Dallas        | H. Shichijo          | UT Dallas        |
| M. Flynn      | U Michigan       | H. Lu           | UT Dallas        | D. Sylvester         | U Michigan       |
| D. Forte      | U Florida        | D. Ma           | UT Dallas        | Y. Takashima         | U Arizona        |
| I. Galton     | UCSD             | N. Maghari      | U Florida        | M. Torlak            | UT Dallas        |
| R. Geiger     | Iowa State       | I. Mahbub       | UT Dallas        | G. Trichopoulos      | Arizona State    |
| M. Ghassemi   | UT Dallas        | Y. Makris       | UT Dallas        | H. Wang              | National Taiwan  |
| J. Gu         | Northwestern     | D. Manocha      | U Maryland       | D. Wentzloff         | U Michigan       |
| S. Gupta      | USC              | P. Mercier      | UCSD             | D. Woodard           | U Florida        |
| A. Hanson     | UT Austin        | S. Mukhopadhyay | Georgia Tech     | R. Yazicigil Kirby   | Boston           |
| P. Hanumolu   | UIUC             | B. Murmann      | Stanford         | D. Yeung             | U Maryland       |
| R. Harjani    | U Minnesota      | F. Najm         | U Toronto        |                      |                  |
| R. Henderson  | UT Dallas        | A. Niknejad     | UC Berkeley      |                      |                  |

#### Table 1. Principal Investigators (May 2023 through April 2024)



Figure 7. Member Institutions of Texas Analog Center of Excellence

#### SUMMARY OF RESEARCH PROJECTS

The 91 research projects funded through TxACE during 2023-2024 are listed in Table 2 below by the Semiconductor Research Corporation task identification number.

### Table 2: Funded research projects at TxACE by SRC task identification number (FA: Fundamental Analog, EE: Energy Efficiency, SS: Safety, Security and Health Care)

|    | Task     | Thrust | Title                                                                                                                  | Task Leader                                       | Institution          |
|----|----------|--------|------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------|----------------------|
| 1  | 2810.053 | FA     | TI PLM to Advanced Lidar and Display<br>Systems                                                                        | Takashima, Yuzuru                                 | Univ. of<br>Arizona  |
| 2  | 2810.054 | SS     | Reconfigurable AC Power Cycling Setup and<br>Plug-in Condition Monitoring Tools for High<br>Power IGBT and SiC Modules | Akin, Bilal                                       | UT Dallas            |
| 3  | 2810.055 | SS/EE  | EMI-Regulated Secure Automotive Power ICs                                                                              | Ma, D. Brian                                      | UT Dallas            |
| 4  | 2810.056 | FA     | Millimeter Wave Packaging Research -<br>Antenna in Package                                                             | Henderson, Rashaunda<br>Lee, Mark<br>Lu, Hongbing | UT Dallas            |
| 5  | 2810.057 | SS     | Reliability Study of E-mode GaN HEMT<br>Devices by AC TDDB and High Resolution TEM                                     | Kim, Moon<br>Shichijo, Hishashi                   | UT Dallas            |
| 6  | 2810.058 | SS/FA  | Machine Learning-Based Overkill/Underkill<br>Reduction in Analog/RF IC Testing                                         | Makris, Yiorgos                                   | UT Dallas            |
| 7  | 2810.059 | SS/EE  | Ultra-Low-Power Robust SAR ADC for PMCW<br>Automotive RADAR                                                            | Chiu, Yun                                         | UT Dallas            |
| 8  | 2810.060 | FA     | Intelligent, Learning ADCs for the Post Figure-<br>of-Merit World                                                      | Flynn, Michael                                    | Univ. of<br>Michigan |
| 9  | 2810.061 | EE     | Two-Stage Vertical Power Delivery and<br>Management for Efficient High-Performance<br>Computing                        | Le, Hanh-Phuc<br>Mercier, Patrick                 | UC San Diego         |
| 10 | 2810.062 | FA     | Multi-Carrier DAC-Based Transmitter<br>Architectures for 100+Gb/s Serial Links                                         | Palermo, Samuel<br>Hoyos, Sebastian               | Texas A&M            |
| 11 | 2810.063 | FA     | Analog and Digital Assist Techniques to<br>Improve Mixed-Signal Performance                                            | Sylvester, Dennis<br>Blaauw, David                | Univ. of<br>Michigan |

|    | Task                      | Thrust | Title                                                                                                                                                             | Task Leader                                             | Institution               |
|----|---------------------------|--------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------|---------------------------|
| 12 | 2810.064                  | SS     | Characterization and Tolerance of Ageing in<br>Integrated Voltage Regulators                                                                                      | Mukhopadhyay, Saibal                                    | Georgia Tech              |
| 13 | 2810.065                  | EE/SS  | Power-Efficient and Reliable 48-V DC-DC<br>Converter with Direct Signal-to-Feature<br>Extraction and DNN-Assisted Multi-Input<br>Multiple-Output Feedback Control | Seok, Mingoo                                            | Columbia                  |
| 14 | 2810.066                  | SS     | Demonstrably Generalizable Compact Models<br>of ESD Devices                                                                                                       | Rosenbaum, Elyse                                        | UIUC                      |
| 15 | 2810.067                  | EE     | Highly Efficient Extreme-Conversion-Ratio<br>Buck Hybrid Converters                                                                                               | Pande, Partha<br>Heo, Deukhyoun<br>Doppa, Janardhan Rao | Washington<br>State       |
| 16 | 2810.068                  | EE     | Active EMI Filtering with Switch-Mode<br>Amplifier for High Efficiency                                                                                            | Hanson, Alex                                            | UT Austin                 |
| 17 | 2810.070                  | SS     | Early and Late Life Failure Prediction Methods<br>for Analog and Mixed-Signal Circuits                                                                            | Kim, Chris                                              | Univ. of<br>Minnesota     |
| 18 | 2810.071                  | FA     | Accurate Compact Temperature Sensors for<br>Thermal Management of High Performance<br>Computing Platforms                                                         | Geiger, Randall<br>Chen, Degang                         | lowa State                |
| 19 | 2810.072<br>&<br>2810.073 | EE     | AI/ML Edge Hardware for Ultra-reliable<br>Wireless Networks                                                                                                       | Allstot, David<br>Makris, Yiorgos                       | Oregon State<br>UT Dallas |
| 20 | 2810.074                  | SS     | Thermal Performance Characterization and<br>Degradation Monitoring of LDMOS based<br>Integrated Power IC with On-Die<br>Temperature Sensors                       | Akin, Bilal                                             | UT Dallas                 |
| 21 | 2810.075                  | EE     | Hybrid Step-Down DC-DC Converters with<br>Large Conversion Ratios for 48V Automotive<br>Applications                                                              | Lee, Hoi<br>Liu, Jin                                    | UT Dallas                 |
| 22 | 2810.076                  | FA     | High Precision Positioning Techniques Based<br>on Multiple Technologies and Frequency<br>Bands                                                                    | Al-Dhahir, Naofal<br>Torlak, Murat                      | UT Dallas                 |
| 23 | 2810.077                  | SS     | Increasing Lifetime of Nano-Scale CMOS<br>Circuits                                                                                                                | O, Kenneth                                              | UT Dallas                 |
| 24 | 2810.078                  | EE     | Programmable Mixed-Signal Accelerator for DNNs with Depthwise Separable Convolution Layers                                                                        | Murmann, Boris                                          | Stanford                  |
| 25 | 2810.079                  | EE     | High-Power-Density In-Package SIMO<br>Converters for Next-Generation<br>Microprocessors                                                                           | Huang, Cheng                                            | lowa State                |

|    | Task                      | Thrust | Title                                                                                                                                                     | Task Leader                                                                                                                       | Institution              |
|----|---------------------------|--------|-----------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------|--------------------------|
| 26 | 2810.080                  | EE     | Efficient and High-Density Fully In-Package<br>GaN-Based High-Ratio DC-DC Converters                                                                      | Huang, Cheng                                                                                                                      | lowa State               |
| 27 | 2810.081                  | FA     | Development of 70-95 GHz Terabit<br>Beamformer                                                                                                            | Wang, Huei<br>Huang, Tian-Wei<br>Lin, Kun-You                                                                                     | National<br>Taiwan Univ. |
| 28 | 2810.082                  | FA     | Adaptive Digital Cancellation of Dynamic<br>Error from Clock Skew, Component<br>Mismatches, and ISI in High-Resolution RF<br>DACs                         | Adaptive Digital Cancellation of Dynamic<br>Error from Clock Skew, Component<br>Mismatches, and ISI in High-Resolution RF<br>DACs |                          |
| 29 | 2810.083                  | FA     | Automated Layout Of Analog Arrays in<br>Advanced Technology Nodes                                                                                         | Sapatnekar, Sachin<br>Harjani, Ramesh                                                                                             | Univ. of<br>Minnesota    |
| 30 | 2810.084                  | SS     | Soft and hard analog fault detection,<br>injection, coverage, diagnosis, and<br>localization strategies suitable for production<br>test and in-field test | Chen, Degang                                                                                                                      | lowa State               |
| 31 | 2810.085<br>&<br>2810.093 | FA     | Applications of Circuit Transient Sensitivity<br>Simulation to Semiconductor Circuit Analysis<br>and Design                                               | Rohrer, Ronald                                                                                                                    | Carnegie<br>Mellon       |
| 32 | 2810.086                  | SS     | Machine Learning-based Functional Safety<br>Improvement of AMS components in<br>Automotive SoCs                                                           | Basu, Kanad                                                                                                                       | UT Dallas                |
| 33 | 2810.087                  | EE     | Grid Optimization and Silicon Validation for<br>Chip Robustness                                                                                           | Najm, Farid                                                                                                                       | Univ. of<br>Toronto      |
| 34 | 2810.088                  | EE     | Grid Optimization and Silicon Validation for<br>Chip Robustness                                                                                           | Kim, Chris                                                                                                                        | Univ. of<br>Minnesota    |
| 35 | 2810.089                  | SS     | Techniques for Low-cost Design, Test, and<br>Calibration of RF MIMO Systems                                                                               | Ozev, Sule<br>Trichopoulos, Georgios                                                                                              | Arizona State            |
| 36 | 2810.090                  | SS     | Motor Health Monitoring                                                                                                                                   | Akin, Bilal                                                                                                                       | UT Dallas                |
| 37 | 2810.091                  | SS     | Development of Two-Photon Absorption<br>Laser System for Creating Single Event Effects                                                                    | Baumann, Robert<br>Quevedo-Lopez,<br>Manuel                                                                                       | UT Dallas                |
| 38 | 2810.092                  | EE     | Battery-Charging CMOS Voltage Regulator for<br>Resistive Low-Voltage DC Sources                                                                           | Rincón-Mora, Gabriel                                                                                                              | Georgia Tech             |
| 39 | 3160.002                  | EE     | tinyASR: Self-Supervised, Sub-10μW<br>Automatic Speech Recognition Hardware for<br>IoT Devices                                                            | Seok, Mingoo                                                                                                                      | Columbia                 |

|    | Task     | Thrust | Title                                                                                                      | Task Leader                                    | Institution           |
|----|----------|--------|------------------------------------------------------------------------------------------------------------|------------------------------------------------|-----------------------|
| 40 | 3160.003 | SS     | Techniques for Online Ageing Detection and<br>In-field Characterization of Aging Phenomena                 | Chen, Degang                                   | Iowa State            |
| 41 | 3160.004 | SS     | Inductive Fault Analysis for Determining<br>Statistical Analog Test Metrics                                | Ozev, Sule                                     | Arizona State         |
| 42 | 3160.005 | SS     | ML-Assisted Scalable DfT and BIST of AMS<br>Systems                                                        | Chatterjee, Abhijit                            | Georgia Tech          |
| 43 | 3160.006 | FA     | Machine-Learning Based Analog Mixed-signal<br>Design Tool                                                  | Chen, Shuo-Wei<br>Gupta, Sandeep<br>Levi, Tony | USC                   |
| 44 | 3160.007 | FA     | AI-Assisted and Layout-Aware Analog<br>Synthesis and Optimization with Design Intent                       | Pan, David<br>Jia, Yaoyao                      | UT Austin             |
| 45 | 3160.008 | FA     | High-Speed DAC with High Output Power and<br>Linearity                                                     | Chen, Shuo-Wei                                 | USC                   |
| 46 | 3160.009 | FA     | 100+GS/s Time-Domain Analog-to-Digital<br>Converters                                                       | Palermo, Samuel<br>Hoyos, Sebastian            | Texas A&M             |
| 47 | 3160.010 | FA     | Design Automation of Low Phase Noise PLL                                                                   | Chen, Shuo-Wei<br>Gupta, Sandeep<br>Levi, Tony | USC                   |
| 48 | 3160.011 | SS     | G-Band CMOS mm-Wave Imager and Sensor<br>for Biomedical Applications                                       | Niknejad, Ali                                  | UC Berkeley           |
| 49 | 3160.012 | SS     | Small-area Low-power DAC Designs with In-<br>field Digital Calibration Ensuring Lifetime High<br>Linearity | Chen, Degang                                   | Iowa State            |
| 50 | 3160.013 | EE     | Energy-Efficient Circuits and Architectures for<br>Cryogenic Operation                                     | Kim, Chris                                     | Univ. of<br>Minnesota |
| 51 | 3160.015 | EE     | ULP Receivers                                                                                              | Wentzloff, David                               | Univ. of<br>Michigan  |
| 52 | 3160.016 | EE     | MODO: Hybrid SIMO-DLDO DC-DC Converter<br>for Multi-Core Microprocessors and System-<br>on-Chips           | Seok, Mingoo                                   | Columbia              |
| 53 | 3160.017 | EE/FA  | Multi-phase Sub-100fs Jitter Ring-oscillator-<br>based Clock Multipliers for Beyond 100Gb/s<br>Links       | Hanumolu, Pavan                                | UIUC                  |

|    | Task     | Thrust | Title                                                                                                                                            | Task Leader                                     | Institution             |
|----|----------|--------|--------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------|-------------------------|
| 54 | 3160.018 | FA     | Pseudo-Static Storage Circuits for Extreme<br>Low Voltage Cryo-CMOS Applications                                                                 | Kulkarni, Jaydeep                               | UT Austin               |
| 55 | 3160.019 | FA     | Mixed-Domain High-Performance CT-ΔΣ ADCs                                                                                                         | Maghari, Nima                                   | Univ. of<br>Florida     |
| 56 | 3160.020 | SS     | Transient Reliability and Condition<br>Monitoring of GaN HEMTs                                                                                   | Akin, Bilal                                     | UT Dallas               |
| 57 | 3160.021 | FA/EE  | Automated Generation of Comprehensive<br>Voltage/Frequency Domains -<br>Logic+PLL+Voltage Regulation                                             | Sathe, Visvesh                                  | Georgia Tech            |
| 58 | 3160.022 | EE     | Domain-Voltage Regulator Co-design for<br>Enhanced SoC Energy Efficiency                                                                         | Sathe, Visvesh                                  | Georgia Tech            |
| 59 | 3160.024 | EE     | On-the-Go Battery Charging/Battery<br>Monitoring SIMIMO Voltage Regulator                                                                        | Rincón-Mora, Gabriel                            | Georgia Tech            |
| 60 | 3160.025 | EE     | DRIVR: A Digital, Re-configurable, Unified<br>Clock-Power (UNICAP) Fabric                                                                        | Sathe, Visvesh                                  | Georgia Tech            |
| 61 | 3160.026 | EE     | Computationally Controlled Integrated<br>Voltage Regulators                                                                                      | Sathe, Visvesh                                  | Georgia Tech            |
| 62 | 3160.027 | SS/EE  | Information-Centric Secure Conversion<br>Interfaces for Energy-Efficient Wireless<br>Systems                                                     | Yazicigil Kirby, Rabia                          | Boston Univ.            |
| 63 | 3160.028 | SS     | 0.5-degree Angular Resolution Submillimeter<br>Electromagnetic Wave Radar Imaging using a<br>9-cm Diameter Electronically Steerable<br>Reflector | O, Kenneth<br>Quevedo-Lopez,<br>Manuel          | UT Dallas               |
| 64 | 3160.029 | SS     | Arrayed Texas Instruments DMD and PLM for<br>Advanced Solid-state Lidar and Holographic<br>Display                                               | Takashima, Yuzuru<br>Kaneda, Yushi              | Univ. of<br>Arizona     |
| 65 | 3160.030 | SS/EE  | Study of Active Discharge and Efficient Driver<br>Architecture for High Power Systems                                                            | Akin, Bilal                                     | UT Dallas               |
| 66 | 3160.031 | SS     | Steerable focal plane arrays for high resolution submillimeter electromagnetic wave radar imaging                                                | Choi, Wooyeol                                   | Seoul National<br>Univ. |
| 67 | 3160.032 | FA     | Hardware-Algorithm codesign for on-chip<br>learning at the edge                                                                                  | Shah, Sahil<br>Manocha, Dinesh<br>Yeung, Donald | Univ. of<br>Maryland    |

|    | Task     | Thrust | Title                                                                                                               | Task Leader                      | Institution             |
|----|----------|--------|---------------------------------------------------------------------------------------------------------------------|----------------------------------|-------------------------|
| 68 | 3160.033 | FA     | On-chip hyperdimensional computing using mixed-signal circuits for Edge AI                                          | Sanyal, Arindam                  | Arizona State           |
| 69 | 3160.034 | FA/EE  | Energy-Efficient Pipelined GS/s+ 12-bit<br>TDC/ADC with Time-Domain RNS Encoding                                    | Chiu, Yun                        | UT Dallas               |
| 70 | 3160.035 | SS     | Computational Analog Security Hardware for<br>Neuromorphic Anomaly Detection                                        | Cao, Ningyuan                    | Univ. of Notre<br>Dame  |
| 71 | 3160.036 | SS     | Electromigration Lifetime Characterization under Realistic Chip Operating Conditions                                | Kim, Chris                       | Univ. of<br>Minnesota   |
| 72 | 3160.037 | FA     | Causal AI for Interpretable and Robust AMS<br>Topology Synthesis and Optimization                                   | Forte, Domenic<br>Woodard, Damon | Univ. of<br>Florida     |
| 73 | 3160.038 | EE     | VERTICAL: Multi-Core Voltage-Stacked<br>Microprocessor with a Dynamic Load<br>Shuffling and a SIMO Converter        | Seok, Mingoo                     | Columbia                |
| 74 | 3160.039 | EE     | Proactive Power and Clock Management for<br>System-on-Chip                                                          | Gu, Jie                          | Northwestern<br>Univ.   |
| 75 | 3160.040 | FA     | Low-Area and Wideband Fractional-N PLL<br>Design Utilizing Acoustic Resonators and<br>Fractional Noise Cancellation | Mercier, Patrick                 | UC San Diego            |
| 76 | 3160.041 | FA     | Scalable Antenna-in-Package Approaches<br>Supporting Heterogeneous Integrated Arrays                                | Floyd, Brian                     | North Carolina<br>State |
| 77 | 3160.042 | EE     | High Step-Down SISO and SI2MO Hybrid<br>Converters for 48/120V Inputs and Beyond                                    | Pande, Partha<br>Heo, Deuk       | Washington<br>State     |
| 78 | 3160.043 | FA     | Next-Generation Interleaved Noise-Shaping<br>SAR ADCs                                                               | Flynn, Michael                   | Univ. of<br>Michigan    |
| 79 | 3160.044 | FA     | Direct-Carrier Modulated Transmitter for<br>Extremely High Data Rate Millimeter-Wave<br>Radios                      | Chang, Frank Mau-<br>Chung       | UC Los<br>Angeles       |
| 80 | 3160.045 | FA/EE  | Design Techniques for Dense Energy-Efficient<br>100+Gb/s/Wire Die-to-Die Interconnect<br>Transceivers               | Palermo, Samuel                  | Texas A&M               |
| 81 | 3160.046 | SS     | CDM Reliability Prediction: From the Test<br>Circuit to the IC Product                                              | Rosenbaum, Elyse                 | UIUC                    |

|    | Task               | Thrust | Title                                                                                                                                 | Task Leader                        | Institution |
|----|--------------------|--------|---------------------------------------------------------------------------------------------------------------------------------------|------------------------------------|-------------|
| 82 | 3160.047           | FA     | Highly Efficient 100-dB+ Pipelined SAR ADC with kT/C Noise Cancellation                                                               | Chiu, Yun                          | UT Dallas   |
| 83 | 3160.048           | SS     | Data-Driven Framework for Cross-Modal<br>Wireless Human Sensing in Complex<br>Environments                                            | Torlak, Murat<br>Al-Dhahir, Naofal | UT Dallas   |
| 84 | 3160.049           | SS     | In-situ Electrical Biasing TEM/STEM Study of<br>E-mode GaN HEMT Devices for Reliability                                               | Kim, Moon<br>Shichijo, Hishashi    | UT Dallas   |
| 85 | 3160.050           | EE     | Multiphase Hybrid Step-Down DC-DC<br>Converter with High Current Density for<br>Large-Conversion-Ratio 48V Automotive<br>Applications | Lee, Hoi<br>Liu, Jin               | UT Dallas   |
| 86 | 3160.051           | EE     | Gate Driving Techniques and Circuits for Wide<br>Bandgap Bidirectional Power Switches                                                 | Ma, D. Brian                       | UT Dallas   |
| 87 | 3160.052           | EE     | Advanced Self-Commissioning Modules and<br>Type-5 PWM Modulation Schemes for EV<br>Traction Drives & Chargers                         | Akin, Bilal                        | UT Dallas   |
| 88 | 3160.053           | FA     | Precision Capacitively-Coupled<br>Programmable-Gain Amplifiers                                                                        | Geiger, Randall                    | lowa State  |
| 89 | TCI 2023<br>Task 1 | FA     | Nanodielectric Fluids Using a Multi-<br>Nanoparticle System for Two-Phase Heat<br>Transfer in 3D Heterogeneous Microsystems           | Ghassemi, Mona                     | UT Dallas   |
| 90 | TCI 2023<br>Task 2 | SS     | Resilient Intelligent Secured Electromagnetic<br>Spectrum Sensing Systems (RISES3) for the<br>Next-Generation of Electronic Warfare   | Mahbub, Ifana                      | UT Dallas   |
| 91 | TCI 2023<br>Task 3 | SS     | Germanium Telluride Chalcogenide Switches<br>for RF Applications                                                                      | Quevedo-Lopez,<br>Manuel           | UT Dallas   |

#### ACCOMPLISHMENTS

In the past year, TxACE has made significant research progress. Table 3 summarizes the number of publications and inventions resulting from the TxACE research during May 2023 to April 2024, while Table 4 lists the major research accomplishments for the Center during the period. The TxACE researchers have published 68 conference papers, 25 journal papers, and 4 Invited Presentations. The team also made 2 invention disclosures and filed 5 patent applications. The list of publications is included as Appendix I. Following the tabulation, brief summaries of each project are provided.

#### Table 3. TxACE number of publications (May 2023 through April 2024)

| Conference Papers | Journal Papers | Invited Presentations | Invention Disclosures | Patents Filed |
|-------------------|----------------|-----------------------|-----------------------|---------------|
| 68                | 25             | 4                     | 2                     | 5             |

#### Table 4. Major TxACE Research Accomplishments (May 2023 through April 2024)

| Category                            | Accomplishment                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
|-------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Fundamental<br>Analog<br>(Circuits) | A type-III supply-regulated phase-locked loop (PLL) showcases the potential of ring oscillators for ultra-low noise applications. It utilizes a high-gain sampling phase detector to suppress in-band phase noise, while a low-noise multiphase oscillator reduces out-of-band noise. Fabricated in a 16-nm FinFET process, the PLL operates over a wide frequency range of 7 to 14 GHz, achieving a low integrated jitter of 70 fs RMS and exceptional supply noise rejection exceeding 30dB. (3160.017, P. Hanumolu, University of Illinois, Urbana-Champaign) |
| Fundamental<br>Analog<br>(Circuits) | A nanowatt subthreshold voltage reference minimizes temperature-induced current variation through a clock reference for adaptive duty-cycled operation and offers output voltage programmability via an integrated programmable DC-DC converter. Fabricated in 0.18-µm CMOS, it achieves a temperature coefficient of 176ppm/°C while consuming 4.6nW, reduces current variation to 2.75%/°C (a 400× improvement), and features 64-step output voltage programmability with 1.2-mV resolution. (2810.063, D. Sylvester, University of Michigan)                  |
| Fundamental<br>Analog<br>(CAD)      | A data-driven analog circuit synthesizer with automatic topology selection and sizing is demonstrated. An adaptive topology dataset is utilized, which can later be enhanced with synthetic data generated using a generative machine learning technique. Experiments involving over 360 OpAmp topologies and over 540K data points demonstrate the capability to generate designs within minutes while achieving quality comparable to that of experienced designers. (3160.007, D. Pan, University of Texas, Austin)                                           |
| Energy<br>Efficiency<br>(Circuits)  | Active EMI filtering with feedback is used to reduce the volume burden of additional passive filtering by a factor of 20. This approach uses a switch-mode amplifier operating at 30+ MHz with GaN devices with a fractional-order filter to achieve high loop gain over a limited bandwidth. Measurements demonstrated 40-60 dB of current attenuation at the first several harmonics, which relaxes filtering requirement, while incurring a 0.4% efficiency penalty on a 120-W boost converter, even with 66% ripple ratio. (2810.068, A. Hanson, UT Austin)  |

| Energy<br>Efficiency<br>(Circuits)                   | Most traditional SIMO (single inductor multiple output) converters operate with fixed time multiplexing ordering to handle the multiple outputs with linear PWM control, which limits respond to a large and fast load transient. Non-linear and non-ordered control strategies that overcome these are demonstrated. A prototype achieving 96.1% efficiency and a transient speed of 2.1A/ns and a maximum current capacity of 2.2A is demonstrated. (2810.079, C. Huang, Iowa State University)                                                                                                     |
|------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Energy<br>Efficiency<br>(Circuits)                   | Power management circuitry such as DC-DC converters are often designed for the rarely occurring worst-case scenarios that increases their cost. Instead, the circuits can be monitored in real time and the circuit parameters can be adjusted to enable reliable operation. A 24V-to-1V DC-DC converter incorporating this concept was fabricated. The converter achieves a peak power efficiency of 93.89% at 405-mA load current, an efficiency improvement of up to 34.74% compared to the baseline design. (2810.065, M. Seok, Columbia University)                                              |
| Safety,<br>Security and<br>Health Care<br>(Systems)  | A holographic point display using a TI Phase Light Modulator (PLM) was developed<br>for producing Computer Generated Holograms in applications such as Head Up<br>Display. Compared to point source multiplexing (PSM) for generating images, the<br>Gerchburg-Saxton (GS) algorithm is 6.7x faster in generating 2D images while PSM<br>is 46x faster for 3D images. Furthermore, tiling of Digital Micromirror Devices with<br>PLMs is an effective way for increasing Field of View and improving resolution of<br>lidar images. (3160.029, Y. Takashima, U of Arizona)                            |
| Safety,<br>Security and<br>Health Care<br>(Systems)  | A swift and lower-cost active discharge electronic circuit system for efficient capacitor discharge in electric vehicles during emergency for safety is demonstrated. It significantly reduces DC link capacitor discharge time from over 10 sec. to just 1 sec. by employing the main inverter switches, which removes the for separate discharge components and lowers cost. This system integrates an adjustable gate driver to modulate the switch's gate-source voltage, enabling constant power operation during discharge. (3160.030, B. Akin, UT Dallas)                                      |
| Safety,<br>Security and<br>Health Care<br>(Circuits) | A 19-GHz voltage-controlled oscillator (VCO) implemented in a 22-nm FDSOI CMOS process, which uses an addressable array of cross-coupled near minimum size NMOS transistor pairs and post fabrication selection demonstrates phase noise (PN) which is more than 5 dB lower than that of CMOS VCO's operating at 15-20 GHz in the literature. The VCO configured with the combination with the lowest PN was stressed for 5 hours that degrades PN by ~2-3 dB. The PN performance even after three stress and post-stress selection cycles, is almost fully recovered. (2810.077, K. K. O, UT Dallas) |

### Safety, Security and Health Care Thrust



| Category                                             | Accomplishment                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
|------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Safety,<br>Security and<br>Health Care<br>(Systems)  | A holographic point display using a TI Phase Light Modulator (PLM) was developed<br>and methods for producing Computer Generated Holograms for applications such<br>as Head Up Display were benchmarked. Compared to point source multiplexing<br>(PSM) for generating images, the Gerchburg-Saxton (GS) algorithm is 6.7x faster<br>(1.47s vs. 9.86s) in generating 2D images while PSM is 46x faster (9.57s vs 446s)<br>for 3D images. Furthermore, tiling of Digital Micromirror Devices with PLMs was<br>shown to be an effective way for increasing Field of View and improving resolution<br>of lidar images. (3160.029, Y. Takashima, U of Arizona)                                                                                                                |
| Safety,<br>Security and<br>Health Care<br>(Systems)  | A swift and lower-cost active discharge electronic circuit system for efficient capacitor discharge in electric vehicles during emergency for safety is demonstrated. It significantly reduces DC link capacitor discharge time from over 10 sec. to just 1 sec. by employing the main inverter switches, which removes the for separate discharge components and lowers cost. This system integrates an adjustable gate driver to modulate the switch's gate-source voltage, enabling constant power operation during discharge. Through frequency modulation, thermal runaway is effectively prevented. (3160.030, B. Akin, UT Dallas)                                                                                                                                  |
| Safety,<br>Security and<br>Health Care<br>(Circuits) | A 19-GHz voltage-controlled oscillator (VCO) implemented in a 22-nm FDSOI CMOS process, which uses an addressable array of cross-coupled near minimum size NMOS transistor pairs and post fabrication selection demonstrates phase noise (PN) of -117 dBc/Hz at 1-MHz offset frequency while dissipating 8 mW of DC power. The PN is more than 5 dB lower than that of CMOS VCO's operating at 15-20 GHz in the literature. The VCO configured using the combination (24 pairs switched on) with the lowest PN was stressed for 5 hours at V <sub>DDOSC</sub> =1.9 V and bias current of 27 mA. The stress degrades PN by ~2-3 dB. The PN performance even after three stress and post-stress selection cycles, is almost fully recovered. (2810.077, K. K. O, UT Dallas) |







#### TASK 2810.054, RECONFIGURABLE AC POWER CYCLING SETUP AND PLUG-IN CONDITION MONITORING TOOLS FOR HIGH POWER IGBT AND SIC MODULES

BILAL AKIN, UNIVERSITY OF TEXAS AT DALLAS, BILAL.AKIN@UTDALLAS.EDU

#### SIGNIFICANCE AND OBJECTIVES

Concerns about long-term reliability hinder the widespread adoption of SiC MOSFETs. This research examines SiC MOSFET performance over time, using AC power cycling to simulate aging. It identifies key indicators for Si IGBT and SiC MOSFET modules and develops circuits for online condition monitoring, lifetime models, and tools for estimating remaining useful life.

#### **TECHNICAL APPROACH**

Utilizing the TI UCC5870 gate driver IC, a smart gate driver board with condition monitoring (CM) circuits is designed. These CM circuits cover all primary aging mechanisms. Devices from different vendors are aged using AC power cycling, and aging precursor shift patterns are captured with CM circuits. Periodically, testing is paused to examine devices with a curve tracer and scanning acoustic microscope (SAM). At the end of their lifespan, various failure analyses are conducted. A MATLAB toolbox for estimating the remaining useful lifetime has been developed using the gathered aging comprehensive approach data. This enhances understanding and prediction of device longevity.

#### SUMMARY OF RESULTS

In this research, the AC power cycling test setup is subjected to full power testing. The power modules and discrete devices were aged under controlled conditions, and their parameter changes were monitored using specialized condition monitoring circuits. As shown in Figure 2, the monitoring circuits recorded online current and voltage data for on-resistance monitoring.



Figure 1. AC power cycling test setup.

Three main precursors for SiC MOSFETs have been identified. The on-resistance shift over aging is presented

in Fig. 3 as an example. Finally, a MATLAB toolbox was developed and trained using real data from the aging test setup to estimate the remaining useful lifetime of the device.



Figure 2. Online R<sub>ds,on</sub> measurement.



Figure 3. Precursor shift over aging.

### **Keywords:** SiC MOSFETs, AC power cycling, performance degradation, condition monitoring, reliability

#### INDUSTRY INTERACTIONS

**Texas Instruments** 

#### MAJOR PAPERS/PATENTS

[1] Patent: "Switching Transient Based Junction Temperature Estimation Of SiC Mosfets With Aging Compensation," 2023.

[2] "AC Power Cycling Test Setup and Condition Monitoring Tools for SiC-Based Traction Inverters," IEEE Transactions on Vehicular Technology.

[3] "Gate-Oxide Degradation Monitoring of SiC MOSFETs Based on Transfer Characteristics with Temperature Compensation," IEEE Transactions on Transportation of Electrification, 2023.

[4] "Reliability Evaluation of SiC MOSFETs Under Realistic Power Cycling Tests," in IEEE Power Electronics Magazine, Oct. 2023.

The escalating levels of EMI in automotive power ICs pose significant reliability and security risks. This work introduces a security-aware power IC architecture that enhances resistance to power side-channel attacks by decoupling power traces from load variations. It not only improves side-channel resistance but also effectively suppresses EMI.

#### TECHNICAL APPROACH

To decouple the power traces from load activities with minimal power overhead, a power masking stage is introduced in parallel to a buck power stage. This masking stage draws randomized input currents that are temporarily stored and subsequently delivered to the output after a random delay. This randomization of power masking and charge recycling eliminates the correlation between the input power trace and load activities. When combined with a hysteretic controller, the randomized switching frequency further suppresses the EMI level in the on-chip power IC.

#### SUMMARY OF RESULTS



Figure 1. Block diagram of the proposed encrypted power IC.

Fig. 1 shows the block diagram of the proposed encrypted on-chip power supply IC. The recycled masking consists of a power injection path with  $M_{Pl}$ ,  $C_{CS}$ , and  $M_{CR}$  in parallel with a half-bridge buck converter. The encryption interface circuit generates random ON-time ( $T_{ON}$ ), which turns on  $M_{Pl}$  at randomized timing and draws random input current ( $I_{Pl}$ ) traces from  $V_{IN}$  and charging  $C_{CS}$ .

Once  $C_{CS}$  is sufficiently charged by  $I_{PI},$  the charge recycling phase is activated. During this phase, the  $M_{PI}$  and  $M_{H}$  are turned off to disconnect the main power

source, and the  $M_{CR}$  is activated to use the  $C_{CS}$  as a temporary power source. Consequently, the input power trace is no longer correlated to the load activity, preventing side-channel attacks. By recycling the charges initially drawn for randomized power injection, the proposed architecture significantly reduces the power overhead compared to existing side-channel resistant designs.



Figure 2. Measured I<sub>IN</sub> and I<sub>CORE</sub> without proposed techniques.



Figure 3. Measured  $I_{\mbox{\scriptsize IN}}$  and  $I_{\mbox{\scriptsize CORE}}$  with proposed techniques.

As observed from Fig. 2, without the proposed techniques, the input current  $(I_{IN})$  is directly proportional to the load current  $(I_{CORE})$ , making it susceptible to sidechannel attacks. In contrast, Fig. 3 demonstrates that the proposed techniques effectively randomize both the amplitude and the timing of  $I_{IN}$  in response to variations in  $I_{CORE}$ . Consequently, this removes any correlation between  $I_{IN}$  and  $I_{CORE}$ , effectively preventing side-channel attacks. Moreover, this is achieved with a minimal power overhead of less than 4.9%.

Keywords: Side-channel resistance, EMI noise, IC security

#### INDUSTRY INTERACTIONS

IBM, NXP, Texas Instruments

#### MAJOR PAPERS/PATENTS

[1] K. Wei, J. W. Kwak and D. B. Ma, "An encrypted on-chip power supply with random parallel power injection and charge recycling against power/EM side-channel attacks," IEEE Transactions on Power Electronics, vol. 38, no. 1, pp. 500-509, Jan. 2023.

#### TASK 2810.057, RELIABILITY STUDY OF E-MODE GAN HEMT DEVICES BY AC TDDB AND HIGH RESOLUTION TEM MOON KIM, UNIVERSITY OF TEXAS AT DALLAS, MOONKIM@UTDALLAS.EDU HISASHI SHICHIJO, UNIVERSITY OF TEXAS AT DALLAS

#### SIGNIFICANCE AND OBJECTIVES

The p-GaN gate of E-Mode GaN HEMTs acts as a switch for 2-DEG channel formation in AlGaN/GaN channels. Thus, gate reliability is critical to expanding their use in power devices. In this project, we have successfully characterized X-GaN HEMT devices both electrically and physically to better understand their failure mechanism.

#### **TECHNICAL APPROACH**

Two types of E-mode p-GaN HEMTs, i.e., p-GaN and X-GaN configurations, were studied for their electrical and physical characteristics, including AC TDDB (Time Dependent Dielectric Breakdown) in addition to the DC TDDB, high-resolution transmission electron microscopy (HRTEM), and in-situ electrical-biasing TEM.

#### SUMMARY OF RESULTS

In the case of p-GaN HEMTs, the sign of device degradation during constant voltage (CV) stress is increased gate current in TDDB. The forward gate bias sweep shows an increase in gate current flowing through the device. In CV stress, the damage initiation to failure is rapid, and the metal/p-GaN contact of the gate has been completely damaged. Because of the high current flow, we also observe collateral damage to the device, which is an after-effect of gate failure. A multistage failure or soft breakdown was observed in constant current (CC) stress compared to the single-stage failure or hard breakdown in CV stress.

The gate injection of E-Mode GaN HEMT transistors with hybrid drain configuration, i.e., X-GaN HEMTs, shows considerably stable operation under stress, and the time to failure is approximately dependent on the operating voltage. When these devices were tested for constant voltage (CV) stress, and constant current (CC) stress to induce time-dependent breakdown, the time to failure was strongly dependent on the stress current, unlike weak voltage dependence in CV stress.

E-mode GaN HEMTs with a p-GaN gate were used to prepare in-situ electrical biasing samples utilizing SEM and FIB techniques. In the gate-drain biasing setup, we observed the physical changes and defect formation concentrated around the device's active region, which is responsible for the gate breakdown, as shown in Fig. 1. However, the initiation to final damage was sudden and could not be captured in real-time.



Figure 1. STEM images of the degraded p-GaN/AlGaN/GaN heterostructure after in-situ electrical biasing show nanocracks, elemental diffusion, and roughened interfaces.

To monitor gate breakdown in real-time, we prepared similar samples with significantly reduced surface leakage current and conducted in-situ stress measurements. The samples, prepared using focused ion beam (FIB) techniques, were treated to limit the current flow. Combining isolation cuts with plasma and oxygen treatments reduced the surface electrical leakage current by up to six orders of magnitude, as shown in Fig. 2.



Figure 2. I-V characteristics of (a) as prepared in-situ electrically biased E-mode GaN HEMT with reduced surface leakage current and (b) in-situ electrically biased sample, showing breakdown.

We have successfully analyzed the device degradation/failure characteristics and reduced surface leakage for in-situ samples by six orders of magnitude. However, further refinement is necessary to decrease the leakage even more to be compatible with ex-situ electrical testing. To achieve this, we are pursuing more precise control in in-situ electrical biasing experiments, including the development of a new 4-probe gate-source-drain biasing.

### **Keywords:** E-mode GaN HEMT device, Reliability, in-situ electrical biasing STEM

#### INDUSTRY INTERACTIONS

**Texas Instruments** 

#### MAJOR PAPERS/PATENTS

[1] A. Mehta, S. Shichijo, J. Joh, C. Suh, and M.J. Kim, "Degradation and failure mechanism of p-GaN gate emode GaN HEMTs," ECS Trans. 112, 9-20 (2023).

Analog/RF devices are prone to process variability, and this impacts the performance of devices and yield. To optimize Analog/RF IC testing for yield our solution aims to minimize yield loss (Overkill) and test escapes (Underkill) by leveraging machine learning models.

#### **TECHNICAL APPROACH**

To minimize Overkill, our three-step approach includes: (1) predicting auxiliary test values via multivariate regression models, (2) clustering these predictions with actual outcomes, and (3) identifying recoverable devices using a proximity-based metric. For Underkill, we utilize unsupervised GMM (Gaussian Mixture Model) clustering on measurements from multiple insertions to isolate devices likely to fail on-site and employ adaptive multivariate outlier detection for identifying potential customer returns. Fault IDs are determined through a multi-class neural network to eliminate the need for extensive failure analysis.

#### SUMMARY OF RESULTS

In this section, we will summarize the Overkill and Underkill reduction work that was previously explored and present the new results on Underkill reduction extension to classify the fault-Id of customer returns.

In our efforts to reduce Underkill, we performed our experiments on an industrial dataset from Texas Instruments that consisted of 66 specification tests and 241 auxiliary tests performed on 92,022 devices. Of these devices, we focus on 8,840 (9.6%) devices that pass the specification test but fail the auxiliary tests. Using the twoclass classifier in addition to our regression and clustering, we recovered 81.6% (highlighted in green) of devices from our focus group as observed in Table 1.

| Table 1. Devic | e Classification | using a | Two-class | Classifier |
|----------------|------------------|---------|-----------|------------|
|----------------|------------------|---------|-----------|------------|

|           |      | Specification Tests |      |  |
|-----------|------|---------------------|------|--|
|           |      | Pass                | Fail |  |
| Auxiliary | Pass | 80261 <b>+ 7217</b> | 1623 |  |
| Tests     | Fail | 726                 | 2195 |  |

In our efforts to reduce Underkill, we proposed a threestep approach; feature selection, clustering using GMM, and adaptive outlier detection. We performed our experiments on an industrial dataset from Texas Instruments consisting of devices from 19 wafers with a recorded customer return on each wafer. Upon applying our proposed methodology, we achieved coverage of 89% - 100% (correctly identified customer returns). Additionally, the outlier detection model incurs an additional yield loss of 3.48% - 1.8% as we progress the train set from 10 wafers to 18 wafers.

Table 2. Fault-Id Classification using Multi-Class Classifier

| Actual vs<br>Predicted | Fault-Id 1 |       | Fault-Id 3 |
|------------------------|------------|-------|------------|
| Fault-Id 1             | 81.5%      | 18.5% | 0%         |
|                        | 100%       | 0%    | 0%         |
|                        | 32.5%      | 67.5% | 0%         |
|                        | 24.5%      | 75.5% | 0%         |
|                        | 0%         | 7.5%  | 92.5%      |

Finally, in our efforts to classify the fault-ld of customer returns. We performed our experiment on an industrial dataset from Texas Instruments that consisted of 19 customer returns that are categorized into 3 fault-id. The multi-class classifier model is a feedforward neural network with three hidden layers, each using ReLU (Rectifier Linear Unit) activation and dropout layers to prevent overfitting and an output layer with SoftMax activation for 3-class classification.

To train and test the model we used an 80-20 split of devices belonging to each fault-Id and the results of classification are recorded in Table 2. We can classify the correct fault-Id of customer returns with 100% accuracy if our predictions are subjected to majority vote.

**Keywords:** Yield recovery, machine learning, adaptive testing, failure analysis

#### INDUSTRY INTERACTIONS

**Texas Instruments** 

#### MAJOR PAPERS/PATENTS

[1] D. Neethirajan, et. al., "Machine Learning-Based Overkill Reduction through Inter-Test Correlation," IEEE VLSI Test Symposium (VTS), 2022.

[2] V. A. Niranjan, et. al., "Machine Learning-Based Adaptive Outlier Detection for Underkill Reduction in Analog/RF IC Testing," IEEE VLSI Test Symposium (VTS), 2023

### TASK 2810.059, ULTRA-LOW-POWER ROBUST SAR ADC FOR PMCW AUTOMOTIVE RADAR

YUN CHIU, UNIVERSITY OF TEXAS AT DALLAS, CHIU.YUN@UTDALLAS.EDU

#### SIGNIFICANCE AND OBJECTIVES

Low-power, high sample-rate ADCs targeting reliable operation in harsh environments such as automotive radars are needed. Most prior works on SAR ADCs were focused on the core SAR design, and little attention has been paid to the peripheral circuits, namely, the input and reference buffers, which often consume more power.

#### **TECHNICAL APPROACH**

A soft summing-node (SSN) technique, termed the "elastic" S/H structure, that alleviates power-hungry ADC input drivers to achieve the goal of an overall ultra-low power consumption is pursued in this project. A schematic of the proposed two-step SAR ADC with SSN sampling is shown in Fig. 1. The summing node switch  $M_2$  is sized down compared to its conventional counterpart, which improves parasitics but introduces swing on the SSN, node 2. The capacitor  $C_x$  captures any swing on node 2 so it is not seen by the first stage.

SUMMARY OF RESULTS



Figure 1. Two-step SAR ADC with SSN technique.

A 12-b, 300MS/s two-step SSN SAR ADC prototype was taped out in a 22-nm CMOS process in January 2024. The completed layout and post-layout simulation results are reported. The simulated linearity of the first stage including parasitic resistance was 85dB as shown in Fig. 2. The simulated post-layout SNDR of the ADC including quantization noise but not thermal noise was 75dB (Fig. 3). Due to the long simulation times of post-layout simulation, a sine fit was used to evaluate the performance of the ADC given the limited number of simulated samples. The first stage alone could be simulated with parasitic capacitors and resistors extracted, while the overall ADC including clocking and bypass was simulated with parasitic capacitors only.



Figure 2. SSN + residue amplifier post-layout linearity.



Figure 3. Post-layout simulation of prototype ADC chip.

The finished layout of the prototype is shown in Fig. 4.



Figure 4. Layout screenshot of prototype ADC chip.

**Keywords:** soft summing node (SSN), summing-node swing, summing-node distortion, flash TDC, background calibration

#### INDUSTRY INTERACTIONS

NXP, Texas Instruments

#### MAJOR PAPERS/PATENTS

### TASK 2810.064, CHARACTERIZATION AND TOLERANCE OF AGEING IN INTEGRATED VOLTAGE REGULATORS

SAIBAL MUKHOPADHYAY, GEORGIA INSTITUTE OF TECHNOLOGY, SAIBAL@ECE.GATECH.EDU

#### SIGNIFICANCE AND OBJECTIVES

The project will develop circuit techniques and design methodologies to model, characterize, and tolerate ageing in integrated voltage regulators, including on-chip inductive buck and digital low dropout regulators, used in modern SoCs.

#### TECHNICAL APPROACH

This project has analyzed the effects of ageing in IVRs, designed a test circuit to efficiently characterize ageing in IVRs, and estimated the ageing-induced design margin for IVR. We will focus on high-frequency inductive regulators.

#### SUMMARY OF RESULTS

We have developed a design framework including on-chip stress and measurement circuits to characterize NBT/HCIinduced aging of various components of an IVR (Fig. 1). A digitally controlled, 125-MHz double-sampled IVR with the proposed framework is designed and fabricated in 65nm CMOS (Fig. 2). On-chip stress and measurement circuits enable aging analysis of individual components without degrading other components. Stress and measurements are conducted to characterize the aging degradation of each component's functionality. In the meantime, measurements are performed to characterize the aging of each component (independently and simultaneously) on transient response, regulation properties, and efficiency of IVRs. Measurements show that aging effects increase power stage resistance, extend DPWM's duty cycle, increase ADC's transition levels, and reduce PID's critical frequency, leading to marginally longer setting time, deteriorated voltage regulation, and reduced efficiency. Ageing induced frequency reduction of PID has the strongest effect on IVR's transient response (Fig. 3). Power stage degradation leads to the largest efficiency reduction (Fig. 3).



Figure 1. Overall architecture and die-photo (65-nm CMOS) of a digitally controlled IVR.



Figure 2. Chip die photo and specifications. IVR uses a 68-nH discrete inductor and a 44-nF discrete capacitor.



Figure 3. Measurement results: Setup for accelerated ageing stress (@ 2V) (top, left), impacts of aging of IVR components on IVR efficiency (top, right), and settling time for load transient (bottom).

#### Keywords: Integrated voltage regulator, ageing

#### INDUSTRY INTERACTIONS

IBM, Intel, NXP

#### MAJOR PAPERS/PATENTS

[1] S. Zhang, et.al., "Analysis of the Effect of Hot Carrier Injection in An Integrated...," IEEE/ACM ISLPED 2022.

[2] S. Zhang, et. al, "Measurement of Aging Effect in a Digitally Controlled Inductive...," IEEE International Reliability Physics Symposium (IRPS), 2024.

[3] W. Wang, et. al., "Measurement of Aging Effect of an Analog Computing-In-Memory...," IEEE International Reliability Physics Symposium (IRPS), 2024.

#### TASK 2810.065, POWER-EFFICIENT AND RELIABLE 48-V DC-DC CONVERTER WITH DIRECT SIGNAL-TO-FEATURE EXTRACTION AND DNN-ASSISTED MULTI-INPUT MULTIPLE-OUTPUT FEEDBACK CONTROL MINGOO SEOK, COLUMBIA UNIVERSITY, MGSEOK@EE.COLUMBIA.EDU

#### SIGNIFICANCE AND OBJECTIVES

The goal for the second phase of this project is to extract critical features of the GaN-based DC-DC converter and then perform health monitoring, stress control, and efficiency tracking functions.

#### **TECHNICAL APPROACH**

We designed a 40-to-3V wide input range GaN-based DC-DC converter featuring three advanced techniques. First, we proposed a health monitor sensing the on-time resistance ( $R_{on}$ ) and threshold voltage ( $V_{th}$ ) degradation of the GaN device. The monitor estimates the remaining useful life (RUL) and detects the catastrophic failure of GaN. Second, we designed stress control circuits that relax the thermal stress, the voltage stress on the gate and drain terminals. Third, we proposed an efficiency tracking for GaN-based DC-DC converter by gate drive voltage ( $V_{drv}$ ) modulation.

#### SUMMARY OF RESULTS

From May 1, 2023, to April 30, 2024, we designed the second prototype chip, a GaN-based DC-DC converter. Using the chip, we will verify the health monitoring, stress control, and efficiency track functions of this prototype chip.



Figure 1. Architecture of the proposed converter. It consists of one digital  $V_{out}$  regulation loop, one health monitoring and stress control loop, and one efficiency tracking loop.

The battery for electrical vehicles has a large dynamic fluctuation range between 40V and 3V. To tolerate the wide fluctuation of  $V_{in}$ , we employ GaN devices in the power stage of the DC-DC converter for its high voltage

tolerance and low  $R_{on}.$  However, the GaN devices suffer from reliability issues like  $R_{on}$  and  $V_{th}$  shifts. To address these and further improve the converter efficiency, we introduced three novel features to the GaN-based converter.

Fig. 1 shows the architecture of the proposed 40V-to-3V wide input GaN-based DC-DC converter for automotive applications. It has three control loops. The first is the voltage-mode digital Vout regulation loop. The second is the health monitoring and stress control loop. The third is the efficiency tracking loop. The voltage-mode digital V<sub>out</sub> regulation loop comprises a flash ADC, a digital PID controller, and a digital PWM generator. The health monitoring and stress control loop includes a V<sub>ds</sub> sampler, an iL amplifier, a Vth trigger, a PTAT temperature sensor, a TDC, three ADCs, an Ron calculation block, a junction temperature (T<sub>i</sub>) estimation block, a temperature calibration block, a health monitoring block, and a stress control block. This loop first senses the  $V_{th},\,V_{ds},\,i_L$  and ambient temperature (T<sub>a</sub>) of the converter. Based on the sensed  $T_a$ ,  $V_{ds}$ , and  $i_L$ , it calculates  $R_{on}$  and estimates  $T_i$ , then calibrates  $R_{on}$  and  $V_{th}$  to  $T_j$  to acquire the junction temperature independent  $R_{\text{on}}$  and  $V_{\text{th}}$  value. The health monitoring block takes Ron and Vth to estimate RUL and detect the catastrophic failure of GaN. The efficiency tracking loop has an efficiency tracking controller and a charge pump. It tunes V<sub>drv</sub> under different load conditions to track the maximum efficiency point.

**Keywords:** RUL estimation, failure recognition, efficiency tracking, ringing mitigation

#### INDUSTRY INTERACTIONS

Texas Instruments, IBM, Intel

#### MAJOR PAPERS/PATENTS

[1] Z. Wang, et al., "93.89% Peak Efficiency 24V-to-1V DC-DC Converter with Fast In-Situ Efficiency Tracking and Power-FET Code Roaming," 2023 ESSCIRC, Lisbon.

[2] M. Li, et al., "16.6 PACTOR: A Variation-Tolerant Probing-Attack Detector for a 2.5Gb/s×4-Channel Chip-to-Chip Interface in 28nm CMOS," 2024 ISSCC, San Francisco.
[3] Z. Wang, et al., "A Ten-Level Series-Capacitor 24-to-1-V DC-DC Converter With Fast In Situ Efficiency Tracking, Power-FET Code Roaming, and Switch Node Power Rail," in IEEE Journal of Solid-State Circuits.

### TASK 2810.066, DEMONSTRABLY GENERALIZABLE COMPACT MODELS OF ESD DEVICES

ELYSE ROSENBAUM, UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN,

ELYSE@ILLINOIS.EDU

#### SIGNIFICANCE AND OBJECTIVES

We developed compact models of ESD protection devices and verified that those "ESD models" correctly represent the devices' response to arbitrary stimuli. The compact models allow designers to use simulation to create protection circuits that can protect I/O pins in advanced nodes without compromising signal integrity.

#### **TECHNICAL APPROACH**

We developed non-quasi-static models of N-well and Pwell diodes that are accurate under both normal operating and ESD conditions. MOSFETs are used in both functional circuits and on-chip protection networks. We proposed an ESD wrapper model that augments the PDK MOSFET model. Transient models of ESD devices are validated using waveforms that are different from those used for parameter extraction. Interconnect models suitable for CDM-ESD simulations were derived and the resultant circuit simulations were validated by test circuit measurements.

#### SUMMARY OF RESULTS

A MOSFET ESD model was created by augmenting the PDK model with extra elements to account for the unique high-current behaviors. The extra elements are named the "wrapper." To ensure that the accuracy of the PDK model is maintained, ideally, none of its parameters should be modified. Previously, we reported that the BSIM4 MOSFET model included in most PDKs for planar bulk CMOS technologies greatly overpredicts the gate current at voltages that exceed the rated supply voltage. The error prevents one from simulating the ESD response of the rail clamp circuit of Fig. 1. We are not able to mitigate that error with the wrapper and thus we manually set the Igbmod parameter to 0. Results are shown in Fig. 2.



Figure 1. Active rail clamp circuit. For a positive discharge to an IO pin, the "up" diode will direct the ESD current to the  $V_{DD}$  rail, from where it flows to  $V_{SS}$  through the clamp.

During simulation, it is important to detect if the ESDinduced gate voltage ( $V_{GS}$ ) exceeds the oxide breakdown voltage ( $BV_{ox}$ ).  $BV_{ox}$  is a function of the stress duration and the MOSFET active area (W X L). Previously,  $BV_{ox}$  had not been quantified on a CDM time scale, motivating the development of an on-chip single-shot pulse generator (Fig. 3). An analysis of the  $BV_{ox}$  data may be found in the report for Task 3160.046.



Figure 2. Simulated response of the circuit in 65-nm CMOS is in Fig. 1. The results are shown in green.



#### gate oxide breakdown voltage on hundreds of ps time scale. Keywords: ESD, CDM, compact models, circuit simulation

#### INDUSTRY INTERACTIONS

AMD, Intel, NXP, Texas Instruments

#### MAJOR PAPERS/PATENTS

[1] S. Huang, et. al., "Physics-based compact model of N-Well ESD diodes," in 2023 EOS/ESD Symposium Proceedings.

[2] M. Drallmeier, et. al., "Distributed protection for high-speed...," in 2023 EOS/ESD Symposium Proceedings.

[3] E. Rosenbaum, et. al., "Compact models for simulation of on-chip ESD protection networks," *IEEE Trans. Electron Devices*, vol. 71, no. 1, pp. 151-166, 2024.

We developed a fully synthesizable reliability sensor to monitor device degradation due to aging. We have modified the previous 65nm design according to the issues found in the test chip data and implemented the modified synthesized version in 28-nm CMOS and 12-nm FinFET which provide markedly improved aging statistics.

#### **TECHNICAL APPROACH**

We made improvements to our Verilog design to make the design more robust and to get cleaner data. An additional counter prevents the beat frequency (BF) counter from rolling over and keeps its maximum count value. The measurement terminating signal is synchronized with the counters and trigger signals, instead of a combinational signal pulse. The design uses the AC stress clock to remove glitches in the ring oscillators during startup. The design is fully synthesizable, and the layout was generated with an automatic place and route tool without any manual routing.

#### SUMMARY OF RESULTS

The odometer has calibration free operation and ring oscillator start-up glitch removal. We also added a special type of odometer, with a stacked inverter chain as the reference ring oscillator and a 2x standard cell inverter chain as the stressed ring oscillator. The design includes ring oscillators, with 3 different threshold voltage (V<sub>th</sub>) flavors – RVT, LVT, and HVT, with 3 different standard cells – INV, NAND, and NOR. Each of the ring oscillators has 101 stages (stress) and 103 stages (reference) of the specific standard cell. We have 6 odometers for each V<sub>th</sub>, and 3 voltage sensor odometers in each of the two power domains. A total of 42 odometers is in the chip, each of which can be individually controlled by an AXI interface.

We collected measurement data from all the odometers after stressing them for almost two days. We stressed half of the odometers with DC stress and the other half with AC stress. We measured BF count values right before the stressing to calculate % frequency degradation. In Fig. 1, the frequency shift is plotted against the stress time, each of the plots shows the variation in the frequency shift data among 6 odometers. The odometers are stressed with 1.2V (+33% of nominal 0.9V) for 28nm and 1.067V (+33% of nominal 0.8V) for 12nm in 25°C and 100°C. The data show higher DC stress degradation in high temperature than in room temperature for increased BTI, which is expected. We can

see higher degradation in DC aging data than the AC aging data, which shows higher BTI aging. We can see higher degradation in HVT devices, as high threshold voltage devices are more sensitive to the frequency shift. Overall degradation for the 12-nm FinFET odometers are less than that fabricated in 28-nm CMOS. We however notice more variations in the data of the 12-nm FinFET odometers than that fabricated in 28-nm CMOS.



Figure 1. Frequency shift vs stress time for (top) 28-nm planar transistor 1.2V stress (+33% of nominal 0.9V  $V_{DD}$ ) and (bottom) 12-nm FinFET 1.067V stress (+33% of nominal 0.8V  $V_{DD}$ ) for both DC and AC stress.

**Keywords:** Silicon odometer, synthesizable, Verilog, automatic place and route, 280nm CMOS, 12-nm FinFET

#### INDUSTRY INTERACTIONS

Intel, NXP, TI

#### MAJOR PAPERS/PATENTS

#### TASK 2810.074, THERMAL PERFORMANCE CHARACTERIZATION AND DEGRADATION MONITORING OF LDMOS BASED INTEGRATED POWER IC WITH ON-DIE TEMPERATURE SENSORS

BILAL AKIN, UNIVERSITY OF TEXAS AT DALLAS, BILAL.AKIN@UTDALLAS.EDU

#### SIGNIFICANCE AND OBJECTIVES

The Lateral Diffused MOSFET (LDMOS) device is widely used in power integration applications. With the remarkable increase of current density and switching speed, the stress caused by voltage overshoot becomes more pronounced. Consequently, it is essential to examine the robustness of LDMOS devices under dynamic reliability testing.

#### **TECHNICAL APPROACH**

To accelerate testing and achieve reliable results, a large-scale test setup is used, allowing stress to be simultaneously applied to multiple devices. The design of customized isolated LDMOS devices focuses on examining the effect of the isolation well connection and width on the device resilience during dynamic reliability testing and finding the most reliable precursor for condition monitoring of the device.

#### SUMMARY OF RESULTS

A dedicated circuit is designed to accurately measure the Idss of LDMOS devices for condition monitoring. The circuit to measure the Idss is based on a transimpedance amplifier structure with an ultra-low input current. The circuit diagram of the  $I_{\mbox{\tiny dss}}$  measurement setup and the corresponding experimental board are illustrated in Fig. 1. To measure the Idss, both switches on the high and low sides should be turned off. To achieve precise measurements, the circuit utilizes the LMC6041 operational amplifier. With an input leakage current of only 2fA, this amplifier is an ideal one to accurately measure the I<sub>dss</sub> in the nano-ampere current range. To obtain more sensible data on Idss, the circuit incorporates a feedback resistor  $(R_2)$  with a value of  $10M\Omega$ . Furthermore, the appropriate value for  $R_2$  ensures that the Idss from the DUT flows exclusively through R2, eliminating any leakage current through the lower switch. On the other hand, the presence of parasitic elements on the PCB which is modeled by input capacitor (C<sub>in</sub>), can affect the circuit's performance. Therefore, to compensate for the impact of C<sub>in</sub> when using a large feedback resistor, a feedback capacitor (C<sub>f</sub>) is added to the circuit such that

$$\frac{1}{2\pi R_1 C_{in}} \geqslant \frac{1}{2\pi R_2 C_f} \rightarrow R_1 C_{in} \leqslant R_2 C_f \quad . \tag{1}$$

By iteratively testing for different values of  $C_f$ , the pulse response of the circuit is experimentally optimized. Fig. 2 illustrates the results of  $I_{dss}$  measurement from the proposed circuit and the curve tracer at different  $V_{ds}$ . The bus voltage is changed during testing to show the response of the circuit by increasing the  $I_{dss}$ . As can be seen, the circuit can accurately measure the  $I_{dss}$  which has an error of less than 2% compared that measured with the curve tracer at the rating voltage of the device ( $V_{ds} = 20V$ ). Fig. 2 illustrates the double pulse tester (DPT) results of the DUT using the board shown in Fig. 1.



Figure 1. Condition monitoring setup of LDMOS devices: circuit diagram and experimental board.



Figure 2.  $I_{dss}$  measurement results from the proposed circuit and curve tracer and DPT results from the setup are shown in Fig. 1.

**Keywords:** LDMOS, reliability, dynamic reliability test, condition monitoring, drain leakage current

#### INDUSTRY INTERACTIONS

**Texas Instruments** 

#### MAJOR PAPERS/PATENTS

[1] R. Sajadi, C. Xu, B. T. Vankayalapati, M. Farhadi and B. Akin, "Reliability Evaluation of Isolated LDMOS Devices and Condition Monitoring Solution," in IEEE Transactions on Components, Packaging and Manufacturing Technology, vol. 14, no. 5, pp. 841-850, May 2024.

This research seeks to increase lifetime and enable circuit operation closer to the reliability limit to improve performance. To reduce complexity and cost, approaches to estimate noise degradation using surrogate sensors will be investigated. This research may provide a step toward a framework for predicting time to failure.

#### **TECHNICAL APPROACH**

Noting that the aging of nano-scale transistors is highly variable, and noise is one of the most sensitive parameters to transistor aging, this research will investigate the feasibility of increasing the lifetime of circuits by monitoring noise performance degradation and intelligent circuit reconfiguration. More specifically, PLL's and downconverters using arrays of near minimum size transistors that can be used for post-fabrication selection of a subset to reduce noise will be utilized. The feasibility of replacing the transistors with increased noise due to aging with fresh transistors that have lower noise to recover the circuit noise performance will be evaluated.

#### SUMMARY OF RESULTS

This research effort involving collaboration with Prof. Y. Makris of UT Dallas and Prof. C. Kim of U. of Minnesota is experimentally evaluating the feasibility of increasing lifetime by replacing the degraded devices with low noise devices that are not aged. A PLL employing a VCO using an array of individually selectable near minimum sized NMOS cross coupled pairs is fabricated in 65-nm CMOS. Arbitrary combinations of 64 pairs can be selected for stressing and for VCO operation. The PLL in Figs. 1 and 2(left) include an on-chip phase noise measurement circuit that can be used for the selection of combinations of pairs with low phase noise after fabrication and after stress.



Figure 1. PLL with an on-chip phase noise measurement circuit that will be used for initial stress and heal experiments to investigate the feasibility of improving lifetime.

The phase noise of PLL was measured using the on-chip phase noise measurement circuit along with an external ADC, and a Keysight E5052B Signal Source Analyzer at 500-kHz, 1-MHz, and 2-MHz offset from its carrier near 4 GHz.

The deviations between the two techniques are less than  $^{\rm \sim}1.2$  dB. The lowest phase noise measured is -128 dBc/Hz at a 2-MHz offset.

Having redundant elements in a circuit enables the replacement of transistor pairs as they degrade with the ones with smaller degradation to increase the circuit's lifetime. A VCO of a PLL with 32 out of 64 pairs switched on was stressed at  $V_{DD}$ =2.6 V and a bias current of 7.7 mA for 1 hour. This is followed by phase noise measurement and intelligent search steps to identify a new combination with the lowest phase noise. The stress degrades the phase noise by ~3 dB and the subsequent search identified a combination with phase noise almost the same as before the stress. Fig. 2(right) shows the phase noise degradation after stress as well as recovery indicating that the proposed PLL design technique should enable an increase in the lifetime.



Figure 2. (left) Die photograph of PLL, (right) phase noise degradation of VCO versus stress cycles at  $V_{DD}$ =2.6V for one hour. Re-selection after the 1<sup>st</sup> stress cycle restores the phase noise close to that before the 1<sup>st</sup> stress.

**Keywords:** PLL, downconverter, noise measurements, post-aging selection, lifetime

#### INDUSTRY INTERACTIONS

Intel, Texas Instruments

#### MAJOR PAPERS/PATENTS

[1] F. Jalalibidgoli, et. al., "Post-Fabrication Performance and Longevity Improvement of a Phase Locked Loop with an On-chip Phase Noise Measurement Circuit," Accepted to IEEE Transaction on Microwave Theory and Techniques. [2] F. Jalalibidgoli, et. al., "19-GHz VCO with Phase Noise of -117 dBc/Hz at 1-MHz Offset Using an Array of Near Minimum Size Transistors and Intelligent Post Fabrication Selection," Accepted to 2024 IEEE IMS, Washington DC. TASK 2810.084, SOFT AND HARD ANALOG FAULT DETECTION, INJECTION, COVERAGE, DIAGNOSIS, AND LOCALIZATION STRATEGIES SUITABLE FOR PRODUCTION TEST AND IN-FIELD TEST DEGANG CHEN, IOWA STATE UNIVERSITY, DJCHEN@IASTATE.EDU

#### SIGNIFICANCE AND OBJECTIVES

Mission-critical applications demand extreme reliability, mandating high electronic fault coverage. Analog circuits account for a tiny portion of transistors but a major portion of failures. This project develops costeffective strategies for detecting soft and hard analog faults for both production and in-field testing, targeting significantly enhanced reliability and robustness.

#### **TECHNICAL APPROACH**

We will work with industry liaisons to identify the most relevant AMS-IP blocks as research vehicles. Digital-like detectors and injectors will be inserted to boost structural observability and controllability so that high defect coverage becomes possible from the digital detectors, making specification simulation unnecessary. Instead, DC parametric sweep is utilized for overage evaluation, dramatically reducing coverage simulation time, and allowing all single-defect injections evaluated for accurate coverage. Soft defects are modeled as device parametric changes or as open/shorts in sub-unit components. Detector output codes and injector input codes will be analyzed to determine the defect location and type.

#### SUMMARY OF RESULTS

During the last year, we developed a time-efficient analog and mixed-signal defect simulation framework. We tested it using multiple AMS circuits such as Op amps, SAR ADCs, and FVF LDOs. We also tested it across multiple defect detection methods such as IOI, OTM, etc. See [1] for details. We submitted a long report to SRC on the design and simulation of multiple analog circuits with targeted soft and hard analog fault injection to verify the proposed fault detection method and its small impact on analog performance. We submitted another report to SRC on methods and algorithms for analyzing detection digital output data to achieve fault diagnosis and fault localization together with evidence illustrating algorithm efficacy. We developed a method for high-resolution comparator defect detection, achieving 100% defect coverage. Working with TI, we started developing a method for defect detection in phase-locked loops with minimal intrusion and ultra-fast fault coverage simulation.

Fig. 1 illustrates an example structure of a highresolution comparator. One or more stages of preamplifier, as shown in Fig. 1(a), could be used to increase the comparator gain. A standard latch stage Fig. 1(b) is used as the output stage. The comparator is used in a SAR ADC. We control the C-DAC of the ADC to generate either +1LSB or -1LSB as the input to the comparator. The SAR process generates conversion codes which are compared against +/-1 for defect detection. We achieved 100% coverage with a short fault coverage simulation time.



Figure 1. Example structure of a high-resolution comparator with one or more preamplifier stages (a) and an output latch stage (b) as used in a SAR ADC.

### **Keywords:** Hard and soft defect detection, analog defect coverage, defect diagnosis and localization, in-field test

#### INDUSTRY INTERACTIONS

IBM, Intel, NXP, Richtek, Siemens, Texas Instruments

#### MAJOR PAPERS/PATENTS

 M Saikiran, et. al., "Low-cost defect simulation framework for analog and mixed-signal (AMS) circuits with enhanced time-efficiency," J. Analog Int Circ Sig Proc.
 M. Sekyere, et. al., "A Power Supply Rejection Based Approach for Robust Defect Detection," 2023 IEEE East-West Design & Test Symposium, Batumi, Georgia.

[3] M. Saikiran, et. al., "Digital Assisted Defect Detection Methods for AMS Circuits: An Overview," 2023 IEEE East-West Design & Test Symposium, Batumi, Georgia.

[4] M. Saikiran, et. al., "Graph Theory Based Defect Simulation Framework for AMS Circuits with Improved Time-Efficiency," 2023 IEEE East-West Design & Test Symposium, Batumi, Georgia.

In this project, we propose a data-driven anomaly detection framework for AMS Functional Safety (FuSa) in automotive systems. Over the past year, the unsupervised learning framework was enhanced by incorporating novel feature and signal selection algorithms, an explainable AI technique to improve model transparency, and analysis at higher abstraction levels.

#### **TECHNICAL APPROACH**

In this research, we propose a novel anomaly detection strategy that involves: (1) a genetic algorithm-based feature selection approach, (2) a novel signal selection algorithm that ascertains the best intermediate circuit signal for furnishing enhanced anomaly detection accuracy, while reducing the associated detection latency, and (3) an explainable AI (XAI)-based framework that boosts user interpretability and transparency of the anomaly detection framework. Next, we adopt a systematic methodology for anomaly abstraction using OpAmp circuits. We construct k-stage non-inverting amplifiers and introduce anomalous variations in various locations. This in-depth examination supports the development of effective anomaly detection and mitigation strategies.

#### SUMMARY OF RESULTS

The overview of the proposed FuSa violation detection framework is illustrated in Fig. 1. We demonstrate a Proof of Concept (PoC) of two case studies to evaluate the proposed feature and signal selection frameworks: (1) a bandgap voltage reference (V<sub>Ref</sub>) circuit and (2) an operational amplifier (Opamp) circuit<sup>[1]</sup>. From the results, we can infer that the proposed feature selection approach induces a considerable improvement in anomaly detection performance of up to 7.2% over the baseline technique for  $V_{Ref}$  and 6.5% over the baseline for OpAmp circuits, thereby advocating for its effectiveness. Furthermore, our signal selection approaches provided a 2.3X reduction in detection latency. This is particularly important since it is imperative to detect FuSa anomalies early before they lead to a failure. The findings of our XAIbased approach, using an Explainable Boosting Machine, are beneficial in classifying anomalous behavior and can be used as feedback during the circuit design phass [1]. The anomaly abstraction experiments were performed on three different amplifier stages: single-stage, dual-stage, and tri-stage amplifiers, each built from a base operational

amplifier circuit. Anomalies were introduced in a k-stage amplifier by simulating anomalous operations of the base OpAmp symbol at various locations within the amplifier. The anomaly detection algorithms (GMM, K-means, BIRCH, spectral) were evaluated based on their anomaly detection accuracy across experiments at the different abstractions. Notably, the GMM (Gaussian Mixture Model) algorithm consistently demonstrated superior detection performance across all three amplifier stages, achieving 80% accuracy in the single-stage amplifier, 90.8% in the dual-stage amplifier, and up to 100% in the tri-stage amplifier. Similarly, the k-means clustering algorithm exhibited competitive performance, with detection accuracies of 78.75%, 87.5%, and 96.2% in the respective amplifier stages. BIRCH and Spectral Clustering furnish slightly lower performance. These findings confirm the proposed solution's efficacy in enhancing the functional safety of automotive AMS circuits.

We have identified the following action items as our deliverables for this year: (1) perform abstraction from component to SoC level and (2) generate standardized datasets for FuSa research.



Figure 1. Illustrative example of the proposed framework in a real-world scenario.

**Keywords:** Functional Safety, AMS Circuits, Feature Selection, Signal Selection, Explainable AI

#### INDUSTRY INTERACTIONS

Intel, Texas Instruments

#### MAJOR PAPERS/PATENTS

[1] A. Arunachalam, et. al., "Enhanced ML-based Approach for Functional Safety Improvement in Automotive AMS Circuits," 2023 IEEE ITC, Anaheim, CA.

#### TASK 2810.089, TECHNIQUES FOR LOW-COST DESIGN, TEST, AND CALIBRATION OF RF MIMO SYSTEMS SULE OZEV, ARIZONA STATE UNIVERSITY, SULE.OZEV@ASU.EDU GEORGE TRICHOPOULOS, ARIZONA STATE UNIVERSITY

#### SIGNIFICANCE AND OBJECTIVES

This project aims to lower the overall production cost of RF MIMO systems by developing techniques for antenna/radome design, design and judicious insertion of built-in self-test (BIST) and calibration components, and system-level test development without requiring far-field testing.

#### **TECHNICAL APPROACH**

The proposed approach includes two optimization phases. In Phase 1, we will develop a system level model that includes imperfections of the RF front-end as well as the antenna and radome. This model will be used to determine what information can be extracted by using only mission-mode signals and what information is needed to achieve the calibration levels that satisfy system-level requirements. Mismatches between antenna elements including RX and TX stem from imperfections in hardware elements due to process variations in path lengths, metal thickness, and mismatches in active components. These are calibrated post-production using а limited number of measurements.

#### SUMMARY OF RESULTS

We used the cascaded radar evaluation board from Texas Instruments for the experiments. Since errors are higher at fine range resolution, we used the finest range resolution setting (6cm) for all experiments. One set of experiments was conducted in an anechoic chamber, which is a controlled environment. Several experiments were conducted in parks and other public areas, which serve as uncontrolled environments. Fig. 1 shows the experimental setups in the anechoic chamber and in one of the uncontrolled environment settings. Corner reflectors are placed in fixed locations whereas the radar board is rotated to emulate sweeping the angle of the object. We demonstrated and analyzed the proposed technique in hardware, using a cascaded mm-Wave radar device from Texas Instruments.

Angle of arrival estimation errors for the experiments with and without correction are shown in Fig. 2. For the correction, the model parameters are calculated from software simulations. The results show that modeling angle of arrival estimation error via software simulation is representative of hardware measurements. Both in controlled and uncontrolled environments, the angle of arrival estimation accuracy is increased significantly with correction. Without correction, the angle of arrival estimation errors have a common shape where the error increases at higher angles. Using polynomial correction, 3-4x reduction in the RMS error of angle of arrival estimation (RMS error taken over angles [-73°, 73°] with 1° steps) can be observed.





Figure 1. Experimental setup





Keywords: mm-wave, radar, 5G, BIST

#### INDUSTRY INTERACTIONS

**Texas Instruments** 

#### MAJOR PAPERS/PATENTS

[1] F. C. Ataman, C. K. Y.B., S. Rao and S. Ozev, "Improving Angle of Arrival Estimation Accuracy for mm-Wave Radars," 2023 IEEE International Test Conference (ITC), Anaheim, CA, USA, 2023, pp. 30-36.

Bearing fault detection in fan motors via current signal analysis ensures operational continuity, reduces maintenance costs, enhances safety, and extends motor life. Objectives include early fault diagnosis, predictive maintenance, optimized operations, and improved system reliability, efficiency, and cost-effectiveness.

#### **TECHNICAL APPROACH**

This study utilizes current motor analysis to identify bearing faults in cooling fan motors, aiming to improve system reliability and prevent costly shutdowns or safety risks. It circumvents issues with traditional vibrationbased diagnostics that can cause damage and require sensor installation. Instead, we employ Motor Current Signature Analysis (MCSA), a noninvasive method, to detect these faults early, focusing particularly on mechanical defects.

#### SUMMARY OF RESULTS

This project, initiated in 2022, has its progress detailed in Table 1. Tasks 1, 2, and 3 are completed on schedule, and we are currently focusing on Task 4.

Task 1 & 2: Modular test benches were created to analyze fan motor parameters, replicating common bearing faults for comparison with healthy counterparts. This effort produced a comprehensive training database and informed the development of detection algorithms. Time-domain, frequency-domain, and machine-learning algorithms were incorporated into two diagnostic frameworks — a low-computation flowchart and a complex machine-learning model. Both were evaluated and documented.

**Task 3:** We optimized the proposed algorithms (Fig. 1) to reduce external disturbances and false alarms, making them suitable for all small fan motors. The algorithms were implemented on MCUs and tested under various conditions. A detailed report on testing, optimization, and applicability has been provided.

**Task 4:** Currently in progress, we are developing a Remaining Useful Lifetime (RUL) deep learning structure (Fig. 2) to estimate RUL based on the motor current signal. The algorithm is now being tested under different conditions to evaluate its performance.



Figure 1. Proposed algorithm for EDGE Implementation.



Figure 2. Proposed deep learning for bearing RUL estimation. Table 1. Timeline

| Timeline | Yea | r1-2 | Ye | ar3 | Year4 |  |
|----------|-----|------|----|-----|-------|--|
| Task 1,2 |     |      |    |     |       |  |
| Task3    |     |      |    |     |       |  |
| Task 4   |     |      |    |     |       |  |

**Keywords:** Fan motors, bearing fault, time domain analysis, machine learning, deep learning

#### INDUSTRY INTERACTIONS

DRV Team, Texas Instruments

#### MAJOR PAPERS/PATENTS

[1] C. Li, M. Afshar, and B. Akin, "Fault Detection in Small Fan Motors Using MCSA," 2023 IEEE International Electric Machines & Drives Conference (IEMDC), San Francisco, CA, USA, 2023, pp. 1-7.

[2] M. Afshar, C. Li and B. Akin, "Real-Time Current-Based Distributed Bearing Faults Detection in Small Cooling Fan Motors," in IEEE Transactions on Industry Applications, vol. 60, no. 2, pp. 3188-3199, March-April 2024.
TASK 2810.091, DEVELOPMENT OF TWO-PHOTON ABSORPTION LASER SYSTEM FOR CREATING SINGLE EVENT EFFECTS ROBERT BAUMANN, UNIVERSITY OF TEXAS AT DALLAS, ROBERT.BAUMANN@UTDALLAS.EDU MANUEL QUEVEDO-LOPEZ, UNIVERSITY OF TEXAS AT DALLAS

# SIGNIFICANCE AND OBJECTIVES

The objectives are to create a two-photon-absorption laser pulse system enabling characterization of integrated circuit heavy-ion susceptibility WITHOUT the need for a cyclotron, to optimize the system to mimic transient charge disturbance produced by the passage of a heavy ion and to create sensitivity mapping capability for IC design debugging.

#### **TECHNICAL APPROACH**

A two-photon-absorption (TPA) laser for a single-event effects (SEE) characterization system with ustomized machine-vision and control software will be constructed using readily available off-the-shelf lasers, optics, motion control hardware. Axicon optics will be used to transform the ellipsoid Gaussian beam profile into a long cylindrical Bessel beam profile to better emulate heavy-ion charge tracks. The team will work with TI on designing/obtaining process technology monitors (diodes with deepest implant/diffusion structures) and correlate TPA laser pulse energy to equivalent heavy ion linear energy transfer (LET) values. This should allow the correlation of the TPA laser to heavy ions.

# SUMMARY OF RESULTS

The work of assembling, designing, and optimizing the optics and mechanics is ongoing. For silicon devices we are planning to operate at 1260nm so the photons by themselves cannot ionize e-h pairs and thus the laser beam can propagate through the silicon without significant attenuation (the laser pulse is injected through the backside of the IC). The wavelength was also selected to align with the wavelength used by the US Naval Research Labs (primary developers of the TPA technique for capturing SEE responses) so that we can correlate our results with their results. At the focus of the beam, photon intensity is such that a large amount of TPA occurs to create significant charge generation in the active device layers of the silicon IC. To mimic heavy-ion events the pulse shape and energy must be controlled. This is achieved with in-line pulse metrology. A special prismatic lens (axicon) converts the Gaussian beam focal ellipsoid to a Bessel beam which is a long cylinder along the primary axis. This is crucial in mimicking the charge geometry produced by a heavy ion, which also is a long cylinder. Dedicated process test structures from Texas Instruments will be characterized at TAMU Cyclotron with actual heavy ions of various linear energy transfer (LET) as a reference response of the silicon. These results will be correlated with TPA laser pulse injections spanning a range of injected pulse energies.

The basic TPA system has been assembled and is operational. Optics optimization has been achieved using COMSOL optical simulations. The main activity for the remainder of 2024 will be to complete the NIR and VIS imaging optical systems to allow visualization of VLSI target features on the IC during TPA injection. Currently, we are focused on using well-characterized PIN diode structures to capture the charge transients from both heavy ion events (performed at TAMU cyclotron) and TPA laser pulses. This is a physical correlation and will be aided by TCAD modeling. In 2025 we plan to focus on rapid alignment (to specific circuit structures) and throughput enhancement to enable faster evaluations. A simplified block diagram of the system is shown below.



Figure 1. Simplified block diagram of the TPA laser injection system for SEE Characterization.

**Keywords:** Harsh environments, Radiation effects, Singleevent-effects, Two-photon absorption laser, IC reliability

#### INDUSTRY INTERACTIONS

**Texas Instruments** 

Safety-critical high-volume long-lifetime applications like automotive propelled interests in online ageing monitoring. Conventional methods relying on sufficient design margins lead to a large area and performance overhead, require complex tradeoffs for PT robustness, and are difficult to migrate across technologies. On-chip aging monitoring can mitigate these problems and extend IC lifetime.

# **TECHNICAL APPROACH**

Ageing is a random process and any single ageing monitor inevitably produces incomplete chip-health assessment. Different ageing mechanisms affect different circuit parameters differently. We propose to develop integrated ageing sensors measuring device parameters directly degraded by aging in its operating environment after deployment. The proposed ageing sensors will measure a large population of devices under test, thus enabling the study of the random nature of ageing. They also provide a means for tracking the evolution of aging effects' probability distributions over different use/stress conditions. Quantitative correlation of aging sensor measurements to the product's event log offers useful lifecycle information.

# SUMMARY OF RESULTS

During the year, we investigated the physics, modeling, and EDA simulation of various ageing mechanisms. To address TI and NXP's needs, we extended our first design of a TDDB sensor to the cases of high-side MOS gate cap and/or vertical cap SILC (stress induced leakage current) detection. The new design allows monitoring of both NMOS and PMOS gate TDDB, as well as regular caps and vertical caps' TDDB. In particular, it applies to power circuits' cap TDDBs that are stressed at higher voltages. The new design also consumes significantly less area and offers a large dynamic range with a linear transfer curve in log current vs digital codes. The TDDB sensor design was further extended to monitor an array of DUT SILC currents to account for the statistical nature. We have successfully taped out a design in the TSMC 180-nm process through MUSE. We are currently in the process of taking measurements, creating various stressing and ageing conditions, and evaluating the the TDDB sensor. See Fig. 1. and its caption for some details.

For BTI ageing, we have developed two versions of the direct BTI effect to digital converters and shown how they

can be used for online BTI monitoring in analog ICs. In addition, we have also started working on an HCI ageing monitor design based on rebalancing a latch structure in which a DUT transistor suffers from HCI ageing.



Figure 1. Top-down left-right subplots show the top-level schematic of a TDDB sensor for fast measurement of SILC, MSB details for generating  $V_{comp}$ , LSB details for fine detection of SILC, SILC sensor's transfer curve showing large dynamic range and log-linear relationship, layout of the SILC sensor, and layout of the sensor with DUT array, digital control and pads.

# **Keywords:** TDDB, NBTI/PBTI, HCI, online ageing monitor, random ageing

# INDUSTRY INTERACTIONS

Texas Instruments, IBM, NXP, Intel, Mentor-Siemens

# MAJOR PAPERS/PATENTS

 E Darko, et. al., "On-Chip Monitoring of TDDB ...," 2023
 Int Integrated Reliability Workshop, S. Lake Tahoe, CA.
 K Bhatheja, et. al., "A BIST Approach to Approximate Co-Testing of Data Converters." IEEE Design & Test (2024).
 D Adjei, et. al, "An Improved Single-Temperature Trim Technique for Bandgaps," 2023 IEEE MWCAS, Pheonix, AZ.
 D Adjei, et. al, "A Resistor-less Precision Curvature-Compensated Bandgap Voltage Reference Based on the V...," 2023 IEEE ISCAS, Monterey, CA.

The objective of fault simulation is to estimate the fault coverage of a given test input. Existing fault simulation tools inject and analyze fault responses at this level of detail. However, extending fault simulation to large circuits can be difficult due to the nature of simulations.

#### **TECHNICAL APPROACH**

This work aims to bridge the gap between transistorlevel fault simulations where the fault models are more accurate and behavioral simulations that are faster and feasible for larger circuits. The circuit is partitioned into its building blocks, as guided by its behavioral model. Transistor-level fault simulations are conducted on the smaller building blocks. For each building block, fault behavior is captured through additional circuit components and functional transformations that are added to the behavioral model.

#### SUMMARY OF RESULTS

We apply our hierarchical fault modeling approach to two experimental circuits, a flash ADC and a PLL. We compare the accuracy of the fault model at the system level (using MATLAB) as well as the simulation time for both circuits. To enable comparisons within a reasonable time, transistor-level fault simulations are compared between the MATLAB and SPICE simulations for simpler circuits. For the ADC, we use a 3-bit architecture to simulate fully at the transistor level. For the PLL, we use a low divide ratio (N=8) to enable transistor-level simulations for faults that are not catastrophic. Both circuits are implemented in a 65-nm CMOS TSMC process.

At the comparator level, fault response similarities are evaluated as Dynamic Time Warping or Euclidean Distance metrics. The thresholds are adjusted per fault based on the difference between faulty and fault-free responses. For the 3-bit ADC, there are 672 faults to simulate. The ADC is simulated for its static parameters, including DNL, INL, gain error, and offset error. Hierarchical fault simulations and full transistor-level fault simulations are in agreement with the maximum error for hierarchical fault simulation being less than 1% of 1 LSB. As an example, Fig. 1 shows the simulation results for a randomly selected fault in Comparator 3.

Table 1 compares the computation times between the hierarchical and full transistor-level simulations for three circuits: a 3-bit flash ADC, an 8-bit flash ADC, and a PLL. Computation times for the hierarchical simulations (HL)

include transistor-level simulations for each block, model fitting time, and behavioral simulations whereas computation time for full transistor-level simulations (SP) includes only the SPICE simulation time. The time savings for the hierarchical simulations increase with the increasing complexity of the circuits.



Figure 1. MATLAB and Spice simulation results for a randomly selected fault in Comparator 3.

| Table | 1. | Fault  | simulation  | time  | comparison. |
|-------|----|--------|-------------|-------|-------------|
| IUNIC |    | I GOIL | Jinnanacion | unic. | companison  |

|              | 3-b ADC | 8-b ADC | PLL  |
|--------------|---------|---------|------|
| Spice        | 14s     | 30min   | 17hr |
| Hierarchical | 10s     | 15min   | 4min |

**Keywords:** mixed-signal, fault simulations, hierarchical modeling, HL simulations

#### INDUSTRY INTERACTIONS

Intel, Texas Instruments

#### MAJOR PAPERS/PATENTS

[1] Tolga Aksoy, Nikhil Sagar Modala, Lakshmanan Balasubramanian, Rubin Parekhji, Sule Ozev, "Hierarchical Fault Simulation for Mixed-Signal Circuits Using Template Based Fault Response Modeling," IEEE ETS, 2024.

The research will develop efficient machine learningassisted design-for-testability and built-in self-test mechanisms for AMS circuits and systems that are scalable across diverse circuit types and device specifications and can be implemented without the need to incorporate oscilloscopes and other complex test instruments on-chip

# TECHNICAL APPROACH

The DfT/BiST approach can be formulated in four key steps: (a) test setup and calibration, (b) application of optimized tests to the DUT using on-chip hardware resources, (c) acquisition of the DUT test response using built-in sensors that are optimized for testing accuracy and (d) on-chip ML-assisted test response analyzers for DUT characterization. In (b), the test stimulus is optimized to maximize the accuracy with which good vs. bad devices can be classified under multi-parametric process variations. In (c), the response of the DUT is acquired using a *minimal number of strategically placed built-in sensors and digitization circuitry*.

# SUMMARY OF RESULTS

Prevalent specification-based AMS testing techniques require the use of complex test circuits or regressors that are difficult to implement on-chip. We overcome these limitations in our proposed approach, Outlier Oriented Alternative Test and Tuning (OATT) which maximizes the number and magnitude of the statistical principal components (PCA) of the time-domain DUT test response vectors across diverse manufacturing process corners. This allows the construction of a multi-dimensional Gaussian probability density model that characterizes the distribution of DUT responses in the principal components domain. Outliers of this probability density model are classified as defective devices using calibrated confidence ellipses, implicitly detecting devices with parametric as well as hard defects. The embedded DUT response is acquired using coherent undersampling and does not require explicit signal reconstruction. Post-manufacture tuning is performed by minimizing the statistical distance of the DUT response in the PCA domain from the nominal Gaussian model using multi-arm bandit reinforcement learning. Simulation results demonstrate the viability and promise of the proposed approach.

Fig. 1 gives an overview of the test generation algorithm. The stimulus maximizes the strengths of the

principal components of the DUT responses across diverse process corners using a genetic optimization algorithm. During testing, the response of the DUT is mapped to the transformed PCA space for pass-fail analysis. If this is outside the pdf of nominal DUT responses by a threshold, it is classified as bad. Else the DUT is classified as good.



Figure 1. PCA-based test generation algorithm.

| Before | test | optir | nıza | tion |
|--------|------|-------|------|------|
| Deloie |      | open  |      |      |

|               | Good    | (Classify) | Bad ( | Classify) |
|---------------|---------|------------|-------|-----------|
| Good (Actual) | 131     | (32.75%)   | 69    | (17.25%)  |
| Bad (Actual)  | 103     | (25.75%)   | 97    | (24.25%)  |
| At            | fter te | st optimiz | ation |           |
|               | Good    | (Classify) | Bad ( | Classify) |
| Good (Actual) | 184     | (46%)      | 16    | (4%)      |
| Bad (Actual)  | 3       | (0.75%)    | 197   | (49.25%)  |

The misclassification rate of 200 good and 200 bad LNA devices before and after test optimization is shown in the table above, showing the efficacy of the OATT technique.

Keywords: Test, Analog, Tuning, Outlier, Defects

# INDUSTRY INTERACTIONS

Intel, Texas Instruments

# MAJOR PAPERS/PATENTS

[1] S. Komarraju, A. Tammana, C. Amarnath, and A. Chatterjee, "OATT: Outlier Oriented Alternative Testing and Post-Manufacture Tuning of Mixed-Signal/RF Circuits and Systems," 2023 IEEE International Test Conference.

[2] S. Komarraju, M. Mejri, A. Tammana, G. Dharmaraj, C. Amarnath and A. Chatterjee, "AMS Test Stimulus Generation and Response Analysis Using Hyperdimensional Clustering: Minimizing Misclassification Rate," 2024 IEEE European Test Symposium. ALI NIKNEJAD, UNIVERSITY OF CALIFORNIA AT BERKELEY, NIKNEJAD@BERKELEY.EDU

#### SIGNIFICANCE AND OBJECTIVES

A fully integrated sub-THz two-dimensional imager will enable real-time studying of biological samples, such as mammalian cells, in a label-free fashion. The noninvasiveness and the complete integration of all the necessary components for monitoring live biological samples make this platform particularly suitable for testing the efficacy of drugs and stimuli.

# **TECHNICAL APPROACH**

To achieve the requisite resolution for imaging mammalian cells, a few to tens of microns in diameter, the imaging pixels dimensions, and the array compactness are of paramount importance. Additionally, the highsensitivity requirements for the detection of the type or state of the cells within a heterogeneous sample point to the direction of using high-frequency resonant-based sensing structures. To this end, a dense array of 200GHz Split-Ring Resonators (SRR) was designed to serve as the imaging area in our proposed platform. These SRR pixels are excited using an on-chip source and each pixel is individually measured using a phase detector.

#### SUMMARY OF RESULTS

A 20x10 array was designed in bulk CMOS 28-nm technology. The top-level block diagram of this imager is illustrated in Figure 1. Each row has its own 200-GHz source, the output of which is split into two paths, LO and RF, through a pair of coupled transmission lines. The LO signal is routed directly to the phase detector, while the RF signal travels through the sensing pixels before reaching the RF port of the phase detector. The differential outputs of the phase detectors go to a 10:1 multiplexer, which allows the use of a single amplifier for processing the baseband signal. Moreover, periodic switching of the resonators between a sensing mode and a reference mode allows for the implementation of correlated double-sampling signal processing, which drastically suppresses the low-frequency noise of the phase detector and subsequent stages in addition to other environmental drifts.

Performing phase detection requires the LO and RF signals to be in quadrature, to avoid any DC offsets, and to avoid the saturation of the baseband. The quadrature generation is achieved through the proper EM modeling of all the passive components and the sensing resonators.

The initial measurements of our first prototype demonstrate the full functionality of the imager. The

resonance response of each pixel was measured. The functionality of the correlated double sampling scheme was verified, proving to be crucial in the DC offset removal of the pixels. The calibration of the pixels gain errors, due to process variations, are corrected by adjusting the quality factors of the SRRs. Figure 2 depicts the image that was taken from a sample (a small piece of PDMS (polydimethylsiloxane)) which confirms the functionality of the imager. Going forward, we plan to make measurements with live tissue samples, which require temperature control and long-term stability.



Figure 1. The imager top-level diagram for a 20x10 array.



Figure 2. The image of a PDMS sample.

**Keywords:** subTHz, permittivity sensor, near-field sensor, biosensor, imager

# INDUSTRY INTERACTIONS

IBM, NXP, Texas Instruments

The semiconductor industry has entered the age of Al and IoT, with forecasts soon exceeding 25 billion such devices. Software-based Al systems are speed-limited and power-hungry, while analog hardware accelerators offer great potential. IoT/AI-HW accelerators all require many embedded, reliable data converters, propelling strong needs for ultra-small, low-power, self-healing DACs.

# **TECHNICAL APPROACH**

Segmented DAC architectures that are intrinsically suitable for ultra-small area design with low power consumption are being investigated. Redundancy to ensure the non-existence of un-calibratable errors with sufficient confidence/yield is incorporated with practical low-cost BIST methods for accurate identification of DAC mismatch errors and other nonlinearities. Low-cost onchip calibration will be used to reduce all recoverable DAC errors to noise levels. MC and PVT studies will be used to evaluate yield and robustness. A fabricated test chip will be measured to demonstrate the performance density potential of the proposed concepts.

# SUMMARY OF RESULTS

During the last year, several new segmented DAC architectures using novel redundancy to avoid nonrecoverable errors are developed. Practical DAC BIST algorithms and on-chip calibration strategies to achieve high linearity while using ultra-small areas for matchcritical devices are also developed. Three different ultrasmall area DACs: a sub-radix segmented voltage mode R-2R DAC, a redundancy three-segment resistor string DAC, and a MOSFET only R-2R DAC with redundancy are taped out. At the request of SRC liaisons, a sub-radix capacitive DAC structure with redundancy is also developed. An ADC-DAC co-test algorithm was developed that accurately estimates all bit weights, including the redundant bit. The test results are then used to provide digital calibration of the full DAC, achieving the desired linearity. Extensive Monte-Carlo simulations show that with only 7-bit analog matching, the proposed DACs can avoid unrecoverable errors with high yields while providing post-calibration linearity at the 14-bit levels.

Figure 1 shows the schematic of a voltage-mode R2R DAC with the proposed redundancy bit in blue. Resistors are realized with minimum-sized library resistors with 7-bit matching. Figure 2(left) shows the DAC transfer curve before calibration, with nonlinearities visible by human

eyes, while Figure 2(lower right) shows 200 MC runs showing after calibration INL all less than 1 LSB. Figure 3 shows the layout of the fabricated 14-bit DAC with an ultra-small area of  $96\mu m \times 167\mu m$ .







Figure 2. MC simulation shows <1 LSB INL after calibration



Figure 3. Layout of a fabricated 14-bit DAC with ultra-small area

**Keywords:** ultra-small area and low power DAC, performance density, BIST, on-chip self-calibration

# INDUSTRY INTERACTIONS

NXP, Texas Instruments

# MAJOR PAPERS/PATENTS

 K Bhatheja, et al, "A BIST Approach to Approximate Co-Testing...," IEEE Design & Test (2024)
 M Sekyere, et al, "Ultra-Small Area, Highly Linear Sub-Radix R-2R ...," 2023 IEEE MWCAS, Phoenix, AZ
 I Bruce, et al, "Small Area, High Accuracy Sub-Radix Resistive...," 2023 IEEE MWCAS, Phoenix, AZ.

GaN power semiconductors are used in high-density, high-efficiency applications. Given the JEDEC JEP 180 standard, further investigation is needed on the device's repetitive transient withstand capability. Additionally, developing online and in-situ condition monitoring techniques for early warning of device degradation is essential.

# **TECHNICAL APPROACH**

The research began with the development of a scalable dynamic high-temperature operating life (DHTOL) test setup for evaluating the transient reliability of GaN product test vehicles. This setup uses localized heating to ensure high-temperature conditions for GaN devices, preventing unrelated failures. It includes onboard dynamic resistance characterization circuits per JEP 173 standards. The test bench will be used in subsequent stages to characterize various commercially available GaN devices.

# SUMMARY OF RESULTS

A prototype of the highly scalable DHTOL test bench, targeting a product-level power rating of 70-150 watts, has been developed. The architecture of the test bench adopts a motherboard and daughter card arrangement. The motherboard injects power into each daughter card, which forms the product-level test vehicle.

The product test vehicles developed are based on a Quasi-Resonant (QR) flyback converter (Fig. 1). This topology was chosen not only because it is one of the most widely preferred in the targeted power range, but also because it allows for leveraging the unique transient withstand capability of GaN HEMTs. In each daughter card, the power is made to recirculate within itself between the input capacitor and the flyback inductor. This minimizes power wastage and eliminates the need for large load dump systems, thus achieving scalability in terms of size, efficiency and cost.



Figure 1. Circuit schematic of the test vehicle.

The on-stage voltage and current samples of the DUT are acquired using discrete ADCs to estimate the dynamic on-state resistances. The captured digital samples are converted back to analog values through separate DACs, allowing for easy acquisition through an external multimeter in a user-friendly manner throughout the test duration. In addition, the daughter cards also incorporate resistive heaters. These heaters are utilized for localized closed-loop heating of the DUTs. This allows for precise and controlled temperature regulation as per test requirements.

A test bench in Fig. 2 housing five separate test vehicles has been developed. The test bench is currently in the process of testing GaN HEMTs from multiple vendors at different transient conditions each for 1000 hours.



Figure 2. Dynamic high-temperature operating life test setup.

**Keywords:** GaN HEMT, reliability, DHTOL, dynamic resistance, device degradation

#### INDUSTRY INTERACTIONS

**Texas Instruments** 

The research goal of this project is to create a holistic approach to provide physical-layer security and spectral efficiency for energy-constrained wireless communication technologies. We propose using information-centric algorithms in the design of secure RF systems for achieving high spectrum utilization with low energy.

#### **TECHNICAL APPROACH**

We introduce a novel constellation projection scheme in which we transmit symbol projections along distinct basis vectors via separate antennas. By randomizing the selection of basis vectors, we induce scrambling in all directions except the broadside, while ensuring the intended receiver perceives an unaltered constellation. Furthermore, we are designing a low-power modulo sampling technique to achieve a wideband spectrum sensing with a high instantaneous dynamic range (DR).

#### SUMMARY OF RESULTS

In our approach to physical-layer wireless security, we utilize orthogonal projections of symbols distributed across various antenna elements. In a static AWGN channel scenario with steering vectors, these projections undergo diverse phase shifts at Eve's locations, resulting in a distorted constellation pattern. However, in Bob's position, these phase shifts align, contributing constructively to reconstructing the original symbol. To bolster security, we implement randomized changes to the projection basis vectors over time, as depicted in Fig. 1, adding additional protection against eavesdropping.



Figure 1. Alteration of basis vectors for the 16-QAM constellation projection, introducing scrambling for security.



Figure 2. BER vs. Angle of Arrival for the 16-QAM constellation projection with change of basis (Bob and Eve have 10-dB SNR).

We performed Monte Carlo simulations to assess the bit error rate (BER). As illustrated in Fig. 2, the BER significantly degrades outside the broadside, primarily due to scrambling, effectively fortifying security against potential eavesdroppers in that area. Notably, our method ensures secrecy without the need for complex computational algorithms, and Bob does not incur any additional processing overhead.

Additionally, our sampler integrates modulo folding within a feedback loop while embedding a built-in antialiasing filter (Fig. 3), minimizing circuit complexity and power usage. Folding is initiated by comparator-detected threshold crossings through negative feedback. This lowpower modulo sampler enhances instantaneous DR.



Figure 3. Modulo folding with an embedded anti-aliasing filter.

**Keywords:** physical-layer security, phased array, modulo sampling, wideband spectrum sensing, low power

INDUSTRY INTERACTIONS

MediaTek

# TASK 3160.028, 0.50° ANGULAR RESOLUTION SUBMILLIMETER ELECTROMAGNETIC WAVE RADAR IMAGING USING A 9-CM DIAMETER ELECTRONICALLY STEERABLE REFLECTOR KENNETH K. O, UNIVERSITY OF TEXAS AT DALLAS, K.K.O@UTDALLAS.EDU

#### SIGNIFICANCE AND OBJECTIVES

This research seeks to improve the bandwidth, noise figure, and output power of 410-GHz concurrent transmitter/receiver pixels, to demonstrate a Cassegrain reflector with a 9-cm diameter that can be electronically steered, and to work with the task for the focal plane array (FPA) of the pixels to implement an imaging system.

#### **TECHNICAL APPROACH**

To increase the range and to increase the frame rate that also limits the reduction of noise bandwidth, this task will investigate approaches to increase the transmitted output power and decrease the noise figure of individual pixels, and simultaneous activation of multiple pixels. To increase the bandwidth, the use of an E-patch antenna will be investigated, and the pixels will be frequency synchronized using a broadband injection locking technique. To increase the field of view (FoV) to +/- 45°, wafer-scale fabrication of a 9-cm diameter electronically steerable Cassegrain reflector employing reflect arrays will be investigated.

#### SUMMARY OF RESULTS

Fig. 1 (top) shows a conceptual diagram of imagers using a focal plane array (FPA) of  $\lambda/2x\lambda/2$  concurrent transceiver pixels. A prototype using a 430-GHz 1x3 array of  $\lambda/2x\lambda/2$  concurrent transceiver pixels and a 6-cm diameter Cassegrain reflector is also shown in Fig. 1(bottom). The pixels including an on-chip antenna, transmitter, and receiver are fabricated using a 65-nm CMOS process and are utilized to image ~6cm x 6cm objects located 3-m away through fog with an angular resolution of ~0.7° [1]. Building on the demonstration, this task working with the task of Prof. W. Choi of Seoul National University seeks to demonstrate a 410-GHz system capable of imaging objects 30-50-m away with 0.5° angular resolution. The imager prototype will support a field of view (FoV) of 90° x 90°, a frame rate of 32/sec, and a depth resolution of 1.5 cm or a bandwidth of 10 GHz, and will utilize a focal plane array consisting of a 3x3 array of 8x5 pixel array integrated circuits. To increase the range, lowering the receiver noise figure by avoiding the uses of diode-connected transistors directly connected to the oscillator core for coherent detection, and increasing the output power by inclusion of embedding networks as well as their optimization will be investigated. To increase

the bandwidth, an E-patch antenna will be utilized, and the pixels will be frequency synchronized using a broadband injection locking technique that is being researched by the task of Prof. Choi. The FPA will support only +/- 5° FoV. To increase FoV to +/- 45°, a wafer-scale realization of a 9-cm diameter electronically steerable Cassegrain reflector using reflect arrays will be investigated.



Figure 1. (top) Conceptual diagram of imagers using an FPA of  $\lambda/2x\lambda/2$  concurrent transceiver pixels, (bottom) prototype using a 430-GHz 1x3 array of  $\lambda/2x\lambda/2$  concurrent transceiver pixels and a 6-cm diameter Cassegrain reflector.

**Keywords:** sub-mm wave, imaging, electronically steerable reflect array, CMOS.

#### INDUSTRY INTERACTIONS

**Texas Instruments** 

#### MAJOR PAPERS/PATENTS

[1] Y. Zhu et al., "430-GHz CMOS Concurrent Transceiver Pixel Array for High Angular Resolution Reflection-Mode Active Imaging," 2022 IEEE International Solid-State Circuits Conference, February 2022. TASK 3160.029, ARRAYED TEXAS INSTRUMENTS DMD AND PLM FOR ADVANCED SOLID-STATE LIDAR AND HOLOGRAPHIC DISPLAY YUZURU TAKASHIMA, UNIVERSITY OF ARIZONA, YTAKASHIMA@OPTICS.ARIZONA.EDU YUSHI KANEDA, UNIVERSITY OF ARIZONA

# SIGNIFICANCE AND OBJECTIVES

Texas Instruments Phase Light Modulator (TI-PLM) controls the phase of the laser enables a solid-state implementation of lidar and holographic 3D display. The algorithm to display 3-dimensional point clouds and faceted objects is completed for a holographic display. Simulation shows an arrayed PLM improves the resolution of lidar images.

#### **TECHNICAL APPROACH**

**PLM for holographic display:** TI-PLM displays 2D and 3D images by CGH (Computer Generated Hologram). For applications such as automotive HUD (Head up Display), the computation time of CGH is critical since the contents are updated within tens of milliseconds to sustain a high frame rate. We benchmark CGH calculation methods for 2D and 3D objects and display them by using TI-PLM.

**PLM for Lidar:** Tiling DMD and PLM, is a cost-effective way to increase the resolution of scanning lidar, as well as holographic display. We benchmark the effect of tiling on the resolution of the lidar system based on CGH calculation and reconstruction.

#### SUMMARY OF RESULTS

**PLM for holographic display:** Through the SRC/TxACEfunded Task 2810.053, "TI PLM to Advanced Lidar and Display Systems," we demonstrated holographic point display with TI-PLM, and extended to 3D display by Gerchburg-Saxton (GS) algorithm, multiple point cloud, and facetted 3D image generation depicted in Figs. 1(a), (b), and (c), respectively. Figs. 2(a) and 2(b) show a 2dimensional image of "U of A" generated by the GS algorithm and point source multiplexing (PSM). Table 1 tabulates the computation speed of CGH to display 2D and 3D images. While the image qualities are comparable, the GS algorithm outperforms the PSM for 2D images. In contrast, PSM is substantially faster for many 3D point clouds, since the GS algorithm needs to be repeated to generate many 2D slices of a 3D object.

Arrayed PLM for lidar and display: Tiling PLM and DMD increases the Field of View (FOV) and extent of viewing zone of holographic displays. Also, the large area increases the measurable range of the lidar system. The feasibility of image generation from arrayed PLM is confirmed in simulation (Fig. 3) and experimentally using a segmented PLM that mocks arrayed PLM (Fig. 3(b), and (c)).



Figure 1. (a) 3D wire grid object (inset) displayed by TI-PLM through an AR image guide. The 3D image is superimposed on a see-through image. 3D images displayed as (b) point cloud, and (c) facet of a slice of a 3D object.



Figure 2. (a) 2D source of image. Displayed images with TI-PLM using the (b) GS algorithm, and (c) Point Source Multiplexing.

Table 1. Benchmarking of computational speed.

|          | GS Algorithm       | PSM                 |
|----------|--------------------|---------------------|
| 2D Image | 1.47s              | 9.86s (1763 points) |
| 3D Image | 7min26s (400 CGHs) | 9.57s (1600 points) |
|          | b)                 | C)                  |

Figure 3. (a) 2D source of an image, (b) area of PLM to display CGH, and (c) displayed images with TI-PLM.

Keywords: TI-PLM, holographic, 3D displays, lidar, HUD

#### INDUSTRY INTERACTIONS

**Texas Instruments** 

#### MAJOR PAPERS/PATENTS

 R. Shrestha et. al., "Digital Phase Conjugation by Texas Instruments Phase Light Modulator for Near-to-Eye Display," SPIE Photonics West Paper 12900-26 (2024).
 Y. Zhang et. al., "Real-time 3D Objects Generation by MEMS Phase Light Modulator based on Camera Input for ADAS Applications," accepted for ODF 2024. (2024).

High-voltage (HV) systems in electric vehicles (EV) pose risks during emergencies, requiring efficient capacitor discharge for safety. This project aims to develop a swift, cost-effective electronic system to mitigate these hazards, enhancing safety for technicians and emergency responders during accidents, repairs, and other emergencies.

# **TECHNICAL APPROACH**

The proposed method introduces a new active discharge electronic circuit system. It significantly reduces DC link capacitor discharge time from over 10 seconds to just 1 second by employing the main inverter switches. This system integrates an adjustable gate driver to modulate the switch's gate-source voltage, enabling constant power operation during discharge. Through frequency modulation, thermal runaway is effectively prevented. Consequently, the system achieves the targeted 1-second discharge time, demonstrating its efficiency in managing capacitor discharge within a shorter timeframe.

#### SUMMARY OF RESULTS

Table 1 provides a comprehensive summary of test results conducted under specific conditions. The data underscores the critical role of gate-source voltage in achieving successful discharge operations. Initial experimentation involved discharging a 1000-V DC link capacitor using various TO-247 package MOSFETs and a SiC Power Module. Among the tested vendors, five managed a 1-second discharge at this voltage with a 2µs pulse width. However, one vendor necessitated a 6V V<sub>GS</sub>, indicating limitations at lower voltages due to inherent high threshold voltage. Nevertheless, all devices exhibited successful 1-second discharges with proper V<sub>GS</sub> voltage under constant power operation.



Figure 1. The architecture for active discharge. The inverter's switches serve to efficiently discharge high-voltage DC link energy, offering cost and space savings, while achieving a decreasing the discharge time to just 1 second.

Once the concept was validated, assessing the operational reliability became imperative. The reliability test involved discharging a capacitor for a 900-V DC link onto the devices under test (DUTs) once their case temperature reached around 80°C on a hot plate. Following each 1-second discharge period, a 4.5-second interval was implemented to prevent unchecked temperature escalation within the devices. This interval ensured the DUTs could safely charge and discharge within acceptable temperature thresholds. Reducing this interval causes the device's case temperature to rise during successive cycles, potentially leading to thermal runaway and device damage. Therefore, the interval is necessary for adequate cooling of the DUTs. Notably, the inverters of first two vendors in the table exhibited unreliable traits for 1-second discharge time at V<sub>GS</sub>=6V, marked by over a 50% increase of on-resistance values and significant alterations in body diode forward voltage. Upon repetition of the test with 5-V V<sub>GS</sub> at 1-second discharge time, only the V<sub>GS</sub> value changed, and these vendors demonstrated robust characteristics without parameter shifts. Vendor 3 and the power module displayed reliable traits under 6-V V<sub>GS</sub> at 1-second discharge. Overall, active discharge operation under proper gate voltages ensures reliable performance. These switches can effectively be used for this operation, leading to significant cost and space savings.

| Vendor | V <sub>DC</sub><br>Link | $V_{GS}$ | Discharge<br>Time | Pulse<br>Width |
|--------|-------------------------|----------|-------------------|----------------|
| 1      | 1000V                   | 5V       | √1s               | 2µs            |
| 2      | 1000V                   | 5V       | √1s               | 2µs            |
| 3      | 1000V                   | 6V       | √1s               | 2µs            |
| 4      | 1000V                   | 5V       | √1s               | 2µs            |
| 5      | 1000V                   | 5V       | √1s               | 2µs            |
| Module | 1000V                   | 5V       | √1s               | 2µs            |

**Keywords:** Active Discharge, Constant Power, Electric Vehicle, High Voltage DC Link Cap, Frequency Modulation

#### INDUSTRY INTERACTIONS

**Texas Instruments** 

# TASK 3160.031, STEERABLE FOCAL PLANE ARRAYS FOR HIGH RESOLUTION SUBMILLIMETER ELECTROMAGNETIC WAVE RADAR IMAGING

WOOYEOL CHOI, SEOUL NATIONAL UNIVERSITY, CHOI0010@SNU.AC.KR

#### SIGNIFICANCE AND OBJECTIVES

Sub-millimeter-wave concurrent transceiver pixels (CTP) enable focal plane radar arrays. For higher range resolution and wider field of view, wideband and scalable array architectures are essential. We are investigating architectures and design frameworks for large-scale arrays, supporting ~10GHz bandwidth and IF quadrature demodulation using 410-GHz CTP's and electronically steerable reflectors.

#### TECHNICAL APPROACH

For synchronization of concurrent transceiver pixels (CTP's) in large-scale arrays, wideband and efficient injection signal distribution methods are investigated. First, we modeled the loss and efficiency of passive and active elements in distribution networks such as transmission lines, amplifiers, frequency multipliers, and coupled oscillators. Using the models, a simulation study is performed to find the optimum arrangement of each element to deliver the required injection signal to CTP's according to the size of the array on the chip and in the package.

# SUMMARY OF RESULTS

We performed the analysis for the case where 16x16 chips with 2x2 concurrent transceiver pixels each are integrated to form a 1024-element array. Each pixel generates and down-converts 410-GHz signal using a 205-GHz VCO that doubles as a transmitter, and LO that drives mixer gates. A 205-GHz injection signal should be distributed to the pixels individually.

<u>On-chip coupling network:</u> Injection power needed for the 2X2 pixel array on-chip is largely affected by the loss of the distribution transmission line. Preliminary analyses suggest that the standing wave oscillator (SWO) architecture yields the lowest power consumption as the loss of the distribution transmission line is countered by using a 2x2 distributed small negative-g<sub>m</sub> cell rather than a single high-power amplifier. As a result, an SWO at 102.5GHz is implemented. The 205-GHz second harmonics of the SWO are extracted at the commonmode node of the negative-g<sub>m</sub> cell.

<u>Chip-by-chip distribution network:</u> On the other hand, the H-tree structure is employed for the distribution network among chips. Conventional H-tree architecture can be improved by including an amplified frequency doubler as the loss of distribution transmission lines is frequency dependent. The optimum position of the doubler can be obtained by using the frequencydependent loss of the transmission line and the efficiency of the amplified frequency multiplier for the required output power to drive the chips. Based on this, the total power consumption is estimated and plotted vs the number of doublers in Fig. 2. The number of doublers is inversely proportional to the number of chips that a doubler drives. The attenuation coefficient of 0.5dB/mm at 50GHz and 0dBm power delivered to each pixel are assumed. The optimum number of doublers is calculated to be 4 for a 1024-pixel array, resulting in the overall architecture in Fig. 1.

The proposed method will be applied to define optimum on-chip SWO intra-chip H-tree structures for power efficiency and bandwidth by incorporating actual pixel measurement data, and more accurate models of distribution network components.



Figure 1. Suggested array architecture for 1024 elements.



Figure 2. Power consumption vs the frequency doubler.

**Keywords:** imaging radar array, concurrent transceiver pixel, standing wave oscillator, signal distribution

#### INDUSTRY INTERACTIONS

**Texas Instruments** 

The proposal aims to develop a Computational Analog Security Hardware (CASH) for on-chip learnable anomaly detection. It targets improved hardware security through efficient, minimal-overhead hyperdimensional computing (HDC) for side-channel attack (SCA) detection. Key tasks include algorithm and hardware baseline establishment, mixed-signal near-entropy computation, and conversionfree HDC implementation.

#### **TECHNICAL APPROACH**

The technical approach integrates computational and security hardware, leveraging physically unclonable function (PUF) circuits for efficient hyperdimensional computing (HDC) encoding with minimal overhead. Key tasks involve establishing a digital baseline algorithm and hardware, exploring mixed-signal near-entropy computation, and developing conversion-free HDC. Innovations include few-shot learning, entropy generation, and on-chip anomaly detection. The project uses SPICE simulations and 65-nm CMOS for evaluation. It aims to achieve high performance with low energy and area overhead, enhancing hardware security.

#### SUMMARY OF RESULTS

We have set up the entire simulation flow, and proposed an architecture, generated a dataset (Task 1), and built a digital baseline (Task 2) for benchmarking and comparison. These efforts are documented in a paper we plan to submit later this year. In this paper, we will introduce a compute-in-PUF (CIPUF) architecture (Fig. 1) as an illustrative case of the general CASH primitive.



Figure 1. Overview of proposed CIPUF architecture for sidechannel power profile encoding.

Using IBM Power Grid and SPICE simulations, we combined an RLC network netlist with a synthesized AES module in 65-nm CMOS to generate data. Simulations in Cadence Virtuoso recorded power traces with and without sensing resistors to emulate SCA probing. We generated 1463 unique power profiles per workload, each with 256  $V_{DD}$  samples. Attack and non-attack scenarios were created by activating or deactivating a 1- $\Omega$  SCA resistor. HDC encoders processed these traces into feature vectors for model training and inference. Our digital baseline SCA detection system includes a preamplifier and ADC for quantizing voltage profiles, an SRAM bank for storing HDC hypervectors, and a custom digital processor for permutation, nonlinear functions, and similarity checks. It also incorporates a PUF key generator for cryptographic keys, an AES encryption engine for data security, and a microcontroller to emulate workloads. Design Compiler and SPICE simulations are used to evaluate the performance, energy consumption, latency, and area efficiency. IBM Power Grid Benchmarks were used for early data collection, with post-layout evaluations conducted using the 65-nm CMOS technology.

Simulation results demonstrate CIPUF's feasibility, achieving 96% side-channel detection accuracy, 4.15× area savings, and 12.8× energy savings compared to baseline designs. Also, we have fabricated a chip this year to verify the proposed CiPUF architecture (Task 3-5).

| Energy(nJ)/query | Baseline |       | PUF-HDC |        | Saving |
|------------------|----------|-------|---------|--------|--------|
| ADC              | 4.5      | 0.6%  | 4.5     | 7.7%   | 0      |
| Similarity       | 28.4     | 3.5%  | 28.4    | 48.9%  | 0      |
| Encoder          | 745.9    | 92.9% | 24.2    | 41.907 | 21.9   |
| PUF              | 24.3     | 3.0%  | 24.5    | 41.0%  | 31.08  |
| TDC              | 0.0      | 0.0%  | 0.9     | 1.6%   | -0.02× |
| Total            | Total 80 |       |         | 58     | 12.8×  |

Table 1. SCA detection energy breakdown and comparison with digital baseline.

**Keywords:** hardware security, side-channel attack, machine learning, analog / mixed-signal, circuit design

#### INDUSTRY INTERACTIONS

IBM, Intel, Texas Instruments

#### MAJOR PAPERS/PATENTS

[1] J.Liu, et. al., "CIPUF: Twoards On-chip Learnable Anomaly Detection with Compute-in-PUF architecture," (tentative submission) 2024 ISLPED, August, 2024, California, USA.

# TASK 3160.036, ELECTROMIGRATION LIFETIME CHARACTERIZATION UNDER REALISTIC CHIP OPERATING CONDITIONS CHRIS KIM, UNIVERSITY OF MINNESOTA, CHRISKIM@UMN.EDU

#### SIGNIFICANCE AND OBJECTIVES

Segmenting metal lines in a power grid is a common technique for mitigating Electromigration (EM). However experimental results for a fully segmented power grid have not yet been reported. We have designed a 28-nm CMOS EM test chip to collect data from power grids with various segmentation lengths.

#### **TECHNICAL APPROACH**

EM test chips are designed to collect IR drop trends under EM stress and analyze time-to-failure (TTF) statistics for various grid segmentation lengths. Experiments will be done with automated multi-threaded test software that our group has developed to perform accurate and stable testing on four power grid structures under multiple temperatures and current stress conditions.

#### SUMMARY OF RESULTS

A chip was designed to test power grid segmentation, building on our previous test chip design. The chip consists of four DUTs (devices under test): an unsegmented grid, a 4- $\mu$ m segmented grid, an 8- $\mu$ m segmented grid, and a 16- $\mu$ m segmented grid. Each DUT has a dedicated heater and temperature sensor to heat the power grid to a constant temperature for accelerating EM. 128 voltage taps were distributed throughout each power grid for monitoring IR drop and grid resistance. The top-level chip design is shown in Fig 1.

Power grid segmentation was accomplished using a zigzag pattern. The zig-zag segmentation consists of metal segments connected by shorter jumpers on the adjacent layer, as shown In Fig. 2. This segmentation pattern was chosen as it allows for consistent voltage tap and via placement between DUTs.

The planned experiments look to collect data on the effects of grid segmentation on EM lifetime and recovery. This will be accomplished through measurements of TTF, grid recovery, max grid current for EM immunity, and stress/recovery cycling.



Figure 1. New 28-nm CMOS power grid EM chip, consisting of four test structures with various segmented and unsegmented configurations.



Figure 2. Segmented power grid DUT using a zig-zag segmentation pattern for layout uniformity.

**Keywords:** Power grid, IR noise, electromigration lifetime, silicon validation, physical design

#### INDUSTRY INTERACTIONS

AMD, Intel, Siemens EDA, Texas Instruments

# TASK 3160.046, CDM RELIABILITY PREDICTION: FROM THE TEST CIRCUIT TO THE IC PRODUCT ELYSE ROSENBAUM, UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN, ELYSE@ILLINOIS.EDU

#### SIGNIFICANCE AND OBJECTIVES

The objectives of this project are to (1) design a technology-portable victim circuit that can be used to evaluate the capability of simulation to predict the ESD failure level, and (2) confirm that circuit simulation can correctly predict the CDM (Charge Device Model) failure level of a packaged IC.

#### **TECHNICAL APPROACH**

CDM simulations will be performed using compact models that are optimized for ESD-like high-current conditions. The accuracy of a CDM simulation will be evaluated by hardware measurements. Toward that end, we will develop an on-chip probe that allows us to capture ESD waveforms inside an IC. The predicted failure level will be trustworthy because sub-nanosecond measurements of the gate dielectric breakdown voltage will be performed using an on-chip pulse generator.

#### SUMMARY OF RESULTS

As expected, the gate oxide breakdown voltage (BV<sub>ox</sub>) measured on an ESD time scale is a decreasing function of the MOSFET active area. Also,  $BV_{ox}$  is a (mostly) decreasing function of stress duration. However, as shown in Fig. 1, the breakdown voltage is insensitive to pulse width in the range of 1 ns to 10 ns, at least for planar transistors with polysilicon gates. TCAD simulation provides a possible explanation for that surprising observation. Positive pulses were applied to the gate of the NMOS transistors, which caused electrons to tunnel from the channel to the gate. Those electrons have excess kinetic energy when they arrive at the gate (the anode), and one energy transfer mechanism is electron-hole pair generation. As shown in Fig. 2, it takes about 10 ns for the gate to invert; on a shorter time scale, the polysilicon gate is biased in deep depletion. The electric field across that depletion layer will accelerate the generated holes back toward the oxide. Anode hole injection is a known dielectric wear-out mechanism. The large electric field in the anode during a pulse's first nanosecond causes the oxide to incur more damage during that interval than in any later 1-ns interval. We hypothesize that for a 10-ns long pulse, more damage takes place in the first nanosecond than in the remaining 9 ns. As a result,  $V_{BD}$  ( $V_{63}$ ) has similar values for 1-ns and 10-ns pulses. For longer pulse widths, such as 100 ns, the extended stress time allows for the cumulative damage to be greater, and the value  $V_{63}$  drops as expected.

When a PDK MOSFET model is combined with an "ESD wrapper," as described in the report for Task 2810.066, the resultant model is accurate under both normal operating and ESD conditions. Example results may be found in [2].



Figure 1. For each pulse-width and area (number of fingers x W of 5  $\mu$ m x L of 60nm), ramp voltage breakdown measurements are performed on 40 test structures fabricated in 65-nm CMOS. The 63<sup>rd</sup> percentile breakdown voltage is plotted here. Outside the range 1-10 ns, the slope of the curves is consistent with prior works that study oxide breakdown on a longer time scale.



Figure 2.  $V_{GS}$  = 4.5 V. Band diagram at 1 ns (left) and 10 ns (right). Uncalibrated TCAD simulation.

**Keywords:** time-dependent dielectric breakdown, ESD, charged device model

# INDUSTRY INTERACTIONS

AMD, Intel, NXP, Texas Instruments

#### MAJOR PAPERS/PATENTS

 M. Drallmeier, et. al., "On-chip single-shot pulse generator for TDDB...," in 2024 IEEE IRPS.
 Y. Zhou, et. al., "Accuracy preserving extensions to a PDK MOSFET...," to appear 2024 EOS/ESD Symp. TASK 3160.048, DATA-DRIVEN FRAMEWORK FOR CROSS-MODAL WIRELESS HUMAN SENSING IN COMPLEX ENVIRONMENTS MURAT TORLAK, UNIVERSITY OF TEXAS AT DALLAS, TORLAK@UTDALLAS.EDU NAOFAL AL-DHAHIR, UNIVERSITY OF TEXAS AT DALLAS

# SIGNIFICANCE AND OBJECTIVES

In-vehicle occupant sensing seeks to perform three primary tasks: detect one or more occupants in a vehicle, localize a detected occupant to a seat, and classify the detected occupant. This research applies a deep-learning method for high-accuracy localization of occupants and classification of occupants as baby, child, or adult.

#### **TECHNICAL APPROACH**

To overcome the limitations of model-based in-cabin sensing approaches relying on complex rules and thresholds, we develop a robust CNN-based model that learns features from the data. Our proposed architecture (Fig. 1) ensures the robustness of our trained model. We evaluate cases with a group of participants excluded from training and validation of our model to emulate the realistic scenario of new unseen passengers in the vehicle. Specifically, we select six groups comprised of 20 total passengers to be left out one group at a time during training. Then, we test each of these groups using the corresponding trained model.

#### SUMMARY OF RESULTS

Occupant seat localization results are plotted as confusion matrices in Fig. 2 for 45-frame and 140-frame decisions. From the results, we observe an average accuracy of 94.3% after 45 frames and 95.5% after 140 frames. Furthermore, we compare the accuracy across observation time window sizes in Fig. 3. From the results, we observe that our proposed approach is robust to the observation time window, obtaining a 92.9% accuracy after only 10 frames of observation time of 140 frames.

Beyond occupant localization, we evaluate our proposed model for occupant classification. Confusion matrix results for 45-frame and 140-frame decisions are plotted in Fig. 4. From the confusion matrices, we observe average accuracies on new participants of 80.4% and 85.1% for 45-frame and 140-frame observation time windows, respectively.



Figure 1. Proposed network architecture.



Figure 2. Occupant seat localization confusion matrices using our proposed CNN-based model. (a) 45-frame observation time window. (b) 140-frame observation time window.



Figure 3. Occupant seat localization average accuracy over time for both a model-based point cloud thresholding and our proposed CNN-based model.



Figure 4. Occupant seat classification confusion matrices using our proposed CNN-based model. (a) 45-frame observation time window. (b) 140-frame observation time window.

#### Keywords: Deep learning, mmWave radar, radar sensing

#### INDUSTRY INTERACTIONS

**Texas Instruments** 

#### MAJOR PAPERS/PATENTS

[1] J. P. Van Marter, A. V. Mani, A. G. Dabak, S. Rao, M. Torlak, "CNN-based In-Vehicle Occupant Sensing using Millimeter-Wave Radar," in IEEE Radar Conf., May 2024, Denver, Colorado.

TASK 3160.049, IN-SITU ELECTRICAL BIASING TEM/STEM STUDY OF E-MODE GAN HEMT DEVICES FOR RELIABILITY MOON KIM, UNIVERSITY OF TEXAS AT DALLAS, MOONKIM@UTDALLAS.EDU HISASHI SHICHIJO, UNIVERSITY OF TEXAS AT DALLAS

# SIGNIFICANCE AND OBJECTIVES

AlGaN/GaN High Electron Mobility Transistors (HEMTs) with higher electron mobility, wider bandgap, and superior thermal conductivity surpass silicon for high-voltage switching. To enhance reliability, addressing challenges like gate stress and degradation requires real-time monitoring and analysis of heterostructure integrity under electrical and thermal bias to understand failure mechanisms.

# **TECHNICAL APPROACH**

Our initial in-situ electrical biasing tests were conducted with e-mode p-GaN gate HEMTs. However, the leakage current measured for the in-situ samples was significantly higher than that expected from the transistors. Therefore, we propose employing various methods, including shallow isolation cuts, surface treatments, and insulator deposition, to reduce the current leakage of in-situ samples to the levels consistent with that for the transistors. A four-probe in-situ TEM electrical biasing study, along with comprehensive TEM characterization of the devices failed by in-situ electrical biasing, will be conducted and correlated with their electrical characteristics.

# SUMMARY OF RESULTS

We focused on identifying GaN HEMT devices with dimensions suitable for our in-situ electrical biasing chips during the initial phase. Fig. 1 illustrates the critical dimensions of the FIB-optimized in-situ 4-probe electrical biasing E-chip that will be utilized in our project. The gate structure must be less than 20 micrometers to fit within the E-chip.

We evaluated several different GaN HEMT devices to meet this requirement and selected the one with appropriately sized gate structures by cross-sectional SEM imaging. Fig. 2 depicts one of the chosen GaN HEMTs, highlighting the gate structure. The gate, source, and drain dimensions are within the acceptable range for the E-chip. Notably, electrical contacts can be easily connected to the 4-probe pads of the E-chip shown in Fig. 1. Additionally, we have identified two other devices with similar dimensions, which will also be used for in-situ electrical testing.

Individual transistors will be connected to the 4-probe electrical biasing E-chip via FIB, allowing us to measure

surface leakage current. We will apply appropriate treatments to reduce the surface leakage current to the levels consistent with that for the transistors. Additionally, we will conduct comprehensive structural and chemical analysis of the transistor structure down to the atomic scale using high-resolution TEM/STEM.



Figure 1. FIB-optimized in-situ 4-probe electrical biasing E-chip, showing critical dimensions for mounting a single GaN HEMT transistor.



Figure 2. SEM images of the cross-sectioned e-mode GaN HEMT device showing the overall dimensions and transistor structure, with critical features indicated by arrows.



Figure 3. Cross-sectional STEM ABF image of the e-mode GaN HEMT device selected, showing the overall dimensions and transistor structure.

**Keywords:** E-mode GaN HEMT device, Reliability, in-situ electrical biasing STEM

# INDUSTRY INTERACTIONS

**Texas Instruments** 

#### MAJOR PAPERS/PATENTS

[1] A. Mehta, S. Shichijo, and M.J. Kim, "In-situ S/TEM DC biasing of p-GaN/AlGaN/GaN heterostructure for E-mode GaN HEMT devices," *Engr. Res. Express* **6**, 015324 (2024).

TCI 2023 TASK 2, RESILIENT INTELLIGENT SECURED ELECTROMAGNETIC SPECTRUM SENSING SYSTEMS (RISES<sup>3</sup>) FOR THE NEXT-GENERATION OF ELECTRONIC WARFARE IFANA MAHBUB, UNIVERSITY OF TEXAS AT DALLAS, IFANA.MAHBUB@UTDALLAS.EDU RASHAUNDA HENDERSON, UNIVERSITY OF TEXAS AT DALLAS

#### SIGNIFICANCE AND OBJECTIVES

We propose a data-driven Reconfigurable Intelligent Surface (RIS) for automatic transmission/reflection and anomaly detection in next-generation electronic warfare. Key advancements include a wideband reconfigurable boosting transmit-reflectarray (TRA), explainable machine-based design, and unsupervised learning for detecting adversarial signals, achieving 94.88% classification and 98.66% anomaly detection accuracy in X-band.

#### **TECHNICAL APPROACH**

The high data rates in future communication systems pose significant challenges since the increased bandwidth reduces the relative frequency distance between the inband wanted signal and out-of-band interference. In this work, we devise an anomaly detection framework by proposing a novel unsupervised ML-based ensemble methodology to furnish precise performance irrespective of model-specific inaccuracies and thereby authorizing RIS-guided TRA for signal manipulation to mitigate malicious attacks. The interpretability of the proposed method aids in identifying key design variables influencing the TRA performance, which can aid in designing nextgeneration TRAs for reflecting the interfering signals and thus ensuring the security of sensitive data.

#### SUMMARY OF RESULTS

The overview of the proposed ML-inspired RIS-based RF interference mitigation scheme is illustrated in Fig. 1. In this work, we have evaluated our solution exploiting an RIS comprised of numerous inexpensive units that are expected to enhance wireless benchmarks, including security. Additionally, we consider an explainable boosting machine (EBM)-based ML model to comprehend the TRA's behavior which significantly expedites the electromagnetic (EM)-driven simulation process. This interpretable methodology facilitates the identification of pivotal design parameters for designing an efficient TRA. Finally, an unsupervised ML approach integrating feature selection, model selection, and an ensemble sequential model consultation strategy is proposed to deliver timely and robust detection of anomalous signals, thereby enabling active RIS to manipulate signal transmission or reflection to improve system security. From our analysis,

we can infer that the proposed ML-inspired TRA furnishes at least 80% transmissivity/reflectivity while maintaining up to 94.88% classification accuracy in forecasting transmittance and reflectance exploiting an 8dimensional feature space and EBM algorithm. These findings will be submitted as a manuscript to IEEE IMS 2024. Additionally, the unsupervised ML technique achieves an accuracy of up to 98.66% in detecting anomalies, as demonstrated through evaluations against established benchmarks and simulated scenarios. These findings have been accepted at IEEE ISVLSI 2024.

We also introduce a cost-effective EBM to predict features of active metasurface (AMS) arrays and electromagnetic attributes including incident angles. The model achieves a mean square error of 0.014 in the forward prediction of transmittance/reflectance for a TRA and over 96% accuracy in inverse feature prediction. Feature importance-based preprocessing and hyperparameter tuning enhance accuracy and efficiency. An explainability analysis identifies key features, expediting the "Inverse Feature Prediction" process and improving convergence by focusing on critical attributes. These results have been accepted for presentation at IEEE RWW 2025.



Figure 1. Overview of the proposed ML-guided RIS-based RF interference cancellation

**Keywords:** Reconfigurable Intelligent Surface (RIS), Transmit Reflectarray (TRA), Machine Learning (ML), Malicious Signal, Explainable Boosting Machine (EBM).

#### INDUSTRY INTERACTIONS

**Texas Instruments** 

#### MAJOR PAPERS/PATENTS

 S. Reza et al., "Machine Learning Intervened RIS-based RF Interference...," accepted at the 2024 IEEE ISVLSI.
 S. Reza et al., "Efficient Explainable Boosting Machine for RF...," 2025 IEEE Radio & Wireless Symposium.

# TCI 2023 TASK 3, GERMANIUM TELLURIDE CHALCOGENIDE SWITCHES FOR RF APPLICATIONS

MANUEL QUEVEDO-LOPEZ, UNIVERSITY OF TEXAS AT DALLAS, MQUEVEDO@UTDALLAS.EDU KENNETH K. O, UNIVERSITY OF TEXAS AT DALLAS

# SIGNIFICANCE AND OBJECTIVES

This project aims to develop high-quality Germanium Telluride (GeTe) chalcogenides for their excellent phasechange properties for improved insertion loss, isolation and power handling for RF switches as well as power consumption for turning the switches on and off.

#### **TECHNICAL APPROACH**

This project employs Pulsed Electron Deposition (PED) to synthesize high-quality Germanium Telluride (GeTe) thin films for RF switches. The approach includes optimizing PED parameters, integrating GeTe into switch structures, and performing comprehensive electrical and structural characterizations. By refining fabrication processes and leveraging theoretical modeling, we aim to lower the power consumption for turning on and off of RF switches while maintaining or enhancing the insertion loss, isolation and power handling capabilities of GeTebased RF switches.

# SUMMARY OF RESULTS

The PED growth method is a novel technique for depositing GeTe thin films. The quality of deposited films strongly depends on many factors, including, the main parameters, the electron beam generation voltage, and background gas pressure. X-ray photoelectron spectroscopy (XPS) was carried out to study the impact of deposition parameters.

Table 1. Composition of film and oxygen content obtained byX-ray photoelectron spectroscopy analysis.

| Voltage (kV) | Composition                       | Oxygen |
|--------------|-----------------------------------|--------|
| 11           | Ge <sub>43</sub> Te <sub>54</sub> | 3%     |
| 13           | Ge <sub>47</sub> Te <sub>49</sub> | 4%     |
| 15           | Ge <sub>51</sub> Te <sub>43</sub> | 6%     |

XPS analysis indicates a relatively small amount of oxygen at low voltages. Table 1. Illustrates the variation in atomic concentration as a function of the electron beam generation voltage. These findings imply that the amount of material ablated during deposition is highly dependent on the voltage, with 13 kV emerging as a promising value to achieve stoichiometric GeTe films. Attaining a composition close to the stochiometric value is crucial, as it directly influences the electrical properties and overall performance of the device. The XRD analysis revealed no diffraction peaks in the amorphous state but exhibited a rhombohedral crystalline structure after annealing at 220-260°C (Fig. 1(a)). In the low-voltage sample, the crystalline states correspond to rhombohedral GeTe and orthorhombic TeO<sub>2</sub>. Experiments indicate that the oxide compound forms at temperatures above 250°C, likely due to the decomposition of GeTe or the presence of Te-rich regions in the film. This formation is attributed to the experimental conditions during the annealing process. Moreover, GeTe films undergo natural oxidation when it is exposed to atmosphere.



Figure 1. (a) XRD patterns of GeTe as-deposited and after annealing treatment. (b) Sheet resistance variation with temperature.

Fig. 1(b) shows the variation of sheet resistance with increasing temperature. The off- and on-resistance ratio of the films was found to be approximately 10<sup>6</sup>, which is a good indicator of the RF applicability of films. The results indicate that GeTe films deposited using a 13 kV e-beam generation voltage are the most promising for switching applications due to their favorable electrical properties.

Keywords: GeTe, PED, Switches for RF

INDUSTRY INTERACTIONS

# **Energy Efficiency Thrust**



| Category                           | Accomplishment                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
|------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Energy<br>Efficiency<br>(Circuits) | Active EMI filtering with feedback is used to reduce the volume burden of additional passive filtering by a factor of 20. This approach uses a switch-mode amplifier operating at 30+ MHz with GaN devices with a fractional-order filter to achieve high loop gain over a limited bandwidth. Experimental results with this approach demonstrated 40-60 dB of current attenuation at the first several harmonics, which reduces the volume burden of additional passive filtering. This circuit incurred a 0.4% efficiency penalty on a 120W prototype boost converter, even with 66% ripple ratio. (2810.068, A. Hanson, UT Austin)                                                                                                                                                                                         |
| Energy<br>Efficiency<br>(Circuits) | Most traditional SIMO (single inductor multiple output) converters operate with fixed time multiplexing ordering to handle the multiple outputs with linear PWM control, which limits respond to a large and fast load transient. Two non-linear and non-ordered control strategies that overcome these are demonstrated. The first chip prototype demonstrates a total load transient of 2A/ns, which is significantly larger and faster than that of traditional SIMO converters. The peak efficiency of 96.1% is also higher than the state-of-the-art. The second prototype achieves 96.1% efficiency and a transient speed of 2.1A/ns and a maximum current capacity of 2.2A. (2810.079, C. Huang, Iowa State University)                                                                                                |
| Energy<br>Efficiency<br>(Circuits) | Power management circuitry such as DC-DC converters are often designed for the rarely occurring worst-case scenarios that increases their cost. Instead, the circuits are monitored in real time and the circuit parameters are adjusted to enable reliable operation. For example, the system periodically monitors three converter operational parameters: ambient temperature, switching frequency, and load current. Using these, an online algorithm estimates the remaining useful life of each capacitor and the converter in real time. A 24V-to-1V DC-DC converter incorporating these concepts was fabricated. The converter achieves a peak power efficiency of 93.89% at 405-mA load current, an efficiency improvement of up to 34.74% compared to the baseline design. (2810.065, M. Seok, Columbia University) |







The escalating levels of EMI in automotive power ICs pose significant reliability and security risks. This work introduces a security-aware power IC architecture that enhances resistance to power side-channel attacks by decoupling power traces from load variations. It not only improves side-channel resistance but also effectively suppresses EMI.

# TECHNICAL APPROACH

To decouple the power traces from load activities with minimal power overhead, a power masking stage is introduced in parallel to a buck power stage. This masking stage draws randomized input currents that are temporarily stored and subsequently delivered to the output after a random delay. This randomization of power masking and charge recycling eliminates the correlation between the input power trace and load activities. When combined with a hysteretic controller, the randomized switching frequency further suppresses the EMI level in the on-chip power IC.

# SUMMARY OF RESULTS



Figure 1. Block diagram of the proposed encrypted power IC.

Fig. 1 shows the block diagram of the proposed encrypted on-chip power supply IC. The recycled masking consists of a power injection path with  $M_{Pl}$ ,  $C_{CS}$ , and  $M_{CR}$  in parallel with a half-bridge buck converter. The encryption interface circuit generates random ON-time ( $T_{ON}$ ), which turns on  $M_{Pl}$  at randomized timing and draws random input current ( $I_{Pl}$ ) traces from  $V_{IN}$  and charging  $C_{CS}$ .

Once  $C_{CS}$  is sufficiently charged by  $I_{PI},$  the charge recycling phase is activated. During this phase, the  $M_{PI}$  and  $M_{H}$  are turned off to disconnect the main power

source, and the  $M_{CR}$  is activated to use the  $C_{CS}$  as a temporary power source. Consequently, the input power trace is no longer correlated to the load activity, preventing side-channel attacks. By recycling the charges initially drawn for randomized power injection, the proposed architecture significantly reduces the power overhead compared to existing side-channel resistant designs.



Figure 2. Measured I<sub>IN</sub> and I<sub>CORE</sub> without proposed techniques.



Figure 3. Measured I<sub>IN</sub> and I<sub>CORE</sub> with proposed techniques.

As observed from Fig. 2, without the proposed techniques, the input current  $(I_{IN})$  is directly proportional to the load current  $(I_{CORE})$ , making it susceptible to sidechannel attacks. In contrast, Fig. 3 demonstrates that the proposed techniques effectively randomize both the amplitude and the timing of  $I_{IN}$  in response to variations in  $I_{CORE}$ . Consequently, this removes any correlation between  $I_{IN}$  and  $I_{CORE}$ , effectively preventing side-channel attacks. Moreover, this is achieved with a minimal power overhead of less than 4.9%.

Keywords: Side-channel resistance, EMI noise, IC security

# INDUSTRY INTERACTIONS

IBM, NXP, Texas Instruments

# MAJOR PAPERS/PATENTS

[1] K. Wei, J. W. Kwak and D. B. Ma, "An encrypted on-chip power supply with random parallel power injection and charge recycling against power/EM side-channel attacks," IEEE Transactions on Power Electronics, vol. 38, no. 1, pp. 500-509, Jan. 2023.

# TASK 2810.059, ULTRA-LOW-POWER ROBUST SAR ADC FOR PMCW AUTOMOTIVE RADAR

YUN CHIU, UNIVERSITY OF TEXAS AT DALLAS, CHIU.YUN@UTDALLAS.EDU

# SIGNIFICANCE AND OBJECTIVES

Low-power, high sample-rate ADCs targeting reliable operation in harsh environments such as automotive radars are needed. Most prior works on SAR ADCs were focused on the core SAR design, and little attention has been paid to the peripheral circuits, namely, the input and reference buffers, which often consume more power.

# **TECHNICAL APPROACH**

A soft summing-node (SSN) technique, termed the "elastic" S/H structure, that alleviates power-hungry ADC input drivers to achieve the goal of an overall ultra-low power consumption is pursued in this project. A schematic of the proposed two-step SAR ADC with SSN sampling is shown in Fig. 1. The summing node switch  $M_2$  is sized down compared to its conventional counterpart, which improves parasitics but introduces swing on the SSN, node 2. The capacitor  $C_x$  captures any swing on node 2 so it is not seen by the first stage.

SUMMARY OF RESULTS



Figure 1. Two-step SAR ADC with SSN technique.

A 12-b, 300MS/s two-step SSN SAR ADC prototype was taped out in a 22-nm CMOS process in January 2024. The completed layout and post-layout simulation results are reported. The simulated linearity of the first stage including parasitic resistance was 85dB as shown in Fig. 2. The simulated post-layout SNDR of the ADC including quantization noise but not thermal noise was 75dB (Fig. 3). Due to the long simulation times of post-layout simulation, a sine fit was used to evaluate the performance of the ADC given the limited number of simulated samples. The first stage alone could be simulated with parasitic capacitors and resistors extracted, while the overall ADC including clocking and bypass was simulated with parasitic capacitors only.



Figure 2. SSN + residue amplifier post-layout linearity.



Figure 3. Post-layout simulation of prototype ADC chip.

The finished layout of the prototype is shown in Fig. 4.



Figure 4. Layout screenshot of prototype ADC chip.

**Keywords:** soft summing node (SSN), summing-node swing, summing-node distortion, flash TDC, background calibration

# INDUSTRY INTERACTIONS

NXP, Texas Instruments

TASK 2810.061, TWO-STAGE VERTICAL POWER DELIVERY AND MANAGEMENT FOR EFFICIENT HIGH-PERFORMANCE COMPUTING HANH-PHUC LE, UNIVERSITY OF CALIFORNIA AT SAN DIEGO, HANHPHUC@UCSD.EDU PATRICK MERCIER, UNIVERSITY OF CALIFORNIA AT SAN DIEGO

#### SIGNIFICANCE AND OBJECTIVES

This project seeks to significantly improve the efficiency of power delivery from high-voltage busses to scaled-CMOS-compatible voltages (<1V) in a vertically and heterogeneously integrated architecture leveraging hybrid and switched-capacitor dc-dc converters. The architecture can reduce thermal dissipation and power pins by at least 2X.

#### **TECHNICAL APPROACH**

The ambitious approach utilizes a 2-stage vertical PDM architecture with an optimal tapered current distribution, combining a 4V-to-1V switched-capacitor voltage regulator (SCVR) stage located within the package substrate, underneath the processing die, along with a 20V/48V-to-4V hybrid voltage regulator module (HVRM) stage on the PCB. The SCVR is co-packaged with deep-trench capacitors where both SCVR and the integrated capacitor interposer dies can be thinned so that they can fit within the C4 bump height. The vertical power tree architecture enables ~2X reduction in package PDM pins with 4X interconnect loss reduction; resulting in a ~1.5X increase in available data IO pins.

# SUMMARY OF RESULTS



Figure 1. Architecture of the two-stage vertical power delivery architecture with 48V-to-1V conversion is implemented on 65-nm (SCVR) and 180-nm (HVRM) processes.

The Gen 1 designs of the two converter chips for the proposed system were completed and measured in the summer 2022 (Fig. 1-3). The SCVR implemented in 65-nm CMOS and the HVRM in a 180-nm process will be assembled using flip-chip bumping to reduce parasitic resistances and inductances. The HVRM (SCVR) design achieved 94.2% (92.5%) efficiency (Fig. 3). In comparison to the state-of-the-art 2-state works with similar voltage conversion ratios, this work achieves ~10% higher peak system efficiency, equivalent to ~44% maximum loss re-

duction, and demonstrates scalability for future heterogeneously integrated systems that is not achievable with the single stage architecture.

The Gen 2 designs of the two converters (Fig. 4) have several significant upgrades, including increasing the input to 48V for HVRM and adding the 3:1 mode for fast transient response and a smooth non-linear/linear regulation mode transition in SCVR. The SCVR chip was implemented and measured to achieve all intended operations. The HVRM is taped out in June and will be tested in Sep. 2024.



Figure 2. Gen 1 measured transient response with 1HVRM + 3SCVRs.



Figure 3. Gen 1 efficiency vs. load current for 1 HVRM + 6 SCVRs.



Figure 4. Top-level layout of the two Gen 2 converter designs.

**Keywords:** vertical power delivery, vertical power tree, DC-DC converter, switched-capacitor, hybrid converter

#### INDUSTRY INTERACTIONS

AMD, IBM, Intel, NXP, Texas Instruments

#### MAJOR PAPERS/PATENTS

[1] C. Hardy, et al., "11.1 A Scalable Heterogeneous Integrated Two-Stage Vertical Power-Delivery...," ISSCC 2023, pp. 182-184.

# TASK 2810.065, POWER-EFFICIENT AND RELIABLE 48-V DC-DC CONVERTER WITH DIRECT SIGNAL-TO-FEATURE EXTRACTION AND DNN-ASSISTED MULTI-INPUT MULTIPLE-OUTPUT FEEDBACK CONTROL MINGOO SEOK, COLUMBIA UNIVERSITY, MGSEOK@EE.COLUMBIA.EDU

# SIGNIFICANCE AND OBJECTIVES

The goal for the second phase of this project is to extract critical features of the GaN-based DC-DC converter and then perform health monitoring, stress control, and efficiency tracking functions.

#### **TECHNICAL APPROACH**

We designed a 40-to-3V wide input range GaN-based DC-DC converter featuring three advanced techniques. First, we proposed a health monitor sensing the on-time resistance ( $R_{on}$ ) and threshold voltage ( $V_{th}$ ) degradation of the GaN device. The monitor estimates the remaining useful life (RUL) and detects the catastrophic failure of GaN. Second, we designed stress control circuits that relax the thermal stress, the voltage stress on the gate and drain terminals. Third, we proposed an efficiency tracking for GaN-based DC-DC converter by gate drive voltage ( $V_{drv}$ ) modulation.

# SUMMARY OF RESULTS

From May 1, 2023, to April 30, 2024, we designed the second prototype chip, a GaN-based DC-DC converter. Using the chip, we will verify the health monitoring, stress control, and efficiency track functions of this prototype chip.



Figure 1. Architecture of the proposed converter. It consists of one digital  $V_{out}$  regulation loop, one health monitoring and stress control loop, and one efficiency tracking loop.

The battery for electrical vehicles has a large dynamic fluctuation range between 40V and 3V. To tolerate the wide fluctuation of  $V_{in}$ , we employ GaN devices in the power stage of the DC-DC converter for its high voltage

tolerance and low  $R_{on}$ . However, the GaN devices suffer from reliability issues like  $R_{on}$  and  $V_{th}$  shifts. To address these and further improve the converter efficiency, we introduced three novel features to the GaN-based converter.

Fig. 1 shows the architecture of the proposed 40V-to-3V wide input GaN-based DC-DC converter for automotive applications. It has three control loops. The first is the voltage-mode digital Vout regulation loop. The second is the health monitoring and stress control loop. The third is the efficiency tracking loop. The voltage-mode digital V<sub>out</sub> regulation loop comprises a flash ADC, a digital PID controller, and a digital PWM generator. The health monitoring and stress control loop includes a V<sub>ds</sub> sampler, an iL amplifier, a Vth trigger, a PTAT temperature sensor, a TDC, three ADCs, an Ron calculation block, a junction temperature (T<sub>i</sub>) estimation block, a temperature calibration block, a health monitoring block, and a stress control block. This loop first senses the  $V_{th},\,V_{ds},\,i_L$  and ambient temperature (T<sub>a</sub>) of the converter. Based on the sensed  $T_a$ ,  $V_{ds}$ , and  $i_L$ , it calculates  $R_{on}$  and estimates  $T_i$ , then calibrates Ron and Vth to Tj to acquire the junction temperature independent  $R_{on}$  and  $V_{th}$  value. The health monitoring block takes Ron and Vth to estimate RUL and detect the catastrophic failure of GaN. The efficiency tracking loop has an efficiency tracking controller and a charge pump. It tunes V<sub>drv</sub> under different load conditions to track the maximum efficiency point.

**Keywords:** RUL estimation, failure recognition, efficiency tracking, ringing mitigation

#### INDUSTRY INTERACTIONS

Texas Instruments, IBM, Intel

#### MAJOR PAPERS/PATENTS

[1] Z. Wang, et al., "93.89% Peak Efficiency 24V-to-1V DC-DC Converter with Fast In-Situ Efficiency Tracking and Power-FET Code Roaming," 2023 ESSCIRC, Lisbon.

[2] M. Li, et al., "16.6 PACTOR: A Variation-Tolerant Probing-Attack Detector for a 2.5Gb/s×4-Channel Chip-to-Chip Interface in 28nm CMOS," 2024 ISSCC, San Francisco.
[3] Z. Wang, et al., "A Ten-Level Series-Capacitor 24-to-1-V DC-DC Converter With Fast In Situ Efficiency Tracking, Power-FET Code Roaming, and Switch Node Power Rail," in IEEE Journal of Solid-State Circuits.

# TASK 2810.067, HIGHLY EFFICIENT EXTREME-CONVERSION-RATIO BUCK HYBRID CONVERTERS PARTHA PANDE, WASHINGTON STATE UNIVERSITY, PANDE@WSU.EDU

DEUKHYOUN HEO AND JANA DOPPA, WASHINGTON STATE UNIVERSITY

# SIGNIFICANCE AND OBJECTIVES

This project aims to create a high-performing and energy-efficient voltage regulator with an extremely high voltage conversion ratio (VCR). Additionally, we intend to introduce a novel optimization framework leveraging machine learning (ML) technologies.

# **TECHNICAL APPROACH**

This research aims to achieve three main objectives: (1) create a new HCR single-input and single-output (SISO) hybrid buck converter that is highly efficient, (2) design a single-input-multi-output (SIMO) HCR hybrid converter capable of producing a full range of output voltages while maintaining high efficiency, high power density, low cross-regulation, and fast response, and (3) create a machine learning (ML) framework aimed at improving the circuit parameters of SISO and SIMO HCR hybrid converters.

# SUMMARY OF RESULTS

Our study focuses on a SIMO HCR converter employing the Inductor-First Hybrid (IFH) topology. In line with our previous year's development of a single-output IFH converter, the proposed SIMO architecture comprises a power inductor and switched-capacitor-power-stages (SCPSs). This setup allows for achieving a high VCR of 32, while maintaining high efficiency at the same time.

We have investigated a method to enhance the response time to changing load transients. With the adoption of an inductor-first structure, the required inductor current transition time is minimized compared to other hybrid structures. By implementing a dynamic switching frequency ( $f_{sw}$ ) during the transitional phase, we have successfully addressed the recovery time of the flying capacitor voltage. Additionally, our "Reversed-D Power Loss Reduction" technique reconfigures the power stage under high-duty cycle conditions to reduce hard discharge power loss and the voltage drop across power switches.

The HCR converter has been fabricated using a 180-nm CMOS process, with an area of  $2\times2.5$ mm<sup>2</sup> including test pads which has been depicted in Fig. 1. We employed a 1.2-nH off-chip inductor with an f<sub>SW</sub> of 2.5MHz. The measurement results related to steady-state node voltage (V<sub>01</sub>) and load response from 0 to 0.2A are shown in Fig. 2. With the proposed dynamic f<sub>SW</sub> controller, a reduced under/overshoot voltages of -25mV and 23mV with 0.1%

settling time of 11 $\mu s$  for both sides are observed, which corresponds to a 45% response time improvement.



Figure 1. Die photo of the SIMO HCR converter.



Figure 2. Measured steady-state node voltages and load response.

**Keywords:** extremely high conversion ratio, buck converter, hybrid topology, efficiency enhancement

# INDUSTRY INTERACTIONS

Intel, NXP, Texas Instruments

# MAJOR PAPERS/PATENTS

 Z. Zhou, et. al., "A Fully Integrated Analog-Assisted Digital Low-Dropout Regulator with Inverter-Based Fast Response...," Submitted to IEEE T. of Power Electronics.
 Z. Zhou, et. al., "A Battery/USB Input Sub-1V Output Reconfigurable Hybrid High-Step-Down Converter with Reduced...," Submitted to TCAS1 after Major Revision.
 Z. Zhou, et. al., "A Multi-Output Reconfigurable Hybrid Buck Converter with Fast Response and Reversed Duty Cycle Control for Enhanced Efficiency," Will be submitted to IEEE TCAS1.

# TASK 2810.068, ACTIVE EMI FILTERING WITH SWITCH-MODE AMPLIFIER FOR HIGH EFFICIENCY ALEX HANSON, UNIVERSITY OF TEXAS AT AUSTIN, AJHANSON@UTEXAS.EDU

#### SIGNIFICANCE AND OBJECTIVES

This project developed active EMI filtering methods based on switch-mode circuits to reduce the size and weight of conducted EMI filtering compared to passive filters and improve efficiency compared to linear-mode active filters.

#### **TECHNICAL APPROACH**

Two distinct approaches were taken. The first replaced the linear-mode amplifier in conventional AEF schemes with a class-D switch-mode amplifier. Because this approach relies on feedback, we call it the feedback approach. The second created an auxiliary circuit with an equal-and-opposite current to the power converter in question to cancel the conducted EMI current. Because this circuit's switching is synchronous with the power converter, and we call it the synchronous approach.

#### SUMMARY OF RESULTS

The feedback approach used the following novel features: (1) a switch-mode amplifier, (2) operating at 30+ MHz with GaN devices, with (3) a fractional-order filter to achieve high loop gain over a limited bandwidth. The synchronous approach bears a certain resemblance to current-steering approaches but achieves true volume reduction as the auxiliary circuit is rated for much lower voltage than the main circuit.

Experimental results with the feedback approach demonstrated 40-60 dB of current attenuation at the first several harmonics, which reduces the volume burden of additional passive filtering by a factor of ~20. This circuit incurred a 0.4% efficiency penalty on a 120W prototype boost converter, even with a 66% ripple ratio.

Experimental results with the synchronous approach demonstrated 15-30 dB of current attenuation at the first several harmonics, which reduces the volume burden on additional passive filtering by a factor of ~5. The auxiliary circuit incurred a <0.1% efficiency penalty for a 320W prototype boost converter, even with a 75% ripple ratio.

Both methods are highly effective. The feedback approach is more general and achieves higher attenuation, while the synchronous approach is low engineering cost, low production cost, easily integrable, and applies to the most common grid-interface converters (boost).



Figure 1. The feedback approach closely resembles conventional active EMI filter schemes with the linear amplifier replaced with a switch-mode class D amplifier operating at 30+ MHz switching frequency and 2 MHz control bandwidth.



Figure 2. The synchronous approach in which an auxiliary circuit switches synchronously with the power converter to generate equal-and-opposite ripple current at the input.

**Keywords:** electromagnetic interference, conducted EMI, power converter, power factor correction

#### INDUSTRY INTERACTIONS

Texas Instruments

# MAJOR PAPERS/PATENTS

[1] D. T. Nguyen, E. Macias and A. J. Hanson, "Active EMI Filter with Switch-Mode Amplifier for High Efficiency," 2022 IEEE Applied Power Electronics Conference and Exposition (APEC), Houston, TX, USA, 2022, pp. 443-450, doi: 10.1109/APEC43599.2022.9773582.

[2] D. T. Nguyen, C. Deng, E. Macias and A. J. Hanson, "Synchronously Switched Active EMI Filter," 2022 IEEE Energy Conversion Congress and Exposition (ECCE), Detroit, MI, USA, 2022, pp. 1-8, doi: 10.1109/ECCE50734.2022.9948006.

[3] A. J. Hanson and D. T. Nguyen, "Synchronous Switch-Mode Active Electromagnetic Interference Cancellation Circuit and Method," UT Provisional Patent #63307565, filed 7 Feb 2022.

# TASK 2810.072 / 2810.073, AI/ML EDGE HARDWARE FOR ULTRA-RELIABLE WIRELESS NETWORKS DAVID ALLSTOT, OREGON STATE UNIVERSITY, ALLSTOTD@OREGONSTATE.EDU YIORGOS MAKRIS, UNIVERSITY OF TEXAS AT DALLAS

# SIGNIFICANCE AND OBJECTIVES

The overall objective of this project is to develop areaand power-efficient on-chip real-time digital predistortion techniques for state-of-the-art RF transmitters using energy-efficient switched-capacitor power amplifiers.

# TECHNICAL APPROACH

These goals are being addressed using a real-valued time-delay neural network that comprises a fully connected input layer, hidden layers, and an output layer. The hidden layers comprise 4 fully connected layers (40 neurons) and with activation functions, there are 6 total layers. Good results have been achieved using a nonquantized NN wherein both AM-AM and AM-PM are trained together. Simulations of the DPD are applied to the measured results of the power amplifier in Fig. 1.

#### SUMMARY OF RESULTS

A novel 8-core SCPA has been designed, implemented, and tested with and without DPD as indicated below. The preliminary measured results in Fig. 4 are encouraging.



Figure 1. Architecture/circuits of the 8-way power combiner.



Figure 2. Layout of the 8-way power combiner.



Figure 3. Test flow for dynamic measurements plus DPD NN.



Figure 4. Measured (a) spectrum, (b) constellation, (c) EVM and (d) ACLR of the modulated signals before and after DPD.

Future work will consider quantized neural networks. We are also considering Adjoint Network techniques for this project in collaboration with Prof. Rohrer.

**Keywords:** Switched-capacitor power amplifier, Class-G power amplifier, transformer power combiner, backoff efficiency, digital pre-distortion techniques

#### INDUSTRY INTERACTIONS

Intel, Qualcomm, Texas Instruments

#### MAJOR PAPERS/PATENTS

 B. Qiao, et al., "An eight-core class-G switchedcapacitor power amplifier with eight power backoff efficiency peaks," IEEE RFIC Conf., 2022, pp. 1-4.
 N. Najim, et al., "Machine learning techniques for digital pre-distortion in CMOS switched-capacitor power amplifiers," SRC Techcon, 2022.

# TASK 2810.075, HYBRID STEP-DOWN DC-DC CONVERTER WITH LARGE CONVERSION RATIOS FOR 48V AUTOMOTIVE APPLICATIONS HOI LEE, UNIVERSITY OF TEXAS AT DALLAS, HOILEE@UTDALLAS.EDU JIN LIU, UNIVERSITY OF TEXAS AT DALLAS

#### SIGNIFICANCE AND OBJECTIVES

This research aims to develop innovative capacitorassisted hybrid DC-DC converters to provide high power efficiency under large input-to-output voltage conversions in 48-V automotive applications. A systematic approach will also be developed to realize hybrid converters with a minimal number of low-voltage power FETs and passive components to improve the converter power density.

#### **TECHNICAL APPROACH**

We investigate both flying-capacitor multi-level and switched-capacitor-assisted converter topologies to evaluate operation flexibility in different conditions, the requirements of voltage balancing and pre-charging of flying capacitors, the capability of providing high power density and power losses. We started with our recently reported converter topology: a dual-path hybrid Dickson converter that can lower the required inductor current for a given output current. This not only decreases the inductor conduction loss but also improves the capability of delivering high-output current for better power density.

#### SUMMARY OF RESULTS

The proposed 7:1 dual-path hybrid Dickson (DPHD) converter is shown in Fig. 1 below. It consists of one inductor, 10 power switches, and 6 flying capacitors. All flying capacitors in the converter do not need to be balanced in the steady state nor be pre-charged in the start-up condition. Fig. 1 also shows that the SC network shares part of the output current, thereby reducing the required inductor current and its conduction loss. The efficiency would be thus improved.



Figure 1. Architecture of the proposed 7:1 dual-path hybrid Dickson converter topology [1].



Figure 2. 7:1 DPHD converter prototype.



Figure 3. Measured power efficiency of 7:1 converter.

This converter topology was validated by a prototype for converting 36V – 65V input to an output voltage of 1 -2V [1]. Ten 25-V MOSFETs were used as power switches in the DPHD converter prototype. The converter operates at 250kHz and delivers the load current I<sub>0</sub> up to 32A. Thanks to the proposed DPHD structure for sharing Io, the value of inductor current is reduced by 17% compared to the conventional buck converter or other single-inductor hybrid converters. Fig. 2 presents the simulated and measured power efficiency of the DPHD converter with two output voltages of 1V and 2V with V<sub>IN</sub> of 48V. The peak power efficiencies of 92.7% and 90.6% are achieved with the input-to-output conversion ratios of 24 and 48, respectively. Compared to other state-of-the-art highratio non-isolated step-down converters, this DPHD converter supports the highest Io, provides the highest power density of 481W/in<sup>3</sup>, and achieves competitive peak power efficiencies

**Keywords:** DC-DC converter, dual-path hybrid Dickson converter, high-conversion-ratio step-down converter, hybrid converter, non-isolated converter

#### INDUSTRY INTERACTIONS

Intel, NXP, Texas Instruments

#### MAJOR PAPERS/PATENTS

[1] C. Chen, et. al., "Dual-path Dickson converter for highratio conversions in PoL Applications," *IEEE Transactions on Industrial Applications*, vol. 59, no. 6, pp. 6914 – 6925, Dec. 2023.

# TASK 2810.078, PROGRAMMABLE MIXED-SIGNAL ACCELERATOR FOR DNNS WITH DEPTHWISE SEPARABLE CONVOLUTION LAYERS BORIS MURMANN, STANFORD UNIVERSITY, MURMANN@STANFORD.EDU

# SIGNIFICANCE AND OBJECTIVES

Deep neural networks (DNNs) require massively parallel and energy-efficient multiply-accumulate (MAC) circuitry. In-memory computing (IMC) has shown potential. This work has looked for a middle ground between IMC and standard digital processing by investigating both mixedsignal compute arrays and fully digital approaches. Most recently, we have developed a digital IMC processor that is superior to the traditional analog approach.

# **TECHNICAL APPROACH**

Our hardware is built to efficiently run bottleneck layers, which are used in modern DNNs like MobilenetV2 (MBNetV2) that target Tiny-ML applications. The key ingredient of our approach is a processing element (PE) array that intersperses partial product accumulation circuitry with local SRAM kernel storage and digital multipliers. The density of these circuits allows us to fully unroll and pipeline the operations of a bottleneck layer to (1) reduce the activation memory needed, (2) eliminate accumulation buffers and (3) eliminate repeat weight and activation accesses (see Fig. 1).

# SUMMARY OF RESULTS

Quantization experiments on a small MBNetV2 (<60kB) for several tinyML applications have shown that 8-bit quantization results in better accuracy than 4-bit quantization for the same network size, motivating the use of digital computing in the final stage of this project. We have thus designed a kernel for a fully digital IMC approach that uses the same type of local memories as the MS approach but uses 8-bit bit-serial multiplications with a digital adder tree and accumulator.

Using the IMC kernel, we developed an end-to-end optimized processor for low-energy operation with tinyML inference workloads. We taped out and measured a prototype in 28-nm CMOS (see Fig. 2) [3]. It achieves 4.6  $\mu$ J-per-inference (with 91.6% accuracy) on the CIFAR-10 benchmark, as well as commensurate energy savings on all standard tinyML application benchmarks.

As shown in the figure, each compute memory slice contains custom latch array (CLA) memory and bit-serial arithmetic that performs 8bx8b multiplications over 8 clock cycles. The chip runs at a clock frequency of 10 MHz and achieves a throughput of 400 frames per second. The merits of using CLA-based memory are discussed in [2].



Figure 1. Pipelined machine learning processor architecture for bottleneck layers.



Figure 2. Prototype IC in 28-nm CMOS (4.09 mm<sup>2</sup> core area).

**Keywords:** Deep neural networks, hardware accelerators, in-memory computing, mixed-signal integrated circuits

# INDUSTRY INTERACTIONS

IBM, NXP, Qualcomm, Samsung, Texas Instruments

# MAJOR PAPERS/PATENTS

[1] W.-H. Yu, et. al., "A 4-bit Mixed-Signal MAC Array with Swing Enhancement and Local Kernel Memory," MWSCAS, Aug. 2021.

[2] M. Giordano, et. al., "TinyForge: A Design Space Exploration to Advance Energy and Silicon Area Trade-offs in tinyML Compute Architectures with Custom Latch Arrays," ASPLOS, Apr. 2024.

[3] R. Doshi, et. al., "Medusa: A 0.83/4.6 μJ/Frame 86/91.6%-CIFAR-10 tinyML Processor with Pipelined Pixel Streaming of Bottleneck Layers in 28nm CMOS," VLSI Circuit Symposium, Jun. 2024.

SIMO allows multiple rails to share the same inductor, thus reducing the use of bulky and expensive inductors. However, it has significant limitations of larger ripples, smaller power capacity, lower efficiency, and fast/large current transient handling capability. This project aims to address these limitations.

# **TECHNICAL APPROACH**

In this project, two Ph.D. students are involved in designing SIMO converters with different innovation emphases. Most traditional SIMO converters operate with different fixed orders (time multiplexing or ordered power distributive control) to handle the multiple outputs with linear PWM control. However, the ordered operation creates limitations in responding to a large and fast load transient. Both designs eliminate the ordered operations with new non-linear control, such that everything is truly demand-based for maximum transient performance.

# SUMMARY OF RESULTS

The first design is "buffet-like" non-ordered non-linear control (Fig. 1) which separate the inductor charge/discharge control with the output rotation, with higher priority given to ensuring a slightly more-thanneeded inductor current in the inductor. A higher sampling frequency is assigned to the inductor charge/discharge control to ensure a higher resolution for better efficiency and transient response. To further improve the load transient performance, hybrid digital LDOs are designed in parallel with each output to provide a second path to break the large signal LC limitation.

The second design employs a "state-based" nonordered non-linear control for a 3-level SIMO converter with a higher conversion ratio. This "state-based" design methodology not only applies to the 3-level power stage but theoretically to any hybrid capacitive and inductive topologies to achieve capacitor balancing and output regulation.

Both Designs have been fabricated with TSMC 180-nm CMOS and measured. The buffet-like design, as shown in Fig. 2, can handle a load transient of combined 2.1A/ns, which is significantly larger and faster than the ~300mA current steps of traditional SIMO converter designs. The peak efficiency of 96.1% is also higher than the state-of-the-art [1]. In the state-based design, both the flying capacitor and power inductor were integrated on the chip,

minimizing the parasitics to ensure better efficiency and reliability. The chip achieves a higher voltage conversion ratio with 1A/1.5ns load transient handling capability [2].

Proposed Ripple-Based Hi-Res. Ind. Current Redundancy Ctrl:



Figure 1. Proposed buffet-like non-linear control.







Figure 3. Chip photos of both designs.

**Keywords:** single-inductor multiple-output (SIMO), nonlinear control, digital LDO, load transient, 3-level

# INDUSTRY INTERACTIONS

IBM, NXP, Texas Instruments

# MAJOR PAPERS/PATENTS

[1] L. Zhao, et. al., "A 96.1% Efficiency Single-Inductor Multiple-Output (SIMO) Buck Converter...", Accepted in *IEEE Journal of Solid-State Circuits (JSSC)*, Early Access.

[2] J. Tang, et. al., "A Monolithic Non-Linearly Controlled 3-Level Single-Inductor Multiple-Output Buck Converter with 1A/1.5ns Transient Handling Capability...", in *IEEE Custom Integrated Circuits Conference (CICC)*, Apr. 2024.

# TASK 2810.080, EFFICIENT AND HIGH-DENSITY FULLY IN-PACKAGE GAN-BASED HIGH-RATIO DC-DC CONVERTERS CHENG HUANG, IOWA STATE UNIVERSITY, CHENGH@IASTATE.EDU

#### SIGNIFICANCE AND OBJECTIVES

Three-level and double step-down (DSD) converters are two of the most popular state-of-the-art topologies for high-ratio step-down conversion. The current objective is to design new and better topologies in terms of efficiency and power density.

#### **TECHNICAL APPROACH**

The first effort mainly focuses on chip-level 48-V converter design. A 4-phase switched-capacitor (4PSC) topology [1], shown in Fig. 1, and a 8-phase switched-capacitor topology (8PSC) with a chip fabricated for measurement, shown in Fig. 2 have been developed. The second effort mainly focuses on board-level design, with resonant and non-resonant topologies. The resonant switched tank stages and a buck stage are combined into one single stage to reuse power transistors, reducing 12 power transistors to 8 compared to the regulated switched-tank solution from Google. This improves power density and achieve voltage regulation.

#### SUMMARY OF RESULTS

The proposed 4PSC converter uses lower-rated (12V) devices to handle the higher current stages near the output for better efficiency. In simulation [1], it showed a similar efficiency when compared to a DSD converter but with one fewer inductor, and a much higher efficiency when compared to a 3-level converter with the same number of inductors. The new 8PSC topology further extended to use 6-V devices at the output for a ~7% efficiency improvement over that of the 4PSC version, which is also much better than DSD. Simulations showed ~92% efficiency for 48V-to-1V conversion. Non-linear closed-loop voltage regulation and flying capacitor balancing are also achieved. A chip is fabricated and currently under measurement and debugging. The chip photo is shown in Fig. 2.

The 48-V-to-1-3.3V direct conversion switched tank converter (Fig. 3) achieved a fast closed-loop regulation, 94.7% peak efficiency, 50A maximum current capacity, and 595W/in<sup>3</sup> power density. Measured power efficiency and load-transient response are shown in Fig. 4. Only 40 mV droop was observed with 1-20 A load step. A new version with a higher power density is designed and being measured.



Figure 1. Proposed 4PSC in [1].



Figure 2. Chip prototype for the new 8PSC.



Figure 3. Direct conversion switched tank converter.



Figure 4. Measured efficiency and load transient response.

**Keywords:** high-ratio, hybrid, 48V, DC-DC converter, point-of-load converter

#### INDUSTRY INTERACTIONS

IBM, Intel, MediaTek, Richtek, Samsung, Texas Instruments

#### MAJOR PAPERS/PATENTS

[1] M. R. Khan, et. al., "A Single Inductor 4-Phase Hybrid Switched-Capacitor...," in IEEE ISCAS, 2023.

[2] M. R. Khan, et. al., "Analytical Comparison of 2-Phase 3-Level...," in IEEE ECCE, 2022.

[3] S. Y. Sim, et. al., "A 94.7% Efficiency Direct-Step-Down Switched-Tank-Based...," in IEEE APEC, Feb. 2024.

[4] M. R. Khan, et. al., "Inverting Single-Inductor Multiple-Output DC-DC Converters," US Patent, filed in 2023.

# TASK 2810.087, GRID OPTIMIZATION AND SILICON VALIDATION FOR CHIP ROBUSTNESS

FARID N. NAJM, UNIVERSITY OF TORONTO, F.NAJM@UTORONTO.CA CHRIS H. KIM, UNIVERSITY OF MINNESOTA

#### SIGNIFICANCE AND OBJECTIVES

On-chip power grid design has a direct impact on the supply voltage and in turn, on circuit timing and functionality. As the chip ages, even a well-designed grid degrades over time and causes voltage drop violations. We develop methods for grid optimization and design that guarantee grid robustness under electromigration.

#### **TECHNICAL APPROACH**

Our previous work on grid optimization [1] provided a method for grid fixing by varying the widths of metal lines, to be used once an electromigration (EM)-induced voltage drop violation has been found. In this work, we will build on [1] and address its lack of a void model and its main runtime bottleneck, namely that the objective function computation is slow. We do this by including a realistic physical void model and introducing a novel model reduction in the lines and the trees to speed up the simulation and tackle large grids.

#### SUMMARY OF RESULTS

First, some background: Metal lines fail when voids form in them, due to high tensile stress in the metal, arising from current density-induced atomic movement, called electromigration. In previous work (2016-18), we developed a simulator (called EKM) for the stress over time, which predicts when voids will appear and so the time to failure (TTF). This simulation engine, which lacks a true void growth model, was used to compute the objective function, i.e., the TTF [1]. In 2020, we developed equivalent circuit models using RC metal lines that serve as a proxy for the time-evolution of stress in the lines, leading to electromigration, so that regular circuit simulation can be used to predict the TTF. In 2021, we developed a reduced model for these RC lines, intending to reduce the simulation runtime.

Since then, we made three contributions that improved the runtime of both the simulation engine and the overall optimization engine. **First**, note that the optimizer is a sequence of Linear Programs (LPs) that are applied at certain time points. This year, we chose the target lifetime as the time when the LPs are run. This is potentially risky but turned out to be an excellent heuristic, providing results where very few parts of the grid need to be modified, in contrast to last year's result whereby most of the grid needed to be modified, and runtime was better. **Second,** we have developed a numerical approach that significantly improves the efficiency of simulation during the void growth phase. If k is the number of voids (small) and n is the number of grid branches (huge), the effort required to update the grid voltages at every time point is now  $O(k^3 + kn)$ , much better than the complexity of our last year's engine, which was  $O(n^2)$ . With these two contributions implemented, results are shown in Table 1.

| Grid Size | #I De | Area Inc   | Runtime (hrs) |       | Speed-up |
|-----------|-------|------------|---------------|-------|----------|
| 0110 3126 | #LF5  | Alea IIIc. | Old           | New   | Sheen-nh |
| 12k       | 2     | 0.0006%    | 0.03          | 0.02  | 1.5      |
| 62k       | 1     | 0.50%      | 0.63          | 0.07  | 8.6      |
| 37k       | 2     | 0.15%      | 0.1           | 0.03  | 3.3      |
| 146k      | 4     | 1.04%      | 2.92          | 0.81  | 3.6      |
| 1.2 M     | 1     | 0.03%      | 9.78          | 2.83  | 3.5      |
| 1.6 M     | 6     | 0.74%      | 9.23          | 11.28 | 0.8      |
| 1.7 M     | 1     | 0.01%      | 4.53          | 3.14  | 1.4      |

Table 1. Runtime improvements in optimization.

**Third**, our contribution in the 2<sup>nd</sup> half of the year builds and extends work that was published in ICCAD-23, leading to a reduced model for whole interconnect trees, rather than per-line models. That previous work did not include the void growth phase, but we have found a way to include it, using our equivalent circuit model. The simulation runtimes are significantly reduced (Table 2).

Table 2. Simulation runtime speed-ups (reduced model).

| Grid Size | CPUT     | lime (hr : m | Speed-up Relative to |        |        |
|-----------|----------|--------------|----------------------|--------|--------|
|           | PIRC20   | PACTN5       | New Model            | PIRC20 | PACTN5 |
| 37k       | 00:01:05 | 00:00:25     | 00:00:04             | 16 X   | 6X     |
| 146k      | 00:17:00 | 00:08:00     | 00:01:50             | 9 X (  | 4 X    |
| 560k      | 00:19:00 | 00:06:45     | 00:00:53             | 22 X   | 8 X    |
| 1.2M      | 01:28:00 | 00:45:00     | 00:01:37             | 54 X   | 28 X   |
| 1.6M      | 01:38:00 | 00:13:30     | 00:02:31             | 15 X   | 5 X    |
| 1.7M      | 01:01:00 | 00:23:20     | 00:01:14             | 50 X   | 19 X   |
| 2.6M      | 02:27:00 | 00:51:00     | 00:03:23             | 43 X   | 15 X   |
| 3.2M      | 02:00:00 | 00:40:00     | 00:11:00             | 11 X   | 4 X    |

**Keywords:** integrated circuits, electromigration, stress, reliability, optimization

#### INDUSTRY INTERACTIONS

Intel, NXP, Siemens EDA, Texas Instruments

#### MAJOR PAPERS/PATENTS

[1] Z. Moudallal, V. Sukharev & F.N. Najm, "Power grid fixing for EM-induced voltage failures," ICCAD, Nov 2019.

# TASK 2810.088, GRID OPTIMIZATION AND SILICON VALIDATION FOR CHIP ROBUSTNESS CHRIS KIM, UNIVERSITY OF MINNESOTA, CHRISKIM@UMN.EDU

FARID NAJM, UNIVERSITY OF TORONTO

# SIGNIFICANCE AND OBJECTIVES

Electromigration in a power grid is a significant reliability concern. However, characterizing the power grid EM behavior and collecting IR drop shifts as real silicon data is non-trivial. We analyzed early 28-nm CMOS EM silicon data with a circuit-based test vehicle to overcome the issue.

# **TECHNICAL APPROACH**

EM test chips are designed to collect IR drop trends under EM stress and analyze time-to-failure (TTF) statistics under different conditions. Experiments are done with multi-threaded automated test software that our group has developed to perform accurate and stable testing on four power grid structures under multiple temperatures and current stress conditions.

#### SUMMARY OF RESULTS



Figure 1. SEM image of test chip showing an EM-induced void as well as other non-EM damage. (courtesy: S. Moreau @ CEA-Leti)

Fig. 1 and Fig. 2 show SEM cross-sectional images of the EM test chip after stressing. EM-related voids and extrusion were observed, as well as other none EM damages. The majority of the damage consists of voids that formed in between the metal routing. Extensive melting and vaporization of the layers surrounding the heater suggest the chip is unable to withstand the high heater temperatures used. Lower heater temperatures need to be used for future testing to prevent non-EMrelated damage. Cycling testing was also performed to better understand other mechanisms that can influence EM. Fig. 3 shows the basic results of temperature and current cycling compared to the baseline. Temperature cycling attempts to further stress the power grid through repeated thermal expansion and contraction, thereby increasing void nucleation. Current cycle testing was done to determine if the power grid could recover if the stress was temporarily removed. Initial results show that temperature cycling has no significant effect on TTF, however, current cycling can significantly increase the TTF. Additional data is needed to confirm these results.



Figure. 2 SEM image showing melted and vaporized material on the heater. (courtesy: S. Moreau @ CEA-Leti)



Figure 3. Power grid resistance progression for temperature and current cycling.

**Keywords:** Power grid, IR noise, electromigration lifetime, silicon validation, physical design

#### INDUSTRY INTERACTIONS

Intel, Siemens EDA, Texas Instruments

#### MAJOR PAPERS/PATENTS

[1] Y. Yi, A. Kteya, A. Volkov, S. Moreau, V. Sukharev, and C.H. Kim, "Electromigration Test Chip Experiments From Realistic Power Grid Structures: Failure Trend Comparison and Statistical Analysis," International Reliability Physics Symposium (IRPS), 2024.

[2] A. Kteyan, V. Sukharev, A. Volkov, J. Choy, F. Najm, Y. Yi, C.H. Kim, and S. Moreau, "Electromigration Assessment in Power Grids with Account of Redundancy and Non-Uniform Temperature Distribution," International Symposium on Physical Design (ISPD), 2023.

# TASK 2810.092, BATTERY-CHARGING CMOS VOLTAGE REGULATOR FOR RESISTIVE DC SOURCES GABRIEL A. RINCÓN-MORA, GEORGIA INSTITUTE OF TECHNOLOGY, RINCON-MORA@GATECH.EDU

#### SIGNIFICANCE AND OBJECTIVES

Small dc sources are resistive, so the power they supply is limited. The aims of this research are to explore how the maximum power point (MPP) shifts, how this MPP can be tracked, and how a CMOS voltage regulator can track this MPP, supply a load, and recharge a battery.

#### **TECHNICAL APPROACH**

The first objective is to determine how power losses in the system shift the MPP. Understanding this will reveal what should and should not be done when tracking the MPP. Designing and integrating a low-power controller with a power-efficient stage that switches an off-chip inductor are next. The system must track the MPP, regulate the output, draw battery assistance when input power is deficient, and recharge the battery when excess input power is available. The CMOS system should incorporate amplifiers, comparators, references, and other circuit blocks that require low power to operate and short (duty-cycled) periods to perform their functions.

#### SUMMARY OF RESULTS

Small thermoelectric generators (TEGs) and glucose fuel cells (FCs) are slow and resistive. They deliver the most power when the loaded TEG (input  $v_{IN}$  of the harvester) is half the unloaded TEG (effective source voltage  $v_S$ ). The output maxes when  $v_{IN}$  is lower because the charger loses power. But source resistance  $R_S$  is so high that this charger loss can be negligible. As a result, the system delivers the most power when  $v_{IN}$  nears  $0.5v_S$ .

Disconnecting the system from the TEG to sense  $v_s$  sacrifices the power that  $v_s$  can deliver across this sensing period. But since  $v_s$  changes very slowly, the sensing frequency can be very low. The average power lost between sensing events can therefore be negligible.

This way, the charger in Fig. 1 can supply the battery  $v_B$  without losing much power to sensing events or the charger.  $C_S$  and  $C_H$  sample-and-hold  $0.5v_S$  and the maximum power-point (MPP) comparator  $CP_{MPP}$  keeps  $v_{IN}$  near this  $0.5v_S$ . The charge pump CP charges a temporary supply  $v_T$  that is used to switch the inductor  $L_X$  that draws and transfers  $v_{IN}$  power to  $v_B$ . When  $v_B$  is sufficiently high,  $v_B$  takes over for  $v_T$  until  $v_B$  is fully charged.

Threshold circuit  $v_{TH}$  senses when  $v_T$  first reaches 1.8 V in Fig. 2 to indicate the system is awake. Past this point, when starting,  $v_T$  supplies the energy that switches  $L_X$ . This way,  $L_X$  supplies alternating energy packets to  $v_B$  and  $v_T$ .



Figure 1. Maximum-power-point-tracking CMOS charger.



Figure 2. Wake, start, and static waveforms.

 $C_T$  discharges and charges this way until v<sub>B</sub> reaches the system's headroom v<sub>HR</sub>, which is when v<sub>B</sub> takes over for v<sub>T</sub>. So v<sub>B</sub> supplies the energy needed to switch L<sub>X</sub>. This startup process continues until v<sub>B</sub> can supply a load in static mode. **Keywords:** MPP tracker, charger, regulator, harvester

#### INDUSTRY INTERACTIONS

Intel, Texas Instruments

#### MAJOR PAPERS/PATENTS

[1] X. Li and G.A. Rincón-Mora, "Maximum Power-Point Theory for Thermoelectric Harvesters," MWSCAS '23.

In this project, we developed a real-time and energyefficient ASR accelerator for highly accurate on-device AI applications. Our preliminary studies found that the transformer model has performed best but the massive number of external memory accesses (EMA) is required. Our objective is a substantial reduction of EMA by adopting hardware-friendly algorithmic strategies.

# **TECHNICAL APPROACH**

Conceptually, the EMA is proportional to the number of total parameters, data precision, compression ratio, and reusability. We first replaced all the weights into one large-sized layer-wise shared and tiny distinct parameters so that the total number of parameters could be substantially reduced while maintaining accuracy. We then applied different quantization and encoding schemes for other data types to compress the overall data size. Finally, we devised a batch-processible data path to support different numbers of inputs per batch for various input lengths. It has larger data reusability for the case of short input length.

# SUMMARY OF RESULTS

The overall parameters were reduced by replacing all the weights into layer-wise shared parameters (denoted as D) and tiny distinct parameters (denoted as Z). All the matrix multiplications between input (denoted as X) and weight in the vanilla transformer are replaced into two sequential matrix multiplication XDZ, where three matrices are involved. X and D are dense, but Z is highly sparse, with a sparsity of around 0.85. All the layers share D and will be used again for the computations in the following layer. To prevent loading of the D multiple times, we placed a dedicated buffer for the D, which will not be changed during the execution. As a result, the effective EMA is almost occupied by runtime data transfer of Z, which could be reduced by 13x compared to the vanilla transformer.



Figure 1. Algorithmic co-optimization to reduce the overall number of parameters.

To further reduce the size of the overall parameters, we applied data-specific compression. A non-uniform quantization based on k-means clustering was used for the compression of D. We found no accuracy degradation up to the 4-bit quantization of D. Next, the sparse matrix Z is stored as an index-value tuple, where the index must be able to represent any integer ranging from 0 to 255, requiring 8-bit precision. Inspired by delta encoding, the original index sequence was encoded to have smaller values, up to 5-bit on average. The total parameter size was finally reduced by 6.5x compared to the 32-bit baseline.

Assuming the batch of 2 inputs can be simultaneously processed on a chip, the same parameter can be reused two times, and the effective EMA can also be reduced. The only difference between the two types of batch processing, depending on the input length, is the datapath in the attention layer. By making the accelerator support both datapaths, we achieved 1.6x EMA reductions in the whole inference of the Librispeech dataset.

| Non-Uniform Quantization                         |             | Delta Encoding                                                      |
|--------------------------------------------------|-------------|---------------------------------------------------------------------|
| Original Data<br>Quant. Without<br>Accuracy Drop | Upto        | Original 5 12 18 26 32 High Precision<br>Sequence $+7$ +6 +8 +6     |
|                                                  | Von-Uniform | Encoded 5 7 6 8 6 Low Precision<br>Sequence 5 7 6 8 6 Low Precision |

Figure 2. Data-specific compression: non-uniform quantization for D and delta encoding for index of Z.



Figure 3. 2 different datapaths for a single normal input sentence (top) and two short input sentences (bottom).

**Keywords:** automatic speech recognition, transformer, algorithmic co-optimization, data-specific compression, batch processing

#### INDUSTRY INTERACTIONS

Intel, NXP

#### MAJOR PAPERS/PATENTS

[1] S. Moon et al., "A sub-uJ/token transformer-based speech-to-text accelerator ...," (will be submitted at ISSCC 2025).

Cryogenic computing provides a better subthreshold slope, higher on current, and lower interconnect resistance, all without the baggage of dimension scaling, making it a promising approach for improving CPU performance in the post-Moore's law era. This project aims to address technological barriers to adopting cryogenic (e.g., 77K) CPU operation.

# **TECHNICAL APPROACH**

A RISC-V chip was taped out in 28nm technology in December 2023. An open-source RISC-V RTL is used for the logic part. Figure 1 (Top) shows the GDS illustration of the chip, together with the technical summary. For each chip, there are two identical cores implemented to save area. Each core could be running independently with its instruction cache (\$I) and data cache (\$D). 18 metal-based temperature sensors in each core for local temperature characterization are included and 84 high power-density ring oscillators are placed across the entire RISC-V region to mimic high-performance CPU power consumption.

#### SUMMARY OF RESULTS

The communication between the RISC-V core and the hosting computer is handled by the AXI controller. A programmable digital controlled oscillator (DCO) with a 10-ps tunning resolution (according to schematic simulation) is placed outside of the core area. The DCO could be programmed using an AXI controller and will provide the clock signal for both core#1 and core#2. Figure 1(Middle) shows more details about the temperature sensors and the high power-density ring oscillator. The temperature sensors are 4 terminal-based measurement systems and analog switches are used to select each of them on or off. The ring oscillator based structure is proposed to mimic real CPU self-heating behavior with a 5-stage design. Each high-density unit can be turned on or off and at the same time to achieve a high power density due to the small form factor and the high switching activity. The programmable DCO is implemented purely from the standard cell library provided by the foundry to make it compatible with the synthesis and APR flow and therefore the functionality could easily be verified using the industry standard verification tools such as VCS and Prime Time. Figure 1(Bottom) shows the testing system we have built. It consists of a motherboard, the ribbon cable connector, a raspberry Pi for controlling signal, a chamber for conventional temperature range (-20°C to

120°C ) testing and a Liquid Nitrogen dewar for cryogenic testing. The motherboard collects all different signals (digital, analog, power) and feeds them to the RISC-V chip using two 40-pin ribbon cables. The Raspberry Pi is mounted directly onto the motherboard and can be programmed remotely to control all testing equipment using Python.



High Power-Density Units (ROSC)





Figure 1. (Top) GDS illustration and technical summary of the 28-nm testing vehicle. (Middle) Detailed implementation of custom components (temperature sensor, high power-density unit and DCO) supporting the RISC-V core. (Bottom) Setup for room temperature testing and cryogenic testing will be performed in the next step with the RISC-V chip inside LN2.

Keywords: Cryogenic, ring-oscillator array, CPU performance improvement, on-chip-heater, self-heatingeffect

#### INDUSTRY INTERACTIONS

Intel, Texas Instruments
There is a large gap between the data rate, interference rejection, and sensitivity of ultra-low power (ULP) transceivers, and the radio requirements to meet today's wireless standards. This project will demonstrate new ULP transceivers that support higher order modulation, and higher data rates at ULP levels.

#### **TECHNICAL APPROACH**

New ULP transceiver architectures and signaling to achieve higher modulation indexes, focusing initially on QAM and OFDM signaling for their commonality in wireless standards are being investigated. Energydetection receivers for their low power consumption are prioritized. After selecting and simulating an architecture and signal, a test chip will be fabricated for verification. The use of the first-year work on phase noise cancellation in ULP transmitter and receiver links is being evaluated. This year, two-tone PSK signaling for its improved spectral efficiency and higher data rate are being investigated.

#### SUMMARY OF RESULTS

The research effort is sub-divided into two phases as shown in Fig. 1: Development of 1) efficient two-tone PSK transmitter, and 2) ULP ED-first PSK receiver. It was observed that the phase noise (PN) present on the transmitter can be canceled at the receiver with two-tone modulation. This relaxes the requirements of the transmitter, saving total power. Because PN is canceled at the receiver, this higher PN does not compromise using PSK to achieve higher-order modulation.



Figure 1. Preliminary architecture for Two-Tone PSK transmitter (phase 1) and receiver (phase 2).

Fig. 2 shows a plot of the efficiency of transmitters in the literature vs. output power. The efficiency is reduced as the output power is lowered. Below 0-dBm output power, none of these results achieve an efficiency > 32%. Improvements can be made by utilizing novel signaling that is compatible with high-efficiency architectures.



Figure 2. Efficiency of state-of-the-art ULP transmitters in the literature versus output power.

Envelope modulation is included to encode a two-tone PSK signal. In doing so, a switch-mode power amplifier is used to improve the power amplifier efficiency. Using two-tone modulation allows the receiver to downconvert the signal with ULP using a squaring circuit via intermodulation. The two-tone modulation sends the signal over the air such that it can be referenced to itself, eliminating the need for a local oscillator to be present at the receiver. It also enables near-coherent ULP downconversion, thereby canceling the phase noise common to the tones while maintaining the PSK information.

A system-level MATLAB simulation of the proposed transmitter-to-receiver phase noise cancellation is performed. It is also verified that the PSK information is preserved through the link as well. Lastly, a two-tone PSK transmitter with a switch-mode PA is designed using the TSMC 65GP process. The transmitter is optimized for 4-dBm output power. At -2-dBm output power the simulated transmitter efficiency is roughly 40%.

Keywords: ULP, Transmitter, Receiver, PSK, Two-Tone

#### INDUSTRY INTERACTIONS

Intel, Mediatek, NXP, Texas Instruments

#### MAJOR PAPERS/PATENTS

[1] A. Gupta, T. J. Odelberg and D. D. Wentzloff, "Low-Power Heterodyne Receiver Architectures: Review, Theory, and Examples," in IEEE Open Journal of the Solid-State Circuits Society, vol. 3, pp. 225-238, 2023.

## TASK 3160.016, MODO: HYBRID SIMO-DLDO DC-DC CONVERTER FOR MULTI-CORE MICROPROCESSORS AND SYSTEM-ON-CHIPS MINGOO SEOK, COLUMBIA UNIVERSITY, MGSEOK@EE.COLUMBIA.EDU

#### SIGNIFICANCE AND OBJECTIVES

Due to their poor transient performance, SIMO DC-DC converters have not been widely used in multi-core processors. This project aims to develop a hybrid SIMO-DLDO (MODO) power management architecture by proposing a MODO converter that achieves a high PCE and implementing event-driven control to improve dynamic load regulation performance.

#### **TECHNICAL APPROACH**

Digital LDO's fast transient response will be incorporated into DC-DC converters. Conventional DC-DC converters' dynamic regulation performance is limited due to the continuity of inductor current. With an assisting DLDO, this limitation can be overcome. When load changes occur, the DLDO will supply the load current and cut off the BUCK converters' loop by setting the PWM duty cycle to 1; then it will take over the regulation of the output node and slowly close its power switches. When the inductor current is larger than the load current, DLDO will be disabled.

#### SUMMARY OF RESULTS

In the last year, the DLDO model was revised and was submitted it to TVLSI. A digital LDO featuring load-dependent feedback and feedforward control was also measured. It shows a wide dynamic load range (2.478X) and fast transient performance (1.94pC-FoM) (Fig 1). The results are submitted to ESSERC2024.



Figure 1. Dynamic load regulation performance.

Lastly, a DLDO assisting BUCK converter (DABUCK) is taped out for fabrication. Fig. 2 shows the architecture of the DABUCK. It includes a conventional voltage-control BUCK with type-III compensation, a DLDO in parallel with the BUCK converter, a TDC for feedforward control and control logics, and an FSM that generates signals to control the PWM and LDO state.



Figure 2. Architecture of the DABUCK

Fig. 3 shows the assisting DLDO circuit. It consists of a PMOS array for an undershoot event and an NMOS array for an overshoot event. Each of them applies integral and feedforward control.





The post-layout simulation shows a 72mV/70mV droop/overshoot voltage upon a 1.1A/1ns load increase/decrease. The peak power efficiency is 94.4%.

**Keywords:** BUCK, DLDO, feedforward control, Integral control, dynamic load regulation

#### INDUSTRY INTERACTIONS

IBM, Intel, Texas Instruments

#### MAJOR PAPERS/PATENTS

 Y. Xu et al., "Model-Based Study on the Limit of the Dynamic Load Regulation Performance of a Digital Low Dropout Regulator," submitted to TVLSI 2024
 Y. Xu et al., "A 28-nm, 1.94-pC FoM, 2478X Load Range Digital LDO Featuring Load-Dependent Feedback and Feedforward Control," submitted to ESSERC 2024

## TASK 3160.017, MULTI-PHASE SUB-100FS JITTER RING-OSCILLATOR-BASED CLOCK MULTIPLIERS FOR BEYOND 100GB/S LINKS PAVAN HANUMOLU, UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN, HANUMOLU@ILLINOIS.EDU

#### SIGNIFICANCE AND OBJECTIVES

Sub-rate serial link transceivers are increasingly favored for overcoming bandwidth constraints. Yet, this method requires routing a high-frequency clock signal surpassing 14 GHz. We aim to develop a frequency multiplier and multi-phase generator capable of minimizing jitter to less than 100 fs r.m.s.

#### **TECHNICAL APPROACH**

We introduce methodologies to improve the phase noise performance and enhance supply noise immunity in ring-oscillator-based PLLs functioning at frequencies surpassing 10 GHz. Central to our approach are key design techniques such as employing a sampling phase detector to minimize in-band phase noise and adopting a low-noise multiphase ring oscillator (RO) design to mitigate out-ofband noise. Additionally, we implement a type-III supplyregulated architecture to broaden the frequency range and alleviate sensitivity to process, voltage, and supply variations.

#### SUMMARY OF RESULTS

Figure 1 illustrates the proposed PLL architecture, comprising a sampling phase detector (SPD), two integrators, a voltage-controlled ring oscillator (VCRO) with frequency tuning capabilities via supply voltage and varactor adjustments, and an integer-N divider. The SPD is constructed using a slope generator and a track-and-hold circuit to mitigate reference spurs and minimize undesired effects like clock feedthrough. This is achieved by strategically placing the PMOS tracking switch (M<sub>1</sub>) within the slope generator. Upon the positive edge of the reference clock (REF), the sampling capacitor (Cs) is charged through resistor R<sub>s</sub>. Subsequently, upon the positive edge of the feedback clock (FB), the exponential rising voltage is sampled onto C<sub>s</sub>, capturing the phase difference between the REF and FB clocks. The  $R_s$  (60 $\Omega$ ) and C<sub>s</sub> (1.2pF) values were selected to ensure high SPD gain and reduced kT/C noise, limiting the SPD's noise contribution to -145dBc/Hz. The voltage across  $C_H$ corresponds to the proportional control voltage  $(V_P)$ , while the integral control voltage  $(V_1)$  is established by the  $G_{M1}$ - $C_1$  integrator. This integrator also ensures that  $V_P$ equals  $V_{REF1}$  when the PLL is locked. An additional integrator  $(G_{M2}-C_2)$  and an NMOS-based regulator have been integrated into the PLL to mitigate supply and temperature sensitivity.



Figure 1. Propose type-III PLL architecture.

A prototype PLL was fabricated using a 22-nm FinFET technology and housed in a plastic QFN package. The PLL was locked to a REF clock (812.5MHz) produced by an Analog Devices evaluation board (ADF4377). Figure 2 showcases the PLL phase noise at 13GHz output frequency, depicted in two operational modes: type-II and type-III. Only a marginal increase of jitter was observed (67.5fs in the type-II mode vs. 69.3fs in the type-III mode) across an integration range from 10kHz to 100MHz. Of the total 69.3fs of jitter, 52fs can be attributed to the PLL, while the remaining portion arises from the reference clock path.



Figure 2. Measured phase noise plots.

Keywords: ring PLL, low jitter, type-III response

#### INDUSTRY INTERACTIONS

Intel, NXP, Texas Instruments

#### MAJOR PAPERS/PATENTS

[1] M. Khalil et al., "A 69.3fs ring-based sampling-PLL achieving 6.8GHz – 14GHz and -54.4dBc spurs under 50mV...," 2023 IEEE International Solid-State Circuits Conference (ISSCC), San Francisco, CA, USA, 2023.

Synthesis, Auto-Place, and Route (SAPR) dominate the modern SoC construction methodology. However, energyefficient SoCs require automation of integrated Voltage Regulation (VR) to produce fine-grained voltage and clock domains through FLL/PLLs. This effort explores and devises an all-digital domain compiler to generate clock and voltage-regulated domains.

#### **TECHNICAL APPROACH**

The effort is organized into two thrusts. **Thrust 1:** We will build upon our  $V_{dd}$  -droop tolerant and fast-response UniCaP-2 construction (Fig. 1(b)) to explore and develop a framework that automates the construction of robust, larger, all-digital domains. User-provided constraints (Fig. 1(a)) are used to develop a unified system. **Thrust 2:** Autonomous, all-digital run-time VR loop-gain tuning will be used to ensure optimal transient response across PVT conditions, thereby overcoming the problem of poor performance due to margining for worst-case PVT conditions. In the context of UniCaP, improved VR response minimizes performance loss from FIFO saturation, and margins due to memory V<sub>min</sub> constraints.

#### SUMMARY OF RESULTS



Figure 1. (a) Overview of proposed Domain Compiler, (b) simplified schematic of the proposed architecture consisting of integrated LDO/PLL modules in addition to the load domain.

The focus of our effort in Year 2 builds upon our efforts in Year 1 to develop an autonomous gain tracker. We further developed the current sensor used to track  $I_{LSB}$ , the current of one unit header under the PVT conditions it was subjected to. The design was taped out and awaiting silicon test. We also developed two necessary design time optimization flows: (1) extracting critical paths from a digital design, producing their equivalent spice netlists, evaluating them under all anticipated PVT corners, and using it to provision the TRO module (Fig. 1); and (2) evaluating the stable values of  $K_i$  and  $K_p$  needed for a PI controlled LDO across PVT operations anticipated by the user, and arriving at a convex optimization formulation that solves a regression problem for a polynomial in  $I_{LSB}$  and  $V_{dd}$  which can be used to tune  $K_i$  and  $K_p$  at run time to maintain optimal compensation for the system (Fig. 2).



Figure 2. Simulated impact of  $K_i/K_p$  on LDO response time and phase margin. A data-driven approach enables determining regions of valid  $K_i$  and  $K_p$  settings and maximizes LDO performance under given stability constraints while guard banding for error in sense mechanisms for  $I_{LSB}$  or  $V_{dd}$ .

Fig. 3 shows outlines the effectiveness of the proposed approach. We look to tape out our first generation of Domain-compiled designs in 65-nm CMOS, with a second design to follow in 16-nm CMOS early next year.



Figure 3. Simulation results of the deviation (in mV) of voltage droop achieved using polynomial runtime regression to adjust  $K_i$  and  $K_p$  vs optimal settings. Less than 2mV of droop degradation is observed over the ideal, per-PVT  $K_i/K_p$  setting.

Keywords: Model-predictive control, Voltage Regulation

#### INDUSTRY INTERACTIONS

AMD, IBM, Intel, NXP, Texas Instruments

Improving voltage regulator (VR) transient droop response continues to enhance SoC energy efficiency but is limited to acting in response to a  $V_{dd}$  droop after a load current ( $i_{load}$ ) transient. This project explores *data feedforward* to improve VR transient response, responding alongside a load surge, rather than doing so reactively.

#### **TECHNICAL APPROACH**

This task constitutes two thrusts: **Thrust 1**, Stable V<sub>dd</sub> control with data-driven digital  $i_{load}$  sensing to provide data feed-forward (Fig. 1 (Left)), and frequency for improved LDO transient response (Fig. 1 (Right)) and **Thrust 2**,  $I_{load}$  Estimation and Prediction based on digital state. We will leverage this and related efforts to evaluate tradeoffs in model complexity, precision latency, energy dissipation, and  $i_{load}$  sensing accuracy - properties that will determine the overall effectiveness of the combined VR-load system. Special attention will be paid to analyzing system stability.



Figure 1. The proposed VR-domain co-design effort uses  $i_{load}$  estimation obtained from the load to not only provide a feed-forward signal for LDO operation but also allow cycle skipping to pre-emptively reduce load current to become in line with current delivery capabilities. Simulated performance of the proposed architecture, allowing the feed-forward LDO to act early to minimize voltage droop.

This effort is currently in its second year. Our most important finding during the prior year has been the realization that the use of state variables in the load will not assist LDO design in a meaningful manner. The main reason for this is that processors operate at frequencies much higher than even high-speed LDOs, which themselves sample  $V_{dd}$  at frequencies much higher than their dominant output pole. As a result, the current draw resulting from any operating cycle (or a few) is readily filtered by decoupling capacitance. To gain full benefit from a feed-forward architecture, we would need to estimate current and deliver the load estimate to the LDO within a single cycle and have the LDO use the information before the next data sample. We have seen some benefit of load estimation to the operation of the LDO specifically as an alternative current sensing mechanism to the notoriously noisy D-component of PID control. However, we do not feel the benefits outweigh the costs associated with implementing a load estimation engine. We are exploring instead, an approach that relies on load estimation that overcomes the transientresponse/efficiency limitations of buck converters by implementing load-current demand control. The idea is to proactively throttle a load domain to avoid adaptive clocking events.



Figure 2. Block diagram of the feed-forward LDO that we implemented and simulated extensively. Simulation shows effectively no droop, but instead a slight surge owing to a current overestimation by the  $I_{load}$  sensor. Note that this simulation assumes 1-cycle latency in the LDO.

Keywords: Feed-forward, Current-estimation

#### INDUSTRY INTERACTIONS

AMD, IBM, Intel, NXP, Texas Instruments

#### MAJOR PAPERS/PATENTS

[1] Z. Xie, X. Xu, M. Walker, J. Knebel, K. Palaniswamy, N. Hebert, J. Hu, H. Yang, Y. Chen, and S. Das, "APOLLO: An Automated Power Modeling Framework for Runtime Power Introspection in High-Volume Commercial Microprocessors," in MICRO-54 Oct. 2021, pp. 1–14.

Charging and energizing small devices from portable USB sources is pervasive nowadays. Volume, power efficiency, and response time are critical in this space. The objective of this research is to develop a compact, efficient, single-inductor voltage regulator that charges and monitors a battery while supplying uninterrupted power to the load.

#### **TECHNICAL APPROACH**

The first phase is to design a CMOS power supply that switches one inductor so it charges a battery while supplying the load. The system is compact and efficient because one inductor transfers all the power. The system is also fast because the inductor always drains into the load, even while charging the battery. The second phase is to develop the CMOS controller that switches the inductor. This controller should stabilize the feedback loop so the response time of the system is short. The last phase is to explore electronic markers that reflect the health and state of the battery.

#### SUMMARY OF RESULTS

The switched-inductor power supply proposed in Fig. 1 is faster than the state-of-the-art (SoA) because L<sub>X</sub> always drains into the output v<sub>0</sub>, even when charging the battery v<sub>B</sub>. When supplied from the input v<sub>IN</sub>, L<sub>X</sub> energizes and drains into v<sub>0</sub> with M<sub>XO</sub> or through v<sub>B</sub> with M<sub>BO</sub> buckfashion. Without v<sub>IN</sub>, L<sub>X</sub> energizes and drains into v<sub>0</sub> with M<sub>IO</sub> from v<sub>B</sub> also like a buck.



Figure 1. Battery-charging CMOS voltage regulator.

One of the challenges with this design is the floating nature of v<sub>B</sub>. When  $M_{XG}$  connects  $v_{SWX}$  to ground, for example,  $v_{SWB}$  swings below ground. And when  $M_{BO}$  connects to  $v_O$ ,  $v_{SWX}$  rises above  $v_O$ . Minimum- and maximum-supply blocks are therefore necessary to select and generate the lowest and highest supplies that  $M_{IO}$ ,  $M_{XO}$ , and  $M_{BG}$  need to turn off.

But since  $L_X$  always drains into  $v_O$ , this switcher excludes the right-half-plane zero  $z_{RHP}$  normally present in boostderived supplies. Without  $z_{RHP}$ , a type-III voltage-mode controller can stabilize the feedback loop that regulates  $v_O$ . This way, the unity-gain frequency  $f_{OdB}$  of the loop can be higher than the LC double pole  $p_{LC}$  that  $L_X$  into the output capacitor  $C_O$  sets.

For this,  $A_E$  in Fig. 2 adds the pole  $p_1$  that reduces the loop gain  $A_{LG}$  towards  $f_{0dB}$  and the two zeros  $z_1$  and  $z_2$  that recover the phase lost to  $p_{LC}$ . The pulse-width modulator that  $CP_{PWM}$  converts  $A_E$ 's output  $v_{EO}$  into the duty-cycled command  $d_E$ ' that switches  $L_X$  in Fig. 1. Since  $CP_{PWM}$  samples  $v_{EO}$  once every clock cycle  $f_{CLK}$ ,  $CP_{PWM}$  can phase-shift  $v_{EO}$ 's translation to  $d_E$ ' near  $f_{CLK}$ .  $p_1$ ,  $z_1$ ,  $z_2$ , and  $p_{LC}$  must therefore reduce  $A_{LG}$  to an  $f_{OdB}$  that is below  $f_{CLK}$ .



Figure 2. Type-III voltage-mode PWM voltage regulator.

The CMOS switcher in Fig. 1 has been fabricated and will be tested later in 2024. A non-provisional patent for this structure was filed late in 2023. A paper on MOSFET selection and cross-conduction of multiple-I/O power supplies was accepted for publication at the IEEE MWSCAS in 2024. The controller in Fig. 2 is being designed in 2024 so it can be implemented in 2025.

**Keywords:** Voltage regulator, charger, switched inductor, multiple inputs, multiple outputs

#### INDUSTRY INTERACTIONS

Intel, NXP, Texas Instruments

#### MAJOR PAPERS/PATENTS

 L. Cui, Q. Zhi, and G.A. Rincón-Mora, "Compact, Accurate, and Efficient Battery-Charging CMOS Voltage Regulator," non-provisional patent submitted.
 L. Cui and G.A. Rincón-Mora, "Switched-Inductor Multiple-I/O Power Supplies: MOSFET Selection and Cross Conduction," IEEE MWSCAS 2024, accepted. VISVESH SATHE, GEORGIA INSTITUTE OF TECHNOLOGY, SATHE@GATECH.EDU

#### SIGNIFICANCE AND OBJECTIVES

Integrated Voltage Regulation (IVR) -- remains a critical technology for driving sustained efficiency in SoCs, offering buck converter efficiencies without additional bulky components. The objective of this effort is to devise a domain-scalable run-time programmable IVR fabric that drives and leverages advances in adaptive clocking and SIMO design (Fig. 1).

#### **TECHNICAL APPROACH**

The design effort is organized into four thrusts (1) analyzing the effectiveness of UniCaP in designs with a large insertion delay; (2) demonstrating domain-scalable SIMO implementation, critical to providing the necessary flexibility required for a dynamically programmable IVR fabric; (3) investigating optimal design-time allocation of cross-bar switches which connect modules to domains based on SoC usage profiles; and (4) designing a tileable buck architecture that can be configured at run-time either as a single-buck, a multi-phase buck, or a SIMO converter. The goal of the effort is to implement and demonstrate a test chip incorporating all 4 efforts.

#### SUMMARY OF RESULTS



Figure 1. Proposed DRIVR system consisting of multiple buck tiles and domains connected through a partial power crossbar. We have prototyped a 2-tile-8-domain of DRIVR. Domains include a RISC-V processor, cordic, FIR FFT accelerators, and dummy domains containing synthetic loads to emulate a multi-domain system.

All 4 goals of our proposed effort were accomplished under the augmented timeframe of an additional 6 months granted by TxACE. We have designed the DRIVR test chip in 65-nm CMOS. The chip has been fabricated and is being packaged for testing. A layout photograph of the test chip is shown in Fig. 2. The DRIVR chip features (1) 8 V<sub>dd</sub> domains which can be configured to connect none, one or both of the provided buck-converter driven power rails; (2) a capability to adaptively "hot-swap" domains from one rail to the next; (3) operate each domain under either header mode or digitally regulated LDO mode; (4) autonomously control buck rail voltages to settle to the minimum voltage required of the domains it is driving; and a highly-digital load-tracker to detect excess current draw from load domains and throttle them to ensure load compliance.



Figure 2. Layout of the DRIVR test-chip.

Post-layout simulations have been carried out to validate the functionality of each of the key features of DRIVR as listed above. We anticipate silicon testing to complete by September.

**Keywords:** UniCaP, SIMO, Configurable Voltage Regulation

#### INDUSTRY INTERACTIONS

ARM, Intel, NXP

#### MAJOR PAPERS/PATENTS

[1] Huang, C-H et. al., "A Single-Inductor 4-Output SoC with Dynamic Droop Allocation and Adaptive Clocking for Enhanced Performance and Energy Efficiency in 65nm CMOS," ISSCC 2021.

[2] Sun, X. et. al., "UniCaP-2: Phase-Locked Adaptive Clocking with Rapid Clock Cycle Recovery in 65nm CMOS," VLSI Symposium 2020.

The enhanced spatio-temporal control afforded by Integrated Voltage Regulation (IVR) in SoCs is critical to achieving efficiency, provided they can maintain or improve voltage droop despite reduced available decap. This effort examines exploiting computation to build *systems* (load and voltage regulator) that are more energy efficient.

#### **TECHNICAL APPROACH**

The design effort is organized into two thrusts. The first is using "computational control" to achieve a time-optimal transient response to random switching load current profiles typical of SoCs. This work was demonstrated in [1] through a computationally controlled LDO. The second is using runtime computing to perform control and optimization that minimizes *total system* energy not only across *all* domains, and Voltage Regulators and converters combined, but minimizes this aggregate energy over time in duty-cycled systems, across Active, Sleep and Wake events combined. Finally, we explore realizing minimum total energy while providing application guarantees across different program execution flows.





Figure 1. Multi-domain, total energy minimization system across domains+VR, and across duty-cycling episodes.

Unlike prior work on minimizing energy efficiency of SoCs that has focused on a single domain, modern SoCs rely on multiple voltage domains to realize energy efficiency. We designed a SIMO-based system that leverages our prior work on regenerative breaking [2] and combines it with time-based energy measurement to develop a system that performs run-time optimization of the overall system intending to minimize total energy draw from the supply.

Fig. 2 shows simulation waveforms obtained from the proposed system implementation, demonstrating the need for runtime multi- $V_{dd}$  optimization for a system that sequentially executes computations on a processor, and then on an accelerator. The system guides  $V_{dd}$ 's toward minimum energy while guaranteeing total execution time. Thus, the system minimizes total power while guaranteeing application performance. We are expecting the test chip of this design to arrive in a month and are anticipating sharing our findings in the Annual Review.



Figure 2. Simulation waveforms showing the trajectory of the optimizer as it transitions multiple domain  $V_{dd}$ 's at runtime to minimize total energy draw from the system across Sleep, Wake, Run episodes in a manner that continues to provide guarantees on overall performance (beyond fclk)

Keywords: Model-predictive control, Voltage Regulation

#### INDUSTRY INTERACTIONS

ARM, Intel, NXP

#### MAJOR PAPERS/PATENTS

[1] Sun, Xun, et al. "14.5 A 0.6-to-1.1 V Computationally Regulated Digital LDO with 2.79-Cycle Mean Settling Time and Autonomous Runtime Gain Tracking in 65nm CMOS." 2019 ISSCC.

[2] Huang, C-H et al., "Energy Minimization of Duty-Cycled Systems Through Optimal Stored-Energy Recycling from Idle Domains," ISSCC 2022.

The research goal of this project is to create a holistic approach to provide physical-layer security and spectral efficiency for energy-constrained wireless communication technologies. We propose using information-centric algorithms in the design of secure RF systems for achieving high spectrum utilization with low energy.

#### **TECHNICAL APPROACH**

We introduce a novel constellation projection scheme in which we transmit symbol projections along distinct basis vectors via separate antennas. By randomizing the selection of basis vectors, we induce scrambling in all directions except the broadside, while ensuring the intended receiver perceives an unaltered constellation. Furthermore, we are designing a low-power modulo sampling technique to achieve a wideband spectrum sensing with a high instantaneous dynamic range (DR).

#### SUMMARY OF RESULTS

In our approach to physical-layer wireless security, we utilize orthogonal projections of symbols distributed across various antenna elements. In a static AWGN channel scenario with steering vectors, these projections undergo diverse phase shifts at Eve's locations, resulting in a distorted constellation pattern. However, in Bob's position, these phase shifts align, contributing constructively to reconstructing the original symbol. To bolster security, we implement randomized changes to the projection basis vectors over time, as depicted in Fig. 1, adding additional protection against eavesdropping.



Figure 1. Alteration of basis vectors for the 16-QAM constellation projection, introducing scrambling for security.



Figure 2. BER vs. Angle of Arrival for the 16-QAM constellation projection with change of basis (Bob and Eve have 10-dB SNR).

We performed Monte Carlo simulations to assess the bit error rate (BER). As illustrated in Fig. 2, the BER significantly degrades outside the broadside, primarily due to scrambling, effectively fortifying security against potential eavesdroppers in that area. Notably, our method ensures secrecy without the need for complex computational algorithms, and Bob does not incur any additional processing overhead.

Additionally, our sampler integrates modulo folding within a feedback loop while embedding a built-in antialiasing filter (Fig. 3), minimizing circuit complexity and power usage. Folding is initiated by comparator-detected threshold crossings through negative feedback. This lowpower modulo sampler enhances instantaneous DR.



Figure 3. Modulo folding with an embedded anti-aliasing filter.

**Keywords:** physical-layer security, phased array, modulo sampling, wideband spectrum sensing, low power

INDUSTRY INTERACTIONS

MediaTek

High-voltage (HV) systems in electric vehicles (EV) pose risks during emergencies, requiring efficient capacitor discharge for safety. This project aims to develop a swift, cost-effective electronic system to mitigate these hazards, enhancing safety for technicians and emergency responders during accidents, repairs, and other emergencies.

#### **TECHNICAL APPROACH**

The proposed method introduces a new active discharge electronic circuit system. It significantly reduces DC link capacitor discharge time from over 10 seconds to just 1 second by employing the main inverter switches. This system integrates an adjustable gate driver to modulate the switch's gate-source voltage, enabling constant power operation during discharge. Through frequency modulation, thermal runaway is effectively prevented. Consequently, the system achieves the targeted 1-second discharge time, demonstrating its efficiency in managing capacitor discharge within a shorter timeframe.

#### SUMMARY OF RESULTS

Table 1 provides a comprehensive summary of test results conducted under specific conditions. The data underscores the critical role of gate-source voltage in achieving successful discharge operations. Initial experimentation involved discharging a 1000-V DC link capacitor using various TO-247 package MOSFETs and a SiC Power Module. Among the tested vendors, five managed a 1-second discharge at this voltage with a 2µs pulse width. However, one vendor necessitated a 6V V<sub>GS</sub>, indicating limitations at lower voltages due to inherent high threshold voltage. Nevertheless, all devices exhibited successful 1-second discharges with proper V<sub>GS</sub> voltage under constant power operation.



Figure 1. The architecture for active discharge. The inverter's switches serve to efficiently discharge high-voltage DC link energy, offering cost and space savings, while achieving a decreasing the discharge time to just 1 second.

Once the concept was validated, assessing the operational reliability became imperative. The reliability test involved discharging a capacitor for a 900-V DC link onto the devices under test (DUTs) once their case temperature reached around 80°C on a hot plate. Following each 1-second discharge period, a 4.5-second interval was implemented to prevent unchecked temperature escalation within the devices. This interval ensured the DUTs could safely charge and discharge within acceptable temperature thresholds. Reducing this interval causes the device's case temperature to rise during successive cycles, potentially leading to thermal runaway and device damage. Therefore, the interval is necessary for adequate cooling of the DUTs. Notably, the inverters of first two vendors in the table exhibited unreliable traits for 1-second discharge time at V<sub>GS</sub>=6V, marked by over a 50% increase of on-resistance values and significant alterations in body diode forward voltage. Upon repetition of the test with 5-V V<sub>GS</sub> at 1-second discharge time, only the V<sub>GS</sub> value changed, and these vendors demonstrated robust characteristics without parameter shifts. Vendor 3 and the power module displayed reliable traits under 6-V V<sub>GS</sub> at 1-second discharge. Overall, active discharge operation under proper gate voltages ensures reliable performance. These switches can effectively be used for this operation, leading to significant cost and space savings.

| Vendor | V <sub>DC</sub><br>Link | $V_{GS}$ | Discharge<br>Time | Pulse<br>Width |
|--------|-------------------------|----------|-------------------|----------------|
| 1      | 1000V                   | 5V       | √1s               | 2µs            |
| 2      | 1000V                   | 5V       | √1s               | 2µs            |
| 3      | 1000V                   | 6V       | √1s               | 2µs            |
| 4      | 1000V                   | 5V       | √1s               | 2µs            |
| 5      | 1000V                   | 5V       | √1s               | 2µs            |
| Module | 1000V                   | 5V       | √1s               | 2µs            |

**Keywords:** Active Discharge, Constant Power, Electric Vehicle, High Voltage DC Link Cap, Frequency Modulation

#### INDUSTRY INTERACTIONS

**Texas Instruments** 

The key goal of this project is to demonstrate highspeed, high-resolution time-domain converter architectures in fine CMOS nodes with minimum voltagedomain assistance or native, linear time-domain S/H and amplifiers (which do not exist). Silicon prototypes and experimental results will be demonstrated and reported.

#### **TECHNICAL APPROACH**

Technology scaling presents a unique opportunity for time-domain (TD) analog circuits over their conventional voltage-domain counterparts. While the latter struggles with a dwindling supply voltage, the former gains (TD) resolution and accuracy with each finer process node. An enabling technique of time-domain RNS (residue number system) encoding is proposed to achieve a 4-bit architectural complexity, thus ensuring high efficiency, for the 8-bit first stage in a 12-bit two-step pipelined TDC. The large leading stage resolution greatly relaxes the interstage residue accuracy requirements, making it possible to achieve a 12-bit resolution using exclusively native time-domain circuits that exhibit superior technology scalability.

SUMMARY OF RESULTS

**Keywords:** time-to-digital converter (TDC), remainder number system (RNS) encoding, time-domain circuits

INDUSTRY INTERACTIONS

NXP

## TASK 3160.038, VERTICAL: MULTI-CORE VOLTAGE-STACKED MICROPROCESSOR WITH A DYNAMIC LOAD SHUFFLING AND A SIMO CONVERTER

MINGOO SEOK, COLUMBIA UNIVERSITY, MGSEOK@EE.COLUMBIA.EDU

#### SIGNIFICANCE AND OBJECTIVES

Vertically stacking voltage domains have been proposed to improve efficiency via current recycling. However, its efficiency degrades if stacked loads have a mismatch in current draws. We aim to create a new voltage-stacked architecture that can achieve higher efficiency than conventional architecture, even if the stacked domains exhibit a mismatch.

#### **TECHNICAL APPROACH**

In this period, we have investigated the power supply rejection ratio (PSRR) of digital low dropout regulators (DLDOs). We developed time-domain analytical models for the PSRR of DLDOs, using step and ramp noise injections to observe the output voltage response. We performed SPICE simulations in 28-nm CMOS for model validation. We explored the impact of loop parameters on PSRR, such as clock frequency, unit current, and output capacitance. We considered an integral control DLDO under various noise conditions and we formulated PSRR parameters. It provides insights for optimizing DLDO design and enhancing system stability and robustness against supply noise.

#### SUMMARY OF RESULTS

**PSRR Modeling:** The work formulates models for five key parameters:  $T_{inf}$ ,  $T_{lim}$ , PSRR<sub>0</sub>, PSRR<sub>1</sub>, and PSRR<sub>2</sub>. These models accurately predict PSRR behavior under varying conditions of input noise.

**Simulation Findings:** Validation through SPICE simulations shows that the models closely match the simulation results with minimal error. PSRR improves with increased ramp time until a limit beyond which the inherent output voltage ripple dominates.

**Design Insights:** Increasing the clock frequency and power transistor unit current significantly enhances PSRR. The output capacitance has a limited impact, being inversely proportional to the square of the capacitance. Optimal PSRR design involves balancing these parameters to ensure system robustness against power supply noise.

**Practical Applications:** The presented models and methodologies provide practical guidelines for achieving optimal DLDO design. The results highlight the importance of PSRR in enhancing the stability and integrity of systems using DLDOs. In conclusion, this work offers significant contributions to the understanding and improvement of

PSRR in DLDOs, providing essential tools and insights for circuit designers to enhance the performance and stability of integrated systems.



Figure 1. (a) Input voltage with ramp noises, and (b) simulated PSRR under  $T_{\text{ramp}}$  variation.



Figure 2. PSRR improvement by (a) unit current, and (b) clock period, predicted using the models.

**Keywords:** Digital low dropout regulator, power supply rejection ratio, power supply noise, dynamic load regulation, integral control

#### INDUSTRY INTERACTIONS

IBM, Intel, NXP

#### MAJOR PAPERS/PATENTS

[1] K. Baek, et. al., "Model-Based Study on the Power Supply Rejection of a Digital Low Dropout Regulator," Submitted to the ICCAD, 2024.

## TASK 3160.039, PROACTIVE POWER AND CLOCK MANAGEMENT FOR SYSTEM-ON-CHIP

JIE GU, NORTHWESTERN UNIVERSITY, JGU@NORTHWESTERN.EDU

#### SIGNIFICANCE AND OBJECTIVES

Emerging computing methods offer new opportunities to solve the grand challenges of modern ICs. This project will develop advanced computing methods and associated circuit solutions to enable proactive power and clock management for overcoming the ever-increasing challenges of maintaining power and timing integrity.

#### **TECHNICAL APPROACH**

Compared with conventional solutions, this project will develop advanced machine learning (ML) techniques to proactively predict chip power consumption and upcoming supply droops so that reaction can be taken ahead of time before the real events, overcoming the slow response of conventional approaches. During this reporting period, we have developed a simulation platform that allows the generation of chip power consumption and supply droops over a large number of run cycles for the training of ML models. We also have evaluated several popular ML models for power predictions.

#### SUMMARY OF RESULTS

In this period of the project (from January to May 2024), we have focused on two tasks: (1) developing a simulation platform for data generation of chip power and voltage droop over a large number of run cycles and (2) evaluation of ML models for prediction of supply droops. These efforts are built upon our prior work on a proof of concept for proactive power management. During this reporting period, we aim to develop a better integrated and automated simulation bench and perform a study on machine learning capabilities for power and droop predictions. Below are the detailed developments and results.

First, we improved our previous simulation platform. The previous simulation platform only allowed a few thousand run cycles. In the new simulation platform, we can run benchmark programs at 100k cycles so that entire benchmark programs can be evaluated without breaking into small pieces for execution. The testbed, a RISC-V CPU is also improved so that the cache can contain entire programs for smoother execution. Fig. 1 shows a simulation platform and an example of simulated and predicted supply droop with testbench programs.

Second, we use the obtained power and supply voltage data (150k run cycle data from 14 benchmark programs) to train a few typical ML models, e.g. neural network

(MLP), linear regression, LSTM, and evaluate the accuracy of each model. Table 1 shows the accuracy of each model on the benchmark run. The accuracy is reported for the prediction of three-level current/voltage values. Many variables are varied in the study as shown in the table including the number of run cycles used for prediction, neural network layers, and etc. This development overcomes our prior limitations with fewer benchmarks and run cycles. Interestingly, LSTM performs the best. A complete report will be provided in the next report period.



Figure 1. The overall CPU power analysis flow and an example of the supply droop corresponding to CPU activity.

Table 1. Voltage prediction and current prediction results.

|          | Current Prediction |               |                   |                   |                   |           |             |                 |  |  |
|----------|--------------------|---------------|-------------------|-------------------|-------------------|-----------|-------------|-----------------|--|--|
| Model    | LR                 | 1-layer<br>NN | MLP (2-<br>layer) | MLP (3-<br>layer) | MLP (4-<br>layer) | 1-l<br>LS | ayer<br>STM | 2-layer<br>LSTM |  |  |
| Accuracy | 69.90%             | 67.20%        | 74.20%            | 78.60%            | 6 78.10%          | 79        | .80%        | 82.10%          |  |  |
|          | Voltage Prediction |               |                   |                   |                   |           |             |                 |  |  |
| Model L  |                    | LR            | 1-layer NN        |                   | MLP               |           | LSTM        |                 |  |  |
| Accura   | Accuracy 62.3      |               | 63.20%            |                   | 74.90%            |           | 79.10%      |                 |  |  |

**Keywords:** proactive power management, machine learning, supply droop mitigation, RISC-V CPU

#### INDUSTRY INTERACTIONS

IBM, Intel, Texas Instruments

#### MAJOR PAPERS/PATENTS

[1] X. Chen, et. al., Submitted to IEEE J. of Solid-State Circuits, 2024.

TASK 3160.042, HIGH STEP-DOWN SISO AND SI2MO HYBRID CONVERTERS FOR 48/120V INPUTS AND BEYOND PARTHA PANDE, WASHINGTON STATE UNIVERSITY, PANDE@WSU.EDU DEUK HEO, WASHINGTON STATE UNIVERSITY

#### SIGNIFICANCE AND OBJECTIVES

The objective of this project is to design a high stepdown hybrid structure capable of generating a sub-1V output from a 48-V input. Our focus is on achieving high performance while maximizing power density, ensuring efficiency and compactness in the overall design.

#### **TECHNICAL APPROACH**

This research aims to develop: (1) A novel single-input single-output (SISO) high step-down hybrid architecture characterized by reduced footprint using a fewer number of passive components, high efficiency, and high-power density and (2) a single-inductor independent multi-output (SI2MO) high step-down hybrid converter architecture with a fast response.

#### SUMMARY OF RESULTS

We have investigated a novel SISO high step-down hybrid buck converter comprising two off-chip inductors and five off-chip flying capacitors to demonstrate a Dual-Inductor DC-DC converter. The proposed power stage effectively generates a sub-1V output from a 48-V input and lowers the power switching voltage rating.

The block diagram of the proposed high step-down SISO converter is presented in Fig. 1(a). Here, the power stage includes two inductors and two switched-capacitor power stages (SCPS). The SCPS achieves a steady state using a relaxed duty cycle ratio PWM control method, leveraging the benefits of both the three-level and Dickson topologies for a high voltage conversion ratio (VCR). A major advancement from our work is the development of the power stage segmented into two sub-blocks: the high voltage (HV) block and the low voltage (LV) block. First, the HV block reduces the input voltage  $(V_{IN})$  by half using a three-level buck converter. The reduced V<sub>IN</sub> then serves as the input for the Dickson topology, allowing power switches to operate at lower stress voltages, as implemented in the LV block. This ensures that the duty cycle is always maintained within a reasonable range. In the final phase, two dual-path inductors generate a minimal current passing through the SCPSs during high VCR scenarios. The two inductors must provide all the output current, resulting in minimal inductor DC resistance loss, while the flying capacitors in that block are regulated to a low voltage close to Vout, which is sub-1V.

The proposed high step-down SISO is planned to be implemented in the 180-nm HV Bipolar-CMOS-DMOS (BCD) process. As per the simulation, the voltage transient response demonstrates an efficiency of 90% at a current load of 1A with an output voltage of 970mV and input of 48V (Fig. 1(b)). In the next phase, the entire circuit, including the controller, will be implemented for taping out.



Figure 1. (a) Proposed hybrid high voltage, low voltage structure, and (b) output voltage transient simulation.

**Keywords:** High step-down conversion ratio, buck converter, hybrid topology, single-inductor-single-output

#### INDUSTRY INTERACTIONS

Intel, NXP, Texas Instruments

#### MAJOR PAPERS/PATENTS

[1] Z. Zhou et al., "A Battery/USB Input Sub-1V Output Reconfigurable Hybrid High-Step-Down Converter with Reduced Inductor Current in Nominal CMOS," Submitted to TCAS1 after Major Revision.

[2] Z. Zhou et al., "A Multi-Output Reconfigurable Hybrid Buck Converter with Fast Response and Reversed Duty Cycle Control for Enhanced Efficiency," Will be submitted to IEEE TCAS1.

Large die yield concerns and the rise of domain-specific accelerators have motivated partitioning compute modules into multiple chiplets on interposers with highdensity interconnects. The die-to-die interconnect design techniques in this proposal aim to significantly improve efficiency at high per-pin data rates, which is necessary for the continued scaling of future systems.

#### **TECHNICAL APPROACH**

A new dense energy-efficient die-to-die interconnect transceiver architecture is in development that is based on simultaneous bidirectional (SBD) signaling. The transceiver front-ends utilize a novel inverter-based voltage-mode driver that has a replica driver hybrid, which efficiently separates the outbound and inbound signals, and merges low-complexity echo cancellation and highpass filter near-end and far-end crosstalk (NEXT and FEXT) cancellation circuitry. Low-overhead 10b11b spatial encoding is employed to dramatically reduce supplynoise-induced crosstalk. Finally, a forwarded-clock architecture allows for low-complexity receive-side deskew and high-frequency correlated jitter tracking.

#### SUMMARY OF RESULTS

Fig. 1 shows the proposed die-to-die interconnect transceiver architecture that consists of 24 single-ended wires, with 22 data SBD data transceivers that transmit 20 effective bidirectional data streams between dies and 2 unidirectional forwarded-clock channels. Two groups of low-overhead 10b11b spatial encoding are employed in the 22 data transceivers to dramatically reduce supply-noise-induced crosstalk.

Efficient SBD techniques are necessary to generate a replica outbound for subtraction from the total signal present at the transceiver interface to allow for the extraction of only the inbound signal. The proposed frontend in Fig. 2 introduces an additional echo cancellation segment driven by the outbound data sequence that is capacitively coupled in the replica stage to provide a positive high-pass-shaped echo cancellation signal. Properly adjusting the echo cancellation drive strength and coupling capacitor value with the SSLMS adaptation engine provides improved eye diagram margins. Increased NEXT and FEXT are observed as interposer signal-to-signal pitch is decreased to improve the transceiver edge density. The proposed front-end also employs high-pass filter-based cancellation of the NEXT and FEXT occurring on a given wire from its 6 surrounding interposer channels. These techniques are under investigation to achieve the target performance of a die-to-die transceiver architecture operating at 107Gb/s/wire, considering the spatial encoding and forwarded clocks overhead, with an energy efficiency of 0.2pJ/b and an edge density of 32Tb/s/mm when integrated with a high-density interposer.



Figure 1. Die-to-die interconnect transceiver architecture.



Figure 2. (a) SBD transceiver front-end with an inverter-based driver, replica driver hybrid, echo, and NEXT/FEXT cancellation. 128Gb/s SBD eye diagrams: (b) with echo cancellation, (c) without, and (d) with NEXT and FEXT cancellation.

**Keywords:** Chiplet, die-to-die interconnects, echo cancellation, interposer, simultaneous bidirectional signaling

#### INDUSTRY INTERACTIONS

Intel, MediaTek

## TASK 3160.050, MULTIPHASE HYBRID STEP-DOWN DC-DC CONVERTER WITH HIGH CURRENT DENSITY FOR LARGE-CONVERSION-RATIO 48V AUTOMOTIVE APPLICATIONS HOI LEE, UNIVERSITY OF TEXAS AT DALLAS, HOILEE@UTDALLAS.EDU JIN LIU, UNIVERSITY OF TEXAS AT DALLAS

#### SIGNIFICANCE AND OBJECTIVES

This research aims to develop innovative multiphase hybrid DC-DC converters to provide a high current density and a high-power efficiency for large input-to-output voltage conversions. A systematic approach will be explored to realize multiphase hybrid converters with a minimal number of power FETs and passive components while significantly increasing the output current handling capability.

#### **TECHNICAL APPROACH**

We will investigate multiphase switched-capacitorassisted converter topologies to evaluate operation flexibility in different conditions, the requirements of voltage balancing and pre-charging of flying capacitors, the capability of providing high power density, and different power losses. We started to build a multiphase ladder-based capacitor-assisted hybrid converter that can significantly lower the inductor conduction loss at high output current for the multiphase architecture. The single-phase front-end SC network further reduces the required number of power FETs for reducing the switch conduction loss.

#### SUMMARY OF RESULTS

The proposed 3:1 capacitor-assisted multiphase hybrid (CAMH) converter is shown in Fig. 1. It consists of four inductors, 15 power switches, and 4 flying capacitors. Compared with the Buck converter, the switch on-time can be increased by 6 times, thereby improving the converter power efficiency for high input-to-output conversions. All flying capacitors in the converter do not require voltage balancing in the steady state nor pre-charging in the start-up condition.



Figure 1. Structure of the proposed 3:1 capacitor-assisted multiphase hybrid (CAMH) converter topology  $V_{OUT}/V_{IN} = D/6$ .



Figure 2. Simulated power efficiency.

This 3:1 CAMH converter with on-chip power transistors and gate drivers was designed using the TSMC 180-nm 70-V BCD process to support an input voltage of 24V – 60V and to generate an output voltage of 1 – 5V with a maximum output current  $I_L$  of 50A. In Fig. 2, the simulated peak power efficiency of the converter achieves 95% for 48V-to-1.8V at 500kHz and the full-load power efficiency is 86.5%. When the switching frequency becomes 300kHz, the peak power efficiency can reach over 96%. By summing the volumes of all components in the power stage, the peak current density (the maximum output current/volume of all power-stage components) achieves 712.5A/in<sup>3</sup>. This peak current density in the proposed converter is 16 times and 6.3 times higher than the prior art with on-chip power FETs and discrete eGaN FETs, respectively.

**Keywords:** DC-DC converter, capacitor-assisted multiphase hybrid converter, high-conversion-ratio stepdown converter, hybrid DC-DC converter, non-isolated converter

#### INDUSTRY INTERACTIONS

NXP, Texas Instruments

This project seeks to develop the first-generation IC gate driver for wide bandgap **bidi**rectional **s**witches (**BDS**s). The gate driver promises heightened BDS circuit performance on efficiency, reliability, cost, weight, and size, and could serve as an enabling solution universal to existing monolithic WBG BDS's.

#### **TECHNICAL APPROACH**

This project will investigate gate driving techniques and circuits to enable high high-performance and highreliability operation of the WBG BDSs. First, we will conduct a thorough study of the state-of-the-art BDS device structure and operation, and develop an accurate device model for computer-based simulations. Second, we will investigate innovative dynamic power rail bootstrapping (BST) techniques, which ensure effective BST rail charging. Third, we will replace separate gate drivers with a single coordinated one.

#### SUMMARY OF RESULTS

Since the project starte of Jan. 1, 2024, we have conducted a comprehensive study on the operation and characteristics of state-of-the-art bidirectional GaN power FETs. We also investigated critical challenges and issues in driving these devices.



Figure 1. Cross-section of a classic bi-GaN power FET, T. Morita et al., "650 V 3.1 m $\Omega$ cm2 GaN-based monolithic bidirectional switch using normally-off gate injection transistor," IEDM, 2007.

Fig. 1 shows the cross section of a monolithic bidirectional switch, implemented by a double-gate GaN FET with high breakdown voltage and low specific on-state resistance. Such A double-gate common-drain structure enables bidirectional conduction without significantly increasing active area because the drift region is shared by two FETs. Also, The combination of low conduction and

switching losses leads to high efficiency, making it a highly effective solution for high-power-density applications.



Figure 2. Block diagram of EiceDRIVER 2EDi (Infineon).

As a new type of power device with complex switching operation scenarios, driving BDSs faces critical issues and challenges. Among the very limited solutions, many require galvanic isolation. Although this improves device and signal reliability, many gate driving techniques, such as switching slew rate, deadtime, and EMI controls would be difficult to implement. To mitigate this challenge, we have successfully developed a drain-centerd bootstrap (BST) technique (Fig. 3). As BDS topologies can be highly diverse and complex, this solution provides a BST solution that potentially works for all BDS topologies. In addition, it can be implemented fully on-chip without the need for power diodes and magnetics.



Figure 3. Proposed drain-centers BST circuit.

**Keywords:** Bidirectional switch (BDS), gate driving techniques and circuits

#### INDUSTRY INTERACTIONS

**Texas Instruments** 

This project aims to enhance EV traction control by developing a self-commissioning library and advanced PWM modulation schemes. Objectives include accurate parameter estimation, maximizing torque production using derived parameters, and leveraging Type-5 ePWM (enhanced pulse width modulator) in a 10kW power converter to demonstrate its advantages over traditional PWM methods.

#### TECHNICAL APPROACH

Synchronous Motors (SMs) are widely used in traction applications due to their high power density, dynamic performance, and efficiency. Accurate knowledge of their electrical and magnetic attributes is crucial for optimal control, yet challenging due to manufacturing tolerances and dynamic variations. To address this, a highly automated self-commissioning procedure is proposed for precise motor parameter identification, essential for initial powertrain testing and ongoing operation. This procedure, which includes periodic recalibration to account for system changes, will enhance control precision and system health monitoring. Using these parameters, advanced control strategies will be developed to optimize SM performance.

#### SUMMARY OF RESULTS

This project was initiated in 2024. Currently, we have been working on Tasks 1 & 3 in parallel.

**Task 1:** Development of self-commissioning modules, testing these modules on synchronous motors, and comprehensive documentation to elucidate the underlying theory and procedure steps for end users.

The self-commissioning procedure involves steps designed for any SMs, requiring minimal input data. Despite module complexity, a user-friendly code and documentation structure ensures ease of use. Detailed information on code, procedures, and theory is provided for seamless integration with the TI platform ensuring effective module utilization.

**Task 2:** Development of advanced control techniques based on the parameters obtained through self-commissioning for maximize torque production.

Parameters from the procedure will enhance advanced motor control techniques. Inductance mapping is vital for optimizing torque and current safety, especially in hightorque, low-speed scenarios. These techniques adapt to non-linearities, selecting efficient current vectors, thus improving performance and efficiency.

**Task 3:** Investigation of practical applications for Type-5 ePWM features in the context of power converter topologies used for electric vehicle onboard chargers, HV-LV DC-DC, and inverters.

The goal is to demonstrate the advantages of Type-5 ePWM through simulation studies. This task explores dual active bridge, multilevel, and stacked half-bridge topologies for EV systems, focusing on advanced Type-5 PWM features to enhance efficiency, power density, and cost. Comprehensive simulations and controller implementations will demonstrate reduced CPU utilization, control complexity, and losses.

**Task 4:** Applications of Type-5 ePWM with Dual Active Bridge(DAB) converters for advanced motor drive applications. We plan to implement a 10 kW 3-Level DAB inverter, popular among EV manufacturers while remaining open to alternative topologies based on simulations. The study aims to identify modulation schemes and control methodologies using Type-5 ePWM features to enhance motor drive efficiency.



Figure 1. Self-commissioning steps.

**Keywords:** Self commissioning, vector control, parameter estimation

#### INDUSTRY INTERACTIONS

Texas Instruments, DRV Team

## **Fundamental Analog Thrust**



| Category                            | Accomplishment                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
|-------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Fundamental<br>Analog<br>(Circuits) | A type-III supply-regulated phase-locked loop (PLL) showcases the potential of ring oscillators for ultra-low noise applications. It utilizes a high-gain sampling phase detector to suppress in-band phase noise, while a low-noise multiphase oscillator reduces out-of-band noise. Fabricated in a 16-nm FinFET process, the prototype PLL operates over a wide frequency range of 7 to 14 GHz, achieving a low integrated                                                                                                                                                                                          |
|                                     | (3160.017, P. Hanumolu, University of Illinois, Urbana-Champaign)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| Fundamental<br>Analog<br>(Circuits) | A nanowatt subthreshold voltage reference minimizes temperature-induced current variation through a clock reference for adaptive duty-cycled operation and offers output voltage programmability via an integrated programmable DC-DC converter. Fabricated in a 0.18-µm CMOS process, it achieves a temperature coefficient of 176ppm/°C while consuming 4.6nW, reduces current variation to 2.75%/°C (a 400× improvement), and features 64-step output voltage programmability with 1.2-mV resolution. (2810.063, D. Sylvester, University of Michigan)                                                              |
| Fundamental<br>Analog<br>(CAD)      | A data-driven analog circuit synthesizer with automatic topology selection and sizing<br>is demonstrated. An adaptive topology dataset is utilized, which can later be<br>enhanced with synthetic data generated using variational autoencoders (VAE), a<br>generative machine learning technique. This improves the predictive capabilities.<br>Experiments involving over 360 OpAmp topologies and over 540K data points<br>demonstrate the capability to generate designs within minutes while achieving<br>quality comparable to that of experienced designers. (3160.007, D. Pan, University<br>of Texas, Austin) |



Texas Instruments Phase Light Modulator (TI-PLM) flexibly controls the phase of laser in a 2-dimensional manner that enables a solid-state implementation of lidar and holographic 3D display applications. TI-PLM is incorporated as a part of a lidar and active correction subsystem of optical aberration of AR image guide.

#### TECHNICAL APPROACH

**PLM for Lidar:** Solid-state fine field of view (FOV) steering capability is added to the Digital Micromirror Device (DMD) lidar system. Fine steering of the beam is employed by PLM coarse steering by DMD. The DMD-PLM hybrid lidar system increases the resolution of the DMD lidar system in 3 folds (0.054 deg) with a total FOV of 37 degrees.

**PLM for AR display:** TI-PLM displays tightly focused points over an aberrated AR (Augmented Reality) image combiner. A camera-in-the-loop (CITL) pipeline that consists of a digital camera, aberration detection, and correction of optical aberration demonstrates a first PLM-based phase conjugation over an aberrated AR combiner.

#### SUMMARY OF RESULTS

**PLM for Lidar**: We have confirmed that the TI-PLM can be used to steer lidar images. The optical system, depicted as Fig. 1(a), consists of a 905-nm 10-ns pulsed laser, a 0.67inch TI-PLM (Fig. 1(b)), an F/1.2 and f=50mm focusing lens, and a 32x32 pixel MPPC (Multi-Pixel Photon Counter) detector module. The MPPC detector module triggers the 905-nm pulsed laser that illuminates the object "U" placed 50-cm away from the laser. The returning signal is first diffracted by TI-PLM on which a linear grating pattern is displayed. As a grating pattern, we have used grating periods of 2p, 4p, 8p, and 16p (p=10.8um) as the pixel pitch of TI-PLM. Figs. 1(c), (d), and (e), show lidar images fine-steered by TI-PLM. This shows that the infrared lidar "image" is steerable by TI-PLM. The brighter 0<sup>th</sup>-order lidar image is rejected by optical filtering.

**PLM for AR display:** We have confirmed that the TI-PLM displays a tightly focused image through an aberrated AR image guide combiner while varying the distance of the virtual point image. Fig. 2(a) shows an optical setup to display depth-varying point images while correcting the optical aberration of the AR image guide combiner. For the aberration measurement, off-axis holography and single-shot phase retrieval were employed. The measured aberration of the AR image guide is subtracted from the converging spherical wavefront. Figs. 2(b) and (c) show

virtual point images displayed behind and in front of the AR image guide. As these figures show, TI-PLM enables manipulation of the focused spot in a 3-dimensional manner.



Figure 1. (a), (b) Optical setup of the lidar system with TI-PLM. (c), (d), (e) Steered TOF lidar images.



Figure 2. (a), (b) The optical setup of holographic point display TI-PLM. (c), (d) tightly focused point through aberrated AR combiner demonstrated at different depths.

Keywords: PLM, Lidar, Image steering, HUD, Holographic

#### INDUSTRY INTERACTIONS

**Texas Instruments** 

#### MAJOR PAPERS/PATENTS

[1] G. Nero, et. al., "Two-dimensional solid-state diffractive beam steering by digital micromirror devices," SPIE Photonics West Paper 12900-22 (2024).

[2] R. Shrestha, et. al., "Digital Phase Conjugation by Texas Instruments Phase Light Modulator for Near-to-Eye Display," SPIE Photonics West Paper 12900-26 (2024).

## TASK 2810.056, MILLIMETER WAVE PACKAGING RESEARCH – ANTENNA IN PACKAGE RASHAUNDA HENDERSON, UNIVERSITY OF TEXAS AT DALLAS, RMH072000@UTDALLAS.EDU

HONGBING LU AND MARK LEE, UT DALLAS

#### SIGNIFICANCE AND OBJECTIVES

Antenna-in-package (AiP) reduces losses and provides compactness in millimeter-wave front-end modules, which makes it a key enabler. We focused on the performance of slot bowtie (SBT) and E-shaped patch (epatch) antennas integrated into QFNs. Designs centered in WR5 have been characterized for -10-dB bandwidth (BW), peak realized gain and efficiency.

#### **TECHNICAL APPROACH**

Broadband free-space measurements of frequencydependent dielectric material properties are used in simulation to improve design accuracy of the AiPs. After obtaining vendor-supplied package substrates and mold compounds, relative permittivity and loss tangent were determined from 90-325 GHz at room temperature and 150 °C. Mechanical modeling and characterization of similar package configurations have been used in this study to understand failure implications.

#### SUMMARY OF RESULTS

Table 1 shows the footprint dimensions, -10-dB BW, and peak realized gain of the slot bowtie (SBT) and E-shaped patch (e-patch) as simulated in HFSS where the antennas are placed in free space (standalone configuration). The antennas were selected because they have -10-dB bandwidths on the order of 15-30%.

The first flip-chip enhanced QFN (FCeQFN) package test vehicle (TV1) was designed to support the slot bowtie antenna dimensions with 2-metal and 2-dielectric layers. To minimize radiation pattern interference, the antennas can be fed with a rectangular waveguide from the backside and require a package transition to excite a coplanar waveguide feed for the slot bowtie. A microstrip patch internal to the package is used to couple energy from the waveguide to a conductor-backed coplanar waveguide.

The e-patch was integrated into TV2, which contains two transition approaches with 3-metal and 2-dielectric layers to account for the thin dielectric needed to form the microstrip substrate. Two antennas centered at 127 GHz and 180 GHz can be fed with a ground-signal-ground (GSG) probe, while waveguide feeding is utilized for the 180-GHz antenna.

Fig. 1 shows photographs of the fabricated TVs with the following: (a) the top side of the bare TV, (b) completely

molded (h=300 $\mu$ m) TVs where no elements can be seen, (c) one TV that is laser etched for a standard GSG probe to feed a molded antenna where the planar feed lines are exposed and (d) the backside of the TV that includes metal for assembly onto a PCB. The waveguide fed SBT has demonstrated -10-dB BW, gain, and  $\eta$  of 18 GHz, 6.03 dB, and 36%, respectively. Table 2 shows the simulated -10dB BW of the package transitions used to transfer the backside feed signal through the package to antennas and indicates how important they are.

Table 1. Simulated performance of standalone antennas

| Antenna<br>type | X<br>(mm) | Y<br>(mm) | Z<br>(mm) | -10dB<br>BW (GHz) | PGain<br>(dB) | η<br>(%) |
|-----------------|-----------|-----------|-----------|-------------------|---------------|----------|
| SBT             | 1         | 1.5       | .475      | 72                | 9.5           | 79       |
| e-patch         | 1.4       | 1.34      | 0.5       | 47                | 6.8           | 50       |



Figure 1. QFN test vehicles using slot bowtie antennas (left) and e-patch antennae (right). Bare, laser-etched, fully molded, and backside metalized samples are shown.

Table 2. Simulated performance of package transitions

| Specifications             | App. 1<br>(via) | App. 2<br>(SIW) | App. 3<br>(coupling<br>antenna) | App. 1(TV1)<br>(coupling<br>antenna) |
|----------------------------|-----------------|-----------------|---------------------------------|--------------------------------------|
| -10-dB BW (GHz)            | 55.4            | 36              | 43                              | 18                                   |
| min S <sub>21</sub>   (dB) | 1.81            | 2.06            | 1.48                            | 1.7                                  |

Keywords: e-patch, slot bowtie, QFN, radiation pattern

#### INDUSTRY INTERACTIONS

Intel, MediaTek, NXP, Texas Instruments

#### MAJOR PAPERS/PATENTS

[1] A. Jogalekar, et. al., "A Novel Approach to Measure and Characterize Radiation Patterns of Antenna-in-Package," IEEE 73rd Electronic Components and Technology Conference (ECTC), June 2023, Orlando, FL.

[2] O. Medina, et. al., "Substrate Temperature Effects on the Performance of mm-wave Antenna-in-Package," 2023 IEEE International Symposium on Antennas & Propagation & USNC-URSI Radio Science Meeting, July 2023.

Analog/RF devices are prone to process variability, and this impacts the performance of devices and yield. To optimize Analog/RF IC testing for yield our solution aims to minimize yield loss (Overkill) and test escapes (Underkill) by leveraging machine learning models.

#### **TECHNICAL APPROACH**

To minimize Overkill, our three-step approach includes: (1) predicting auxiliary test values via multivariate regression models, (2) clustering these predictions with actual outcomes, and (3) identifying recoverable devices using a proximity-based metric. For Underkill, we utilize unsupervised GMM (Gaussian Mixture Model) clustering on measurements from multiple insertions to isolate devices likely to fail on-site and employ adaptive multivariate outlier detection for identifying potential customer returns. Fault IDs are determined through a multi-class neural network to eliminate the need for extensive failure analysis.

#### SUMMARY OF RESULTS

In this section, we will summarize the Overkill and Underkill reduction work that was previously explored and present the new results on Underkill reduction extension to classify the fault-Id of customer returns.

In our efforts to reduce Underkill, we performed our experiments on an industrial dataset from Texas Instruments that consisted of 66 specification tests and 241 auxiliary tests performed on 92,022 devices. Of these devices, we focus on 8,840 (9.6%) devices that pass the specification test but fail the auxiliary tests. Using the twoclass classifier in addition to our regression and clustering, we recovered 81.6% (highlighted in green) of devices from our focus group as observed in Table 1.

| Table | 1. | Device | Classification | using a | a T | wo-cl | ass | Classifier |  |
|-------|----|--------|----------------|---------|-----|-------|-----|------------|--|
|       |    |        |                |         |     |       |     |            |  |

|           |      | Specification Tests |      |  |  |
|-----------|------|---------------------|------|--|--|
|           |      | Pass                | Fail |  |  |
| Auxiliary | Pass | 80261 <b>+ 7217</b> | 1623 |  |  |
| Tests     | Fail | 726                 | 2195 |  |  |

In our efforts to reduce Underkill, we proposed a threestep approach; feature selection, clustering using GMM, and adaptive outlier detection. We performed our experiments on an industrial dataset from Texas Instruments consisting of devices from 19 wafers with a recorded customer return on each wafer. Upon applying our proposed methodology, we achieved coverage of 89% - 100% (correctly identified customer returns). Additionally, the outlier detection model incurs an additional yield loss of 3.48% - 1.8% as we progress the train set from 10 wafers to 18 wafers.

Table 2. Fault-Id Classification using Multi-Class Classifier

| Actual vs<br>Predicted | Fault-Id 1 |       | Fault-Id 3 |
|------------------------|------------|-------|------------|
| Fault-Id 1             | 81.5%      | 18.5% | 0%         |
|                        | 100%       | 0%    | 0%         |
|                        | 32.5%      | 67.5% | 0%         |
|                        | 24.5%      | 75.5% | 0%         |
|                        | 0%         | 7.5%  | 92.5%      |

Finally, in our efforts to classify the fault-ld of customer returns. We performed our experiment on an industrial dataset from Texas Instruments that consisted of 19 customer returns that are categorized into 3 fault-id. The multi-class classifier model is a feedforward neural network with three hidden layers, each using ReLU (Rectifier Linear Unit) activation and dropout layers to prevent overfitting and an output layer with SoftMax activation for 3-class classification.

To train and test the model we used an 80-20 split of devices belonging to each fault-Id and the results of classification are recorded in Table 2. We can classify the correct fault-Id of customer returns with 100% accuracy if our predictions are subjected to majority vote.

**Keywords:** Yield recovery, machine learning, adaptive testing, failure analysis

#### INDUSTRY INTERACTIONS

**Texas Instruments** 

#### MAJOR PAPERS/PATENTS

[1] D. Neethirajan, et. al., "Machine Learning-Based Overkill Reduction through Inter-Test Correlation," IEEE VLSI Test Symposium (VTS), 2022.

[2] V. A. Niranjan, et. al., "Machine Learning-Based Adaptive Outlier Detection for Underkill Reduction in Analog/RF IC Testing," IEEE VLSI Test Symposium (VTS), 2023

There has been tremendous progress in ADC performance. The research community has focused on its conversion capability including efficiency in particular energy Figure of Merit (FoM). This research will redefine ADCs as information extraction tools, dramatically increasing their capability and utility.

#### **TECHNICAL APPROACH**

Existing sensor interfaces create too much data and provide too little information. Machine-learning approaches, such as feature extraction and classification, can overcome bandwidth and power limitations; however, traditional machine-learning methods are expensive. We propose a new class of intelligent and aware ADCs that directly extract information. We also use neural networks to improve ADC performance.

#### SUMMARY OF RESULTS

This work introduces a speech recognition front-end system that solves the problems of conventional adaptive beamforming (ABF) with (1) low digital signal processing (DSP) power consumption (3x lower than state-of-art ABF) thanks to an innovative greedy blocking matrix (GBM) employing simple calculations, (2) automatic direction-ofarrival (DOA) error compensation with direction tracking delay-and-sum beamformer aided by the GBM, (3) a multi-mode hybrid analog-to-digital converter (ADC) that adapts to signal conditions; (4) multi-mode beamforming that takes advantage of high-signal signal-to-noise ratio (SNR) to reduce total power by 54%. A prototype fabricated in a 40-nm CMOS process occupies 0.94mm<sup>2</sup> while consuming 157µW and 72µW in high-power and low-power modes. The proposed system improves speech recognition accuracy from 54% to 83% in noisy conditions.

Fig. 1 shows a simplified block diagram of a beamforming ASR system. ADCs digitize the analog signals from microphones. Next, a beamformer generates a noise-suppressed speech signal. Finally, a feature extractor facilitates efficient DNN speech recognition. Speech processing presents challenges to both ADCs and digital beamforming in an ASR front end. First, the SNR of audio frontend ADC should be high (>80dB) for high-quality audio processing and speech recognition. A low-power on-chip speech recognition DNN classifier consumes a few hundred  $\mu$ W, which makes the ADC and beamformer power consumption a significant portion of the total ASR power consumption.



Figure 1. Multi-mode automatic speech frontend.

Time-interleaving of SAR ADCs is an essential technique for high-speed analog-to-digital conversion. Traditional interleaved SAR ADCs require a large die area and have limited conversion speed due to the overhead of multiple switched-capacitor DACs. Additionally, difficulties with matching between interleaved ADC channels limit performance. We tackle these challenges with (1) a timeinterleaved charge-injected cell (CIC) SAR ADC which benefits from hardware sharing of CIC cells for a small area and (2) a hardware-friendly neural-network calibration scheme (Fig. 2). Neural networks can effectively learn non-linear-functions between two sets of data. In this way, we can calibrate multiple error sources present in the ADC without having to first characterize them explicitly.



Figure 2. 6GS/s ADC with neural network calibration. Keywords: ADC, audio, neural network, calibration

#### INDUSTRY INTERACTIONS

Analog Devices, Intel, MediaTek, NXP

#### MAJOR PAPERS/PATENTS

[1] T. Kang, S. Lee, S. Song, M. R. Haghighat and M. P. Flynn, "A Multimode 157  $\mu$ W 4-Channel 80 dBA-SNDR Speech Recognition Frontend With Direction-of-Arrival Correction Adaptive Beamformer," in IEEE Journal of Solid-State Circuits, doi: 10.1109/JSSC.2023.3327967 [2] E. Ware, J. Correll, S. Lee, and M. P. Flynn, "6GS/s 8-channel CIC SAR TI-ADC with Neural Network Calibration," IEEE European Solid-State Circuits Conference (ESSCIRC), September 2022.

TASK 2810.062, MULTI-CARRIER DAC-BASED TRANSMITTER ARCHITECTURES FOR 100+GB/S SERIAL LINKS SAMUEL PALERMO, TEXAS A&M UNIVERSITY, SPALERMO@TAMU.EDU SEBASTIAN HOYOS, TEXAS A&M UNIVERSITY

#### SIGNIFICANCE AND OBJECTIVES

Clock jitter places fundamental performance limitations on common wireline transmitters, necessitating clock generation and distribution circuitry that achieve rms jitter of a few hundred femtoseconds. The DAC-based transmitter design techniques in this project aim to significantly improve jitter robustness and reduce system equalization complexity.

#### **TECHNICAL APPROACH**

A new multi-carrier DAC-based transmitter architecture is in development that is capable of providing jitter robustness for baseband and coherent multi-tone modulation applications. The transmitter utilizes novel techniques to improve the wireline polar transmitter speed and efficiency, including a high-speed injectionlocked oscillator-based digital phase modulator and DACbased FIR filtering in the segmented output driver. Efficient digital FIR filtering and linearization techniques, including a look-up table equalizer and an output stage pre-distortion DAC are also in development.

#### SUMMARY OF RESULTS



Figure 1. Multi-carrier DAC-based transmitter.

Fig. 1 shows the proposed first-generation 50Gb/s multi-carrier TX that leverages carrier orthogonality to allow band overlap, with three 5GS/s bands of BB PAM4 and MB and HB 16-state complex modulation on respective 5 and 10GHz carriers [1, 2]. The 312.5MHz DSP generates per-band 16 parallel 7-b amplitude plus 2-b predistortion codes and 16 parallel MB and HB 7-b phase codes that then pass through 16:2 MUXes before the final 2:1 serialization in the parallel DAC-based output stages. This results in three independent 5GS/s signals that are then current-mode combined at the driver outputs to

form the multicarrier signal. The BB segment utilizes a conventional CML-based DAC driver, while MB and HB segments use polar CML DAC drivers to combine the symbol amplitude and phase values. The TX output network employs a  $\pi$ -coil network for bandwidth extension and bleeder circuits to maintain proper output common-mode level.

The proposed multicarrier TX was fabricated in a 22-nm FinFET process and occupies 0.18-mm<sup>2</sup> area. 50Gb/s operation is achieved over a channel with 21.5dB loss at 12.5GHz by testing with a multi-carrier receiver developed in SRC Task 2810.013 (Fig. 2). A second-generation 128Gb/s multi-carrier TX was taped-out in a 16nm FinFET process in Fall 2023.



Figure 2. 50Gb/s measurements: (a) BB, (b) MB, and (c) HB.

**Keywords:** Digital-to-analog converter, frequencyinterleaving, jitter, transmitter, serial link

#### INDUSTRY INTERACTIONS

Intel, NXP, Texas Instruments

#### MAJOR PAPERS/PATENTS

[1] I.-M. Yi, et al., VLSI Sym., June 2023, Kyoto, Japan.[2] I.-M. Yi, et al., accepted in IEEE JSSC.

## TASK 2810.063, ANALOG AND DIGITAL ASSIST TECHNIQUES TO IMPROVE MIXED-SIGNAL PERFORMANCE DENNIS SYLVESTER, UNIVERSITY OF MICHIGAN, DMCS@UMICH.EDU DAVID BLAAUW, UNIVERSITY OF MICHIGAN

#### SIGNIFICANCE AND OBJECTIVES

Subthreshold voltage references operate at nW levels while maintaining low TC and PSRR, but suffer from worse process variation and larger current/power variation across temperatures. We propose a duty-cycled subthreshold voltage reference with programmable output voltage for ultra-low power IoT applications.

#### **TECHNICAL APPROACH**

The reference voltage 'seed' is initially generated by a 2-transistor (2T) voltage reference, but we power-gate the 2T and make its duty cycle inversely proportional to temperature. As a result, power variation is significantly reduced for the proposed subthreshold reference. Moreover, we propose a programmable voltage converter that takes the voltage seed to generate a higher reference voltage while maintaining a low TC. This significantly improves the output voltage range of the 2T reference and can be used to compensate for process variation.

#### SUMMARY OF RESULTS

The proposed voltage reference is fabricated in 0.18-µm CMOS with 0.026mm<sup>2</sup> (including on-chip flying capacitors and a 1pF output capacitor). Across 15 dies, the average value of V<sub>ref</sub> for the proposed circuit is 1.086V, ~2.7X over the conventional 2T reference while the standard deviation of the proposed circuit is proportionally increased by 2.9X, indicating that the voltage programmer does not introduce added variation across chip samples. The circuit has a 461ppm/°C TC (using just one clock frequency for all temperatures), while the conventional 2T reference has a 106ppm/°C TC at the same V<sub>dd</sub>. Despite the higher TC, the proposed circuit limits the range of current consumption across -20°C to 100 °C to 1.5-5nA, which is a ~400X reduction in current spread compared to the conventional 2T reference. At 25 °C, the proposed voltage reference consumes 4.6nW including timing sequence generation. Better temperature performance can be obtained using multiple clock frequencies across operating temperatures. We did not implement this in a closed-loop fashioned on this test chip but it is a possible extension. The measurements in the comparison table show the gain possible using this technique (in an openloop manner).



Figure 1. (a) 2T voltage reference, (b) current variation of 2T reference across temp., (c) proposed subthreshold voltage reference.

|                         | This<br>work              | ESSCIRC<br>'18 | TCAS-<br>II '18 | ISCAS<br>'21        |
|-------------------------|---------------------------|----------------|-----------------|---------------------|
| Process (nm)            | 180                       | 180            | 130             | 180                 |
| Area (mm <sup>2</sup> ) | 0.026                     | 0.0012         | 0.003           | 0.002               |
| Chip Samples            | 15                        | 27             | 45              | -                   |
| Vdd (V)                 | 2 - 4                     | 0.5 - 2.5      | 1.1 - 2.4       | 1.0                 |
| Power (nW)              | 4.6                       | 0.65           | 27.5            | 1.35                |
| σ/μ (%)                 | 0.41                      | 0.30           | 5               | 2                   |
| Temp.<br>Range (°C)     | -20 -<br>100 <sup>b</sup> | -40 - 125      | -40 - 85        | -10 - 110           |
| TC (ppm/°C)             | 176 <sup>b</sup>          | 152.8          | 100             | 400                 |
| LS (%/V)                | 2.2                       | 0.031          | 2               | 0.7 - 2             |
| PSRR (dB)<br>@ 100Hz    | -38                       | -61.5          | N/R             | N/R                 |
| Programmable<br>Output  | Yes<br>64 ×<br>0.11%      | No             | No              | Yes<br>16 ×<br>5.2% |

Table 1. Comparison table, <sup>b</sup>4 clock frequencies are used.

**Keywords:** Subthreshold voltage reference, Ultra-low power circuit, Temperature coefficient, PSRR, Voltage programmability

#### INDUSTRY INTERACTIONS

#### NXP

#### MAJOR PAPERS/PATENTS

[1] Y. Peng, et al., "A 4.6nW subthreshold voltage reference with 400X current...," European Solid-State Circuits Conference (ESSCIRC), 2023, Lisbon, Portugal.

## TASK 2810.071, ACCURATE COMPACT TEMPERATURE SENSORS FOR THERMAL MANAGEMENT OF HIGH PERFORMANCE COMPUTING PLATFORMS

RANDALL GEIGER, IOWA STATE UNIVERSITY, RLGEIGER@IASTATE.EDU DEGANG CHEN, IOWA STATE UNIVERSITY

#### SIGNIFICANCE AND OBJECTIVES

The objective is to develop a strategy for designing compact densely distributed temperature sensors for real-time power-thermal management with the accuracy needed for reliably managing failure mechanisms inherent in silicon. Significance is in providing sensor output as a key input into a robust power/thermal management controller.

#### **TECHNICAL APPROACH**

Compact temperature sensors that can be widely dispersed at critical locations throughout an integrated circuit will be designed. Tentatively these sensors will be a single small MOS transistor or pairs of MOS transistors where temperature is embedded in the I-V characteristic of these devices. Located at a less-critical location where area requirements are relaxed will be a Temperature Management Controller (TMC) that extracts temperature from an array of temperature sensors. The interrelationship between the temperature of the TMC and the temperature at remote temperature sensor locations will be managed with an appropriate calibration algorithm.

#### SUMMARY OF RESULTS

The target performance of the compact temperature sensors is absolute accuracy of ±100mK over the critical temperature window from 75°C to 95°C with accuracy relaxed to 3°C at temperatures below 50°C. This should provide the temperature accuracy needed for managing the variation in the thermal-restricted mean time to failure (MTTF) of an integrated circuit to approximately 10% of a target MTTF value.

One of three different temperature sensors designed in this project is shown in Fig. 1. It was fabricated in a 180nm CMOS process. In this circuit, each DUT had 36 temperature sensors spread across the die. Measurements of 252 temperature sensors across 7 DUTs were used to train a nonlinear model. After training, a two-point calibration at 80°C and 90°C was used to correct for gain and offset for each sensor. Measured results, obtained using a Fluke 7103 oil bath to control temperature, are shown in Fig. 2. The measured temperature error over the critical 75°C to 95°C range for the 252 temperature sensors was less than ±70mK. Over the -20°C to 100°C range the measured error was just over  $\pm$  1°C.

Both theoretical and experimental results demonstrate that these compact temperature sensors offer potential for use in multi-site on-chip power-thermal management applications and, more importantly, suggest system designers can enhance system reliability by establishing tighter tolerances on temperature accuracy relative to what is typically reported.



Figure 1. Compact Two-Transistor temperature sensor.



Figure 2. Measured results for 252 samples.

**Keywords:** temperature sensor, thermal mask, power/thermal management, reliability

#### INDUSTRY INTERACTIONS

**Texas Instruments** 

#### MAJOR PAPERS/PATENTS

[1] B. Gadogbe, et. al., "Very Compact Temperature Sensor for Power/Thermal Management," 2023 IEEE 66th MWSCAS, Tempe, AZ, USA, 2023, pp. 142-146.

[2] R. Yang, et. al., "A Compact and Accurate MOS-based Temperature Sensor for Thermal Management," 2023 IEEE 66th MWSCAS, Tempe, AZ, USA, 2023, pp. 594-598. TASK 2810.076, HIGH PRECISION POSITIONING TECHNIQUES BASED ON MULTIPLE TECHNOLOGIES AND FREQUENCY BANDS NAOFAL AL-DHAHIR, UNIVERSITY OF TEXAS AT DALLAS, ALDHAHIR@UTDALLAS.EDU MURAT TORLAK, UNIVERSITY OF TEXAS AT DALLAS

#### SIGNIFICANCE AND OBJECTIVES

Channel State Information (CSI)-based WiFi ranging achieves performance gains over received signal strength information (RSSI)-based and time-stamp-based ranging, especially in multipath environments. However, there is a growing need to move from model-based to data-driven ranging when an accurate mathematical model is not available to relate Wi-Fi CSI to range/location information.

#### **TECHNICAL APPROACH**

Data-driven methods (Support Vector Machines (SVM), Convolutional Neural Networks (CNN), and Deep Neural Networks (DNN)) enable the robust performance of the range/location estimators against environmental changes, signal, and system parameters. We propose a deep-learning-based localization model inspired by a human pose estimation model that takes as input twodimensional (2D) Inverse Fast Fourier Transform (IFFT) estimations of Time-of-Flight (ToF) and Angle-of-Arrival (AoA) from two-way and one-way WiFi CSI and outputs a location heatmap in x and y coordinates. We compute the Root Mean Square Error (RMSE) between the target and estimated heatmaps of line-of-sight (LOS) locations in (x, y) coordinates as loss in each stage.

#### SUMMARY OF RESULTS

Fig. 1 depicts the structure of our proposed neural network for WiFi localization. The blocks denoting "C" are the convolutional layers, "MP" are the max-pooling layers reducing the input size by 2, and "AP" is the average pooling layer. We sequentially deployed 4 stages to estimate the location from the 2D IFFT. The loss function considers the intermediate losses from each stage, thus mitigating the effect of the gradient vanishing problem. The numbers in each block denote the number of filters or kernels used in that layer. If the loss function from stage *i* is  $L_i$ , then the overall loss function of the network is the sum of the losses of the 4 stages.

Fig. 2 depicts the empirical cumulative distribution function (ECDF) performance of the investigated deepneural network-based location estimator compared to 2D IFFT and the model-based super-resolution MUSIC estimator. The model is tested on WiFi testbed data. As seen from the plots, the 90th percentile and median error performance of WiFi 2D localization are improved by up to 50% and 53%, respectively, by using our proposed CNN- based location estimation method compared to modelbased schemes in a practical multipath-rich environment.



Figure 1. Proposed CNN-based WiFi localization deep network.





**Keywords: C**entimeter-level, Internet of Things (IoT), localization, ranging, WiFi

#### INDUSTRY INTERACTIONS

MediaTek, NXP, Texas Instruments

#### MAJOR PAPERS/PATENTS

[1] J. P. Van Marter, et al., "A Multichannel Approach and Testbed for Centimeter-Level WiFi Ranging," in *IEEE Journal of Indoor and Seamless Positioning and Navigation*, Feb. 2024.

[2] S. Helwa, et al., "Bridging the Performance Gap Between Two-Way and One-Way CSI-Based 5 GHz Wi-Fi Ranging," in *IEEE Access*, Jun. 2023.

[3] S. N. Shoudha, et al., "WiFi 5GHz CSI-based Single-AP Localization with Centimeter-Level Median Error," in *IEEE Access*, Oct. 2023.

We aimed to integrate our previously designed Phase Shifter and Power Amplifier-Low Noise Amplifier (PALNA) for 70-95 GHz terabit beamformers from the original bond wire based packaging to a flip-chip packaging using an Integrated Passive Device (IPD) technology.

#### **TECHNICAL APPROACH**

To integrate a 2-element W-band phased-array TX/RX system, this project will utilize flip-chip integrated passive device (IPD) technology. The initial designs of the Phase Shifter and PALNA included up to 24 DC pads, creating significant integration challenges. By applying insights from previous studies, we aim to address these issues using flip-chip design, which integrates DC power supplies to minimize the need for external power lines. Additionally, this approach offers the advantage of lower loss, enhancing overall system performance.

#### SUMMARY OF RESULTS

Integrated Passive Device (IPD) technology was chosen to package the sub-circuits due to its high reliability at high frequencies. Additionally, IPD enables integration of passive and active components in a compact area, which is crucial for maintaining performance in complex systems. Despite potential challenges such as thermal management and precise alignment in high-frequency applications, the advantages of IPD in enhancing circuit performance make it a better choice. In this work, we designed a Wilkinson power divider and a Quasi Yagi-Uda antenna, integrating them with a two-way bi-directional phase shifter and a bi-directional PALNA.



Figure 1. Architecture and layout for the 2-element W-band phased-array Tx/Rx system.

The Wilkinson power divider implemented on the IPD platform achieves 0.25-dB insertion loss and 20-dB isolation at 70 to 95 GHz. These results indicate its efficiency and effectiveness in minimizing signal loss and interference. Furthermore, the Quasi Yagi-Uda antenna demonstrates a gain of 3 dBi and a return loss of 10 dB at 78 to 85 GHz, ensuring reliable signal reception and transmission capabilities. The complete IPD circuit layout,

as depicted in Fig. 1, occupies a compact chip area of 21.6 mm<sup>2</sup>. This part has been taped out, and we plan to conduct measurements in July to validate the design and address any unforeseen integration challenges.



Figure 2. Simulated gain and phase LNA and phase shifter on IPD. (a),(b) LNA mode, (c),(d) PA mode.



Figure 3. Wilkinson power divider simulation result. (a) insertion loss, and (b) isolation.



Figure 4. The simulated pattern of the Quasi Yagi-Uda antenna. **Keywords:** W-band, phased-array, IPD, 65-nm CMOS

#### INDUSTRY INTERACTIONS

MediaTek

#### MAJOR PAPERS/PATENTS

[1] Wen-Jie Lin, Jeng-Han Tsai, Jen-Hao Cheng, Wei-Heng Lin, Tung-Tsen Chiang and Tian-Wei Huang, "A 67-86 GHz Spectrum-Efficient CMOS Transmitter Supporting 1024-QAM with A Process-Variation-Tolerant Design," IEEE Access, vol. 8, pp. 74458-74471, Mar. 2020.

## TASK 2810.082, ADAPTIVE DIGITAL CANCELLATION OF DYNAMIC ERROR FROM CLOCK SKEW, COMPONENT MISMATCHES, AND ISI IN HIGH-RESOLUTION RF DACS

IAN GALTON, UNIVERSITY OF CALIFORNIA AT SAN DIEGO, GALTON@UCSD.EDU

#### SIGNIFICANCE AND OBJECTIVES

The project is developing digital calibration techniques that adaptively measure and cancel both static and dynamic errors from clock skew, component mismatches, and ISI in current-steering RF DACs. It will provide experimental validation via two 22-nm CMOS DAC ICs with performance beyond the current state-of-the-art.

#### **TECHNICAL APPROACH**

Part 1 of the project is developing a current-steering 3-GHz DAC IC with a target worst-case Nyquist-band SNDR of 72 dB enabled by a recently developed subsampling mismatch-noise cancellation (MNC) technique. RZ signaling is used to prevent ISI from limiting performance. Part 2 has developed a subsampling ISI cancellation (ISIC) technique, and Part 3 will develop a second-generation version of the Part 1 DAC IC which includes both the subsampling MNC and ISIC techniques. The ISIC technique eliminates the need for RZ signaling, which will enable a doubling of the sample rate to 6 GHz without degrading the DAC's performance.

#### SUMMARY OF RESULTS

The target specifications are extremely challenging. We have finished Part 1 IC's system-level design and registertransfer-level digital design, and we have completed the design and layout of the most critical analog blocks. We are currently working on the synthesis and automatic place-and-route realization of the on-chip digital calibration engine, the final design and layout of the remaining analog blocks, and high-level layout of the IC.

We completed the Part 2 research ahead of schedule, and the results achieved are better than initially anticipated. Specifically, we have developed an enhanced version of the originally-anticipated ISIC technique that adaptively measures and accurately cancels error from ISI over a high-resolution DAC's first Nyquist band, thereby circumventing the need for return-to-zero (RZ) pulse shaping. It is an extension of the MNC technique. While the MNC technique suppresses both static and dynamic error over the DAC's first Nyquist band from component mismatches and clock skew, it does not mitigate ISI. The ISIC technique complements the MSC technique in that it mitigates ISI. It can be implemented by itself or together with the MNC technique, and, like the MNC technique, it can be operated in both foreground and background calibration modes. When implemented together, the ISIC and MNC techniques can operate simultaneously and share the same analog circuitry without interfering with each other.

The Part 2 research results are better than initially anticipated because the proposed ISIC technique has a different form than initially envisioned and the new form offers two unexpected benefits. One benefit is that the ISIC technique can be run simultaneously and share circuitry with the MNC technique as mentioned above. The other benefit is that the ISIC technique's convergence rate is significantly higher than we initially thought possible. It is on par with that of the MNC technique. These benefits are convenient given that the two techniques will be implemented together in the Part 3 IC. We have published a paper, that presents a rigorous theoretical analysis of the ISIC technique and demonstrates the technique's performance in conjunction with the MNC technique via simulation results [1] and have submitted a provisional patent application [2].

We have also invented a fast circuit-level simulation technique enabled by an extended ISI analysis. The technique quantifies nonlinearity in high-performance current-steering DACs with an order-of-magnitude reduction in simulation times.

**Keywords:** DAC, ISI, mismatch-cancellation, ISI-cancellation, digital calibration

#### INDUSTRY INTERACTIONS

MediaTek, NXP

#### MAJOR PAPERS/PATENTS

[1] S. Kim, et. al., "Adaptive Cancellation of Inter-Symbol Interference in High-Speed Continuous-Time DACs," *IEEE TCAS-I*, vol. 70, no. 11, pp. 4309-4322, Nov. 2023.
[2] I. Galton, "Adaptive Cancellation of Inter-Symbol Interference in High-Speed Continuous-Time DACs," U.S. Patent Provisional Application 63/584,540, Sept. 22, 2023.

## TASK 2810.083, AUTOMATED LAYOUT OF ANALOG ARRAYS IN ADVANCED TECHNOLOGY NODES SACHIN S. SAPATNEKAR, UNIVERSITY OF MINNESOTA, SACHIN@UMN.EDU RAMESH HARJANI, UNIVERSITY OF MINNESOTA

#### SIGNIFICANCE AND OBJECTIVES

High power densities of FinFETs in power amplifiers (PAs) cause self-heating (SH), degrading performance. This study investigates SH effects in the large FinFET arrays in PAs and quantifies its performance impact. A machine learning model for rapid and accurate thermal analysis is used to guide the layout of these transistor arrays.

#### **TECHNICAL APPROACH**

We create array layouts of active transistors in the PA, interspersing dummy transistors to limit thermal issues. We find PA performance metrics (efficiency, PAE, gain) under a thermal map, and insert dummies (Fig. 1) to generate Pareto-optimal tradeoffs between performance and area. To overcome the critical bottleneck of thermal simulation, we create an ML framework for fine-grained transistor-level thermal analysis. We use an encoderdecoder generative (EDGe) network, which has been successful with image-related problems with 2-D spatially distributed data (the U-Net framework for static analysis, and a U-Net+ConvLSTM for transient analysis), and tailor it to work at fine granularity.

#### SUMMARY OF RESULTS

We apply our approach to designs using four different PA classes: Class A, Class AB, Class C, and Class E. For each, we generate eight different layouts of their corresponding transistor arrays, where layout  $\mathcal{A}_{ij}$  places j dummy transistors for every i active transistor. Using these eight combinations and the per-fin current for each PA from the schematic simulation, we determine the spatial temperature maps using our U-Net-based static thermal analysis method. Fig. 2 plots the maximum temperature vs. area for eight different arrangements for the four PAs. As the number of dummy transistors is increased, the temperature reduces, with an area cost. Clumping together more transistors, even with dummies in between (e.g.,  $\mathcal{A}_{32}$ ), may not provide large thermal savings over  $\mathcal{A}_{10}$ , which uses no dummies. The temperature effect is more dominant for the Class AB PA and Class E PAs, which carry higher per-fin currents.

In transient analysis, counter-intuitively, PAs show lower performance variations at higher frequencies, as seen in measured data from NXP mentors. We show that the time constant does not allow the temperature to reach steady state at 80MHz, as it does at 20MHz. Thus, at 80MHz, we see lower temperature-induced variability, translating to better error vector magnitude (EVM) at the system level.



Figure 1. Layout and corresponding thermal profile with four fin and four finger counts active devices, (a) without any dummy, (b) with dummy devices k among active devices A.



Figure 2. Temperature rise vs. area for different classes of amplifiers at 66 MHz for different active and dummy arrangements.

**Keywords:** Power amplifiers, thermal analysis, performance variability, machine learning, FinFET devices

#### INDUSTRY INTERACTIONS

Intel, NXP, Texas Instruments

- [1] N. Karmokar, et. al., submitted for publication.
- [2] M. Madhusudan, et. al. IEEE ESSDRC, 2023.
- [3] N. Karmokar, et. al., TECHCON, 2023.

## TASK 2810.085 / 2810.093, APPLICATIONS OF CIRCUIT TRANSIENT SENSITIVITY SIMULATION TO SEMICONDUCTOR CIRCUIT ANALYSIS AND DESIGN

#### RONALD ROHRER, CARNEGIE MELLON UNIVERSITY, RONROHRER@CMU.EDU

#### SIGNIFICANCE AND OBJECTIVES

This project seeks to demonstrate techniques that reduces the number of circuit simulations and can replace the computation intensive Monte Carlo simulations for yield estimation and enhancement, fault analysis and design centering.

#### **TECHNICAL APPROACH**

Transient adjoint sensitivity computes response gradients to design or process parameters with one simulation and a modest amount of overhead. Artificial Neural Network (ANN) models are used to extract Jacobians of proprietary device models from commercial simulators to obtain those gradients.

#### SUMMARY OF RESULTS

The extracted ANN model device model architecture are shown in Fig. 1. As shown in Fig. 2, it accurately provides the device characteristics. Via backpropagation, it also provides the requisite gradients to design and process parameters. Each dot represents a simulated data point, and the solid line represents the current predicted by the ANN model given corresponding parameter inputs.



Figure 1. Architecture of ANN model for an MOS transistor.



Figure 2. Comparison of ANN model characteristics with BSIM device model characteristics.

**Keywords:** Adjoint Sensitivity, Transient Analysis, Artificial Neural Network Model

#### INDUSTRY INTERACTIONS

NXP, Texas Instruments

#### MAJOR PAPERS/PATENTS

[1] Jiahua Li, Danyal Ahsanullah, Zhengqi Gao & Ron Rohrer, "Circuit Theory of Transient Adjoint Sensitivity," IEEE Transactions on Computer Aided Design of Integrated Circuits and Systems, Volume 42, Number 7, Pages 2303-2316, May 2023.

# TASK 3160.006, MACHINE-LEARNING BASED ANALOG MIXED-SIGNAL DESIGN TOOL

MIKE SHUO-WEI CHEN, UNIVERSITY OF SOUTHERN CALIFORNIA, SWCHEN@USC.EDU SANDEEP GUPTA AND TONY LEVI, UNIVERSITY OF SOUTHERN CALIFORNIA

#### SIGNIFICANCE AND OBJECTIVES

Analog mixed-signal (AMS) modules are typically human-designed. This custom-design process is expensive, leading to a long time-to-market, and suboptimal design. This task aims to develop low-cost AMS circuit modeling, sizing, and layout flow to achieve complete design automation from specification to GDS while achieving state-of-the-art performance.

#### **TECHNICAL APPROACH**

The whole AMS design automation flow can be divided into four main steps including dataset generation, circuit modeling, sizing, and layout. In the past year, we mainly focused on developing algorithms for circuit modeling and layout automation. For circuit modeling, we developed a GNN-based modeling and transfer learning approach to utilize the circuit topology information and reduce the required training samples. For the layout automation, we built a template-based layout generator for a quick and reliable layout of both active and passive components.





Figure 1. VTC schematic and optimization loop.

For the circuit modeling, we use multiple circuits to demonstrate the effectiveness and efficiency of the GNNbased modeling approach. In the first example, we model a voltage-to-time converter that is widely used in emerging time-domain ADCs and utilize the model to optimize the VTC performance.

For a given set of design parameters, the GNN provides the predicted circuit performance, and Stochastic Gradient Decent is utilized to update the design parameters to achieve better performance. While the GNN model is trained with a limited dataset (SFDR<57dB, and SNDR<46dB), the final optimization results (verified with SPICE simulation) can achieve a 62-dB SFDR and 48.5dB SNDR with similar power consumption.

With the ability to extrapolate the performance of circuit modules, we then focused on layout automation to make the flow to be end-to-end. The target is to be able to generate the layout for a given circuit topology with any given design parameters. To achieve that, we focused on

the template-based approach. In this flow, a manual layout example needs to be provided as the template for relative device placement, and the tool can adjust the I/O pin location, device location, and routing based on the given design parameters. A tree structure layout approach is utilized to guarantee a uniform layout style across different layout samples. The detailed layout flow is shown in Fig. 2.





To close the loop and incorporate the layout parasitic extraction (LPE) information into the circuit module optimization, we need to include the LPE into the GNN model, and this is achieved by GNN transfer learning, as shown in Fig.3. The trained schematic-level GNN is fixed, and two trainable extra linear layers are added at the input and the output of the model. The training of the extra layer only requires a few samples to achieve high accuracy, as verified by the delta-sigma DAC example.



Figure 3. GNN transfer learning approach, circuit example (Delta-Sigma DAC, and result)

**Keywords:** Circuit modeling, GNN, transfer learning, layout automation, template-based layout generation

INDUSTRY INTERACTIONS

IBM, NXP

## TASK 3160.007, AI-ASSISTED AND LAYOUT-AWARE ANALOG SYNTHESIS AND OPTIMIZATION WITH DESIGN INTENT DAVID PAN, THE UNIVERSITY OF TEXAS AT AUSTIN, DPAN@ECE.UTEXAS.EDU YAOYAO JIA, THE UNIVERSITY OF TEXAS AT AUSTIN

#### SIGNIFICANCE AND OBJECTIVES

In this report, we present two of our recent works: topology selection and transistor placement optimization. Topology selection is the first and most important step in analog design automation flow, as final performance can be limited by how good a topology is. On the layout side, we will discuss a framework for transistor placement optimization, considering important layout constraints.

#### **TECHNICAL APPROACH**

Our work [1] leverages variational autoencoders, a generative ML technique, to develop a data-driven framework for front-end analog design. On the backend side, we show that when higher-order effects in spatial variation are dominant, a non-common centroid (non-CC) placement can outperform a CC placement [2]. Our placement optimization framework handles important layout constraints in a unified manner, resulting in better-quality results compared to the state-of-the-art [2].

#### SUMMARY OF RESULTS

Our data-driven strategy [1] efficiently and swiftly identifies appropriately sized SPICE schematic netlists for OTAs covering both topology selection and sizing, for unknown user specifications in minutes. The most challenging specifications extended the processing time to about 30 minutes at most. In nearly all feasible scenarios, we achieved a figure of merit (FoM) [1] of zero corresponding to satisfying all design spefications. **Table 1. Experimental Results** [Poddar+, DATE'24] [1]

| Test<br>casc | Designer<br>Expectation                                                                 | Binary Search-<br>Based<br>Prediction        | VAE-Based<br>Prediction                                                            | Final Predicted<br>Sized Topology                  | Our<br>Method:<br>FOM<br>@ 500<br>samples | Our<br>Method:<br>Run-<br>time<br>(mins) | DNN-<br>Opt:<br>FOM<br>@500<br>samples | DNN-<br>Opt:<br>Run-<br>time<br>(mins) |
|--------------|-----------------------------------------------------------------------------------------|----------------------------------------------|------------------------------------------------------------------------------------|----------------------------------------------------|-------------------------------------------|------------------------------------------|----------------------------------------|----------------------------------------|
| 1            | Two-stage OTA<br>with cascode<br>load                                                   | Two-stage OTA<br>with cascode<br>load (NMOS) | Two-stage OTA<br>with cascode<br>load (NMOS)                                       | Two-stage OTA<br>with cascode<br>load (NMOS)       | 0                                         | 10.95                                    | 7.03                                   | 73                                     |
| 2            | Standard<br>Two-stage OTA                                                               | Failed                                       | Standard<br>Two-stage OTA<br>(PMOS)                                                | Standard<br>Two-stage OTA<br>(PMOS)                | 0                                         | 6.875                                    | 6.05                                   | 63                                     |
| 3            | Single-stage<br>inverter-based<br>OTA with cascode                                      | Failed                                       | Single-stage<br>inverter-based<br>OTA with cascode                                 | Single-stage<br>inverter-based<br>OTA with cascode | 0                                         | 6.593                                    | 1.66                                   | 48.7                                   |
| 4            | Two-stage OTA<br>with cascode<br>load, Single-<br>stage OTA with<br>folded-cascode load | Two-stage OTA<br>with cascode<br>load (NMOS) | Single-stage OTA<br>with folded cascode<br>load and cascode tail<br>current (NMOS) | Two-stage OTA<br>with cascode<br>load (NMOS)       | 0                                         | 10.7                                     | 7.03                                   | 73                                     |
| 5            | Standard<br>Two-stage OTA                                                               | Failed                                       | Standard<br>Two-stage OTA<br>(NMOS)                                                | Standard<br>Two-stage OTA<br>(NMOS)                | 0                                         | 10.959                                   | 6.5                                    | 67                                     |
| 6            | Single-stage<br>inverter-based<br>OTA with cascode                                      | Failed                                       | Single-stage<br>inverter-based<br>OTA with cascode                                 | Single-stage<br>inverter-based<br>OTA with cascode | 0.15                                      | 10.9                                     | 3.04                                   | 48.5                                   |

TABLE III: Comparative Analysis of the test results: Designer Expectations v/s Our Method v/s DNN-Opt

Our approach being data-centric is intrinsically faster than traditional optimization techniques and establishes topology selection methods like MOJITO, which requires days of computational effort. Additionally, our established simulation repository enables intelligent initialization of sizing tools, facilitating quicker convergence. This capability allows our approach to surpass traditional sizing tools that start from scratch, achieving a FoM of zero much more rapidly. Testing it against a state-of-the-art sizing tool DNN-Opt, our method demonstrates superior speed in reaching the optimal figure of merit for a given topology for unknown user specifications. Topology selection results are presented in Table 1.

For the transistor placement optimization, we start with an initial placement of unit cells. Subsequently, it is finetuned using a simulated annealing-based method that involves random selection and swapping of unit cells. At each iteration, the placement is evaluated with models for routing cost (RC), mismatch in the length of diffusion (MILD), and mismatch in spatial variation (MV), to guide the process. We can handle transistor placement of varied configurations, such as a current mirror (CM), differential input pair cascode (DIPC), and differential load pair cascode (DLPC). Since the centroid matching requirement is relaxed, non-CC placement can ensure a diffusionbreak-free layout. Also, since layout constraints are handled in a unified manner, we get better results than state-of-the-art (SOTA). Our formulation allows users better control over the objectives. Table 2 compares SOTA and our simulated annealing (SA) approach [2]. Table 2. Experimental Results [MAJI+, DATE'24] [2]

| Tast               | Algo  | o. MV MILD |      | R     | .С     | DB |
|--------------------|-------|------------|------|-------|--------|----|
| 1050               | Aigo. |            |      | Model | Router |    |
| CM:1               | SOTA  | 208        | 0.58 | 75    | 88     | 0  |
| [2,2,4,8,8], K=1.3 | SA    | 174        | 0.44 | 65    | 67     | 0  |
| CM:2               | SOTA  | 381        | 0.46 | 55    | 61     | 0  |
| [2,2,4,10], K=2    | SA    | 280        | 0.40 | 47    | 54     | 0  |
| CM:3               | SOTA  | 77         | 0.36 | 46    | 53     | 0  |
| [2,2,4,8], K=1.3   | SA    | 18         | 0.31 | 44    | 50     | 0  |
| CM:4               | SOTA  | 425        | 0.26 | 76    | 82     | 0  |
| [4,4,8,8], K=1.3   | SA    | 31         | 0.13 | 64    | 69     | 0  |
| CM:5               | SOTA  | 738        | 0.33 | 108   | 113    | 0  |
| [4,4,4,10,10], K=2 | SA    | 251        | 0.03 | 86    | 91     | 0  |

**Keywords:** Topology selection, transistor sizing, common centroid, spatial variation, placement and routing

#### INDUSTRY INTERACTIONS

Analog Devices, IBM, Intel, Samsung, NXP, Texas Instruments

#### MAJOR PAPERS/PATENTS

[1] S. Poddar, et. al., "A Data-Driven Analog Circuit Synthesizer with...," June 10, 2024. (SRC CADT Annual Review Student Poster Competition 1st Place)

[2] S. Maji, et. al., "Analog Transistor Placement Optimization...," June 10, 2024.

## TASK 3160.008, HIGH-SPEED DAC WITH HIGH OUTPUT POWER AND LINEARITY MIKE SHUO-WEI CHEN, UNIVERSITY OF SOUTHERN CALIFORNIA, SWCHEN@USC.EDU

#### SIGNIFICANCE AND OBJECTIVES

Designing a high-speed power DAC for mm-Wave applications is crucial for next-generation wireless communication. This project aims to develop a DAC operating at mm-Wave frequencies with high resolution and linearity, enabling high data rates and low latency. The power DAC will ensure efficient long-distance signal transmission for widespread 5G deployment.

#### **TECHNICAL APPROACH**

This work proposes a multiphase subharmonic switching (SHS) direct digital to mm-Wave power DAC for wideband signal transmission with improved EVM and PAE. The architecture uses SHS power back-off efficiency enhancement and a multi-phase modulation scheme with 8 LO phases at the carrier frequency and 16 at 1/3<sup>rd</sup>-sub-harmonic (SH). A hybrid DAC with delta-sigma modulation compresses input data, reducing DAC non-linearity and improving in-band EVM. A vector cancellation-aided notch filtering technique suppresses SHS-induced spurs by controlling sub-harmonic LO phase differences between DAC channels in power back-off mode, while maintaining signal information at carrier frequency, relaxing output matching network requirements.

#### SUMMARY OF RESULTS



Figure 1. Proposed Multi-phase IQ SHS Transmitter.

The proposed wideband mm-wave multiphase IQ SHS transmitter prototype (Fig. 1) is being fabricated in 65-nm CMOS. Fig. 3 shows the post-layout simulations for the small-signal S-parameters of the output matching network and power combiner from DC to 50 GHz. The insertion loss is around 3 dB at 25 GHz ( $f_c$ ) with a notch at -34 dBc at  $F_c/3$  The power-DAC achieves 34% PAE with 20.2-dBm P<sub>sat</sub> at 25 GHz. The efficiency at -9.5-dB PBO (power back-off) is 20%. Fig. 2 also shows the transient DAC output waveforms and their corresponding frequency domain spectra for both peak and PBO modes with 45 dBc of SH

cancellation, thanks to the notch filter and the IQ cancellation scheme (Fig. 2).



Figure 2. Proposed IQ imbalanced SH LO divider scheme.



Figure 3. (a) EM simulations of power combiner and notch filter at f<sub>c</sub>/3, (b) Post-layout power-DAC output waveform, (c) DAC efficiency versus output power (d), SH cancellation spectra.

**Keywords:** Sub-harmonic switching, matching network, power combiner, hybrid DAC, noise shaping

#### INDUSTRY INTERACTIONS

AMD, NXP, MediaTek

## TASK 3160.009, 100+GS/S TIME-DOMAIN ANALOG-TO-DIGITAL CONVERTERS

SAMUEL PALERMO, TEXAS A&M UNIVERSITY, SPALERMO@TAMU.EDU SEBASTIAN HOYOS, TEXAS A&M UNIVERSITY

#### SIGNIFICANCE AND OBJECTIVES

Conventional successive approximation register (SAR) ADCs require a high interleaving factor due to their limited conversion speed, which inevitably increases implementation complexity and front-end loading. The time-domain (TD) ADC design techniques in this project aim to significantly improve efficiency at high sampling rates relative to their SAR counterparts.

#### **TECHNICAL APPROACH**

A new low-power time-interleaved TD-ADC with advances in the interleaver, unit ADC, and calibration techniques is in development. The ADC utilizes a TDinterleaver based on a voltage-to-time converter (VTC) that is capable of efficiently achieving high sample rates and novel techniques to improve unit ADC speed and efficiency, including a coarse time-to-digital converter (TDC), time residue generation block, time amplifier (TA), and fine TDC. Efficient TD-ADC calibration techniques for VTC gain and TDC time resolution mismatch, TA gain error, and time-interleaving errors are also in development.

#### SUMMARY OF RESULTS

Figure 1 shows the proposed 112GS/s 7-bit timeinterleaved ADC that leverages a time-domain interleaver to improve efficiency at high sample rates. This TD-ADC architecture has 16-way interleaved Rank 1 T/H's that are partitioned into four groups driven by four parallel input buffers. These T/H's are clocked by  $f_s/8 \Phi_{T/H}$  pulses having a 25% duty cycle to avoid sampling crosstalk between the T/H channels in a group. The T/H sampled input voltages are delivered to two parallel VTCs that operate at  $f_s/32$ and generate two clock-like full-swing pulses that have a voltage-dependent time difference. These pulses are then buffered by inverters to drive a unit TDC that performs 7bit quantization at the  $f_s/32$  rate, which is 3.5GS/s for the complete 32-way interleaved ADC sample rate of 112GS/s. The proposed time-domain interleaver provides significant improvements in energy efficiency and bandwidth mismatch robustness due to the ability to leverage inverter-based buffering.

Figure 2(a) shows the proposed 3.5GS/s 7-bit unit TD-ADC that consists of a voltage-to-time converter (VTC), a 7-bit two-stage time-to-digital converter (TDC), and an encoder. The 7-bit TDC has an input 1-bit time folder, a 3bit coarse TDC, a time residue amplifier (TA), and a 3-bit fine TDC. Utilizing a TA gain of 8 allows the 3-bit fine TDC to utilize the same architecture as the initial coarse TDC. This unit ADC is 16X time interleaved to implement a 56GS/s ADC that has simulated performance of 39.2dB SNDR and 37.1fJ/step in a 16-nm FinFET process. This 56GS/s ADC is planned for tape-out in Fall 2024.



Figure 1. TD-ADC with time-domain interleaver.



Figure 2. (a) 3.5GS/s unit time-domain ADC that is 16-way interleaved to realize a 56GS/s ADC. (b). Simulated 56GS/s ADC output spectrum.

**Keywords:** Analog-to-digital converter, serial link, timedomain circuits, time-to-digital converter, voltage-to-time converter

#### INDUSTRY INTERACTIONS

Intel, MediaTek, NXP, Texas Instruments

A synthesizable PLL is a crucial building block for any modern communication system and digital SoC. Lowphase noise PLL's utilize inductors that do not scale with technology scaling and require significant design time. Using a ring-based oscillator can mitigate these, but increases phase noise.

#### **TECHNICAL APPROACH**

Multiplying Delay Locked Loops (MDLL's) are synthesizable and support low phase noise by leveraging precise control over phase and frequency. Their architecture ensures robust performance across PVT. To lower the phase noise, a "multiple injection path per one reference cycle" technique is proposed that reduces the phase noise by a factor of 20log(N) and extends the PLL BW by a factor of N, where N is the number of injections per one reference cycle. The generation of the multiphases for the injection is done through an oversampling DLL that could generate multiple phases with low offset and more noise shaping than a regular DLL.

#### SUMMARY OF RESULTS

The proposed architecture for the MDLL is shown in Figure 1. The performance improvement from the multiple injection concept has been verified through the time-based behavioral model developed in the first phase of the project. In Figure 2, the impact of the multiple injections is plotted. Excluding DTC noise, the expected phase noise is reduced by 20dB and the BW is extended by 10X for ten injections per one reference cycle.

Oversampling DLL generates the multiple phases required for the multiple injections. The input sine wave suffers from a low slope near its peaks. The usage of phases near the peaks will endure more jitter due to the low slope of the input sine wave. Thus, non-uniform phases are taken near the zero crossing for the injection process. Those phases will have low jitter as the slope is high. The input waveform is not limited to a sine waveform only. It can be a sawtooth or square wave. If the input is a square waveform an integrator can be added in series to convert the waveform into a sawtooth waveform. The LUT will be adjusted accordingly to accommodate that change and trigger the DAC based on the expected input waveform. The design automation is divided into two phases; ML/AI for analog block and analog layout generator scripts. The MDLL design is divided into synthesizable parts utilizing regular PnR and

the second portion that uses a machine learning design along with an analog layout generator. The analog layout scripts were developed and were exercised through a couple of test cases that showed promising results. The scripts provided a layout that is LVS/DRC clean with comparable parasitics values to the analog layout created by the designers. This will allow efficient exploration of the design space with a reduced iteration time.



Figure 1. Proposed MDLL for low phase noise.



Figure 2. Mult-Injection MDLL impact on phase noise & BW

**Keywords:** PLL, Clock, Layout Automation, Machine Learning, AI

#### INDUSTRY INTERACTIONS

IBM, Intel, MediaTek, NXP
# TASK 3160.017, MULTI-PHASE SUB-100FS JITTER RING-OSCILLATOR-BASED CLOCK MULTIPLIERS FOR BEYOND 100GB/S LINKS PAVAN HANUMOLU, UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN, HANUMOLU@ILLINOIS.EDU

#### SIGNIFICANCE AND OBJECTIVES

Sub-rate serial link transceivers are increasingly favored for overcoming bandwidth constraints. Yet, this method requires routing a high-frequency clock signal surpassing 14 GHz. We aim to develop a frequency multiplier and multi-phase generator capable of minimizing jitter to less than 100 fs r.m.s.

#### **TECHNICAL APPROACH**

We introduce methodologies to improve the phase noise performance and enhance supply noise immunity in ring-oscillator-based PLLs functioning at frequencies surpassing 10 GHz. Central to our approach are key design techniques such as employing a sampling phase detector to minimize in-band phase noise and adopting a low-noise multiphase ring oscillator (RO) design to mitigate out-ofband noise. Additionally, we implement a type-III supplyregulated architecture to broaden the frequency range and alleviate sensitivity to process, voltage, and supply variations.

#### SUMMARY OF RESULTS

Figure 1 illustrates the proposed PLL architecture, comprising a sampling phase detector (SPD), two integrators, a voltage-controlled ring oscillator (VCRO) with frequency tuning capabilities via supply voltage and varactor adjustments, and an integer-N divider. The SPD is constructed using a slope generator and a track-and-hold circuit to mitigate reference spurs and minimize undesired effects like clock feedthrough. This is achieved by strategically placing the PMOS tracking switch (M<sub>1</sub>) within the slope generator. Upon the positive edge of the reference clock (REF), the sampling capacitor (Cs) is charged through resistor R<sub>s</sub>. Subsequently, upon the positive edge of the feedback clock (FB), the exponential rising voltage is sampled onto C<sub>s</sub>, capturing the phase difference between the REF and FB clocks. The  $R_s$  (60 $\Omega$ ) and C<sub>s</sub> (1.2pF) values were selected to ensure high SPD gain and reduced kT/C noise, limiting the SPD's noise contribution to -145dBc/Hz. The voltage across  $C_H$ corresponds to the proportional control voltage  $(V_P)$ , while the integral control voltage  $(V_1)$  is established by the  $G_{M1}$ - $C_1$  integrator. This integrator also ensures that  $V_P$ equals  $V_{REF1}$  when the PLL is locked. An additional integrator  $(G_{M2}-C_2)$  and an NMOS-based regulator have been integrated into the PLL to mitigate supply and temperature sensitivity.



Figure 1. Propose type-III PLL architecture.

A prototype PLL was fabricated using a 22-nm FinFET technology and housed in a plastic QFN package. The PLL was locked to a REF clock (812.5MHz) produced by an Analog Devices evaluation board (ADF4377). Figure 2 showcases the PLL phase noise at 13GHz output frequency, depicted in two operational modes: type-II and type-III. Only a marginal increase of jitter was observed (67.5fs in the type-II mode vs. 69.3fs in the type-III mode) across an integration range from 10kHz to 100MHz. Of the total 69.3fs of jitter, 52fs can be attributed to the PLL, while the remaining portion arises from the reference clock path.



Figure 2. Measured phase noise plots.

Keywords: ring PLL, low jitter, type-III response

#### INDUSTRY INTERACTIONS

Intel, NXP, Texas Instruments

#### MAJOR PAPERS/PATENTS

[1] M. Khalil et al., "A 69.3fs ring-based sampling-PLL achieving 6.8GHz – 14GHz and -54.4dBc spurs under 50mV...," 2023 IEEE International Solid-State Circuits Conference (ISSCC), San Francisco, CA, USA, 2023.

We aim to implement a silicon prototype to fully understand the device's characteristics in a cryogenic environment (77K) with the FinFET process. The characterized results will serve as evaluation entries of cryogenic model development and the second silicon prototype which is targeting to implement fully functional memory macros and SoC.

#### **TECHNICAL APPROACH**

As a step toward enabling design of larger system-onchip (SoC) designs for cryogenic applications, we first will calibrate the current device model at cryogenic temperatures. Our objective is to gain a comprehensive understanding of the behavior of various components under these conditions. Specifically, we aim to study the cryogenic performance of 3T-eDRAM, pseudo-flip-flops, dynamic logic components, and various dimensions of single FinFETs within this silicon prototype. Based on the measurement results, we will develop a cryogenic-specific model. This will enable more accurate predictions and optimizations for future cryogenic SoC designs, ensuring their reliability and efficiency in extremely lowtemperature environments.

#### SUMMARY OF RESULTS

Cryogenic logic technology has emerged as a promising solution for power efficient High-Performance Computing (HPC) thanks to steeper subthreshold swing, extremely low leakage, enhanced mobility, low interconnect resistance, and improved reliability. In prior reports, we have demonstrated extremely low voltage (sub-200mV) DRAM realization with UTBB-SOI FET and multilevel-cell functionality; and discussed the potential of thyristorbased capacitor-less DRAM as a low-voltage, dense, and fast alternative for cryogenic embedded DRAM applications. While we have utilized existing foundry models to develop the silicon and circuits, these models are only suitable for temperatures above -40°C. This silicon prototype is aimed to be tested at 77K. The first silicon prototype was successfully taped out in February 2024 and waiting for fabrication to be completed. This prototype includes various circuits and test arrays, categorized into four categories: (a) 3T-eDRAM memory array (b) Pseudo Flip-Flop and Dynamic logic components for memory macro (c) Ring Oscillator, and (d) 1T device array.

(a) 3T-eDRAM Memory Array: We implemented a 16x16 memory array with the baseline memory components for testing purposes. The necessary periphery components include wordline drivers with level-shifting functionality, Schmitt-trigger-based multilevel readout circuits, and direct access mode with analog ports to measure 3T-eDRAM memory characteristics along bit-lines. (b) Pseudo Flip-Flop and Dynamic logic components: We designed 4~8 serially connected Pseudo flip-flops with scannable functionality and dynamic logic gates designed for memory macro. The Pseudo flip-flop consists of non-closed loop master-slave charged-based storage components, and multiplexers for scan-chain mode and refresh operation. The dynamic logics are developed for memory macros, such as bit-line keeper, bit-line precharge logic gates, and bit-line discharge logic gates. (c) Ring oscillator: A ring oscillator is included to measure the oscillation behavior on different voltage domains under cryogenic temperature. (d) 1T device array: We implemented multiple arrays of low threshold voltage FinFET with different device dimensions. The device's measure path will be connected to the device's gate, drain, and source terminal to characterize the device's drain current, gate current, and source current by providing different bias voltage.



Figure 1. Overall silicon prototype layout includes (a) 3T-Memory Array, (b) Pseudo Flip-Flop and Dynamic Logic components, (c) Ring Oscillator and (d) 1T device array.

**Keywords:** Cryogenic Memory, 3T-eDRAM, Pseudo CMOS logic, data retention

#### INDUSTRY INTERACTIONS

IBM, Intel

#### MAJOR PAPERS/PATENTS

 [1] Saikat Chakraborty and Jaydeep P. Kulkarni,
 "Analyzing the Dynamics of Store Mechanism and Data Retention through Transient Simulations...," 2024 Device Research Conference (DRC), June 2024

The objective is to develop circuits and system-level solutions to improve the overall performance of the ADCs in scaled CMOS nodes. We focus our efforts on nanowatt power compact CT- $\Delta\Sigma$  analog-to-digital converters (ADCs) using direct VCO chopping for sensor interface and CT- $\Delta\Sigma$  ADCs using time/frequency/phase.

#### **TECHNICAL APPROACH**

The use of time/frequency has shown many advantages over that of voltage-based circuits, especially in nanometer CMOS nodes. In most recent works, time/phase or frequency has been used as an integral part of ADCs, often used in the backend quantizer of  $\Delta\Sigma$  ADCs. However, these techniques can be extended to provide various other benefits. Our approach is to explore (1) the direct benefits of time/frequency quantization in  $\Delta\Sigma$  ADCs and (2) leverage architectural innovation to take advantage of the time/frequency domain.

#### SUMMARY OF RESULTS

This work presents a new structure of multi-loop CT- $\Delta\Sigma$ with a VCO-based quantizer (VCOQ) new  $\Delta\Sigma$  modulator robust to various coefficients variations and opamp gain. A recent work called correlated dual loop (CDL) sturdy multi-stage noise shaping (SMASH) CT- $\Delta\Sigma$  removes the explicit quantization error extraction and the related nonlinearity issue in conventional SMASH structures. However, the limited design flexibility and the system's robustness compromise the application of the CDL. The proposed structure is developed based on the core of the CDL SMASH and addresses its limitations by optimizing the noise transfer function (NTF) and introducing a distributed signal feed-in approach. Furthermore, the signal distribution can also constrain the input swing of the quantizers in our proposed structure. As a result, the VCOQ, which is often limited by nonlinearity problem can be incorporated well into this design without the need for any extra calibration.

Fig. 1(a) illustrates the basic topology and NTF of the proposed Mixed-Order Correlated Dual-loop Sturdy MASH (MDL-SMASH) CT  $\Delta\Sigma$  Modulator structure. A signal distribution approach and VCOQ are introduced in the MDL as shown in Fig. 1(b). Thanks to the virtual ground effect at the input of H<sub>2</sub>, the output of H<sub>1</sub> is tracking the input signal fed to the second stage by an amplifier (k<sub>sf</sub>). As a result, the signal amplitude can be distributed between multiple stages, which boosts the dynamic range and linearity by allowing low-swing operation of the loop

filters and constraining the input range of the VCOQ. A prototype CT- $\Delta\Sigma$  has been fabricated in a 65-nm CMOS process. The measured output spectrum and specifications are indicated in Fig. 2.



Figure 1. (a) MDL basic topology with Ksf=1, (b) Signal distribution and VCOQ in proposed MDL SMASH with  $K_{sf}$ =0.2.



Figure 2. Measured output spectrum.

**Keywords:** Noise-Shaping, VCO-Based, Multi-loop, Delta-Sigma, Analog-to-Digital Converter

#### INDUSTRY INTERACTIONS

Mediatek, NXP, Texas Instruments

#### MAJOR PAPERS/PATENTS

[1] X. Xu et al., "Mixed-Order Correlated Dual-loop Sturdy MASH CT  $\Delta\Sigma$  Modulator with Distributed Signal Feed-in and VCO based Quantizer," (CICC) 2023. [2] X. Xu et al., JSSC 2024

Synthesis, Auto-Place, and Route (SAPR) dominate the modern SoC construction methodology. However, energyefficient SoCs require automation of integrated Voltage Regulation (VR) to produce fine-grained voltage and clock domains through FLL/PLLs. This effort explores and devises an all-digital domain compiler to generate clock and voltage-regulated domains.

#### **TECHNICAL APPROACH**

The effort is organized into two thrusts. **Thrust 1:** We will build upon our  $V_{dd}$  -droop tolerant and fast-response UniCaP-2 construction (Fig. 1(b)) to explore and develop a framework that automates the construction of robust, larger, all-digital domains. User-provided constraints (Fig. 1(a)) are used to develop a unified system. **Thrust 2:** Autonomous, all-digital run-time VR loop-gain tuning will be used to ensure optimal transient response across PVT conditions, thereby overcoming the problem of poor performance due to margining for worst-case PVT conditions. In the context of UniCaP, improved VR response minimizes performance loss from FIFO saturation, and margins due to memory V<sub>min</sub> constraints.

#### SUMMARY OF RESULTS



Figure 1. (a) Overview of proposed Domain Compiler, (b) simplified schematic of the proposed architecture consisting of integrated LDO/PLL modules in addition to the load domain.

The focus of our effort in Year 2 builds upon our efforts in Year 1 to develop an autonomous gain tracker. We further developed the current sensor used to track  $I_{LSB}$ , the current of one unit header under the PVT conditions it was subjected to. The design was taped out and awaiting silicon test. We also developed two necessary design time optimization flows: (1) extracting critical paths from a digital design, producing their equivalent spice netlists, evaluating them under all anticipated PVT corners, and using it to provision the TRO module (Fig. 1); and (2) evaluating the stable values of  $K_i$  and  $K_p$  needed for a PI controlled LDO across PVT operations anticipated by the user, and arriving at a convex optimization formulation that solves a regression problem for a polynomial in  $I_{LSB}$  and  $V_{dd}$  which can be used to tune  $K_i$  and  $K_p$  at run time to maintain optimal compensation for the system (Fig. 2).



Figure 2. Simulated impact of  $K_i/K_p$  on LDO response time and phase margin. A data-driven approach enables determining regions of valid  $K_i$  and  $K_p$  settings and maximizes LDO performance under given stability constraints while guard banding for error in sense mechanisms for  $I_{LSB}$  or  $V_{dd}$ .

Fig. 3 shows outlines the effectiveness of the proposed approach. We look to tape out our first generation of Domain-compiled designs in 65-nm CMOS, with a second design to follow in 16-nm CMOS early next year.



Figure 3. Simulation results of the deviation (in mV) of voltage droop achieved using polynomial runtime regression to adjust  $K_i$  and  $K_p$  vs optimal settings. Less than 2mV of droop degradation is observed over the ideal, per-PVT  $K_i/K_p$  setting.

Keywords: Model-predictive control, Voltage Regulation

#### INDUSTRY INTERACTIONS

AMD, IBM, Intel, NXP, Texas Instruments

Our goal was to characterize analog non-volatile devices in Skywater 130-nm process. An open-source PMOS floating gate transistor is characterized and evaluated its practical capabilities through programming techniques, charge retention and the potential integration of these devices in analog computing context, especially in neural network and machine learning applications.

#### TECHNICAL APPROACH

We utilized hot-electron injection and Fowler-Nordheim tunneling for programming the floating-gate transistors, enhancing understanding of their resolution and charge retention capabilities. The methodologies include detailed measurements of threshold voltage changes and the effect of programming mechanisms on device performance, aimed at improving device reliability and efficacy in computational applications.

#### SUMMARY OF RESULTS



Figure 1. Floating-Gate transistor programming. owler Northeim tunneling and hot-electron injection enabling programming of threshold voltage.

The PMOS-based floating-gate transistors demonstrated significant advancements in programming precision and reliability. Specifically, the study achieved a 9-bit resolution for programming through hot-electron injection, where an average threshold voltage increase of 7.46 mV was consistently observed across multiple devices, without exceeding a 9.77 mV threshold for 9-bit accuracy. This high degree of precision is essential for

applications requiring fine control over transistor behavior, such as in analog computing and neuromorphic systems.

In contrast, Fowler-Nordheim tunneling provided a 7bit resolution with an average threshold voltage decrease of 17.2 mV. While the resolution is slightly lower than that achieved by hot-electron injection, it remains valuable for initial global programming of devices, where finer control is less critical. The probabilistic nature of this tunneling process also introduces greater variability, which was adequately characterized in this study, providing essential data for systems where this programming method might be employed.

Moreover, the charge retention tests underscored the transistors' long-term stability. Over a seven-day test period, the daily change in threshold voltage was minimal, projecting a drift of only 4.42 mV over a 10year period. This stability is promising for the deployment of these transistors in environments where long-term reliability is crucial, such as in embedded systems and permanent memory storage.

Keywords: Floating-Gate Transistors, Analog synapses

INDUSTRY INTERACTIONS

#### MAJOR PAPERS/PATENTS

[1] Matt Chen, Charana Sonnadara and Sahil Shah "Open-source Floating-Gate Cell for Analog Synapses" IET electronics letter (Under review)

This project is seeking to develop an on-chip hyperdimensional computer (HDC) by combining hardwaresoftware co-design with mixed-signal circuit design techniques to address memory and energy efficiency limitations in the current state-of-the-art as well as to improve performance (accuracy) by incorporating shallow neural networks as part of the encoder in HDC.

#### **TECHNICAL APPROACH**

We propose to use the reservoir-computing (RC) paradigm for on-chip HDC. RC input layer performs random projections of the input data which is analogous to mapping the base symbols to random hyper-vectors (HVs). The reservoir layer performs permutations on the HVs through recurrent interconnections and finally the outputs of the reservoir neurons are passed through a shallow nonlinear neural network to increase separation/distance between the HVs corresponding to different symbols. The neural network outputs are subsequently digitized and stored as prototype HVs in an associative memory for comparison against query HVs during inference. All the HDC circuits will be designed using analog/mixed-signal techniques.

#### SUMMARY OF RESULTS

Fig. 1 shows the architecture of the proposed HDC employing a reservoir-computing paradigm. Within this reporting period, we have developed a software model for the proposed HDC. The HDC performance is evaluated on the human activity recognition dataset UICHAR which includes 6 classes – walking, walking upstairs, walking downstairs, sitting, standing, and laying. The dataset has data from 30 participants going through different daily activities, and accelerometer and gyroscope data from their watches.



Figure 1. Architecture of the proposed HDC employing a reservoir-computing paradigm.

The dataset has 7352 training and 2947 test samples. Table 1 reports the confusion matrix on the test samples.

The HDC model uses 88 reservoir neurons and achieves 91.7% classification accuracy. As shown in the confusion matrix, the diagonal elements indicate the correctly classified samples, while the off-diagonal elements indicate samples that were wrongly attributed to a different class.

Table 1. Confusion matrix on UICHAR dataset



The tasks for the next reporting period are (1) finetuning the HDC model and testing on other applications such as speech recognition and phono-cardiogram and (2) designing the circuit implementation of the reservoir layer and taping-out a testchip.

**Keywords:** hyper-dimensional computing, reservoircomputing, mixed-signal classifier, in-memory computing

#### INDUSTRY INTERACTIONS

NXP

The key goal of this project is to demonstrate highspeed, high-resolution time-domain converter architectures in fine CMOS nodes with minimum voltagedomain assistance or native, linear time-domain S/H and amplifiers (which do not exist). Silicon prototypes and experimental results will be demonstrated and reported.

#### **TECHNICAL APPROACH**

Technology scaling presents a unique opportunity for time-domain (TD) analog circuits over their conventional voltage-domain counterparts. While the latter struggles with a dwindling supply voltage, the former gains (TD) resolution and accuracy with each finer process node. An enabling technique of time-domain RNS (residue number system) encoding is proposed to achieve a 4-bit architectural complexity, thus ensuring high efficiency, for the 8-bit first stage in a 12-bit two-step pipelined TDC. The large leading stage resolution greatly relaxes the interstage residue accuracy requirements, making it possible to achieve a 12-bit resolution using exclusively native time-domain circuits that exhibit superior technology scalability.

SUMMARY OF RESULTS

**Keywords:** time-to-digital converter (TDC), remainder number system (RNS) encoding, time-domain circuits

INDUSTRY INTERACTIONS

NXP

TASK 3160.037, CAUSAL AI FOR INTERPRETABLE AND ROBUST AMS TOPOLOGY SYNTHESIS AND OPTIMIZATION DOMENIC FORTE, UNIVERSITY OF FLORIDA, DFORTE@ECE.UFL.EDU DAMON WOODARD, UNIVERSITY OF FLORIDA

#### SIGNIFICANCE AND OBJECTIVES

We seek to develop an artificial intelligence-based tool for designing AMS circuits. It will generate circuits that meet design requirements, provide human-interpretable explanations for topology selection and optimization, and highlight trade-offs and unmet specifications. This approach streamlines AMS circuit design, ensuring transparency and efficiency.

#### **TECHNICAL APPROACH**

Our approach fully captures key aspects of AMS design using existing work on grammar-based tree structure (GTS) to construct structure and graph pair decision diagrams (GPDDs) for behavior. We then employ Causal AI to incorporate domain knowledge, creating a comprehensive understanding of the design. Multi-agent optimization partitions complex AMS designs into subblocks, with each agent optimizing its sub-block reward while collaborating to meet overall specifications. Based on optimization outcomes, recommended revisions or parameters are generated to create candidate designs. State-of-the-art AI/ML tools are utilized for topology generation, ensuring efficient and accurate design processes.

#### SUMMARY OF RESULTS

The first task is to set up infrastructure for AMS circuit optimization & Causal AI. We began by selecting a viable representation of an AMS schematic, which must provide all essential information required by both the GTS and GPDD. The key features of a suitable schematic/netlist include transistor parameters, pin/net connections, and AC behavior. We rely on the Spectre simulation tool included in Cadence Virtuoso to create the required netlist. The GTS and GPDD representations are two of our work's most vital design representations. While the GTS creates a set of production rules to instruct how to construct graph structures, including circuit topologies, the GPDD is a precise representation of a circuit's smallsignal model, and resultant symbolic transfer function, for behavioral analysis.

We wrote a Python script to (1) parse the netlist into a graph and (2) create a small-signal version based on traditional rules embedded in the code. The netlist from Fig. 1 is now used to create the GPDD. This was initially facilitated by an existing script written in C++. However,

recognizing Python as a more suitable framework for future AI tasks, we converted the existing code to Python.



Figure 1. Current Schematic-to-Transfer Function pipeline. The sample RC circuit is used to create a left and right graph. These are combined to form a single GPDD, from which a transfer function is created.

We incorporated the NetworkX graph package for visualizing the complete GPDD. The final GPDD Graph is pictured in Fig. 1. This structure is useful not only for the behavior but also as a potential input for our learning models. The output symbolic transfer function generated from the sample RC circuit is also shown in Fig. 1.

Our progress can be actively monitored using the link here: https://github.com/koblahdavid/Causal-AI-AMS.

**Keywords:** Human comprehensible analog automation, Causal AI, Multi-agent optimization, Graph pair decision diagram, Grammar-based tree structure

#### INDUSTRY INTERACTIONS

AMD, IBM, Intel, MediaTek, NXP, Siemens, Qualcomm, GlobalFoundries, Samsung, SK Hynix, Texas Instruments

# TASK 3160.040, LOW-AREA AND WIDEBAND FRACTIONAL-N PLL DESIGN UTILIZING ACOUSTIC RESONATORS AND FRACTIONAL NOISE CANCELLATION

#### PATRICK MERCIER, UNIVERSITY OF CALIFORNIA AT SAN DIEGO, PMERCIER@UCSD.EDU

#### SIGNIFICANCE AND OBJECTIVES

Low-jitter, area-efficient frequency synthesizers are crucial for modern communication protocols, including 5G/6G, Wi-Fi 7, and more. Ring-oscillator-based digital PLLs offer technology-scalable small areas and inherent multi-phase generation with degraded phase noise. This work aims to suppress the noise by utilizing highfrequency FBAR references with DSM noise cancellation.

#### **TECHNICAL APPROACH**

An FBAR-based high-frequency/low-noise/low-power reference and various DSM cancellation techniques will be used to push the loop bandwidth beyond 120 MHz, thereby significantly reducing the noise of the RO and maximizing the benefits of the FBAR. The proposed objectives will be pursued as part of three tasks: (1) construction of a comprehensive Verilog-AMS simulation platform for swift performance evaluation of various PLL topologies, (2) implementation of a TDC-based DSM noise canceling PLL as a proof of concept, and (3) exploration of a switched-capacitor FDC-based fractional-N PLL for next-generation systems, in which the DSM noise is low-pass filtered before entering the loop.

#### SUMMARY OF RESULTS

Conventional periodic steady state (PSS) analyses for fractional-N PLLs are challenging due to the large periodicity of the system. To expediently assess PLL performance, a workflow capable of accurately modeling the noise of oscillators and logic gates within Verilog-AMS is implemented. The initial phase involves the characterization of the oscillator phase noise via PSS+PNOISE simulation in Spectre RF, followed by the extrapolation of three parameters  $\sigma_{\Delta t}, \sigma_{\Delta T}, \sigma_{\Delta T, 1/f}$  from the phase noise plot. These parameters are subsequently fed into time-domain FIR and IIR filter banks to generate different regions of phase noise slope accordingly. An initial verification result of the oscillator simulation model, displayed in Fig. 1, shows the transient simulation from the preliminary framework, which aligns well with the PSS+PNOISE simulation. By utilizing the developed model, the 1ms transient simulation time can be reduced by nearly a factor of 1000x: from several days to 5 minutes.

Using the developed framework, a hybrid PLL with DTCbased DSM noise cancellation will be implemented. The PLL embodies a hybrid, digitally intensive, and wideband ring-based structure, utilizing an analog RC filter in the proportional path and a digital integral path. The 5-6 GHz output from the 5-stage ring oscillator passes through an edge selection block, consequently easing the dynamic range necessary for subsequent circuits and reducing the INL requirements of the DTC.

Moreover, a switched-capacitor frequency-to-current converter (F2I) translates the frequency of the reference and DCO into a proportional current. The divider is embedded within the F2I through ratioed capacitors. The fractional error is generated via another F2I, driven by a DSM clocked by the reference. This fractional path is heavily low-pass filtered, thereby eliminating the highpass shaped DSM noise. It decouples the bandwidth of the signal path containing the frequency error information from the bandwidth of the fractional frequency generated by DSM, facilitating a high main-loop bandwidth for rapid correction of the DCO's phase error, while a smaller bandwidth on the DSM fractional path filters the noise.



Figure 1. Initial validation of the rapid simulation framework.

**Keywords:** Phase-locked loop (PLL), digital-to-time converter (DTC), ring oscillator, phase noise, bandwidth

INDUSTRY INTERACTIONS

#### MAJOR PAPERS/PATENTS

[1] H. Lu and P. P. Mercier, "Linear Periodically Time-Variant Digital PLL Phase Noise Modeling Using Conversion Matrices and Uncorrelated Upsampling," in IEEE Transactions on Circuits and Systems I: Regular Papers. (under review)

This project will demonstrate optimum techniques for Antenna-in-Package (AiP) solutions for RF and millimeterwave (mmWave) phased arrays using 3D Heterogenous Integration (3DHI) and operating between 20-170 GHz. This will include fully documented approaches to antennas, interconnects, transitions, chips, and passive networks and then reference phased-array AiP designs.

#### TECHNICAL APPROACH

We will study 3DHI for RF/mmWave phased-array applications. Our approach employs an AiP made using photosensitive glass and 3DHI of Gallium-Nitride (GaN) front-end ICs and silicon-on-insulator (SOI) beamforming ICs. We will investigate both approaches and trade-offs of array architecture, chip(let) architecture, thermal management, packaging substrate design, manufacturing and assembly of the AiP, and overall AiP performance. We will study these across 20-170 GHz, with prototypes planned at 20, 44, 60, and/or 140 GHz, leveraging existing beamformers at NC State. Our goal is to create scalable, testable, and transferrable designs that can be used by the SRC member community.

#### SUMMARY OF RESULTS

This report summarizes one semester of research activity, as the project started in January 2024. The research project is organized into five tasks, as follows: (1) AiP system evaluation, (2) package component design in glass, (3) chip design (both beamformers in SOI and frontends in GaN), (4) thermal management of the AiP, and (5) full AiP design and prototyping.

Our team made progress in both Tasks 1 and 3 during the reporting period. For Task 1, we aim to evaluate how a 3DHI solution can alter the system architecture, benefit the link budget, reduce power consumption, reduce area, etc. We focused on 60-GHz system evaluation for this first period, leveraging both prior and ongoing work. We are developing 60-GHz full-duplex phased-array beamformers in GlobalFoundries 45nm RFFE technology with partial funding from the NSF RINGS project. These 60-GHz phased arrays will then serve as candidate beamformers for a 3DHI system. We completed a link-budget analysis for the 60-GHz system, comparing a 3DHI system with GaN frontends with an all-silicon solution. Switching to a GaN frontend could reduce the power consumption by 3X for the same range or increase the range by 2.3X for the same power. This work is ongoing.

For Task 3, our focus is on the design and characterization of candidate integrated circuits to be used within a 3DHI system. This includes both circuits that are already designed and those currently being designed. In the past period, we focused first on the characterization of an existing Q-band transmit element that was recently received and then second on the ongoing design of 60-GHz beamforming circuits in 45-nm RFFE technology.

First, we completed the characterization of a 39-44 GHz transmitter (TX) beamformer circuit designed in a parallel project focused on SATCOM array development and funded by the US Army. This and a 20-GHz receiver (RX) beamformer circuit are two more candidate beamformer circuits that may be used within 3DHI prototypes. Our characterizations of both beamformer circuits show that the ICs work well, matching simulation. The RX provides a 2.2-dB noise figure at 20 GHz [1], whereas the TX provides 18-dBm output 1dB compression point (oP<sub>1dB</sub>) at 43 GHz [2]. Both circuits exhibit near-perfect gain and phase control with 6-bit resolution.

The 60-GHz phased array is designed for a full-duplex system that can potentially be used in 6G networks. It includes a transmit/receive (T/R) switch, low-noise amplifier, power amplifier, phase shifter, and variable-gain amplifier circuits. All circuits have been designed and laid out individually. Each circuit is achieving performance comparable to state-of-the-art performance for 60-GHz arrays. The remaining work is integrating all circuits into a beamformer, ensuring the layout is compatible with 3DHI. These designs will be sent for fabrication later in 2024.

**Keywords:** heterogeneous, integration, packaging, 3D, phased arrays

#### INDUSTRY INTERACTIONS

3D Glass Solutions, Analog Devices Inc.

#### MAJOR PAPERS/PATENTS

[1] Y. Chang and B. A. Floyd, "Reduction of phase and gain control dependencies within a 20 GHz beamforming receiver IC," *IEEE Access*, vol. 11, no. 5, pp. 68066-68078, May 2023.

[2] T. Ren, Y. Chang, and B. A. Floyd, "A Q-band phasedarray transmit beamformer in 45nm CMOS SOI for SATCOM," *IEEE BiCMOS Compound Semi. ICs Tech. Symp. (BCICTS)*, Oct. 2024, pp. 1-4.

#### MICHAEL FLYNN, UNIVERSITY OF MICHIGAN, MPFLYNN@UMICH.EDU

#### SIGNIFICANCE AND OBJECTIVES

The noise-shaping (NS) SAR ADC has revolutionized analog-to-digital conversion by combining the efficiency of SAR with the resolution of delta-sigma. NS SARs deliver record efficiency. Architectural innovations have pushed the NS SAR resolution to the audio range. NS SAR ADCs are also appealing as noise-shaping quantizers in continuoustime DSMs.

#### **TECHNICAL APPROACH**

We will investigate new interleaving architectures to increase the bandwidth and resolution of Time-Interleaved Noise-Shaping SAR ADCs, which raise the NTF (Noise Transfer Function) order and reduce the hardware and timing complexity. Our proof-of-concept research indicates that a much lower over sampling ratio (OSR) (i.e., 4~6x) becomes possible, which helps us target for ambitious bandwidth (400 MHz) and resolution (60+ dB) specifications.

#### SUMMARY OF RESULTS

Although there has been tremendous progress in improving NS SAR resolution and efficiency, oversampling fundamentally limits the conversion bandwidth. Interleaving is an obvious choice, and work by this investigator and others has shown that time interleaving (TI) of NS SAR ADCs is practical and advantageous [1]. The TI input load is constant, and the inter-channel delays in a TI NS SAR facilitate noise shaping. Nevertheless, state-ofthe-art TI NS SAR ADCs are limited to around 100MHz bandwidth, which falls far short of the bandwidth needed for emerging communications and sensing (e.g., radar) applications.

TI NS SAR ADCs widen the bandwidth with minimum hardware overhead since it harnesses the inherent time delay between channels to eliminate the need for multicycle switched-capacitor circuits for residue feedback (Figs. 1 and 2). However, a challenge is that relying on the natural inter-channel delay means that a higher-order NTF is only possible with a large number of channels.

For example, a 2<sup>nd</sup>-order lowpass TI NS SAR requires three channels, while a 4<sup>th</sup>-order bandpass TI NS SAR requires five channels. A high number of channels may result in more severe channel mismatch and, therefore, more interleaving artifacts. Further challenges are noise and non-linearity, which limit the resolution gained by an increased NTF order. These factors prevent TI NS from delivering sub-GHz-BW with necessary resolution. Besides optimizing the residue feedback architecture, the resolution and speed of each sub-ADC are also critical for our targeted specifications. We will improve the unit SAR architecture to improve performance and efficiency.

Conventionally, a SAR ADC trades conversion speed for resolution. We plan to break this tradeoff using techniques that speed up quantization and preserve precision. Our core idea is to use noise shaping to suppress quantizer nonidealities, such as distortion or memory effects. Noise shaping allows us to use speed-up techniques that are impractical in Nyquist ADCs because they contaminate the in-band spectrum. For example, a continuous-time (CT) version of NS SAR ADC has been made possible by using noise-shaping to suppress the dynamic tracking error due to sampling switch removal.



Figure 1. Timing framework of TI NS SAR.



Figure 2. Prior work reported in [1].

Keywords: NS SAR, noise shaping, interleaving

#### INDUSTRY INTERACTIONS

IBM, MediaTek, NXP, Qualcomm, Samsung

#### MAJOR PAPERS/PATENTS

[1] L. Jie, H Chen, B. Zheng and M. P. Flynn, "A 100MHz-BW 68dB-SNDR Tuning-Free Hybrid-Loop DSM with an Interleaved Bandpass Noise-Shaping SAR Quantizer," IEEE International Solid State Circuits Conference, February 2021.

# TASK 3160.044, DIRECT-CARRIER MODULATED TRANSMITTER FOR EXTREMELY HIGH DATA RATE MILLIMETER-WAVE RADIOS MAU-CHUNG FRANK CHANG, UNIVERSITY OF CALIFORNIA AT LOS ANGELES, MFCHANG@EE.UCLA.EDU

#### SIGNIFICANCE AND OBJECTIVES

High-order (>256 QAM) modulation in conventional transmitter (TX) architectures is limited by nonlinearity from the baseband-to-carrier conversion, particularly constrained headroom, I/Q imbalance, memory effects, phase noise, and device PVT variations. We propose to use patented DiCAD[1-2] (Digital Controlled Artificial Dielectric) to design digitally controlled phase modulators with true time-delay transmission lines (TLs).

#### **TECHNICAL APPROACH**

The proposed digital TX takes advantage of the nanometric CMOS lithography to implement digitally controlled phase modulators with embedded DiCAD TLs. The carrier is initially phase-shifted by  $\varphi$  before splitting into two branches, then each phase-shifted by the same angle  $\theta$ , but in opposite rotations. Since these two outphasing vectors have a constant envelope, they can be combined directly without backing off the output power.

#### SUMMARY OF RESULTS

Fig. 1 shows such a TX architecture. The modulator uses three sets of switched DiCAD TLs (BBPM phase-shifts the carrier by  $\phi$  for initial phase modulation and BBAM+/BBAM- phase-shifts by  $\pm\theta$  for overall amplitude modulation) to produce out-phased phasors. Each set consists of both coarse and fine phase modulators.



Figure 1. Architecture of the proposed direct-carrier digital outphasing transmitter and its modulation principle.

DiCAD TLs with switched capacitors based on the CMOS backend are used to implement the true-time-delay digital phase modulator. The thin backend metals can enhance TL unit capacitance and reduce its inductance, resulting in a low characteristic impedance. We implement solenoid-like on-chip TL structures as shown in Fig. 2 to further boost its unit inductance by over 20x, increasing the characteristic impedance from  $16\Omega$  to  $70\Omega$ , while compacting the phase modulator.



Figure 2. Digital phase modulator with loss compensation.

The digital control of each MOSFET switch introduces a code-dependent loss due to the on-resistance of the device channel. To alleviate this, a Negative-R (NR) loss compensation mechanism is incorporated in the coarse phase modulator to implement a 2-dB gain variation across control codes (Fig. 2).

The direct-carrier digital out-phasing transmitter takes advantage of the LINC (linear amplification with nonlinear components) concept and addresses the distortion issue by utilizing the abundant sampling space provided by DiCAD TL's coarse and fine controls of the phase modulators. As shown in Fig. 3, the gain provided by the saturation amplifiers aims to minimize the modulator's PM-AM distortion. The series combining provides a sign bit via the sin $\theta$  operation, allowing the out-phased input phasors to create both positive and negative symbols, so the digital modulator is only required to cover 180° to achieve high order modulation (>256 QAM) for mm-Wave transmitters.



Figure 3. Series combined power amplifier to generate the final symbol.

**Keywords:** QAM, Transmitter, Out-phasing, Directcarrier, Transmission Line

#### INDUSTRY INTERACTIONS

#### NXP

#### MAJOR PAPERS/PATENTS

 M.-C. F. Chang, D. Huang, and W. Hant, "Tunable Artificial Dielectrics," US Patent 8164401 B2, Apr 24, 2012
 M.-C. F. Chang, D. Huang, and W. Hant, "Tunable Artificial Dielectrics," US Patent 7852176 B2, Dec 14, 2010.

Large die yield concerns and the rise of domain-specific accelerators have motivated partitioning compute modules into multiple chiplets on interposers with highdensity interconnects. The die-to-die interconnect design techniques in this proposal aim to significantly improve efficiency at high per-pin data rates, which is necessary for the continued scaling of future systems.

#### **TECHNICAL APPROACH**

A new dense energy-efficient die-to-die interconnect transceiver architecture is in development that is based on simultaneous bidirectional (SBD) signaling. The transceiver front-ends utilize a novel inverter-based voltage-mode driver that has a replica driver hybrid, which efficiently separates the outbound and inbound signals, and merges low-complexity echo cancellation and highpass filter near-end and far-end crosstalk (NEXT and FEXT) cancellation circuitry. Low-overhead 10b11b spatial encoding is employed to dramatically reduce supplynoise-induced crosstalk. Finally, a forwarded-clock architecture allows for low-complexity receive-side deskew and high-frequency correlated jitter tracking.

#### SUMMARY OF RESULTS

Fig. 1 shows the proposed die-to-die interconnect transceiver architecture that consists of 24 single-ended wires, with 22 data SBD data transceivers that transmit 20 effective bidirectional data streams between dies and 2 unidirectional forwarded-clock channels. Two groups of low-overhead 10b11b spatial encoding are employed in the 22 data transceivers to dramatically reduce supply-noise-induced crosstalk.

Efficient SBD techniques are necessary to generate a replica outbound for subtraction from the total signal present at the transceiver interface to allow for the extraction of only the inbound signal. The proposed frontend in Fig. 2 introduces an additional echo cancellation segment driven by the outbound data sequence that is capacitively coupled in the replica stage to provide a positive high-pass-shaped echo cancellation signal. Properly adjusting the echo cancellation drive strength and coupling capacitor value with the SSLMS adaptation engine provides improved eye diagram margins. Increased NEXT and FEXT are observed as interposer signal-to-signal pitch is decreased to improve the transceiver edge density. The proposed front-end also employs high-pass filter-based cancellation of the NEXT and FEXT occurring on a given wire from its 6 surrounding interposer channels. These techniques are under investigation to achieve the target performance of a die-to-die transceiver architecture operating at 107Gb/s/wire, considering the spatial encoding and forwarded clocks overhead, with an energy efficiency of 0.2pJ/b and an edge density of 32Tb/s/mm when integrated with a high-density interposer.



Figure 1. Die-to-die interconnect transceiver architecture.



Figure 2. (a) SBD transceiver front-end with an inverter-based driver, replica driver hybrid, echo, and NEXT/FEXT cancellation. 128Gb/s SBD eye diagrams: (b) with echo cancellation, (c) without, and (d) with NEXT and FEXT cancellation.

**Keywords:** Chiplet, die-to-die interconnects, echo cancellation, interposer, simultaneous bidirectional signaling

#### INDUSTRY INTERACTIONS

Intel, MediaTek

### TASK 3160.047, HIGHLY EFFICIENT 100-DB+ PIPELINED SAR ADC WITH KT/C NOISE CANCELLATION YUN CHIU, UNIVERSITY OF TEXAS AT DALLAS, CHIU.YUN@UTDALLAS.EDU

#### SIGNIFICANCE AND OBJECTIVES

Past noise-cancelling structures are limited in their performance by the linearity of the noise-cancelling amplifier. By digitally predicting the swing that the amplifier would see and applying the opposite signal, the summing node swing could be minimized.

#### **TECHNICAL APPROACH**

The accuracy of a digital FIR prediction filter was evaluated in MATLAB behavior simulation. The predictor tested as fixed coefficients and is equivalent to fitting a cubic polynomial to the data and extrapolating.

#### SUMMARY OF RESULTS

A diagram of the first stage SAR ADC model is shown in Fig. 1 below.





Given the positions of the clock edges  $\Phi_{1e}$  and  $\Phi_{1x}$ , the fixed coefficients needed to predict signal movement from  $\Phi_{1e}$  to  $\Phi_{1x}$  ( $\Delta V$ ) can be computed. Additionally, the same FIR taps that are used to predict the amplifier swing can be used to predict  $D_o$ , the quantized signal, just with a different choice of coefficient.

With a 10-b first-stage resolution, the prediction error was calculated given a full-scale input at different frequencies. Up to 10% Nyquist, about -60-dBFS swing was seen at the amplifier input, and the signal prediction provided about 8 bits.



Figure 2. RMS error of digital prediction filter.

In transient behavior simulation, after a few cycles in which accurate samples are being loaded into the FIR filter taps, the expected prediction accuracy is observed. Some redundant bit cycles are employed to ensure the first stage code is accurate to 10bit.



Figure 3. Behavior simulation of instantaneous error of digital prediction filter.

In the following year, the plan is to design and tapeout a prototype chip in a 22-nm CMOS process.

**Keywords:** noise cancellation, kT/C, input swing, FIR filter, digital prediction

#### INDUSTRY INTERACTIONS

Texas Instruments

Accurate low-Power, low-noise, low-power, fullydifferential, and low-distortion programmable gain amplifiers (PGA) with a wide common-mode input range comprise a critical interface to fully-differential data converters. The objective is to develop new methods of designing programmable chopper-stabilized amplifiers with noise and power performance that exceeds that of standard resistive-feedback instrumentation amplifiers.

#### **TECHNICAL APPROACH**

A comprehensive assessment of capacitively-coupled programmable gain amplifiers that can be used for signalconditioning that supports rail-to-rail input signals with rail-to-rail common-mode inputs will be made. Practical strategies for setting the common-mode input voltage of the operational amplifier with low noise and fast settling will be assessed. To mitigate 1/f noise and offset of the amplifier, a chopping scheme will be used. The concept of ping-pong pre-charged chopper capacitors with positive feedback will be investigated as an alternative to precharged buffers to boost the input impedance of the PGA.

#### SUMMARY OF RESULTS

The conventional approach for providing a front-end interface to differential input Analog to Digital Converters that support inputs with a large common-mode input range is the resistor-based PGA, often termed an instrumentation amplifier, shown in Fig. 1. The gain of this amplifier can be programmed with a single resistor RF to accommodate for a wide range of differential input voltages. But this approach offers some challenges: the amplifiers require a large common-mode input range; good compact linear resistors require specialized processing steps; the resistors and amplifiers introduce thermal noise thereby degrading the effiective signal-tonoise ratio of the ADC; the power dissipation in the interface circuit can be large; and the amplifiers are plagued by offset voltage concerns and contribute 1/f noise.

To address some of the key issues associated with the resistor-based PGA, a capacitor-based approach has been reported. A basic implementation of the capacitor-based approach is shown in Fig. 2. With this structure, the gain is determined by capacitor ratios. This approach overcomes several of the limitations of the resistor-based approach. A low-power transconductance-type amplifier can be used and chopper-stabilization will mitigate offset

voltage and 1/f noise in the amplifiers. Highly-linear capacitors are available in standard processes, the capacitors which are used to set the gain do not introduce noise, and with appropriate biasing, the amplifier does not require a large common-mode input range. But this capacitor-based chopper-stabilized PGA is not without challenges. The chopping operation introduces thermal noise, two capacitors must be adjusted to change the gain of the PGA, biasing of the amplifier in the presence of large common-mode input signals must be addressed, and an antialiasing filter is required.



Figure 1. Resistor-based PGA for front-end interface to a fully differential ADC.



Figure 2. Basic capacitor-based PGA approach.

In this work, a quantitative assessment of the tradeoffs between the resistor-based PGA and the capacitor-based PGA will be made. Major emphasis will be placed on the design of the capacitor-based PGA. In particular, tradeoffs between noise and power dissipation in the capacitor-based PGA design will be addressed.

**Keywords:** Capacitively-Coupled Programmable Gain Amplifier, Chopper-Stabilized, Capacitor Feedback

#### INDUSTRY INTERACTIONS

Texas Instruments

# TCI 2023 TASK 1, NANODIELECTRIC FLUIDS USING A MULTI-NANOPARTICLE SYSTEM FOR TWO-PHASE HEAT TRANSFER IN 3D HETEROGENEOUS MICROSYSTEMS

MONA GHASSEMI, UNIVERSITY OF TEXAS AT DALLAS, MONA.GHASSEMI@UTDALLAS.EDU RASHAUNDA HENDERSON, UNIVERSITY OF TEXAS AT DALLAS

#### SIGNIFICANCE AND OBJECTIVES

As stacking chips becomes increasingly important in computing, challenges remain in dissipating heat from internal processing components. This project aims to develop and study nanodielectric fluids (NDFs) for the first time as the next generation of transformative, innovative approaches to thermal management in microsystems.

#### **TECHNICAL APPROACH**

Through comprehensive literature reviews on nanodielectric fluids, we showed that achieving remarkable enhancement in dielectric strength and thermal management is impossible using a type of nanoparticles (NP). The transformative idea of this project is to create a hybrid NF that includes two types of NPs. Nonadiamonds (ND) are far superior to other NPs in thermal management. However, its dielectric strength enhancement is insufficient. On the dielectric strength side, our choice is Fe<sub>3</sub>O<sub>4</sub>, leading to a 23% increase of dielectric strength. Hence, the targeted multi-NP system formed from nanodiamonds and Fe<sub>3</sub>O<sub>4</sub>.

#### SUMMARY OF RESULTS

First, we did three comprehensive and critical literature reviews [1-3] that concentrate on and around the dielectric and thermal performance enhancement of NDFs. Several factors, such as NDF stability, NP concentration, size, shape, NDF viscosity, choice of surfactant, concentration of surfactant, time of ultrasonication, cost, etc., should be considered while synthesizing an NDF. Even though several stability enhancement methods exist, the performance of the NDFs under actual load conditions is still a concern. Although many studies have been conducted using various base fluids and nanoparticles, commercial NDFs have yet to be realized. Our review papers critically review investigations done on NDFs, identify technical gaps, and highlight future research needs. As a result, those can be valuable sources for further research.

Most studies on NDFs focus on transformer oils, with limited research on their use in 3D heterogeneous integration (3DHI) microsystems. Hydrofluoroethers (HFEs), being investigated as dielectric coolants in microsystems, are colorless, odorless, low-surfacetension, low-viscosity solvents with a dielectric constant of around 7 and very low electrical conductivity [2]. ND-Fe<sub>3</sub>O<sub>4</sub> hybrid nanoparticles have high thermal conductivity and magnetic properties, making them interesting for NDFs. However, they have mostly been tested in nondielectric host fluids [2]. In this project, we are examining ND-Fe<sub>3</sub>O<sub>4</sub> hybrid nanoparticles in a dielectric host fluid. The addition of nanoparticles to fluids increases the viscosity, making it difficult for the fluid to flow freely as shown in Fig. 1(a) but it should be noted that there was no significant increase or the viscosity increase upto 1 wt.% is negligible. On the other hand, a thermal conductivity enhancement of 14.5% at merely 0.12 wt.% of ND can be achieved (Fig. 1(b)). This shows that the effect of NP loading on the viscosity is insignificant at the optimum concentrations of the NPs in the base fluid.



Figure 1. (a) Viscosity vs NP concentration, (b) thermal conductivity enhancement vs NP concentration.

**Keywords:** nanodielectric fluids, dielectric strength, thermal conductivity, multi-nanoparticle system, 3D heterogeneous integration of microsystems

#### INDUSTRY INTERACTIONS

#### MAJOR PAPERS/PATENTS

[1] S. P. Kalakonda, et. al., "Nanodielectric fluids for power transformer cooling...," *IEEE DCAS*, April 2024.

[2] S. Kalakonda, et. al., "Liquid nanodielectrics for heat transfer in 3D heterogeneous...," *IEEE IPMHVC*, May 2024.
[3] S. P. Kalakonda, et. al., "Nanodielectric fluids for power transformer insulation challenges...," *IEEE Trans. Dielectr. Electr. Insul.*, under review.

#### **APPENDIX I PUBLICATIONS OF TXACE RESEARCHERS**

# **Conference Publications**

- [1] Jogalekar, A., Medina, O., Iyer, M., Blanchard, A., Henderson, R., Murugan, R., Bakshi, H., Ali, H. (2023). A Novel Approach to Measure and Characterize Radiation Patterns of Antenna-in-Package. 2023 IEEE 73rd Electronic Components and Technology Conference (ECTC), Orlando, FL, USA, pp. 498-503, IEEE.
- [2] Jogalekar, A., Medina, O., Iyer, M., Henderson, R. (2023). Characterization of WR5 Band Based Antenna-in-Package Solution Using Flip-Chip Enhanced QFN. *SRC Techcon, Austin, TX, SRC*.
- [3] Medina, O., Jogalekar, A., Mcgarry, M., Nambiar, K., Lu, H., Lee, M., Henderson, R. (2023). Substrate Temperature Effects on the Performance of mm-wave Antenna-in-Package. SRC Techcon, Austin, TX, SRC.
- [4] Medina, O., Jogalekar, A., Mcgarry, M., Nambiar, K., Lu, H., Lee, M., Henderson, R. (2023). Temperature-Based Performance of Millimeter-Wave Antenna-In-Package. 2023 IEEE International Symposium on Antennas and Propagation and USNC-URSI Radio Science Meeting (USNC-URSI), Portland, OR, USA, pp. 1907-1908, IEEE.
- [5] Nambiar, V., Ren, Y., Jogalekar, A., Medina, O., Lu, H., Henderson, R., Iyer, M. (2023). Linear Viscoelastic Characterization of Epoxy Molding Compound and Ajinomoto Build Up Film Through Nanoindentation for Antenna in Package (AiP) Applications. SRC Techcon, Austin, TX, SRC.
- [6] Song, S., Kang, T., Lee, S., and Flynn, M. (2023). A 150-MS/s Fully Dynamic SAR-Assisted Pipeline ADC Using a Floating Ring Amplifier and Gain-Enhancing Miller Negative-C. 2023 IEEE Symposium on VLSI Technology and Circuits (VLSI Technology and Circuits), Kyoto, Japan, pp. 1-2, IEEE.
- [7] Le, H., Hardy, C., Pham, H., Jatlaoui, M., Voiron, F., Chen, P., Jha, S., Mercier, P. (2023). Vertical Power Delivery and Heterogeneous Integration for High-Performance Computing. 2023 IEEE BiCMOS and Compound Semiconductor Integrated Circuits and Technology Symposium (BCICTS), Monterey, CA, USA, pp. 32-35, IEEE.
- [8] Yi, I., Kaile, S., Zhu, Y., Gomez Diaz, J. Hoyos, S., and Palermo, S. (2023). A 50Gb/s DAC-Based Multicarrier Polar Transmitter in 22nm FinFET. 2023 IEEE Symposium on VLSI Technology and Circuits (VLSI Technology and Circuits), Kyoto, Japan, pp. 1-2, IEEE.
- [9] Peng, Y., Bhat, A., Wadhwa, S., Blaauw, D., and Sylvester, D. (2023). A 4.6nW subthreshold voltage reference with 400× current variation reduction and 64-step 0.11% output voltage programmability. ESSCIRC 2023- IEEE 49th European Solid State Circuits Conference (ESSCIRC), Lisbon, Portugal, pp. 89-92, IEEE.
- [10] Wang, W., Zhang, S., Sharma, S., Lee, M., and Mukhopadhyay, S. (2024). Measurement of Aging Effect of an Analog Computing-In-Memory Macro in 28nm CMOS. *IEEE 2024 IEEE International Reliability Physics Symposium (IRPS), Grapevine, TX, USA,* pp. 1-4, IEEE.
- [11] Zhang, S., Rahman, N., Wang, W., Kidambi, N., Tokunaga, C., and Mukhopadhyay, S. (2024). Measurement of Aging Effect in a Digitally Controlled Inductive Voltage Regulator in 65nm CMOS. 2024 IEEE International Reliability Physics Symposium (IRPS), Grapevine, TX, USA, pp. 1-6, IEEE.
- [12] Li, M., Wang, Z., Mathew, S., De, V., Seok, M. (2024). 16.6 PACTOR: A Variation-Tolerant Probing-Attack Detector for a 2.5 Gb/s× 4-Channel Chip-to-Chip Interface in 28nm CMOS. 2024 IEEE International Solid-State Circuits Conference (ISSCC), San Francisco, CA, pp. 306-308, IEEE.

- [13] Wang, Z., Li, M., Kim, S., Desai, N., Krishnamurthy, R., Lazaro, O., Blanco, A., Zhang, X., Seok, M. (2023). 93.89% Peak Efficiency 24V-to-1V DC-DC Converter with Fast In-Situ Efficiency Tracking and Power-FET Code Roaming. ESSCIRC 2023- IEEE 49th European Solid State Circuits Conference (ESSCIRC), Lisbon, Portugal, pp. 437-440, IEEE.
- [14] Drallmeier, M., and Rosenbaum, E. (2023). Distributed protection for high-speed wireline receivers. 2023 45th Annual EOS/ESD Symposium (EOS/ESD), Riverside, CA, pp. 1-9, ESDA.
- [15] Huang, S., and Rosenbaum, E. (2023). Physics-based compact model of N-Well ESD diodes. 2023 45th Annual EOS/ESD Symposium (EOS/ESD), Riverside, CA, pp. 1-6, ESDA.
- [16] Amankrah, E., Evenezer, P., Chen, D., Geiger, R. (2024). Precise On-Chip Temperature Control for Test: Constant Power Microheater. 2024 GOMACTech, Charleston, SC, USA. US Government Export Controlled.
- [17] Banahene, K., and Geiger, R. (2024). Digitally Program Current Mirrors Utilizing Cascode Transistors. 2024 GOMACTech, Charleston, SC, USA. US Government Export Controlled.
- [18] Tutuani, P., and Geiger, R. (2023). High Resolution Linear Time to Digital Converter Using Pulse Shrinking Rings. NAECON 2023 - IEEE National Aerospace and Electronics Conference, Dayton, OH, USA, pp. 168-173, IEEE.
- [19] Tutuani, P., and Geiger, R. (2024). Compact Linear Time to Digital Conversion Using Pulse Shrinking Rings. 2024 GOMACTech, Charleston, SC, USA. US Government Controlled.
- [20] Yang, R., Gadogbe, B., Chen, D., and Geiger, R. (2023). A Compact and Accurate MOS-based Temp Sensor for Thermal Management. 2023 IEEE 66th International Midwest Symposium on Circuits and Systems (MWSCAS), Tempe, AZ, USA, pp. 594-598, IEEE.
- [21] Najim, N., Kayyil, A., Allstot, D., and Paramesh, J. (2023). Machine learning techniques for digital pre-distortion in CMOS switched-capacitor power amplifiers. *SRC Techcon, Austin, TX, SRC*.
- [22] Giordano, M., Doshi, R., Lu, Q., Murmann, B. (2024). TinyForge: A Design Space Exploration to Advance Energy and Silicon Area Trade-offs in tinyML Compute Architectures with Custom Latch Arrays. 2024 ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), San Diego, CA, pp. 1033–1047, ACM.
- [23] Tang, J, Jiang, J, Zhao, L., Zhang, X., Wei, K., Huang, C. (2024). A Monolithic 3-Level Single-Inductor Multiple-Output Buck Converter with State-Based Non-Linear Control Capable of Handling 1A/1.5ns Transient with On-Die LC. 2024 IEEE Custom Integrated Circuits Conference (CICC), Denver, CO, pp. 1-2, IEEE.
- [24] Khan, M., Wei, K., Zhang, X, Huang, C. (2023). A Single-Inductor 4-Phase Hybrid Switched-Capacitor Topology for Integrated 48V-to-1V DC-DC Converters. 2023 IEEE International Symposium on Circuits and Systems (ISCAS), Monterey, CA, pp. 1-5, IEEE.
- [25] Sim, S., Zhang, X., Jiang, J, Wei, K., Huang, C. (2024). A 94.7% Efficiency Direct-Step-Down Switched-Tank-Based 48V to 1V-3.3 V Hybrid Converter with Constant-Resonant-Time Closed-Loop Control. 2024 IEEE Applied Power Electronics Conference and Exposition (APEC), Longbeach, CA, pp. 1344-1350, IEEE.
- [26] Batabyal, A., Khyalia, S., Zele, R., Wang, H. (2023). Broadband CMOS Power Amplifier Using Novel Current Mode Combiner for Ka-Band Applications. 2023 IEEE International Symposium on Radio-Frequency Integration Technology (RFIT), Cairns, Australia, pp. 38-40, IEEE.
- [27] Cheng, Y., Chiong, C., Wang, Y., Wang, H. (2023). A 1.4-mW Ka-band Low Noise Amplifier Using Self-Resonant Transformer Matching in 90-nm CMOS Process. 2023 IEEE International Symposium on Radio-Frequency Integration Technology (RFIT), Cairns, Australia, pp. 14-16, IEEE.
- [28] Chien, C., Wang, Y., Ng, Y., Huang, T., Chiong, C., Wang, H. (2023). A D-Band Frequency Doubler with Gm-Boosting Technique in 28-nm CMOS. 2023 18th European Microwave Integrated Circuits Conference (EuMIC), Berlin, Germany, pp. 201-204, IEEE.

- [29] Huang, L., Chiong, C., Wang, Y., Wang, H., Huang, T., Chien, C. (2023). A D-band Low-Noise Amplifier in 28-nm CMOS Technology for Radio Astronomy Applications. 2023 18th European Microwave Integrated Circuits Conference (EuMIC), Berlin, Germany, pp. 369-372, IEEE.
- [30] Huang, L., Wang, Y., Wang, H. (2023). Design of a Compact Q-band Low Noise Amplifier in 0.15μm GaAs pHEMT Process. 2023 Asia-Pacific Microwave Conference (APMC), Taipei, Taiwan, pp. 16-18, IEEE.
- [31] Ma, W., Chiong, C., Wang, Y., Wang, H. (2023). A High LO-to-RF Isolation E-band Mixer with 30 GHz Instantaneous IF Bandwidth in 90nm CMOS. 2023 IEEE/MTT-S International Microwave Symposium - IMS 2023, San Diego, CA, USA, pp. 139-142, IEEE.
- [32] Ng, Y., Wang, Y., Wang, H. (2024). A W-Band Amplifier in FinFET Technology. 2024 IEEE 24th Topical Meeting on Silicon Monolithic Integrated Circuits in RF Systems (SiRF), San Antonio, TX, pp. 5-8, IEEE.
- [33] Karmokar, N., Harjani, R., Madhusudan, M., and Sapatnekar, S. (2023). Constructive Common-Centroid Placement and Routing for Binary-Weighted Capacitor Arrays. SRC Techcon, Austin, TX, SRC.
- [34] Madhusudan, M., Poojary, J., Sharma, A., Ramprasath, S., Kunal, K., Sapatnekar, S., and Harjani, R. (2023). Understanding Distance-Dependent Variations for Analog Circuits in a FinFET Technology. ESSDERC 2023 - IEEE 53rd European Solid-State Device Research Conference (ESSDERC), Lisbon, Portugal, pp. 69-72, IEEE.
- [35] Saikiran, M., Sekyere, M., Ganji, M., and Chen, D. (2023). Digital Assisted Defect Detection Methods for Analog and Mixed Signal Circuits: An Overview. 2023 IEEE East-West Design & Test Symposium (EWDTS), Batumi, Georgia, USA, pp. 1-5, IEEE.
- [36] Saikiran, M., Sekyere, M., Ganji, M., and Chen, D. (2023). Graph Theory Based Defect Simulation Framework for Analog and Mixed Signal (AMS) Circuits with Improved Time-Efficiency. 2023 IEEE East-West Design & Test Symposium (EWDTS), Batumi, Georgia, USA, pp. 1-6, IEEE.
- [37] Sekyere, M., Saikiran, M., and Chen, D. (2023). A Power Supply Rejection Based Approach for Robust Defect Detection in Operational Amplifiers. 2023 IEEE East-West Design & Test Symposium (EWDTS), Batumi, Georgia, USA, pp. 1-6, IEEE.
- [38] Arunachalam, A., Das, S., Rajan, M., Su, F., Jin, X., Banerjee, S., Raha, A., Natarajan, S., Basu, K. (2023). Enhanced ML-based Approach for Functional Safety Improvement in Automotive AMS Circuits. 2023 IEEE International Test Conference (ITC), Anaheim, CA, USA, pp. 266-275, IEEE.
- [39] Yi, Y., Kteyan, A., Volkov, A., Moreau, S., Sukharev, V., and Kim, C. (2024). Electromigration Test Chip Experiments From Realistic Power Grid Structures: Failure Trend Comparison and Statistical Analysis. 2024 IEEE International Reliability Physics Symposium (IRPS), Grapevine, TX, USA, pp. 01-06, IEEE.
- [40] Ataman, F., Aladsani, M., Trichopoulos, G., YB, C., and Ozev, S. (2023). Mismatch measurement for mimo mm-wave radars via simple power monitors. 2023 IEEE European Test Symposium (ETS), Venezia, Italy, pp. 1-6, IEEE.
- [41] Ataman, F., Avci, M., YB, C., and Ozev, S. (2023). Global tuning for system performance optimization of rf mimo radars. *IEEE European Test Symposium (ETS), Venice, Italy,* pp. 1-4, IEEE.
- [42] Ataman, F., YB, C., Rao, S., and Ozev, S. (2023). Improving Angle of Arrival Estimation Accuracy for mm-Wave Radars. 2023 IEEE International Test Conference (ITC), Anaheim, CA, USA, pp. 30-36, IEEE.
- [43] Afshar, M., Heydarzadeh, M., and Akin, B. (2023). Multi Sensory Distributed Bearing Fault Classification using Wavelet Scattering Transform. 2023 IEEE Energy Conversion Congress and Exposition (ECCE), Nashville, TN, USA, pp. 3077-3084.

- [44] Rajabioun, R., Afshar, M., and Akin, B. (2023). Deep Learning-Based Bearing Fault Classification Using Stray Magnetic Flux Signal. 2023 IEEE Energy Conversion Congress and Exposition (ECCE), Nashville, TN, USA, pp. 4043-4048.
- [45] Li, X., and Rincon-Mora, G. (2023). Maximum Power-Point Theory for Thermoelectric Harvesters. 2023 IEEE 66th International Midwest Symposium on Circuits and Systems (MWSCAS), Tempe, AZ, USA, pp. 409-413, IEEE.
- [46] Zhang, B., Moon, S., Seok, M. (2024). A 1-TFLOPS/W, 28-nm Deep Neural Network Accelerator featuring Online Compression and Decompression and BF16 Digital In-Memory-Computing Hardware. 2024 IEEE Custom Integrated Circuits Conference (CICC), Denver, CO, USA, pp. 1-2, IEEE.
- [47] Adjei, D., Gadogbe, B., Chen, D., and Geiger, R. (2023). A Resistorless Precision Curvature-Compensated Bandgap Voltage Reference Based on the V GO Extraction Technique. 2023 IEEE International Symposium on Circuits and Systems (ISCAS), Monterey, CA, USA, pp. 1-5, IEEE.
- [48] Adjei, D., Gadogbe, B., Chen, D., Geiger, R., Chaganti, S., Desai, D., Doorenbos, J., and Todsen, J. (2023). An Improved Single-Temperature Trim Technique for 1st Order-Compensated Bandgap References. 2023 IEEE 66th International Midwest Symposium on Circuits and Systems (MWSCAS), Tempe, AZ, USA, pp. 59-63, IEEE.
- [49] Darko, E., Bhatheja, K., Adjei, D., Strong, M., and Chen, D. (2023). On-Chip Monitoring of Time-Dependent Dielectric Breakdown (TDDB) using a Novel Leakage Current Sensor with Digital Output. 2023 IEEE International Integrated Reliability Workshop (IIRW), South Lake Tahoe, California, USA, pp. 1-6, IEEE.
- [50] Gadogbe, B., Adjei, D., Banahene, K., Geiger, R., and Chen, D. (2023). Sub-ppm/°C High Performance Voltage Reference. 2023 IEEE International Symposium on Circuits and Systems (ISCAS), Monterey, CA, USA, pp. 1-4, IEEE.
- [51] Komarraju, S., Tammana, A., Amarnath, C., and Chatterjee, A. (2023). OATT: Outlier Oriented Alternative Testing and Post-Manufacture Tuning of Mixed-Signal/RF Circuits and Systems. 2023 IEEE International Test Conference (ITC), Anaheim, CA, USA, pp. 37-46, IEEE.
- [52] Budak, A., Zhu, K., and Pan, D. (2023). Practical Layout-Aware Analog/Mixed-Signal Design Automation with Bayesian Neural Networks. 2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD), San Francisco, CA, USA, pp. 1-8, IEEE.
- [53] Maji, S., Budak, A., Poddar, S., and Pan, D. (2024). Toward End-to-End Analog Design Automation with ML and Data-Driven Approaches. 2024 29th Asia and South Pacific Design Automation Conference (ASP-DAC), Incheon, S. Korea, pp. 657-664, IEEE.
- [54] Maji, S., Lee, S., and Pan, D. (2024). Analog Transistor Placement Optimization Considering Non-Linear Spatial Variation. 2024 Design, Automation & Test in Europe Conference & Exhibition (DATE), Valencia, Spain, pp. 1-6, IEEE.
- [55] Poddar, S., Budak, A., Zhao, L., Hsu, C., Maji, S., Zhu, K., Jia, Y., and Pan, D. (2024). A Data-Driven Analog Circuit Synthesizer with Automatic Topology Selection and Sizing. 2024 Design, Automation & Test in Europe Conference & Exhibition (DATE), Valencia, Spain, pp. 1-6, IEEE.
- [56] Sekyere, M., Darko, E., Bruce, I., Odion, E., Bhatheja, K., and Chen, D. (2023). Ultra-Small Area, Highly Linear Sub-Radix R-2R Digital-To-Analog Converters with Novel Calibration Algorithm. 2023 IEEE 66th International Midwest Symposium on Circuits and Systems (MWSCAS), Tempe, AZ, USA, pp. 604-608, IEEE.
- [57] Khalil, M., et al. (2024). A 69.3fs Ring-Based Sampling-PLL Achieving 6.8GHz-14GHz and –54.4dBc Spurs Under 50mV Supply Noise. 2024 IEEE International Solid-State Circuits Conference (ISSCC), San Francisco, CA, USA, pp. 138-140, IEEE.

- [58] Xu, X., Parl, B., Guzman, M., Chen, Y., Han, C., Maghari, N. (2023). Mixed-order Correlated Dual Loop Sturdy MASH CT-ΔΣ Modulator with Distributed Signal Feed-in and VCO Quantizer. 2023 IEEE Custom Integrated Circuits Conference (CICC), San Antonio, TX, USA, pp. 1-2, IEEE.
- [59] Zirtiloglu, T., Crary, P., Tasci, E., Eldar, Y., Shlezinger, N., Yazicigil, R. T. (2023). Task-Specific Low-Power Beamforming MIMO Receiver Using 2-Bit Analog-to-Digital Converters Specific Low-Power Beamforming MIMO Receiver Using 2-Bit Analog-to-Digital Converters. 2023 IEEE Asian Solid-State Circuits Conference (A-SSCC), Haikou, China, pp. 1-3, IEEE.
- [60] Chen, X., Feng, J., Shoukry, A., Zhang, X., Magod R., Desai, N., Gu, J. (2023). Proactive Poewr Regulation with Real-time Prediction and Fast Response Guardband for Fine-grained Dynamic Voltage Droop Mitigation on Digital SoCs. 2023 IEEE Symposium on VLSI Technology and Circuits (VLSI Technology and Circuits), Kyoto, Japan, pp. 1-2, IEEE.
- [61] Drallmeier, M., Zhou, Y., and Rosenbaum, E. (2024). On-chip single-shot pulse generator for TDDB characterization on sub-nanosecond timescale. 2024 IEEE International Reliability Physics Symposium (IRPS), Grapevine, TX, USA, pp. 8C.4-1-8C.4-10, IEEE.
- [62] Bruce, I., Sekyere, M., Darko, E., Odion, E., Bhatheja, K., and Chen, D. (2023). Small Area, High Accuracy Sub-Radix Resistive Current Mode Digital-To-Analog Converter with Novel Calibration Algorithm. 2023 IEEE 66th International Midwest Symposium on Circuits and Systems (MWSCAS), Tempe, AZ, USA, pp. 477-481, IEEE.
- [63] Kalakonda, S. P., Ghassemi, M. (2024). Nanodielectric fluids for power transformer cooling and insulation: A review. 2024 IEEE 17th Dallas Circuits and Systems Conference (DCAS), Richardson, TX, USA, pp. 1-6, IEEE.
- [64] Zhang, Y., Shrestha, R., Varghese, E.R., Bass, J., Lee, T. L., Chan, J.C., Deng, X., Kaneda, Y., Willomitzer, F., and Takashima, Y. (2024). Digital Phase Conjugation by Texas Instruments Phase Light Modulator for Near-to-Eye Display. SPIE Photonics West, San Francisco, CA, USA, Proc. SPIE 12900, SPIE.
- [65] Nero, G., Chan, J., Deng, X., Lee, T., Pei, Y., and Takashima, Y. (2024). Two-dimensional solidstate diffractive beam steering by digital micromirror devices. *SPIE ARVRMR, San Francisco, CA, USA,* Proc. SPIE 12900, SPIE.
- [66] Pei, Y., Nero, G., Zhang, T., Chan, J., Deng, X., Lee, T., Liu, P., and Takashima, Y. (2024). Twodimensional multi-domain field-of-view expansion by MEMS SLM for near to eye display. SPIE ARVRMR, San Francisco, CA, USA, Proc. SPIE 12913, SPIE.
- [67] Zhang, T., Kaneda, Y., and Takashima, Y. (2024). FOV expansion by polychromatic light source for holographic Near-to-Eye Display. SPIE ARVRMR, San Francisco, CA, USA, Proc. SPIE 12913, SPIE.
- [68] Lee, T., Liu, P., Pei, Y., and Takashima, Y. (2023). Multi-Domain Multiplexing for Large Field of View in Near-to-Eye Displays. SPIE Optical Engineering + Applications, San Diego, CA, USA, Proc. SPIE PC12684, SPIE.

# **Journal Publications**

- [1] Bhatheja, K., Chaganti, S., Leisinger, J., Darko, E., Bruce, I., and Chen, D. (2024). A BIST Approach to Approximate Co-Testing of Embedded Data Converters. *IEEE Design & Test*, vol. 41, no. 3, pp. 21-28.
- [2] Bruce, I., Farayola, P., Chaganti, S., Sheikh, A., Ravi, S., and Chen, D. (2023). A Weighted-Bin Difference Method for Issue Site Identification in Analog and Mixed-Signal Multi-Site Testing. *Journal of Electronic Testing*, vol. 39, pp. 57-69.

- [3] Chakraborty, S., and Kulkarni, J. (2024). Comprehensive TCAD-based Retention Study of Thyristor RAM (TRAM) for Low-Power and High-Speed Cryogenic Memory Applications. *IEEE Transactions on Electron Devices*, vol. 71, no. 2, pp. 1031-1039.
- [4] Chen, C., Liu, J., Lee, H. (2023). Dual-path hybrid Dickson converter for high-ratio conversions in point-of-load applications. *IEEE Transactions on Industry Applications*, vol. 59, no. 6, pp. 6914-6925.
- [5] Farayola, P. O., Oko-Odion, E., Chaganti, S. K., Sheikh, A., Ravi, S., and Chen, D. (2023). Site-to-Site Variation in Analog Multisite Testing: A Survey on Its Detection and Correction. *IEEE Design* & Test, vol. 40, no. 5, pp. 52-61.
- [6] Farhadi, M., Vankayalapati B., and Akin, B. (2023), Reliability Evaluation of SiC MOSFETs Under Realistic Power Cycling Tests. *IEEE Power Electronics Magazine*, vol. 10, no. 2, pp. 49-56.
- [7] Farhadi, M., Vankayalapati B., Sajadi, R., and Akin, B. (2023). AC Power Cycling Test Setup and Condition Monitoring Tools for SiC-Based Traction Inverters. *IEEE Transactions on Vehicular Technology*, vol. 72, no. 10, pp. 12728-12743.
- [8] Gupta, A., Odelberg, T., and Wentzloff, D. (2023). Low-Power Heterodyne Receiver Architectures: Review, Theory, and Examples. *IEEE Open Journal of the Solid-State Circuits Society*, vol. 3, pp. 225-238.
- [9] Helwa, S., Van Marter, J. P., Shoudha, S. N., Ben-Shachar, M., Alpert, Y., Dabak, A. G., Torlak, M., and Al-Dhahir, N. (2023). Bridging the Performance Gap Between Two-Way and One-Way CSI-Based 5 GHz WiFi Ranging. *IEEE Access*, vol. 11, pp. 70023-70039.
- [10] Huang, W., Wang, Y., Liu, S., Chiong, C., and Wang, H. (2023). A 30–50-GHz Ultralow-Power Low-Noise Amplifier With Second-Stage Current-Reuse for Radio Astronomical Receivers in 90-nm CMOS Process. *IEEE Microwave and Wireless Technology Letters*, vol. 33, no. 5, pp. 555-558.
- [11] Karmokar, N., Sharma, A., Poojary, J., Madhusudan, M., Harjani, R., and Sapatnekar, S. (2023). Constructive Placement and Routing for Common-Centroid Capacitor Arrays in Binary-Weighted and Split DACs. *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, vol. 42, no. 9, pp. 2782-2795.
- [12] Kim, S., and Galton, I., (2023). Adaptive Cancellation of Inter-Symbol Interference in High-Speed Continuous-Time DACs. *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 70, no. 11, pp. 4309-4322.
- [13] Li, J., Ahsanullah, D., Gao, Z., and Rohrer, R. (2023). Circuit Theory of Time Domain Adjoint Sensitivity. *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, vol. 42, no. 7, pp. 2303-2316.
- [14] Mehta, A., Shichijo, S., Joh, J., Suh, C., and Kim, M.J. (2023). Degradation and Failure Mechanism of p-GaN Gate E-Mode GaN HEMTs. *The Electrochemical Society (ECS) Transactions*, vol. 112, no. 2, pp. 9-20.
- [15] Mehta, A., Zhu, X., Shichijo, S., and Kim, M.J. (2024). In-situ S/TEM DC biasing of p-GaN/AlGaN/GaN heterostructure for E-mode GaN HEMT devices. *Engineering Research Express*, vol. 6, no. 1, pp. 015324.
- [16] Rosenbaum, E., Huang, S., Drallmeier, M., and Zhou, Y. (2024). Compact models for simulation of on-chip ESD protection networks. *IEEE Transactions Electron Devices*, vol. 71, no. 1, pp. 151-166.
- [17] Saikiran, M., Sekyere, M., Ganji, M., Yang, R., and Chen, D. (2023). Low-cost defect simulation framework for analog and mixed signal (AMS) circuits with enhanced time-efficiency. *Analog Integrated Circuits and Signal Processing*, vol. 117, pp. 73-94.

- [18] Sajadi, R., Xu, C., Vankayalapati, B. T., Farhadi, M. and Akin, B. (2024). Reliability Evaluation of Isolated LDMOS Devices and Condition Monitoring Solution. *IEEE Transactions on Components, Packaging and Manufacturing Technology*, vol. 14, no. 5, pp. 841-850.
- [19] Shoudha, S. N., Helwa, S., Van Marter, J. P., Torlak, M., and Al-Dhahir, N. (2023). WiFi 5GHz CSI-Based Single-AP Localization With Centimeter-Level Median Error. *IEEE Access*, vol. 11, pp. 112470-112482.
- [20] Taewook, K., Lee, S., Song, S., Haghigat, M., and Flynn, M. (2023). A Multimode 157 μ W 4-Channel 80 dBA-SNDR Speech Recognition Frontend With Direction-of-Arrival Correction Adaptive Beamformer. *IEEE Journal of Solid-State Circuits*, vol. 59, no. 6, pp. 1794-1808.
- [21] Van Marter, J. P., Ben-Shachar, M., Alpert, Y., Dabak, A. G., Al-Dhahir, N., and Torlak, M. (2024). A Multichannel Approach and Testbed for Centimeter-Level WiFi Ranging. *IEEE Journal of Indoor* and Seamless Positioning and Navigation, vol. 2, pp. 76-91.
- [22] Wang, Z., Li, M., Kim, S., Desai, N., Krishnamurthy, R., Zhang, X., and Seok, M. (2024). A Ten-Level Series-Capacitor 24-to-1-V DC–DC Converter With Fast In Situ Efficiency Tracking, Power-FET Code Roaming, and Switch Node Power Rail. *IEEE Journal of Solid-State Circuits*, vol. 59, no. 7, pp. 2029-2041.
- [23] Afshar, M., Li, C., and Akin, B. (2024). Real-Time Current-Based Distributed Bearing Faults Detection in Small Cooling Fan Motors. *IEEE Transactions on Industry Applications*, vol. 60, no. 2, pp. 3188-3199.
- [24] Rajabioun, R., Afshar, M., Atan, Ö., Mete, M., and Akin, B. (2024). Classification of Distributed Bearing Faults Using a Novel Sensory Board and Deep Learning Networks With Hybrid Inputs. *IEEE Transactions on Energy Conversion*, vol. 39, no. 2, pp. 963-973.
- [25] Rajabioun, R., Afshar, M., Mete, M., Atan, Ö., and Akin, B. (2024). Distributed Bearing Fault Classification of Induction Motors Using 2-D Deep Learning Model. *IEEE Journal of Emerging and Selected Topics in Industrial Electronics*, vol. 5, no. 1, pp. 115-125.

# **Invited Presentations**

- [1] Takashima, Y. (2023, June 26-29). Solid-state lidar and all-day-wearable AR display with MEMS SLM [Invited talk]. SPIE Digital Optical Technologies 2023, Munich, Germany.
- [2] Takashima, Y.(2023, November 19-22). Beam and image steering towards solid-state lidar and allday wearable AR near-to-eye display [Invited talk]. ISOM 2023, Takamatsu, Japan.
- [3] Takashima, Y. (2023, December 6-8). Image Steering by MEMS SLM for Near-to-eye AR Display Engines [Invited talk]. The 30th International Display Workshops (IDW 2023), Niigata, Japan.
- [4] Takashima, Y. (2023, December 11-14). Beam and image steering for lidar and AR applications: Just-in-time delivery of photons to where and when they are needed. 2nd International Conference Advances in 3OM, Timisoara, Romania.

# Contact TxACE

To become a TxACE partner, please contact: Kenneth K. O, Director 972-883-5556

To discuss our core facilities in Dallas and how to obtain access to them and to receive future TxACE requests for proposals, please contact: Lucien Finley lucien.finley@utdallas.edu 972-883-5553

TxACE is based at The University of Texas at Dallas. We are located in the Engineering and Computer Science North building, ECSN 3.302.

> Texas Analog Center of Excellence The University of Texas at Dallas, EC37 800 West Campbell Road Richardson, Texas 75080

> > centers.utdallas.edu/txace



Semiconductor Research Corporation

