Electronic Components for Optical Data Communication up to 50 Gbit/s

Von der Fakultät Informatik, Elektrotechnik und Informationstechnik
der Universität Stuttgart
zur Erlangung der Würde eines Doktor-Ingenieurs (Dr.-Ing.)
genehmigte Abhandlung

Vorgelegt von

Damir Ferenci
aus Stuttgart

Hauptberichter : Prof. Dr.-Ing. Manfred Berroth
Mitberichter : Prof. Dr.-Ing. Joachim Burghartz


Institut für Elektrische und Optische Nachrichtentechnik
der Universität Stuttgart

2013
Contents

Abbreviations and Symbols IV

Zusammenfassung XI

1 Introduction 1
   1.1 Fibre Optical Data Transmission Systems .................. 2
   1.2 CMOS versus Bipolar Technologies ............................ 5
   1.3 State-of-the-Art High-Speed ADCs ................................ 5
   1.4 Objective of This Thesis ...................................... 7
   1.5 Outline of This Work .......................................... 8

2 Analogue-to-Digital Converter Fundamentals 10
   2.1 A/D-Conversion .............................................. 10
      2.1.1 Sampling ................................................ 10
      2.1.2 Quantisation ............................................. 13
      2.1.3 Sampling and Quantisation in a Flash ADC ................ 15
   2.2 Static Measures for the ADC Characterisation ............... 16
      2.2.1 Differential Nonlinearity ................................ 17
      2.2.2 Integral Nonlinearity .................................... 17
   2.3 Dynamic Measures for the ADC Characterisation .............. 18
      2.3.1 Signal-to-Noise and Distortion Ratio and Effective Resolution 18
      2.3.2 Spurious Free Dynamic Range ............................ 20
      2.3.3 Total Harmonic Distortion ............................... 20
   2.4 Properties of Time-Interleaved ADCs .......................... 21

3 CMOS Integrated Circuit Design 24
   3.1 The Field-Effect Transistor ................................... 24
      3.1.1 Large Signal Model of the MOS Transistor ................ 25
      3.1.2 Small Signal Model of the MOS Transistor ............... 27
3.2 Introduction to CMOS Logic ........................................ 28
    3.2.1 The CMOS Inverter ........................................... 28
3.3 Introduction to Current Mode Logic ................................ 30
    3.3.1 The CML Inverter ............................................ 31
    3.3.2 Logic Gates in CML .......................................... 33
    3.3.3 A Flip-Flop in CML .......................................... 35
    3.3.4 Current Source ............................................. 37
3.4 Analogue Circuits for High-Speed CMOS A/D-Converters ........ 39
    3.4.1 The Linear Amplifier ...................................... 39
    3.4.2 The Track and Hold Circuit ............................... 40
    3.4.3 The Differential Comparator .............................. 40
4 The ADC Architecture .................................................. 42
    4.1 Sample and Hold Circuit ...................................... 43
    4.2 Quantisation Circuit .......................................... 44
    4.3 Thermometer-to-Binary Encoder .............................. 47
    4.4 Synchronization Circuit ...................................... 50
    4.5 Clock Divider Circuit ........................................ 53
    4.6 Bootstrap Circuit ............................................. 54
    4.7 Layout Implementation ....................................... 55
5 The Measurement Environment .......................................... 59
    5.1 The FPGA-based Measurement System ...................... 59
        5.1.1 The VHDL Design ....................................... 60
        5.1.2 Data Synchronisation and Evaluation .................. 67
    5.2 FPGA Versus Real-Time Scope Measurements ............... 69
    5.3 The Measurement Setup ...................................... 70
        5.3.1 ADC Board ............................................. 72
6 Simulation and Measurement Results .................................. 78
    6.1 Simulation Results ........................................... 78
        6.1.1 The Sample and Hold Circuit ......................... 79
        6.1.2 Summary of the Simulation Results ................... 81
    6.2 Measurement Results ......................................... 82
        6.2.1 DC Characteristics .................................... 82
Contents

6.2.2 Dynamic Characteristics ........................................... 84
6.3 Analysis of the Results ............................................... 87
6.4 Conclusion ............................................................. 92

7 Hybrid ADC Feasibility Evaluation .................................. 93
  7.1 Indium Phosphide Technology .................................... 94
  7.2 System Architecture ............................................... 96
  7.3 Simulation and Measurement Results ............................ 98
    7.3.1 S-Parameter Results ........................................... 98
    7.3.2 THD and SFDR Results ....................................... 101
    7.3.3 SNDR Results .................................................. 102
  7.4 Comparison with the State of the Art ............................ 105
  7.5 DeMUX Integration with TIA ..................................... 107
    7.5.1 Block Diagram ............................................... 107
    7.5.2 Implementation .............................................. 108
    7.5.3 Measurement Results ....................................... 109
  7.6 Conclusion .......................................................... 110

8 Summary and Outlook ................................................ 111
  8.1 Summary ........................................................... 111
    8.1.1 Measurement Results ....................................... 112
    8.1.2 Conclusion .................................................... 112
  8.2 Future Work ....................................................... 113

A Clipping in an Ideal ADC ............................................. 114

Personal Publications .................................................. 116

Bibliography ..................................................................... 119

Curriculum Vitae .......................................................... 125

Acknowledgment ........................................................... 126
## Abbreviations and Symbols

### Abbreviations

<table>
<thead>
<tr>
<th>Abbreviation</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>AC</td>
<td>Alternating Current</td>
</tr>
<tr>
<td>ADC</td>
<td>Analog to Digital Converter</td>
</tr>
<tr>
<td>B</td>
<td>Bulk</td>
</tr>
<tr>
<td>BiCMOS</td>
<td>Bipolar-CMOS</td>
</tr>
<tr>
<td>C</td>
<td>Capacitance</td>
</tr>
<tr>
<td>CDR</td>
<td>Clock-Data Recovery</td>
</tr>
<tr>
<td>CI</td>
<td>CMOS Inverter</td>
</tr>
<tr>
<td>clk</td>
<td>Clock</td>
</tr>
<tr>
<td>CD</td>
<td>Chromatic Dispersion</td>
</tr>
<tr>
<td>CKD</td>
<td>Clock Driver</td>
</tr>
<tr>
<td>CLM</td>
<td>Channel Length Modulation</td>
</tr>
<tr>
<td>CML</td>
<td>Current Mode Logic</td>
</tr>
<tr>
<td>CMOS</td>
<td>Complementary Metal Oxide Semiconductor</td>
</tr>
<tr>
<td>CMRR</td>
<td>Common Mode Rejection Ratio</td>
</tr>
<tr>
<td>D</td>
<td>Drain</td>
</tr>
<tr>
<td>dB</td>
<td>Decibel</td>
</tr>
<tr>
<td>dBc</td>
<td>Decibels relative to the carrier</td>
</tr>
<tr>
<td>DC</td>
<td>Direct Current</td>
</tr>
<tr>
<td>DCA</td>
<td>Differential Cascode Amplifier</td>
</tr>
<tr>
<td>DCF</td>
<td>Dispersion Compensated Fiber</td>
</tr>
<tr>
<td>DeMUX</td>
<td>De-Multiplexer</td>
</tr>
<tr>
<td>DFT</td>
<td>Discrete Fourier Transform</td>
</tr>
<tr>
<td>DHBT</td>
<td>Double Heterojunction Bipolar Transistor</td>
</tr>
<tr>
<td>DNL</td>
<td>Differential Nonlinearity</td>
</tr>
<tr>
<td>DRP</td>
<td>Dynamic Reconfiguration Port</td>
</tr>
<tr>
<td>Abbreviation</td>
<td>Description</td>
</tr>
<tr>
<td>--------------</td>
<td>-------------</td>
</tr>
<tr>
<td>DSP</td>
<td>Digital Signal Processor</td>
</tr>
<tr>
<td>DUT</td>
<td>Device Under Test</td>
</tr>
<tr>
<td>EF</td>
<td>Emitter Follower</td>
</tr>
<tr>
<td>ENOB</td>
<td>Effective Number of Bits</td>
</tr>
<tr>
<td>FET</td>
<td>Field Effect Transistor</td>
</tr>
<tr>
<td>FIFO</td>
<td>First In First Out</td>
</tr>
<tr>
<td>FPGA</td>
<td>Field Programmable Gate Array</td>
</tr>
<tr>
<td>FSM</td>
<td>Finite State Machine</td>
</tr>
<tr>
<td>G</td>
<td>Gate</td>
</tr>
<tr>
<td>GHz</td>
<td>Gigahertz</td>
</tr>
<tr>
<td>GIMP</td>
<td>Gigabit INT Measurement System</td>
</tr>
<tr>
<td>GS</td>
<td>Giga Sample</td>
</tr>
<tr>
<td>INL</td>
<td>Integral Nonlinearity</td>
</tr>
<tr>
<td>InP</td>
<td>Indium Phosphite</td>
</tr>
<tr>
<td>INT</td>
<td>Institut für Elektrische und Optische Nachrichtentechnik</td>
</tr>
<tr>
<td>IP</td>
<td>Internet Protocol</td>
</tr>
<tr>
<td>ISI</td>
<td>Inter-Symbol Interference</td>
</tr>
<tr>
<td>LSB</td>
<td>Least Significant Bit</td>
</tr>
<tr>
<td>MIM</td>
<td>Metal-Insulator-Metal</td>
</tr>
<tr>
<td>MOS</td>
<td>Metal Oxide Semiconductor</td>
</tr>
<tr>
<td>MUX</td>
<td>Multiplexer</td>
</tr>
<tr>
<td>NiCr</td>
<td>Nickel Chrome</td>
</tr>
<tr>
<td>OOK</td>
<td>On-Off Keying</td>
</tr>
<tr>
<td>PC</td>
<td>Personal Computer</td>
</tr>
<tr>
<td>PCB</td>
<td>Printed Circuit Board</td>
</tr>
<tr>
<td>PLL</td>
<td>Phase-Locked Loop</td>
</tr>
<tr>
<td>PMD</td>
<td>Polarization Mode Dispersion</td>
</tr>
<tr>
<td>PRBS</td>
<td>Pseudo Random Bit Sequence</td>
</tr>
<tr>
<td>QPSK</td>
<td>Quadrature Phase Shift Keying</td>
</tr>
<tr>
<td>R</td>
<td>Resistor</td>
</tr>
<tr>
<td>RAM</td>
<td>Random Access Memory</td>
</tr>
<tr>
<td>RF</td>
<td>Radio Frequency</td>
</tr>
<tr>
<td>RMS</td>
<td>Root Mean Square</td>
</tr>
<tr>
<td>S</td>
<td>Source</td>
</tr>
<tr>
<td>Abbreviation</td>
<td>Description</td>
</tr>
<tr>
<td>--------------</td>
<td>-------------</td>
</tr>
<tr>
<td>SAR</td>
<td>Successive Approximation Register</td>
</tr>
<tr>
<td>SFDR</td>
<td>Spurious Free Dynamic Range</td>
</tr>
<tr>
<td>SiGe</td>
<td>Silicon Germanium</td>
</tr>
<tr>
<td>SiNx</td>
<td>Silicon Nitride</td>
</tr>
<tr>
<td>SNDR</td>
<td>Signal-to-Noise and Distortion Ratio</td>
</tr>
<tr>
<td>SNR</td>
<td>Signal-to-Noise Ratio</td>
</tr>
<tr>
<td>SoC</td>
<td>System on a Chip</td>
</tr>
<tr>
<td>S&amp;H</td>
<td>Sample and Hold</td>
</tr>
<tr>
<td>T</td>
<td>Transistor</td>
</tr>
<tr>
<td>THD</td>
<td>Total Harmonic Distortion</td>
</tr>
<tr>
<td>TIA</td>
<td>Transimpedance Amplifier</td>
</tr>
<tr>
<td>TIADC</td>
<td>Time-Interleaved ADC</td>
</tr>
<tr>
<td>UIMP</td>
<td>Universal INT Measurement Protocol</td>
</tr>
<tr>
<td>VHDL</td>
<td>Very High Speed Integrated Circuit Hardware Description Language</td>
</tr>
<tr>
<td>WDM</td>
<td>Wavelength Division Multiplexing</td>
</tr>
</tbody>
</table>
## Symbols

<table>
<thead>
<tr>
<th>Symbol</th>
<th>Description</th>
<th>Unit</th>
</tr>
</thead>
<tbody>
<tr>
<td>(A_c)</td>
<td>common mode amplification factor</td>
<td>-</td>
</tr>
<tr>
<td>(A_d)</td>
<td>differential mode amplification factor</td>
<td>-</td>
</tr>
<tr>
<td>(A_{VT})</td>
<td>technology constant for the offset voltage</td>
<td>Vm</td>
</tr>
<tr>
<td>(C)</td>
<td>capacitance</td>
<td>F</td>
</tr>
<tr>
<td>(C_{DB})</td>
<td>drain-to-bulk capacitance</td>
<td>F</td>
</tr>
<tr>
<td>(C_{GB})</td>
<td>gate-to-bulk capacitance</td>
<td>F</td>
</tr>
<tr>
<td>(C_{GD})</td>
<td>gate-to-drain capacitance</td>
<td>F</td>
</tr>
<tr>
<td>(C_{Gi})</td>
<td>intrinsic gate capacitance</td>
<td>F</td>
</tr>
<tr>
<td>(C_{GS})</td>
<td>gate-to-source capacitance</td>
<td>F</td>
</tr>
<tr>
<td>(C_h)</td>
<td>hold capacitance</td>
<td>F</td>
</tr>
<tr>
<td>(C_L)</td>
<td>load capacitance</td>
<td>F</td>
</tr>
<tr>
<td>(C_o)</td>
<td>overlap capacitance</td>
<td>F</td>
</tr>
<tr>
<td>(C_{SB})</td>
<td>source-to-bulk capacitance</td>
<td>F</td>
</tr>
<tr>
<td>(C'_{ox})</td>
<td>specific capacitance per unit area</td>
<td>F/m²</td>
</tr>
<tr>
<td>(D)</td>
<td>number of the digital code</td>
<td>-</td>
</tr>
<tr>
<td>(D_l)</td>
<td>DFT length</td>
<td>-</td>
</tr>
<tr>
<td>(f)</td>
<td>frequency variable</td>
<td>Hz</td>
</tr>
<tr>
<td>(f_c)</td>
<td>corner frequency</td>
<td>Hz</td>
</tr>
<tr>
<td>(f_g)</td>
<td>maximum frequency of a band-limited signal</td>
<td>Hz</td>
</tr>
<tr>
<td>(f_N)</td>
<td>Nyquist frequency</td>
<td>Hz</td>
</tr>
<tr>
<td>(f_s)</td>
<td>sampling frequency</td>
<td>Hz</td>
</tr>
<tr>
<td>(f_{sq})</td>
<td>signal frequency</td>
<td>Hz</td>
</tr>
<tr>
<td>(f_T)</td>
<td>transit frequency</td>
<td>Hz</td>
</tr>
<tr>
<td>(g_m)</td>
<td>small signal parameter of the transconductance</td>
<td>1/Ohm</td>
</tr>
<tr>
<td>(g_l)</td>
<td>gain of channel 1</td>
<td>-</td>
</tr>
<tr>
<td>(I_b)</td>
<td>bias current</td>
<td>A</td>
</tr>
<tr>
<td>(I_{BS})</td>
<td>bulk-source current</td>
<td>A</td>
</tr>
<tr>
<td>(I_{BD})</td>
<td>bulk-drain current</td>
<td>A</td>
</tr>
<tr>
<td>(I_{CML})</td>
<td>bias current of a CML circuit</td>
<td>A</td>
</tr>
<tr>
<td>(I_{DS})</td>
<td>drain-source current</td>
<td>A</td>
</tr>
<tr>
<td>Symbol</td>
<td>Definition</td>
<td>Unit</td>
</tr>
<tr>
<td>--------</td>
<td>-----------------------------------------------</td>
<td>------</td>
</tr>
<tr>
<td>$I_{in}$</td>
<td>input current</td>
<td>A</td>
</tr>
<tr>
<td>$I_x$</td>
<td>input port $x$</td>
<td></td>
</tr>
<tr>
<td>$I_o$</td>
<td>current for offset compensation</td>
<td></td>
</tr>
<tr>
<td>$I_{SS}$</td>
<td>tail current of a CML circuit</td>
<td>A</td>
</tr>
<tr>
<td>$j$</td>
<td>imaginary unit</td>
<td></td>
</tr>
<tr>
<td>$J$</td>
<td>number of receiver channels</td>
<td></td>
</tr>
<tr>
<td>$k$</td>
<td>discrete time variable</td>
<td></td>
</tr>
<tr>
<td>$K$</td>
<td>number of transmission channels</td>
<td></td>
</tr>
<tr>
<td>$l$</td>
<td>channel variable</td>
<td></td>
</tr>
<tr>
<td>$L$</td>
<td>transistor channel length</td>
<td>m</td>
</tr>
<tr>
<td>$L_{eff}$</td>
<td>effective channel length</td>
<td>m</td>
</tr>
<tr>
<td>$M$</td>
<td>number of interleaved channels</td>
<td></td>
</tr>
<tr>
<td>$n$</td>
<td>resolution</td>
<td>bit</td>
</tr>
<tr>
<td>$N_q$</td>
<td>number of quantization steps</td>
<td></td>
</tr>
<tr>
<td>$N_{rms}$</td>
<td>RMS noise</td>
<td>V</td>
</tr>
<tr>
<td>$n_{ed}$</td>
<td>reduction of the resolution due to clock jitter</td>
<td>bit</td>
</tr>
<tr>
<td>$o_l$</td>
<td>offset error</td>
<td>V</td>
</tr>
<tr>
<td>$Q$</td>
<td>quantization error</td>
<td>V</td>
</tr>
<tr>
<td>$Q_x$</td>
<td>output port $x$</td>
<td></td>
</tr>
<tr>
<td>$r_{ds}$</td>
<td>small signal drain-source resistance</td>
<td>Ohm</td>
</tr>
<tr>
<td>$r_{on}$</td>
<td>small signal on resistance</td>
<td>Ohm</td>
</tr>
<tr>
<td>$r_{out}$</td>
<td>small signal output resistance</td>
<td>Ohm</td>
</tr>
<tr>
<td>$R_D$</td>
<td>drain resistance</td>
<td>Ohm</td>
</tr>
<tr>
<td>$R_f$</td>
<td>feedback resistor</td>
<td>Ohm</td>
</tr>
<tr>
<td>$R_G$</td>
<td>gate resistance</td>
<td>Ohm</td>
</tr>
<tr>
<td>$R_L$</td>
<td>load resistance</td>
<td>Ohm</td>
</tr>
<tr>
<td>$R_S$</td>
<td>source resistance</td>
<td>Ohm</td>
</tr>
<tr>
<td>$R_x$</td>
<td>dummy resistance</td>
<td>Ohm</td>
</tr>
<tr>
<td>$S$</td>
<td>signal bin in a DFT</td>
<td></td>
</tr>
<tr>
<td>$t$</td>
<td>time variable</td>
<td>s</td>
</tr>
<tr>
<td>$t_t$</td>
<td>time span the transfer gate is transparent</td>
<td>s</td>
</tr>
<tr>
<td>$t_{ts}$</td>
<td>settling time of the second track and hold</td>
<td>s</td>
</tr>
<tr>
<td>$t_{LH}$</td>
<td>rise time</td>
<td>s</td>
</tr>
<tr>
<td>$t_{HL}$</td>
<td>fall time</td>
<td>s</td>
</tr>
<tr>
<td>Abbreviation</td>
<td>Definition</td>
<td>Unit</td>
</tr>
<tr>
<td>---------------</td>
<td>---------------------------------------------------------------------------</td>
<td>--------</td>
</tr>
<tr>
<td>$T_S$</td>
<td>sampling period ($1/f_s$)</td>
<td>s</td>
</tr>
<tr>
<td>$v_{GS}$</td>
<td>small signal AC gate-source voltage</td>
<td>V</td>
</tr>
<tr>
<td>$V_D$</td>
<td>voltage corresponding to the digital value $D$</td>
<td>V</td>
</tr>
<tr>
<td>$V_{DD}$</td>
<td>positive supply voltage</td>
<td>V</td>
</tr>
<tr>
<td>$V_{Di}$</td>
<td>ideal mean input voltage corresponding to the quantization interval $D$</td>
<td>V</td>
</tr>
<tr>
<td>$V_{DM}$</td>
<td>estimated mean input voltage corresponding to the quantization interval $D$</td>
<td>V</td>
</tr>
<tr>
<td>$V_{DS}$</td>
<td>drain-source voltage</td>
<td>V</td>
</tr>
<tr>
<td>$V_{DSSat}$</td>
<td>saturated drain-source voltage</td>
<td>V</td>
</tr>
<tr>
<td>$V_{GS}$</td>
<td>gate-source voltage</td>
<td>V</td>
</tr>
<tr>
<td>$V_i$</td>
<td>input voltage $i$</td>
<td>V</td>
</tr>
<tr>
<td>$V_{in}$</td>
<td>differential input voltage</td>
<td>V</td>
</tr>
<tr>
<td>$V_{oi}$</td>
<td>output voltage $i$</td>
<td>V</td>
</tr>
<tr>
<td>$V_{os}$</td>
<td>offset voltage</td>
<td>V</td>
</tr>
<tr>
<td>$V_{out}$</td>
<td>differential output voltage</td>
<td>V</td>
</tr>
<tr>
<td>$V_{pp}$</td>
<td>peak-to-peak voltage</td>
<td>V</td>
</tr>
<tr>
<td>$Q$</td>
<td>quantization voltage interval</td>
<td>V</td>
</tr>
<tr>
<td>$V_{rms}$</td>
<td>RMS input voltage of an ADC</td>
<td>V</td>
</tr>
<tr>
<td>$V_{SS}$</td>
<td>negative supply voltage</td>
<td>V</td>
</tr>
<tr>
<td>$V_{th}$</td>
<td>threshold voltage of a transistor</td>
<td>V</td>
</tr>
<tr>
<td>$W$</td>
<td>transistor channel width</td>
<td>m</td>
</tr>
<tr>
<td>$W_{CML}$</td>
<td>current source transistor width in a CML circuit</td>
<td>m</td>
</tr>
<tr>
<td>$W_{Ti}$</td>
<td>width of transistor $T_i$</td>
<td>m</td>
</tr>
<tr>
<td>$\beta$</td>
<td>current amplification of a transistor</td>
<td></td>
</tr>
<tr>
<td>$\beta_n$</td>
<td>current amplification of the n-channel MOSFET</td>
<td></td>
</tr>
<tr>
<td>$\beta_p$</td>
<td>current amplification of the p-channel MOSFET</td>
<td></td>
</tr>
<tr>
<td>$\Delta t$</td>
<td>RMS clock jitter</td>
<td>s</td>
</tr>
<tr>
<td>$\Delta t_l$</td>
<td>timing error</td>
<td>s</td>
</tr>
<tr>
<td>$\lambda$</td>
<td>channel length modulation parameter</td>
<td>1/V</td>
</tr>
<tr>
<td>$\mu$</td>
<td>electron mobility</td>
<td>cm²/Vs</td>
</tr>
<tr>
<td>$\sigma_a$</td>
<td>variance of the amplifier circuit offset voltage</td>
<td></td>
</tr>
<tr>
<td>$\sigma_c$</td>
<td>variance of the comparator circuit offset voltage</td>
<td></td>
</tr>
</tbody>
</table>
### Abbreviations and Symbols

<table>
<thead>
<tr>
<th>Symbol</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>$\sigma_{ca}$</td>
<td>variance of the offset voltage of the complete comparator circuit in the ADC</td>
</tr>
<tr>
<td>$\sigma_i$</td>
<td>variance of the offset voltage of the differential amplifier used in a comparator circuit</td>
</tr>
<tr>
<td>$\sigma_{VT}$</td>
<td>variance of the transistor threshold voltage</td>
</tr>
<tr>
<td>$\tau$</td>
<td>RC time constant</td>
</tr>
<tr>
<td>$\Omega$</td>
<td>angular frequency variable</td>
</tr>
<tr>
<td>$\Omega_0$</td>
<td>input signal frequency</td>
</tr>
<tr>
<td>$\Omega_S$</td>
<td>sampling frequency</td>
</tr>
</tbody>
</table>
Zusammenfassung


In den Glasfasernetzen wird bei der Datenübertragung bis 10 Gbit/s die Modulationsart „On-Off-Keying” (OOK) verwendet. Es handelt sich hierbei um eine einfache Form der Amplitudenmodulation. Für die Übertragung einer binären Datenfolge wird das Licht für die Übertragung einer logischen ’1’ eingeschaltet bzw. ausgeschaltet, um eine ’0’ zu übertragen. Bei diesem Verfahren ist die benötigte spektrale Bandbreite proportional zur Datenübertragungsrate. Somit führt eine Verdoppelung der Datenübertragungsrate dazu, dass sich der Bandbreitenbedarf ebenfalls verdoppelt. Eine höhere Bandbreite bedeutet aber auch, dass die Dispersion das Signal stärker verzerrt, da die höhere Bandbreite einen größeren Wellenlängenbereich abdeckt und damit auch die Verzerrung durch die Dispersion erhöht, da diese wie oben erwähnt wellenlängenabhängig ist.


Im Rahmen dieser Arbeit wird ein Prototyp eines geeigneten ADCs entwickelt. Der Wandler wird in einer 65 nm CMOS-Technologie für Anwendungen mit niedriger Leistungsaufnahme (engl. Low Power, LP) realisiert. Der ADC besteht aus vier parallel arbeitenden 3 bit Direktwandlern (engl. flash ADC). Diese werden mit einem phasenverschobenen Takt betrieben, dadurch erreicht der Gesamtwandler im Vergleich zum Einzelwandler eine um den Faktor vier höhere Abtas-
Zusammenfassung

Diese Art Wandler wird auch als zeitverschachtelter ADC (engl. time interleaved ADC, TIADC) bezeichnet. Der Wandlerkern benötigt eine Chipfläche von lediglich 0,16 mm². Der Flächenbedarf ist daher im Vergleich zum Entzerrer vernachlässigbar, da dieser eine Chipfläche von 4 mm² benötigt. Der ADC erreicht in den Messungen eine maximale Abtastrate von 36 GS/s. Die effektive Auflösung ist abhängig von der Abtastrate bis zu 2,2 bit. Die Leistungsaufnahme des Wandlers beträgt 3,3 W.

Für die Vermessung des Wandlers wurde im Rahmen dieser Arbeit eine FPGA-basierte Messumgebung entwickelt. Die Messumgebung erlaubt die Charakterisierung des ADCs bis zu einer Abtastrate von 25,6 GS/s. Ein vergleichbares Messgerät ist ein Parallelbitfehleratesmessgerät (engl. parallel bit error rate tester, parBERT). Jedoch übersteigen die Kosten dieses Messgeräts die der FPGA Lösung um den Faktor 100.


Der im Rahmen dieser Arbeit entwickelte Demultiplexer hat eine Abtastrate von 50 GS/s, eine gesamte Harmonische Verzerrung (engl. total harmonic distortion, THD) unter -32 dB im gesamten Nyquist Bereich sowie eine Dämpfung unter 4 dB bis zu einer Eingangsfrequenz von 35 GHz. Der entwickelte TIA hat bei einer Bandbreite von 45 GHz eine Transimpedanz von 70 dBOhm.
1 Introduction

The Internet has changed the communications and media sector in many ways and, in so doing, people’s lives as well. Applications like file sharing, video on demand platforms, as well as social networks allow their members to share pictures and videos. Consequently, the bandwidth required in the Internet networks is increasing, in particular due to the use of high resolution video streams. According to a study by Cisco [1], which is one of the world’s largest network equipment suppliers, global Internet traffic increased by a factor of five between 2007 and 2011, and a threefold growth is expected from 2011 to 2016. There will be three devices with Internet capability for each person on the planet by 2016. Compared to 2011, this is an increase by a factor of three.

To satisfy the increasing demand for bandwidth in the future, the data transmission rates in the present mobile, local, metropolitan and wide area networks must be increased. In wide area networks, data is transferred over a long distance (e.g. between metropolitan areas) using optical fibre links. These networks need upgrading in order to increase the data transmission rate. In this thesis, electronic components, which are necessary to boost the data transmission rate in the existing fibre optical networks, will be presented. These components are a high-speed analogue-to-digital converter (ADC), a demultiplexer (DeMUX) and a transimpedance amplifier (TIA).

The background of this thesis is summarised in this chapter. An introduction to optical data transmission systems is given in the first section. The second section of this chapter deals with state-of-the-art high-speed ADCs. Suitable ADC integration technologies are discussed in the third section. The objective and structure of the thesis are the subject of the last two sections of this chapter.
1.1 Fibre Optical Data Transmission Systems

The simplified block diagram of an optical data transmission system using wavelength division multiplexing (WDM) is shown in Figure 1.1. In a WDM system the band limited optical signals from multiple transmitters, which represent a transmission channel, are combined to a single fibre by means of an optical multiplexer. Each transmitter operates at another wavelength $\lambda$, hence there is no interference between the channels.

![Figure 1.1: Optical data transmission system using wavelength division multiplexing.](image)

Depending on the transmission distance and the fibre attenuation, optical amplifiers must be added between the transmitter and the receiver side. On the receiver side, an optical demultiplexer is used to split the signal back into N-channels. Subsequently, the optical signal is converted into an electrical signal in the receivers.

The data is modulated and converted from an electrical signal into an optical signal on the transmitter side. The modulation method used in optical communication networks with data rates up to 40 Gbit/s is on-off keying (OOK), which is the simplest form of amplitude modulation. In OOK, data is transmitted by switching a signal on or off, depending on whether a logic one, or a logic zero is transmitted.

The block diagram of the receiver is shown in Figure 1.2. The electro-optical conversion is realised by means of the photodiode. The output current of the
photodiode is amplified by the transimpedance amplifier (TIA). A retiming of the electrical signal is performed by means of a recovered data clock in the clock and data recovery unit.

![Block diagram of an electro-optical receiver.](image)

**Figure 1.2:** Block diagram of an electro-optical receiver.

Most of the installed transmitters operate at a data rate of 10 Gbit/s. Increasing the data rate to 40 Gbit/s does not only require faster electronic components, due to physical effects, the received signal must also be equalised before it can be demodulated. To build a cost-efficient equaliser circuit, the complementary metal oxide semiconductor (CMOS) technology can be used. In [3], a Viterbi equaliser is presented, which is able to equalise an optical signal with a data rate of up to 43 Gbit/s.

The basic physical effects which evoke distortion in long-range fibre optical networks are chromatic dispersion (CD) and polarisation mode dispersion (PMD). Both types of dispersion lead to a widening of the transmitted pulses, which, in turn, leads to interference between two subsequent bits. This is referred to as inter-symbol interference (ISI).

The chromatic dispersion in a low-loss fibre is due to the fact that the propagation time of a transmitted light impulse depends on the wave length of the light and the fact that each optical signal has a spectral width, which, in turn, is determined by modulation of the laser used in the transmitter.

The chromatic dispersion is time-invariant, which allows a compensation of the effect in the optical domain. For this purpose, the so-called dispersion-compensated fibres (DCF) are connected to the low-loss fibres. In the DCF, the characteristic wavelength dependent propagation time is the inverse of the low-loss fibre.
1 Introduction

The drawback of this approach is the additional signal attenuation caused by the non-ideal junction between the two fibres and the fact that the attenuation of a DCF is about twice as high as that of a low-loss fibre. Furthermore, existing 2.5 Gbit/s and 10 Gbit/s systems must be updated with additional DCF fibres, which, in turn, reduces the signal power at the receiver even further.

The PMD is evoked by inhomogeneities in the glass fibre, which result in a change in the polarisation of a single polarised optical signal. In other words, both polarisations are present at the receiver side, although a single polarised optical signal is transmitted. This, in turn, leads to ISI because the propagation time of the light is also related to its polarisation.

Unfortunately, the PMD is influenced by environmental factors (temperature and pressure), which makes the PMD a time-variant effect. Therefore, the PMD cannot be compensated for without an adaptive system [5, 6, 7].

A suitable circuit for the compensation of CD and PMD is the Viterbi equaliser as presented in [3]. The block diagram of a possible receiver configuration is shown in Figure 1.3. Compared to the receiver in Figure 1.2, the data recovery is substituted by an ADC and the Viterbi equaliser. The analogue output signal of the TIA is digitised by means of the ADC. The original digital signal is estimated by application of the Viterbi algorithm.

![Block diagram of an electro-optical receiver with Viterbi equaliser.](image)

**Figure 1.3:** Block diagram of an electro-optical receiver with Viterbi equaliser.

The requirement for a 40 Gbit/s data transmission system concerning the ADC is a sampling rate of 40 GS/s and an effective resolution above 2 bit up to the Nyquist frequency of the converter.
1.2 CMOS versus Bipolar Technologies

Bipolar microwave integrated circuits exhibit a low integration level and high costs compared to state-of-the-art CMOS technology. Recent developments in CMOS technology have enabled the realisation of circuits in the 100 GHz range. Today, the transit frequencies in state-of-the-art 65 nm CMOS technology is in the range of 200–300 GHz, in 45 nm technology it is between 300–400 GHz, and in 32 nm technology it is beyond 400 GHz [2, 8, 9].

These facts suggest that high-speed ADCs, which were previously only feasible in high-speed bipolar technology, can be integrated in state-of-the-art CMOS technology. Another advantage of the integration in CMOS is, as shown in the following section, the lower power consumption of this technology compared to bipolar technology.

1.3 State-of-the-Art High-Speed ADCs

The general approach to obtain a high-speed ADC is to parallelise many ADCs to a single high-speed converter. The technical term for such an ADC is time-interleaving ADC (TIADC). The degree of parallelisation depends on the sampling rate of the single converter. The architecture of the single converter in state-of-the-art high-speed ADCs is either a flash ADC or a successive approximation ADC (SAR ADC).

Table 1.1 reflects the current state-of-the-art in ADCs in terms of a high sampling rate and a high effective resolution. The converter in [10] is a 4 bit ADC, therefore it is suitable for the same application as the ADC in this work. The chip is fabricated in a silicon-germanium (SiGe) bipolar-CMOS (BiCMOS) technology. The CMOS converters [11] and [12] are intended for optical communication systems with a higher order modulation. These systems require ADCs with a nominal resolution above 6 bit [13, 14].

Converter [10] is not fully characterised in terms of effective resolution versus input signal frequency. The converter consists of four ADC channels, which is similar to the converter presented in this work. In the measurement, a single
Table 1.1: State-of-the-art high-speed ADCs.

<table>
<thead>
<tr>
<th></th>
<th>[10]</th>
<th>[11]</th>
<th>[12]</th>
<th>This work</th>
</tr>
</thead>
<tbody>
<tr>
<td>Technology</td>
<td>SiGe BiCMOS</td>
<td>65 nm CMOS</td>
<td>65 nm CMOS</td>
<td>65 nm LP CMOS</td>
</tr>
<tr>
<td>Publication</td>
<td>2010</td>
<td>2010</td>
<td>2010</td>
<td>2009</td>
</tr>
<tr>
<td>Sampling rate</td>
<td>40 GS/s</td>
<td>40 GS/s</td>
<td>56 GS/s</td>
<td>36 GS/s</td>
</tr>
<tr>
<td>Nom. Resolution</td>
<td>4</td>
<td>6</td>
<td>8</td>
<td>3</td>
</tr>
<tr>
<td>ENOB@15 GHz</td>
<td>-</td>
<td>3.7 bit</td>
<td>6 bit</td>
<td>2 bit</td>
</tr>
<tr>
<td>Chip Power</td>
<td>5.9 W</td>
<td>1.5 W</td>
<td>-</td>
<td>3.3 W</td>
</tr>
<tr>
<td>Core Power</td>
<td>4.5 W</td>
<td>1.5 W</td>
<td>2 W</td>
<td>2.6 W</td>
</tr>
<tr>
<td>Chip Size</td>
<td>7.5 mm²</td>
<td>44 mm²</td>
<td>16 mm²</td>
<td>5.1 mm²</td>
</tr>
<tr>
<td>Core Size</td>
<td>1.4 mm²</td>
<td>16 mm²</td>
<td>&lt;5 mm²</td>
<td>0.16 mm²</td>
</tr>
</tbody>
</table>

channel is measured and an effective resolution of 3.5 bit is achieved. The measurement is performed at 10 GS/s at an input signal frequency of 1.1 GHz. The power consumption is not specified, therefore it is estimated from the figure of merit. It is about twice the power consumption of the ADC in this work. The core size is about 10 times larger, which is not an issue because the ADC is fabricated in a SiGe BiCMOS technology. Thus, it is not possible to integrate the ADC and a DSP on a single CMOS chip, which is also a drawback of this converter compared to the 3 bit CMOS converter.

Unlike converter [10], CMOS converters can be integrated together with the DSP on a single CMOS chip. Here, the chip area is of importance in terms of fabrication costs. The converters [11, 12] occupy chip areas that are 100 and 30 times higher than the area of the 3 bit ADC.

Comparing the effective resolution of the 3 bit ADC with these converters is difficult because their nominal resolution is significantly higher. The converters [11] and [12] are designed for applications where a high resolution is mandatory, for instance, to build a receiver which features a higher order modulation. If an effective resolution above 2 bit is sufficient, as in the application of a Viterbi equaliser, the 3 bit ADC in this work is the better choice due to the comparable power consumption and significantly smaller chip area.
1.4 Objective of This Thesis

The main objective of this thesis is to show the feasibility of a 40 GS/s ADC in state-of-the-art CMOS technology. For this purpose, a prototype of a 3 bit ADC is developed in a 65 nm low-power CMOS technology. The advantages of this solution are, as described in Section 1.2, the lower power consumption and lower system costs. The ADC developed can be integrated with an equaliser for realisation of a receiver in a 40 Gbit/s fibre optical network. The ADC is described in the following chapters, starting with the basic building blocks.

Furthermore, this work discusses the following:

- **Theory**
  The basic properties which characterise an ADC are explained, the measures that characterise the DC and RF performance of an ADC are introduced and the properties of time-interleaved ADCs are summarised. With regard to CMOS technology, the basic properties of the CMOS transistor are introduced. This includes the fundamental transistor equations and the small signal equivalent circuit.

- **Simulation and measurement**
  The simulation and measurement results of the ADC are explained. Although a simulation is only possible for the analogue part of the ADC, the results provide a best-case estimation for the effective resolution. The results of the measurement are discussed and compared to the simulation results.

- **Measurement environment**
  The FPGA-based measurement environment developed for the characterisation of the ADC is explained. The measurement environment allows to characterise the ADC up to a sampling rate of 25.6 GS/s.

- **Hybrid ADC study**
  An analogue demultiplexer (DeMUX) in indium phosphide (InP) technology has been designed to demonstrate the feasibility of a hybrid ADC. Since the hybrid solution benefits from very fast InP transistors, a sampling rate
that is twice the sampling rate of a single CMOS ADC can be achieved. The functionality of the DeMUX is demonstrated with on-wafer measurements.

- Integration

A transimpedance amplifier (TIA) in InP technology, which is suitable for integration with the DeMUX on a single chip, has been developed. The circuit and the measurement results of the TIA chip are presented.

1.5 Outline of This Work

In Chapter 2, the basic ADC theory is summarised. The sampling theorem and the quantization process, which are the basic operations for an analogue-to-digital conversion, are explained. The measures for the characterisation of the static and dynamic ADC performance are introduced.

The basics of integrated circuit design in CMOS technology are summarised in Chapter 3. First, an introduction to MOS transistors is given. Based on this, the principles of the used logic families are explained. In addition, the digital and ADC-specific analogue circuits, which are used in this work, are introduced.

In Chapter 4, the architecture of the ADC is presented. The building blocks of the ADC are explained, starting with the block diagram of the ADC. The chapter is concluded with a floor plan of the ADC, which contains the previously introduced building blocks.

Chapter 5 deals with the measurement of the ADC. The FPGA-based measurement system, which is used to characterise the ADC, is described in detail, starting with the block diagram of the VHDL design. The limits of the measurement system are shown and the necessity of an additional real-time scope measurement is explained. Based on this, a detailed description of the entire measurement setup is given. The influence of the setup on the measurement results is explained and a compensation method is presented.

The simulation and measurement results are summarised in Chapter 6. The expected bandwidth and resolution, which are estimated from simulations, are shown. Following this, the static and dynamic measurement results, which show
the actual performance of a fabricated ADC chip, are presented. The potential optimisation possibilities are summarised in a detailed analysis of the results.

In Chapter 7, a concept for a hybrid ADC is presented. After a brief introduction to InP technology, the circuit of an analogue demultiplexer, which allows a hybrid ADC to be built, is explained. The measurement results of the demultiplexer are shown and the expected system performance of the hybrid ADC is discussed.

The conclusion is given in Chapter 8, in which the main characteristics of the developed chips are summarised and an outlook for future work is provided.
2 Analogue-to-Digital Converter Fundamentals

In this chapter the most important A/D-converter properties are summarised. The sampling theorem is introduced and the influence of the sampling clock jitter is analysed. The quantisation process is described and the correlation between the resolution and signal-to-noise ratio is explained. The ADC specific static and dynamic error measures are introduced and the properties of time-interleaved ADCs are described.

2.1 A/D-Conversion

The analogue-to-digital conversion is performed in two steps called sampling and quantisation. Sampling is the conversion of a continuous time signal into a discrete time signal. The conversion of a signal with a continuous range of values into a signal with discrete values is referred to as quantisation. Both steps must be performed in an ADC in either order, or simultaneously in a single conversion cycle.

2.1.1 Sampling

Figure 2.1 shows a continuous time signal $x(t)$ and the corresponding spectrum $X(f)$ on the right-hand side. The signal is band limited with $f_g$ as the highest frequency component.

To calculate the spectrum of the sampled time signal $s_x(t)$, it is necessary to multiply the input signal $x(t)$ by the sampling function $s(t)$ as shown in (2.1),
where $\delta(t)$ is the Dirac delta function, $k$ is the time discrete sampling instant, and $T_s$ is the sampling period.

$$s(t) = \sum_{k=-\infty}^{\infty} \delta(t - k \cdot T_s)$$  \hspace{1cm} (2.1)$$

The frequency spectrum $S(f)$ of $s(t)$ is given in (2.2), where $f_s$ is the sampling frequency and the inverse of the sampling period $T_s$.

$$S(f) = \sum_{k=-\infty}^{\infty} \delta(f - k \cdot f_s)$$  \hspace{1cm} (2.2)$$

The sampled input signal $s_x(t)$ is obtained by multiplying $s(t)$ and $x(t)$ as shown in (2.3).

$$s_x(t) = x(t) \cdot \sum_{k=-\infty}^{\infty} \delta(t - k \cdot T_s)$$  \hspace{1cm} (2.3)$$

To calculate the resulting frequency spectrum of the sampled signal $s_x(t)$, the convolution in (2.4) must be solved.

$$S_x(f) = X(f) \ast \sum_{k=-\infty}^{\infty} \delta(f - k \cdot f_s)$$  \hspace{1cm} (2.4)$$

After solving the convolution, the frequency spectrum in (2.5) is obtained.

$$S_x(f) = \sum_{k=-\infty}^{\infty} X(f - k \cdot f_s)$$  \hspace{1cm} (2.5)$$

As seen in (2.5), a periodic spectrum is obtained. This is also illustrated in Figure 2.1. To avoid an overlap of high frequency components with low frequency components of the spectrum $X(f)$, the bandwidth of the input signal $x(t)$ must be limited to half of the sampling rate. Otherwise, a correct reconstruction of the input signal is impossible. This is expressed by the Nyquist-Shannon sampling
Theorem: A band limited signal can be fully reconstructed if the highest frequency component is equal to or less than half of the sampling rate \cite{15,16}.

\begin{equation}
\frac{1}{2} \text{Sampling Rate} \leq \text{Highest Frequency Component}
\end{equation}

![Graphical illustration of the frequency spectrum of a sampled signal.](image)

**Figure 2.1:** Graphical illustration of the frequency spectrum of a sampled signal.

**Sampling Clock Jitter and Effective Resolution**

A measure for the stability of the clock signal is the root mean square (RMS) clock jitter. The reduction of the effective resolution due to the clock jitter can be calculated with (2.6), where \(n\) represents the nominal resolution in the number of bits, \(f_{sg}\) represents the signal frequency and \(\Delta t\) the RMS clock jitter \cite{17}.

\begin{equation}
n_{\text{red}} = \frac{\ln \left(1 + h^2\right)}{2 \ln 2}
\end{equation}

\begin{equation}
h = 2^n \cdot \pi \cdot \sqrt{6} \cdot f_{sg} \cdot \Delta t
\end{equation}

In Figure 2.2, the effective resolution of ideal ADCs with a nominal resolution between 3 bit and 10 bit are plotted versus the RMS clock jitter \(\Delta t\) for an input
signal frequency $f_{sg}$ of 20 GHz. The plot illustrates that the clock jitter limits the effective resolution of a high-resolution ADC. For instance, an RMS jitter of 400 fs limits the effective resolution to about 4 bit at an input signal frequency of 20 GHz.

![Effective resolution versus the RMS clock jitter of ideal ADCs with a nominal resolution between 3 bit and 10 bit for an input signal frequency of 20 GHz.](image)

**Figure 2.2:** Effective resolution versus the RMS clock jitter of ideal ADCs with a nominal resolution between 3 bit and 10 bit for an input signal frequency of 20 GHz.

### 2.1.2 Quantisation

In a quantiser, an input signal with a continuous range of values is converted to a signal with discrete values. For an illustration of the quantisation process, Figure 2.3 contains a continuous time signal and a quantised version of the signal with eight quantisation levels.

The quantisation process introduces what is referred to as a quantisation error, which is the difference between the continuous signal $V_{in}$ and its quantised representation $V_{out}$. The mathematical expression for the quantisation error $Q$ is given in (2.7).

$$Q = V_{in} - V_{out} \quad (2.7)$$
The quantisation error is an undesirable but unavoidable consequence of the quantisation process. The error caused by the quantisation error is referred to as quantisation noise.

The ratio between the noise power, which contains all sources of noise in an ADC, and the signal power is called the signal-to-noise ratio (SNR). Under the assumption that the input signal is equally distributed, the SNR of a quantised signal is calculated with (2.8). The equivalent representation in decibels is given in (2.9) [17, 18].

\[
SNR = 2^n \cdot \sqrt{\frac{3}{2}}
\]

(2.8)

\[
SNR_{dB} = 6.02 \cdot n + 1.76
\]

(2.9)
2.1.3 Sampling and Quantisation in a Flash ADC

The actual implementation of the sampling and quantisation circuit depends on the ADC architecture. The ADC in this work is realised in a flash architecture. Figure 2.4 shows the block diagram of a 3-bit flash ADC.

![Figure 2.4: Architecture of a 3-bit flash ADC.](image)

A sample and hold circuit is used to perform the sampling of the input signal, which is built using a switch, a hold capacitance and a buffer. The switch is turned on at the time instant that the clock signal changes from low to high. Thereby, the input voltage is stored on the hold capacitance. Since the buffer has an infinite input resistance, the voltage sample is kept constant during a clock period, thereby providing the comparators with sufficient time to quantise the input signal. The functionality of a comparator is as follows: The output of a comparator switches from logic low to logic high when the sampled input voltage exceeds the value of the reference voltage applied to the comparator reference input. This voltage is also referred to as the threshold voltage of a comparator.

The reference voltages for the comparators are generated by means of a resistive
reference ladder, which are equally spaced between $V_{\text{ref}+}$ and $V_{\text{ref}-}$. The comparator with the output $V_{c0}$ exhibits the lowest threshold voltage. Accordingly, the comparator with the output $V_{c6}$ exhibits the highest threshold voltage.

The quantisation stage behaves, metaphorically speaking, like a thermometer for the input voltage. If the input voltage corresponds to the lowest input voltage level defined, all outputs of the quantisation stage are logic low. If the input voltage increases gradually, more and more outputs switch to logic high starting with $Q_{c0}$. When the maximum specified input voltage is reached, all the outputs exhibit a logic high.

The actual analogue-to-digital conversion is completed when the output values of the comparators are stored in the decider flip-flops. In order to save power and minimise the output interface of the converter, it is advisable to encode the thermometer code to a binary code by means of a thermometer-to-binary encoder.

### 2.2 Static Measures for the ADC Characterisation

In an ideal ADC, the quantisation levels are equally spaced. If $V_Q$ is the voltage between two quantisation levels, it can be calculated by means of (2.10), where $V_{pp}$ is the input voltage range and $n$ the number of bits of the ADC. The actual voltage between two quantisation steps in a fabricated ADC must be estimated by a measurement of the ADC transfer function.

$$V_Q = V_{pp}/2^n$$  \hspace{1cm} (2.10)

To determine the transfer function of an ADC, the input voltage is linearly swept over the input voltage range of the ADC. The digital output value of the ADC is traced while the input voltage is swept. The voltage changes very slowly over several seconds; therefore, it is a direct current (DC) characterisation.

The differential nonlinearity (DNL) and integral nonlinearity (INL) can be calculated from the transfer function. In a flash ADC the DNL and INL are a
measure for the linearity of the ADC input buffers and the quality of the quantisation stage. Offset errors in the comparators of the quantisation stage result in a non-linear transfer function. The error is caused by process variations during fabrication of the circuit. Since the switching voltage of a comparator is shifted by the offset error, the DC transfer curve is shifted too. The switching point of a comparator is shifted by the offset error and, thus, the DC characteristic is shifted too. In Section 4.2, the offset error of a comparator is explained in more detail.

2.2.1 Differential Nonlinearity

The relative error between the measured and ideal length of a quantisation level is the DNL. It is calculated as shown in (2.11), where \( V_D \) corresponds with the threshold voltage between the quantisation levels \( D \) and \( D + 1 \). The unit least significant bit (LSB) is the normalised width \((V_{D+1} - V_D)/V_Q\) of a quantisation level. An ideal ADC has a DNL of 0 LSB over the entire input voltage range because all quantisation levels are equally spaced.

\[
DNL(D) = \left[\frac{(V_{D+1} - V_D)}{V_Q} - 1\right] \text{LSB}, \quad 0 \leq D < 2^n - 1 \tag{2.11}
\]

The static gain error is neglected for calculation of the DNL. Thus, \( V_Q \) is calculated from the measured maximum input voltage range divided by the number of discrete steps \( 2^n \) \[17, 19\].

2.2.2 Integral Nonlinearity

The INL is the absolute difference between the measured and ideal position of the mean voltage which corresponds to a quantisation level in LSB. The mathematical expression of the INL according to this definition is given in (2.12), where \( V_{Dm} \) and \( V_{Di} \) are the estimated and ideal mean input voltages, respectively, which correspond to the digital value \( D \). \( V_0 \) is the input referred offset voltage of the
ADC, which depends on the manufacturing variations in the input amplifier and the comparator stage. It is used to compensate for the offset error.

\[ \text{INL}(D) = \left[ \frac{V_{dm} - V_{Di} - V_0}{V_Q} \right] \text{LSB}, \ 0 < D < 2^n \quad (2.12) \]

As in the DNL calculation, the gain error of the ADC is also compensated. Therefore, an accurate description of this INL would be that it is an end-point INL because both the offset error and gain error of the ADC are cancelled. In this thesis, the terms INL and end-point INL are used synonymously. An ideal ADC has an INL of 0 LSB for every digital value.

### 2.3 Dynamic Measures for the ADC Characterisation

Most effects in high-speed ADCs, like sampling clock jitter, signal attenuation in the input stage and distortion due to cross-talk occur at high input signal frequencies. The clock jitter, for example, limits the maximum effective resolution of a high-speed ADC, as explained in Section 2.1.1. The dynamic measures are used to quantify the impact of these distortion effects on the performance of the ADC in the entire specified frequency range of the input signal.

#### 2.3.1 Signal-to-Noise and Distortion Ratio and Effective Resolution

The signal-to-noise and distortion ratio (SNDR) over the frequency range of the input signal is one of the most important dynamic measures for an ADC. To determine the SNDR of an ADC, the DFT is usually calculated to obtain the frequency spectrum. Subsequently, the signal energy is divided by the sum of all the other spectral energies, except the DC component and Nyquist frequency of the ADC. The resulting SNDR is commonly expressed in decibels. To avoid misunderstandings, it should be mentioned that the abbreviation SINAD is used instead of SNDR in some publications.
To calculate the SNDR at a specific input signal frequency, a sinusoidal input signal is applied with the maximum input amplitude specified for the ADC. Smaller input amplitudes lead to a lower SNDR because only part of the full resolution of the ADC is used in this case. For example, if the input amplitude is equal to half of the maximum input voltage range, the SNDR is lowered by 6 dB.

Usually, an ADC is characterised at several input signal frequencies. This results in a plot similar to Figure 2.5. In general, the input frequencies up to the Nyquist frequency are sufficient to characterise an ADC. It is possible to extend the frequency range to obtain information about the performance of the ADC at higher input signal frequencies. In this case, the higher input signal is folded back into the Nyquist range of the ADC.

![Figure 2.5: Example of an SNDR plot for a 3 bit ADC.](image)

The effective resolution, which is also referred to as the effective number of bits (ENOB), is defined by means of the SNDR as shown in (2.13). The ENOB value is a more intuitive measure compared to the SNDR value because it reflects the actual resolution of an ADC with all distortion effects.

\[
ENOB = \frac{SNDR - 1.76}{6.02}
\]

(2.13)
2.3.2 Spurious Free Dynamic Range

The spurious free dynamic range (SFDR) is defined as the ratio between the power of the fundamental signal and the strongest distortion component. It is commonly expressed in decibels relative to the carrier (dBc). In Figure 2.6, an ADC spectrum is shown and the SFDR is indicated. In this example, the noise floor, which is caused by the quantisation noise, is below 60 dB. Therefore, only the signal bin and the distortion components are visible. The frequency axis is normalised and the signal power of the frequency bins is given in dBm.

![Figure 2.6: Illustration of the spurious free dynamic range.](image)

2.3.3 Total Harmonic Distortion

The total harmonic distortion (THD) is defined as the ratio of the sum of the powers of all signal harmonics and the power of the fundamental signal. Harmonics are caused by the non-linear behaviour of a circuit. Therefore, the THD is a good measure of the linearity of an amplifier or a sampling circuit. As with the SNDR, the THD is also specified in decibels [18].
2.4 Properties of Time-Interleaved ADCs

High-speed ADCs use multiple ADCs, which operate in parallel with a phase shift between each other to reach a high conversion rate. This concept is called time-interleaving. The block diagram of an ADC with fourfold time-interleaving is depicted in Figure 2.7. The ADC is based on the parallelisation of four single ADCs, which operate at the same conversion rate but with a phase shift of 90 degrees between each other. Due to the fourfold time-interleaving, the conversion rate of a single ADC is four times lower than the conversion rate of the complete ADC.

The disadvantage of time-interleaving is the introduction of additional deterministic errors in the output spectrum of the ADC. In particular, these errors can be summarised as offset, gain and sampling-time errors. These effects are analysed in [20] and summarised below to give an understanding of the problem.

The signal $x(t)$ is assumed as the input signal of each single ADC. The offset error $o_l$, the gain error $g_l$ and the timing error $\Delta t_l$ are the deterministic errors of ADC channel $l$. When these errors are taken into account, the output signal $y_l(t)$ of channel $l$ is obtained as shown in (2.14).

The timing error of a channel is considered with the variable $\Delta t_l$, which is used to delay the input signal $x(t)$ with respect to the ideal sampling instant. The period between the sampling instants is constant and can be described with $T_s \cdot M$, where $T_s$ is the sampling period of the TIADC and $M$ is the time-interleaving factor. The term $lT_s$ is used as an offset to consider the phase shift between the ADCs. The time-interleaving factor $M$ is equal to the number of parallel ADCs. The
ideal sampling instant is described with the Dirac function, which results in the term \( \delta (t - kMT_s - lT_s) \), where \( k \) is the discrete time variable.

The delayed input signal \( x(t - \Delta t_l) \) is amplified with the gain \( g_l \) to consider the gain error of channel \( l \). In addition to the gain error, the offset error \( o_l \) is added as a DC offset of channel \( l \).

\[
y_l(t) = \sum_{k=-\infty}^{\infty} \left[ (g_l \cdot x(t - \Delta t_l) + o_l) \cdot \delta(t - kMT_s - lT_s) \right]
\]  

(2.14)

The sum over all channels \( y(t) = \sum_{l=0}^{M-1} [y_l(t)] \) is calculated from the result in (2.14). This corresponds to the multiplexer (MUX) in Figure 2.7.

\[
y(t) = \sum_{l=0}^{M-1} \left[ \sum_{k=-\infty}^{\infty} \left[ (g_l \cdot x(t - \Delta t_l) + o_l) \cdot \delta(t - kMT_s - lT_s) \right] \right]
\]  

(2.15)

By assuming a harmonic input signal \( x(t) \), the erroneous output spectrum is calculated from (2.15) by means of the Fourier transform. The result of this calculation is shown in (2.16), (2.17) and (2.18), where \( \Omega \) is the angular frequency variable, \( \Omega_0 \) is the angular input signal frequency, \( \Omega_s \) is the angular sampling frequency of the TIADC and \( A \) is the amplitude of the harmonic input signal. In (2.16), \( \alpha^* \) is the complex conjugate of \( \alpha \).

\[
Y(\Omega) = \frac{2\pi}{T_s} \sum_{k=-\infty}^{\infty} \left[ \alpha[k] \delta \left( \Omega - \Omega_0 - k \frac{\Omega_s}{M} \right) - \alpha^*[M - k] \delta \left( \Omega + \Omega_0 - k \frac{\Omega_s}{M} \right) + \beta[k] \delta \left( \Omega - k \frac{\Omega_s}{M} \right) \right]
\]  

(2.16)

\[
\alpha[k] = \frac{A}{2jM} \sum_{l=0}^{M-1} \left[ g_l e^{-j\Omega_0 \Delta t_l} e^{-jl2\pi} \right]
\]  

(2.17)
\[ \beta[k] = \frac{1}{M} \sum_{l=0}^{M-1} q_l e^{-jkl\frac{2\pi}{M}} \] (2.18)

As an example, Figure 2.8 shows a spectrum containing all errors of a fourfold time-interleaved ADC, from which the offset, gain and timing errors can be estimated.

Figure 2.8: Spectrum of a fourfold TIADC with offset, gain and sampling time errors.

The offset error causes spurs at multiples of \(\Omega_s/M\), with \(M = 4\) due to the fourfold time-interleaving. The magnitude of these spurs is determined by the expression in (2.18). This can be intuitively explained, e.g. if a DC signal is applied to the input of the ADC. In the worst case, the offset error will cause the output to toggle between four different output values if a fourfold time-interleaved ADC is assumed.

The spurs caused by the gain and timing errors are spaced around the signal frequency \(\pm \Omega_0\) with an offset equal to multiples of \(\Omega_s/M\).
3 CMOS Integrated Circuit Design

This chapter deals with the basics of integrated circuit design. In the first section the field-effect transistor and its equivalent circuits are described in detail. The CMOS logic is introduced in the following section. The current mode logic and basic building blocks that make up the digital part of the ADC are described in the third section. The basic analogue building blocks of the flash ADC are explained in the last section of this chapter.

3.1 The Field-Effect Transistor

Figure 3.1 shows the cross section of an n-channel field-effect transistor (FET). The gate contact together with the bulk form a parallel plate capacitor with an oxide as a dielectric. By applying a positive gate-bulk voltage, electrons are concentrated at the transition between the dielectric and the bulk. When the gate-bulk voltage exceeds the threshold voltage $V_{th}$, a channel is formed below the oxide and electrons flow from the source contact through the channel to the drain contact. Broadly speaking, the transistor may be considered as a voltage controlled current source since the drain-source current is controlled by means of the gate-source voltage. A more detailed view of the transistor behaviour is provided in the following subsections.

![Figure 3.1: Cross section of an n-channel MOS transistor.](image)
3.1.1 Large Signal Model of the MOS Transistor

The complete large signal model is shown in Figure 3.2. It contains a current source \( I_D \), parasitic capacitances \( C_{GB}, C_{GD}, C_{GS}, C_{DB}, C_{SB}, C_o \), the source-bulk and drain-bulk diodes and the gate, drain and source resistances \( R_G, R_D, R_S \).

![Figure 3.2: Large signal model of the MOS transistor.](image-url)

The drain-source current \( I_D \) is zero while the gate-source voltage \( V_{GS} \) of the transistor is below the threshold voltage \( V_{th} \). When the gate-source voltage exceeds the threshold voltage and the drain-source voltage is below \( V_{DSsat} \) as expressed in (3.1), the transistor is in the linear region, which is also referred to as the ohmic region.

\[
V_{DSsat} = V_{GS} - V_{th} \tag{3.1}
\]

The drain current in the linear region is given in (3.2), where \( V_{DS} \) is the drain-source voltage, \( \mu \) is the electron mobility constant of silicon, \( C'_{ox} \) is the specific capacitance per unit area of the gate oxide, \( W \) is the width of the transistor and \( L \) is the channel length.

\[
I_D = \mu C'_{ox} \frac{W}{L} ((V_{GS} - V_{th})V_{DS} - \frac{1}{2}V_{DS}^2) \tag{3.2}
\]
When $V_{DS}$ exceeds $V_{DSsat}$, the channel is pinched-off at the drain side and saturation of the drain current in (3.2) occurs. By substituting $V_{DS}$ in (3.2) with the saturation voltage $V_{DSsat}$, the drain current for the saturation region is obtained as expressed in (3.3) [22].

$$I_D = \frac{1}{2} \mu C_{ox}' \frac{W}{L} (V_{GS} - V_{th})^2$$  \hspace{1cm} (3.3)

In saturation, an important short channel effect called the channel length modulation (CLM) must also be considered [21]. The n-doped drain area is surrounded by a depletion area, the effective channel length $L_{eff}$ reaches up to this area. The width of the depletion area is dependent on the drain-source voltage $V_{DS}$. With increasing drain-source voltage, the depletion area increases and the effective channel length decreases. The influence on the drain current due to this effect is expressed in (3.4).

$$I_D = \frac{1}{2} \mu C_{ox}' \frac{W}{L} (V_{GS} - V_{th})^2 \cdot (1 + \lambda V_{DS})$$  \hspace{1cm} (3.4)

The source-bulk and drain-bulk diodes must be reverse biased to allow proper functioning of the transistor. The diodes can be used as a model for the leakage current of an FET.

The resistors include the contact resistance and resistances of the gate, drain and source material.

The capacitances $C_{DB}$ and $C_{SB}$ are the depletion capacitances of the reverse biased p-n-junctions at the drain-bulk and source-bulk transitions. The capacitance $C_o$ is a constant overlap capacitance, which depends on the geometry of the gate. It is caused by overlap of the gate area with the drain and source area, respectively. The capacitances $C_{GB}, C_{GS}$ and $C_{GD}$ are dependent on the operating condition of the transistor. The values are related to the intrinsic gate capacitance $C_{Gi}$ in (3.5), with $L_{eff}$ as the effective channel length. Table 3.1 contains the values of the capacitors for the three operating conditions. A more detailed description and approximations for the depletion capacitances are provided in [22, 23].

$$C_{Gi} = C_{ox}' W L_{eff}$$  \hspace{1cm} (3.5)
Table 3.1: Typical values of $C_{GS}$, $C_{GD}$ and $C_{GB}$ depending on the operating condition of the transistor.

<table>
<thead>
<tr>
<th></th>
<th>$V_{GS} &lt; V_{th}$</th>
<th>$V_{DS} &lt; V_{DSat}$</th>
<th>$V_{DS} \geq V_{DSat}$</th>
</tr>
</thead>
<tbody>
<tr>
<td>$C_{GB}$</td>
<td>$C_{Gi}$</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>$C_{GS}$</td>
<td>0</td>
<td>$1/2 \cdot C_{Gi}$</td>
<td>$2/3 \cdot C_{Gi}$</td>
</tr>
<tr>
<td>$C_{GD}$</td>
<td>0</td>
<td>$1/2 \cdot C_{Gi}$</td>
<td>0</td>
</tr>
</tbody>
</table>

3.1.2 Small Signal Model of the MOS Transistor

The small signal equivalent circuit of the MOSFET is shown in Figure 3.3. The equivalent circuit is valid for a constant bulk-source voltage $V_{BS}$, which is a very good approximation for the circuits used in this work. The equivalent circuit is obtained by linearisation of the output current and output resistance at the DC operating point. Thus, the circuit is only valid for a small amplitude of the alternating current (AC) input signal $v_{GS}$. The capacitances account for the parasitics of the transistor, which are the same as in the large signal model. In this model, the overlap capacitances are included in the gate-source and gate-drain capacitances. The gate-bulk capacitance can be neglected when the transistor is active [21, 22, 23].

![Small signal equivalent circuit of the MOSFET.](image)

Figure 3.3: Small signal equivalent circuit of the MOSFET.

The values of the linearised transconductance $g_m$ and the output resistance $r_{ds}$ for the two regions of operation of the transistor are given in Table 3.2.


Table 3.2: Parameter of the small signal equivalent circuit of the MOS transistor with channel length modulation.

<table>
<thead>
<tr>
<th>Parameter</th>
<th>Saturation region</th>
<th>Linear region</th>
</tr>
</thead>
<tbody>
<tr>
<td>$g_{m}$</td>
<td>$\frac{\delta I_D}{\delta V_{GS}}$</td>
<td>$\beta V_{DS}(1 + \lambda V_{DS})$</td>
</tr>
<tr>
<td>$r_{ds}$</td>
<td>$\frac{\delta V_{DS}}{\delta I_D}$</td>
<td>$\frac{1}{\lambda I_D}$</td>
</tr>
<tr>
<td></td>
<td>$\beta(V_{GS} - V_{th})(1 + \lambda V_{DS})$</td>
<td>$[\beta(V_{GS} - V_{th} - V_{DS})(1 + \lambda V_{DS})$</td>
</tr>
<tr>
<td></td>
<td>$+ \beta \lambda ((V_{GS} - V_{th})V_{DS} - V_{DS}^2/2)]^{-1}$</td>
<td></td>
</tr>
</tbody>
</table>

3.2 Introduction to CMOS Logic

The CMOS logic was first patented in 1963 by Fairchild Semiconductor [24]. The logic employs p- and n-channel field effect transistors, which are connected in a way that requires no static power consumption. This property makes CMOS the logic family of choice for applications with a high transistor count, which are for example microprocessors, digital signal processors (DSPs), memories and data converters with a high resolution.

The technology that provides the transistors for CMOS logic is named according to the logic: "CMOS technology". In 65 nm CMOS technology, more than 2 million transistors can be integrated in an area of 1 mm$^2$ [25].

3.2.1 The CMOS Inverter

The CMOS inverter realises the basic logic operation, which is a logic inversion of the input signal. The circuit of the CMOS inverter is shown in Figure 3.4. The source of the p-channel transistor is connected to $V_{DD}$ and the source of the n-channel transistor is connected to $V_{SS}$. The input port is connected to the gates of the transistors. The drains of the transistors, which are also tied together, form the output port.

The hole mobility $\mu_h$ of a p-MOS transistor is lower than the electron mobility $\mu_e$ of an n-MOS transistor. Thus, the width of the p-channel transistor is usually between two and three times larger than the width of the n-channel transistor in order to obtain an equal current gain for both transistors. The actual factor
Figure 3.4: The CMOS inverter circuit.

depends on the CMOS technology used. The current gain of a transistor is equal to (3.6).

\[
\beta = \mu C'_{ox} \frac{W}{L} \tag{3.6}
\]

If the current gain of either of the transistors is higher, the transfer curve of the inverter is not point symmetric to \(V_{DD}/2\). In Figure 3.5, typical transfer curves with a different current gain of the n-channel transistor are shown. As illustrated in the figure, the transfer curve is only point symmetric to \(V_{DD}/2 = 0.5\) V if the n-channel transistor has the same current gain as the p-channel transistor [22].

Figure 3.5: Transfer characteristics of the CMOS inverter.
Dynamic Behaviour

The large signal analysis gives the rise time $t_{\text{LH}}$ and fall time $t_{\text{HL}}$ of the CMOS inverter. The rise time is defined as the time required to switch the output of the inverter from the low state to the high state. It is estimated between 10\% and 90\% of the high level. The fall time is defined according to the rise time for the opposite transition of the logic states.

For a symmetric inverter, the rise and fall times are equal. The rise time $t_{\text{LH}}$ is given in (3.7), where $C_L$ is the load capacitance at the output of the inverter, $\beta$ is the current amplification factor of the transistors and $V_{\text{DD}}$ is the supply voltage [23].

$$t_{\text{LH}} = \frac{4 \cdot C_L}{\beta \cdot V_{\text{DD}}}$$  \hspace{1cm} (3.7)

The maximum clock frequency of a clock driver chain, which is built using inverters, is limited by the rise and fall time as expressed in (3.8).

$$f_{\text{max}} = (t_{\text{LH}} + t_{\text{HL}})^{-1}$$ \hspace{1cm} (3.8)

3.3 Introduction to Current Mode Logic

Current mode logic (CML) is used for high-speed and mixed signal circuits. The advantage of CML over CMOS logic is that the current consumption is constant. Thus, only low switching noise is generated on the supply voltage, thereby minimizing the distortion in other circuits. Furthermore, CML is immune to common mode distortion, e. g. fluctuations of the power supply voltage. In other words, the circuits are less sensitive to external distortions which affect both signal lines. Another advantage observed in simulations with 65 nm LP CMOS technology is the higher switching speed of CML compared to CMOS logic [26].
3.3.1 The CML Inverter

Figure 3.6 shows the symbol of a CML inverter. A CML inverter has two input and two output ports. The differential input signal is defined as the difference between the input voltage $V_1$ and $V_2$, as expressed in (3.9).

\[ V_{in} = V_1 - V_2 \]  

(3.9)

The output voltage is the sum of the common mode amplification $A_c$ and the differential mode amplification $A_d$ as expressed in (3.10).

\[ V_{out} = A_d \cdot (V_1 - V_2) + A_c \cdot \frac{V_1 + V_2}{2} \]  

(3.10)

In an ideal CML inverter, the common mode amplification is zero and the differential mode amplification is infinite. A measure for the quality of a differential amplifier is the common mode rejection ratio (CMRR), which is defined as the magnitude of the differential amplification over the magnitude of the common mode amplification. With a high CMRR, the circuit is immune to distortions which affect both input signals, such as supply voltage fluctuations.

The CML inverter is used in both analogue and digital circuits. A typical circuit of a CML inverter with resistive load is shown in Figure 3.7.

The transfer function of the CML inverter is given in (3.11). The appropriate transfer curve is depicted in Figure 3.8, where $\beta$ is the current amplification factor of the transistors $T_1/T_2$, $R$ is the load resistance and $I_{SS}$ is the static current of
the current source. $V_{in}$ and $V_{out}$ are the differential input and output voltages [22] [26].

$$V_{out} = \sqrt{\beta I_{SS}} \cdot R \cdot \sqrt{1 - \frac{\beta_n}{4I_{SS}} V_{in}^2 \cdot V_{in}}$$ (3.11)

Figure 3.8: Transfer characteristic of the CML inverter with resistive load.
Dynamic Behaviour

The 3-dB corner frequency of the transfer function $V_{out}/V_{in}$ is given in (3.12). The rise and fall time is given in (3.13), where $R_L$ is the load resistance of the CML inverter and $C_L$ is the load capacitance of the circuit. To estimate the load capacitance, all parasitic capacitances of the inverter and the input capacitance of the following circuit must be considered. In other words, all parasitic capacitances connected to the output port must be added to obtain $C_L$ [26].

$$f_{3dB} = \frac{1}{2 \cdot \pi \cdot R_L \cdot C_L}$$ \hspace{1cm} (3.12)

$$t_{LH} = 2.2 \cdot R_L \cdot C_L$$ \hspace{1cm} (3.13)

In the small signal analysis of the gain, the same corner frequency $f_{3dB}$ is obtained as in the large signal analysis [27].

3.3.2 Logic Gates in CML

The basic logic gates in addition to the inverter are the AND, NAND, OR, NOR, XOR and XNOR gates. In CML, the AND, NAND, OR and NOR function can be realised with a single circuit. Another circuit is necessary to realise the XOR or XNOR gate. Thus, all basic logic functions can be realised with three circuits in CML.

The OR Gate

The circuit in Figure 3.9 realises the OR function. The logic function of the OR gate is described in (3.14).

$$Q = I_0 + I_1$$ \hspace{1cm} (3.14)
The function can be rewritten by De Morgan’s law as an AND function as illustrated in (3.15).

\[
Q = \overline{I_0 + I_1} = \overline{I_0} \cdot \overline{I_1}
\]  

(3.15)

Thus, the circuit in Figure 3.9 can be changed to an AND gate by flipping the inverted and non-inverted input and output ports, respectively.

![Figure 3.9: Circuit of the OR gate.](image)

A NOR gate is an OR gate with an additional inverter at the output, which is equal to the circuit in Figure 3.9 when the inverted and non-inverted output ports \( \overline{Q} \) and \( Q \) are flipped. The same applies to the AND gate.

**The XOR Gate**

The circuit in Figure 3.10 realises the XOR function. The XNOR function is realised by flipping the output ports of the circuit.

Equation 3.16 describes the XOR function in disjunctive normal form and with the XOR symbol. The output \( Q \) is true if one of the two inputs is true, while the
other input is false.

\[ Q = I_0 \cdot \overline{I_1} + \overline{I_0} \cdot I_1 = I_0 \oplus I_1 \]  

(3.16)

3.3.3 A Flip-Flop in CML

A flip-flop is a circuit which allows the storage of a logic state. Flip-flops are used in an ADC to perform the logic decision after the comparator stage (cf. Section 4.2) and for data synchronisation (cf. Section 4.4). A synchronous flip-flop usually has an input, an output and a clock port. The data is typically stored with the rising edge of the clock signal or, in other words, at the time instant that the clock signal changes from low to high.

There are two important timing restrictions for the input signals that must be met for reliable operation of a flip-flop. These are the setup and hold times. The setup time is the minimum time that a stable signal must be applied at the input of a flip-flop before it is stored with a clock edge. The hold time is the minimum time that the input signal must be stable after a clock edge. If the setup time or hold time are violated, the flip-flop may fall into what is called a metastable state. The setup and hold times can be estimated by simulations or measurements.
In an asynchronous system metastability cannot be avoided. In a decider circuit, for example, metastable states occur because the output signal of a comparator may be close to the common mode voltage, which is not defined in a digital system. A flip-flop leaves the metastable state within a short time, which can be estimated, for example, by a simulation. One method to avoid metastable states at the input of a logic circuit is to use two series-connected flip-flops. If the first flip-flop enters a metastable state, it will have a full clock period to leave this state and make a decision in favour of a valid logic level. Thus, if the flip-flop leaves the metastable state within a clock period, the following flip-flop will contain a valid logic value \[28, 29\].

In CML, a flip-flop is built with two series-connected latches as illustrated in Figure 3.11. The clock signal of the first latch must be inverted, which corresponds to a crossing of the differential clock inputs.

![Figure 3.11: Block diagram of a flip-flop.](image)

The circuit of the latch is shown in Figure 3.12. With a high level of the clock signal, the left transconductance amplifier \((T_1, T_2)\) is turned on. In this state, the output of the latch follows the input signal. In other words, the latch is transparent while the clock signal is high. When the clock signal changes to low, the second transconductance \((T_3, T_4)\) is turned on and the left transconductance amplifier is now turned off. In this state, the latch is in a self-holding mode because the input of the second transconductance is connected to its output. The latch holds the last logic input value that was present during the transparent mode before the clock was switched. A change of the input signal will not affect the output of the latch in this state.

The difference between a latch and a flip-flop is that the output of a latch follows the input signal the entire time that the clock signal is active (logic high). In contrast, a flip-flop changes its output state only with the rising edge of a clock signal.
3.3.4 Current Source

The circuit of the current source used in each CML circuit is shown in Figure 3.13 together with the symbol for the current source. For simplification purposes, the input $V_{\text{ref}}$ is not depicted in the symbol of the current source. $V_{\text{ref}}$ is connected to a current mirror, which is used to adjust the current of all CML circuits.

The circuit of the current mirror is shown in Figure 3.14. The current through transistor $T_2$ is mirrored to the current sources in the CML circuits since the gate-source voltage is the same for all current sources connected to $V_{\text{ref}}$. The resistor $R_2$ determines the input current, which is mirrored into the current sources of the connected CML gates. Resistor $R_1$ is used to adjust the bias current through an external control voltage.
If all related transistors have the same channel length, the current, which is mirrored into a CML gate, can be calculated using equation (3.17), where $W_{CML}$ is the width of the current source transistor in a CML gate, $W_{T2}$ is the width of transistor $T_2$ in the bias circuit and $I_b$ is the reference current. This equation is an approximation because the channel length modulation is neglected \cite{22, 27}.

$$I_{CML} \approx I_b \cdot \frac{W_{CML}}{W_{T2}} \quad (3.17)$$

**Output Resistance**

The characteristic value of a current source is its output resistance. To achieve a lower sensitivity to supply voltage variations, a large output resistance is desired for a current source. Transistor $T_1$ improves the output resistance of the bias circuit in Figure 3.14 by the factor $g_{m1}r_{ds1}$ as expressed in (3.18), where $g_{m1}$ and $r_{ds1}$ correspond to the small signal transconductance and output resistance of $T_1$ and $r_{ds2}$ corresponds to the output resistance of $T_2$ \cite{22}.

$$r_{out} \approx g_{m1}r_{ds1}r_{ds2} \quad (3.18)$$
3.4 Analogue Circuits for High-Speed CMOS A/D-Converters

In general, an ADC can be separated into two parts, the analogue front-end and the digital back-end. The analogue front-end typically contains linear amplifiers, transfer gates and a comparator circuit. In the following, these building blocks are described.

3.4.1 The Linear Amplifier

The circuit of the differential amplifier is introduced in Section 3.3.1. The transfer curve in Figure 3.8 shows good linearity around the zero transition. From (3.11), it is inferred that, if the factor $\frac{\beta_n}{4I_{SS}}V^2_{in}$ is considerably smaller than one, the transfer curve is linear and can be simplified to (3.19). Thus, the linearity of the differential amplifier can be improved with a large current $I_{SS}$.

$$V_{out} = \sqrt{\beta_n I_{SS} \cdot R \cdot V_{in}}$$  \hspace{1cm} (3.19)

The parameters of two identical transistors in an integrated circuit show a random variation after the fabrication process. This is called a device mismatch [30]. The mismatch is responsible for the input referred offset voltage of an amplifier. The transfer curve of an ideal amplifier is a straight line through the origin. In a fabricated amplifier circuit, the input offset voltage shifts the zero crossing of the transfer curve due to a mismatch between the transistors in the transconductance amplifier. Since the mismatch affects all transistors, each amplifier on a chip will have a different offset voltage. A more detailed description of the input referred offset voltage is provided in Section 4.2 for the actual implementation of the quantisation stage. In a time-interleaved ADC, the offset voltage introduces additional spurs as described in Section 2.4.
3.4.2 The Track and Hold Circuit

The sampling of the input signal is realised with the track and hold circuit. The track and hold circuit keeps the input voltage constant during the low phase of a clock signal. It is usually realised with a transistor which operates as a switch and a hold capacitance, as shown in Figure 3.15.

\[ r_{on} \approx \frac{1}{\beta (V_{GS} - V_{th})} \]  

(3.20)

The transistor operates in the triode region when the switch is turned on. The on-resistance in (3.20) is obtained from the small signal analysis [17, 22].

The on-resistance and the hold capacitance \( C_h \) form a low pass, which limits the input bandwidth of an ADC. Switching off the transfer transistor in the track and hold circuit leads to a change in the hold voltage. The reason for this is that the capacitance of the transistor channel changes when the transistor is turned off, as described in the large signal analysis in Section 3.1.1. This can be compensated by dummy transistors. The effect is called charge feed-through effect or the clock feed-through effect [17].

Figure 3.16 shows the compensated differential track and hold circuit. The dummy transistors (\( T_1, T_3, T_4, T_6 \)) are added at both sides of the transfer gate, which exhibit a width that is half the width of the switching transistors (\( T_2, T_5 \)).

3.4.3 The Differential Comparator

A comparator has two analogue inputs and a digital output. The two input voltages are compared with each other. If the input voltage applied to the first input port is larger than the input voltage of the second input port, the output voltage is high; if the input voltage at the first input port is smaller, it is low.
Figure 3.16: Compensated differential track and hold circuit.

The circuit of the differential comparator is shown in Figure 3.17. The differential comparator is realised with two transconductance amplifiers, which use two common load resistors to add the output currents of the transconductance amplifiers. The amplification of this type of comparator is in the range of one because a higher amplification reduces the bandwidth of the comparator. To increase the overall gain of the comparator, additional CML amplifiers must be connected to the output of the comparator.

Figure 3.17: Comparator circuit.

Besides the limited gain, a comparator also suffers from the input offset voltage as described for the linear amplifier in Section 3.4.1. The offset errors introduce a non-linearity in the transfer characteristic of the flash ADC.
The architecture of the time-interleaved flash ADC designed is shown in Figure 4.1. The input $V_{in}$ is connected to four ADCs which operate in parallel. The clock signals of the ADCs have a 90-degree phase difference between each other. Therefore, the architecture realises an analogue demultiplexer by a factor of four. The necessary clock signals are generated by a four-phase clock divider. The four clock phases are chosen in a way that allows equidistant sampling instants for the TIADC to be obtained.

Each ADC channel contains a sample and hold circuit, a quantiser and decider and a binary encoder. The quantiser and decider circuit performs the quantisation of the sampled input signal by means of a comparator stage. A subsequent flip-flop stage performs the digital decision. The resulting thermometer code is encoded into the binary code by means of a 1-of-7 priority encoder. Finally, all four channels are synchronised by a three-step synchroniser to allow synchronous data output from all four channels.

Figure 4.1: Block diagram of the 3 bit ADC.
4.1 Sample and Hold Circuit

The block diagram of the demultiplexer is shown in Figure 4.2. It is built using four sample and hold circuits (S&H), CMOS clock drivers (CKD), two bootstrap circuits and a 50 Ohm input termination. The CMOS clock drivers are used to obtain a higher voltage swing compared to CML-drivers. The bootstrap circuit is used to increase the voltage levels of the clock signal. The higher voltage swing and increased voltage levels are necessary to minimise the on-resistance of the transfer gates, this maximises the input bandwidth of the ADC as described in Section 6.3.

![Figure 4.2: Block diagram of the analogue input stage.](image)

The sample and hold circuit in Figure 4.3 is built using a track and hold circuit, a linear open loop amplifier and a second track and hold circuit. The track and hold circuits are compensated for clock feed-through by means of n-channel transfer transistors as described in Section 3.4.2. The first track and hold circuit is only compensated at the output side. This reduces the input capacitance of the analogue ADC input. It is sufficient because the input signal is fed from a low impedance input.

The second track and hold circuit is compensated at the input and output side. The hold capacitances of both track and hold circuits are composed of the parasitic capacitances of the connected transistors and wiring capacitances.
4.2 Quantisation Circuit

The quantisation circuit is shown in Figure 4.4. It is realised by means of seven comparators. The reference voltages for the comparators are generated by a resistive reference ladder. At each comparator output, five series-connected CML amplifiers are used to increase the gain of the comparators. The gain of a comparator is approximately one and the gain of the amplifier stage is approximately eight. The digital decision is performed by two series-connected flip-flops, which are connected to the outputs of the amplifier stage. As explained in Section 3.3.3, the series-connected flip-flops are necessary to avoid meta-stable states at the output of the decider circuit.

The mismatch of the transistors in the comparators reduces the linearity of the ADC transfer curve. The matching properties of CMOS transistors and CML inverters are investigated in [30, 31, 32] and summarised below to give an un-
understanding of the problem. The mismatch of a transistor pair can be described with a normal distribution. The standard deviation of the offset voltage $\sigma_{VT}$ is proportional to the technology constant $A_{VT}$ and inversely proportional to the square root of the transistor width $W$ and the length $L$ as expressed in (4.1). The technology constant $A_{VT}$ is empirically determined by the technology vendor. For technology nodes up to 130 nm CMOS, the matching was improved for each new node. For these technologies the matching improvement is, in a very good approximation, inversely proportional to the gate oxide thickness. In contrast, with regard to the latest technology nodes with structure sizes below 130 nm, no significant improvements have been reached in the matching properties. Therefore, the gate oxide thickness is not the only factor that has an influence on the technology constant.

$$\sigma_{VT} = \frac{A_{VT}}{\sqrt{WL}}$$  \hspace{1cm} (4.1)

According to (4.1), the transistor width $W$ is the only parameter that can be used to control the offset voltage in a given technology. The transistor length is fixed to the minimum allowed length $L_{\text{min}}$ to maximise the switching speed of the transistor. According to (4.1), the transistor width $W$ must be increased by
a factor of four to halve the standard deviation of the offset voltage of a CML inverter.

The differential comparator is made up of two differential amplifiers which are shorted at the output, thus compared to a CML amplifier, the standard deviation of the offset voltage is increased by the factor $\sqrt{2}$ as shown in (4.2). The equation is valid since the output voltage of the comparator is equal to the sum of the output currents of the transconductances of two differential amplifiers multiplied with the load resistance (see Section 3.4.3). Therefore, the variance $\sigma_c$ of the offset voltage of the comparator is equal to the sum of the variances $\sigma_i$ of the offset voltage of the differential amplifiers. The variances of the offset voltages of the two differential amplifiers are equal if the transistors have the same dimensions, thus $\sigma_{i1}^2 = \sigma_{i2}^2 = \sigma_i^2$. The mismatch of the load resistors can be neglected compared to the mismatch of the transistors, if the resistor area is chosen properly. A larger resistor area improves the matching, with the drawback of a higher parasitic capacitance to the substrate.

$$\sigma_c = \sqrt{\sigma_{i1}^2 + \sigma_{i2}^2} = \sqrt{2\sigma_i^2} = \sqrt{2} \frac{A_{VT}}{\sqrt{WL}}$$  \hspace{1cm} (4.2)

To calculate the standard deviation $\sigma_{ca}$ of the offset voltage of the complete comparator circuit, the series-connected amplifiers must also be considered. The standard deviation of the offset voltage can be calculated by means of (4.3), where $\sigma_c$ is the standard deviation of the comparator offset voltage, $\sigma_a$ is the standard deviation of the amplifier offset voltage, $j$ is the number of amplifier stages, and $A_{cp}$ and $A_a$ correspond to the gain of the comparator and amplifier, respectively [30].

$$\sigma_{ca} = \sqrt{\sigma_c^2 + \left( \frac{\sigma_a}{A_{cp}} \right)^2 + \sum_{n=1}^{j-1} \left( \frac{\sigma_a}{A_{cp} \cdot A_n^2} \right)^2}$$ \hspace{1cm} (4.3)

From (4.3), it is concluded that the largest contribution to the total offset voltage is caused by the comparator and the first two amplifiers. The offset of the following amplifiers can be neglected if the gain of the amplifier stages is larger or
equal to two and the transistors in the amplifiers have equal dimensions. Under this condition, equation (4.3) is simplified to (4.4).

\[ \sigma_{oa} \sim \sqrt{\sigma_c^2 + \left( \frac{\sigma_a}{A_{cp}} \right)^2 \cdot \left( 1 + \frac{1}{A_a^2} \right)} \tag{4.4} \]

The impact of the mismatch can be reduced by either increasing the transistor width or by using an additional calibration circuit that can compensate for the offset voltage. Increasing the width of a transistor decreases the bandwidth of a CML circuit as shown in equation (3.12) in Section 3.3.1 since the transistor width is proportional to the parasitic capacitance of a transistor. The load resistance can be reduced to keep the bandwidth constant. This requires a higher power consumption to keep the output amplitude constant. Thus, there is a trade-off between the power consumption and the offset of an amplifier.

Using a calibration circuit also reduces the bandwidth because additional transistors must be added to compensate for the offset voltage. Depending on the actual circuit, the influence can be kept at a minimum (in the range of a few percent). However, a calibration circuit also requires additional registers to store the calibration data. Furthermore, a data bus that is able to address the configuration registers must be developed. This requires an enormous effort in a full-custom design which is not reasonable for a first prototype.

To conclude, the comparator has the greatest impact on the input referred offset voltage. A low comparator offset voltage and a high comparator gain are ideal. Since it is impossible to realise a high comparator gain and to maintain a large bandwidth, the amplifiers must also be optimised for a low mismatch. If calibration is not used in the comparators, the only way to reduce the offset voltage is to increase the transistor width.

### 4.3 Thermometer-to-Binary Encoder

For realization of the encoder circuit, the comparator offset voltage must be considered, to avoid bubble errors. A bubble error in a thermometer code refers to a single or a sequence of logic zeros surrounded by one or more logic ones. This
must be prevented because it is not defined in the thermometer code. A bubble error occurs, for example, if the switching threshold of a comparator with a lower significant digit is shifted, due to the comparator offset, above the switching threshold of a comparator with a higher significant digit [33,34].

The offset voltage of the comparator stage is calculated to 17 mV by means of equation (4.3), the transistor parameters and the technology constant are taken from the design rule manual of the CMOS technology used. This is a 3 sigma distance to LSB/2 (50 mV). Thus, 99.7% of the comparators have an offset voltage which is smaller than 50 mV. Considering that the ADC is a prototype, this is found to be sufficient to omit the circuit for the bubble error correction because the probability of bubble errors occurring on an ADC chip is below $1 - 0.997^{4.7} = 8\%$. The four refers to the number of ADCs and the seven refers to the number of comparators. As this is a worst case approximation, it is concluded, that in at least 92 out of 100 ADC chips no bubble errors occur. A good method to check for bubble errors in a fabricated chip is to measure the transfer curve of the ADC. If bubble errors occur, the transfer function is not monotonically increasing.

In the encoder, a 1-of-7 priority encoder generates seven select signals, which are encoded by means of OR-gates into a 3 bit binary code. The truth table of the priority encoder is shown in Table 4.1. The logic functions are summarised in (4.5). The input signals of the priority encoder are labelled with $I_0..I_6$ and the output signals with $A_0..A_6$.

<table>
<thead>
<tr>
<th>$I_0$</th>
<th>$I_1$</th>
<th>$I_2$</th>
<th>$I_3$</th>
<th>$I_4$</th>
<th>$I_5$</th>
<th>$I_6$</th>
<th>$A_0$</th>
<th>$A_1$</th>
<th>$A_2$</th>
<th>$A_3$</th>
<th>$A_4$</th>
<th>$A_5$</th>
<th>$A_6$</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>1</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>1</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
</tbody>
</table>
\[
A_0 = I_0 \oplus I_1 \\
A_1 = I_1 \oplus I_2 \\
\ldots \\
A_5 = I_5 \oplus I_6 \\
A_6 = I_6
\] (4.5)

The truth table for the binary conversion is shown in Table 4.2 and the logic functions of the binary encoder are summarised in (4.6). The input signals of the encoder are labelled \(A_0..A_6\) and the three-bit binary output signals with \(Q_0..Q_2\).

<table>
<thead>
<tr>
<th>(A_0)</th>
<th>(A_1)</th>
<th>(A_2)</th>
<th>(A_3)</th>
<th>(A_4)</th>
<th>(A_5)</th>
<th>(A_6)</th>
<th>(Q_0)</th>
<th>(Q_1)</th>
<th>(Q_2)</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
</tr>
<tr>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>1</td>
<td>0</td>
</tr>
<tr>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>1</td>
</tr>
<tr>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
</tr>
</tbody>
</table>

\[
Q_0 = A_0 + A_2 + A_4 + A_6 \\
Q_1 = A_1 + A_2 + A_5 + A_6 \\
Q_2 = A_3 + A_4 + A_5 + A_6
\] (4.6)

The circuit realization is depicted in Figure 4.5. On the left-hand side, the priority encoder is realised with six XOR gates and one CML amplifier. The encoding of these signals into the binary code is realised by means of nine OR-gates. The regular structure of the circuit ensures equal delays for all paths. A delay between
The signals would reduce the maximum possible clock rate of a flip-flop because it must be ensured that all signals meet the setup time of the flip-flops.

![Thermometer-to-binary encoder](image)

**Figure 4.5:** Thermometer-to-binary encoder.

### 4.4 Synchronization Circuit

To realise a clock synchronous interface for a subsequent digital signal processing unit, a synchronisation circuit has been developed. The circuit in Figure 4.6 is used to synchronise the binary output signal of the four ADC channels to a common clock phase.

![Digital output synchronisation](image)

**Figure 4.6:** Digital output synchronisation.

The phase difference between channel one and four is 270 degrees. Flip-flop stages are used to shift the data in 90-degree steps, resulting in a synchronisation of the
two channels after three flip-flop stages. Shifting of the data phase is realised by clocking a flip-flop with a clock signal that is 90 degrees ahead of the data clock.

The phase shift between channel one and two is 90 degrees and the phase shift between channel one and three is 180 degrees. Therefore, channel two is shifted only once and channel three twice. In these channels data synchronous flip-flop stages are used to maintain all four channels synchronously after the phase synchronisation.

Although the data synchronisation can be realised in a more power efficient manner by means of latches, flip-flops are, as described below, better in this case. In Figure 4.7 the timing diagram of a latch-based and flip-flop based synchronisation circuit is shown. In this example, a phase shift of 270 degrees is realised by means of three latch stages and three flip-flop stages, respectively. The output signal of the first latch/flip-flop is the input signal of the second latch/flip-flop and so on. The clock signals (clk $0^\circ$, clk $90^\circ$) of the first two latches/flip-flops are also included in the diagram.

To realise the data synchronisation, the clock signals between two series-connected latches and two series-connected flip-flops, respectively, exhibit a delay of 90 degrees. This corresponds to $1/4 \cdot T$, where $T$ is the period of the clock signal. The waveform of the output voltage of a CML latch exhibits a charging and discharging characteristic of an RC low-pass (cf. Section 3.3.1). The output voltage slope of a latch is determined by its load resistance and its parasitic capacitance at the output.

As illustrated in the diagram, the output amplitude is attenuated after each latch. The main reason for this effect is that the output of a latch does not reach the full output voltage range within the available high period of the clock signal. As shown in the diagram, the output voltage of latch 1 has reached a value of less than half of the maximum output amplitude. The second latch follows this signal with the same delay.

This effect occurs at a high clock frequency when the settling time of the latch is equal to or higher than half of the period of the clock signal. To avoid this, flip-flops are used. As mentioned before, a flip-flop is built using two series-connected
Figure 4.7: Timing diagram of a latch-based and flip-flop-based synchronisation circuit.

latches. The second latch operates with the inverted clock with respect to the first latch. While the first latch follows the input signal, the second latch is in hold mode. After a clock transition, the first latch is in hold mode and the second latch follows the output of the first latch. Thus, the output signal is delayed by means of the flip-flop by T/2, which is effectively the additional time obtained for the signal regeneration. This allows a reliable operation at a higher clock frequency compared to a latch-based synchronisation circuit.
4.5 Clock Divider Circuit

The block diagram of the static clock divider which is realised by using two CML latches is shown in Figure 4.8. The circuit divides the input clock (clk) by a factor of two.

![Clock Divider Circuit Diagram]

**Figure 4.8:** Clock divider circuit.

To increase the switching speed with a large capacitive load at the output of the clock divider flip-flops, additional drivers are necessary. For this purpose, four series-connected CML inverters are used. The dimensions of the first CML inverter in the chain are rather small to keep the parasitic capacitance small at the outputs of the latches. This maximises the operating speed of the clock divider. The size and as a consequence also the power consumption of the CML inverters is also doubled after each amplifier stage. With this approach, the driving capability of the circuit is increased with each amplifier stage.

The differential output signal of the CML inverters is converted by the CML-to-CMOS conversion circuit in Figure 4.9 to a CMOS output signal. The circuit is built using two CMOS inverters (CI) and a DC-biased CMOS inverter (T₁/T₂), which is connected between the output of the first CMOS inverter and the input of the second CMOS inverter.

The DC-biased CMOS inverter allows the duty-cycle to be adjusted by shifting the switching point of the second CMOS inverter in either direction. By tuning the voltage $V_{\text{adj}}$, the duty cycle can be adjusted to 50%.
The output of the CML-to-CMOS conversion circuit is buffered with a CMOS driver chain. The driver chain is built using eight series-connected CMOS inverters. After each CMOS inverter, the transistors are enlarged by a factor of 1.4. This minimises the increase of the capacitive load for each CMOS inverter and allows a large capacitive load to be driven at the output of the chain.

In Figure 4.8, the phase shift between the clock signal clk₁ and clk₂ is $T/4$, where $T$ is the period of the clock signals. Thus, a four-phase CMOS clock signal is obtained at the output of the clock divider circuit since clk₁/clk₂ are the inverse signals of clk₁/clk₂.

### 4.6 Bootstrap Circuit

The bootstrap circuit is shown in Figure 4.10. The circuit provides the clock signals for the transfer gates of the sample and hold circuit. The input signals clk and clk̅ are both fed from the CMOS driver chain in the clock divider circuit.
The purpose of the bootstrap circuit is the generation of a voltage which is higher than the supply voltage. The clock signals clk and \( \overline{clk} \) operate in phase opposition. Assuming that the voltage \( V_1 \) is in the low state of the clock signal “clk” equal to zero volts, \( C_1 \) is charged to \( V_b \) by means of transistor \( T_1 \) due to a high potential at the gate of this transistor. When the clock signal “clk” switches to the high state, the voltage \( V_2 \) exceeds the input voltage \( V_1 \) by the bootstrap voltage \( V_b \).

This only applies to the ideal case when the parasitic capacitance \( C_{p2} \) is zero because \( C_{p2} \) forms together with the capacitor \( C_1 \) a voltage divider. The output voltage \( V_2 \) of this voltage divider is calculated in (4.7), where \( \Delta V_1 \) is the input voltage swing [35]. For symmetry reasons, the output swing of output \( \overline{clk} \) is the same. To achieve the largest possible voltage swing, \( C_{p2} \) must be small compared to \( C_1 \). In this application, \( C_{p2} \) depends on the gate capacitance of the sample and hold circuit as well as its parasitic wiring capacitance. Therefore, \( C_1 \) must be sufficiently large e.g. \( C_1 = 10 \, C_{p2} \).

\[
V_2 = \Delta V_1 \cdot \frac{C_1}{C_1 + C_{p2}} + V_b \tag{4.7}
\]

The capacitance \( C_{p1} \) is the parasitic capacitance to the substrate of capacitance \( C_1 \). In the layout only metal-to-metal capacitors are available. Therefore, the capacitor plate close to the substrate must be connected to the input port, in order to ensure that \( C_{p1} \) is at the input of the circuit. Otherwise, the output amplitude of the circuit will not reach the desired value because \( C_{p1} \) and \( C_{p2} \) would add up to a larger parasitic output capacitance.

The drawback of the entire circuit is a relatively high area requirement, in comparison with the sample and hold circuit itself. The reason is that on the one hand, capacitance \( C_1 \) is relatively large in order to realise the required voltage swing. On the other hand, the large parasitic capacitance \( C_{p1} \) also requires a relatively large input driver to achieve a low rise time at the output \( \overline{clk} \).

### 4.7 Layout Implementation

The floorplan of the ADC core is shown in Figure 4.11. The floorplan reflects the actual size of the components realised on the ADC chip. The core size is about 400x400 \( \mu \text{m}^2 \). All four channels are placed close to each other to minimise the
length of the wiring. A wire realises, due to its intrinsic resistance and capacitance to the substrate, a low pass filter. To avoid a performance limitation due to a bandwidth problem, short wires are used for the interconnection of the ADC building blocks.

Figure 4.11: Floorplan of the ADC core.

The differential analogue input signal is connected by means of a matched microstrip transmission line to the 50 Ohm termination resistors. From these resistors the input signal is connected to the input of the sample and hold circuits using equally long lines. These "T-shaped" lines ensure that the signal delay from the 50 Ohm termination resistors to the input of all four sample and hold circuits is the same. The CMOS clock drivers for the sample and hold circuit are very large compared to the sample and hold circuit itself in order to provide enough driving capability.
Each ADC core is built as a kind of “horizontal stack”, starting with the sample and hold (S&H) circuit and ending with the thermometer-to-binary encoder. The clock drivers which are located beside each ADC core exhibit the same delay as the signal path. The purpose is to maximise the setup time of the flip-flops in the signal path. The synchronisation circuit is placed close to the outputs of the four ADC channels. The necessary clock signals are obtained from the ADC channels.

Figure 4.12 shows the floorplan of the ADC chip. The core is placed close to the pads connected to the analogue input of the ADC to minimise the attenuation of the input signal. The clock divider is placed in the right lower corner of the chip to minimise the influence on the ADC core. All clock lines originating from the clock divider must exhibit the same length to maintain the phase shift of 90 degrees between the clock signals. This is realised with an additional loop in the shorter clock lines.

The digital output signals of the four channels are wired to the output pads on the left and right side of the chip. Due to the relatively large distance of up to 2 mm between the ADC core and the output pads, intermediate buffers are used to avoid the use of transmission lines. In this frequency range and for this wiring distance, transmission lines are necessary to avoid signal reflections. The issue with the transmission lines is the large space requirement, especially when a phase difference must be avoided since it is difficult to wire them with an equal length when starting at the ADC core. With the intermediate drivers, the largest wire is about 1 mm long.

The first intermediate line driver stage is built using three series-connected CML amplifiers per lane. The clock signal is distributed to both sides of the chip and also amplified in the first buffer stage with three CML amplifier stages to realise the same delay as in the data path. In the next stage, the signals are split into an upper and lower buffer stage with preceding re-sampling. The output of this buffer stage is re-sampled again in the actual output driver, which is placed next to the output pads. After re-sampling, the signal is amplified using five CML inverters which are designed to drive a 50 Ohm load.

The ADC chip exhibits two clock outputs. The lower clock output provides a data synchronous output clock, whereas the upper clock output provides a half-
rate clock. Thus, both full-rate and half-rate data transmission is supported. On the top of the chip, the two reference voltages ($V_1$, $V_2$) and the bootstrap voltage ($V_b = V_3$) must be applied. Additionally, some control inputs are available to adjust the bias of the CML circuits ($V_4$, $V_5$, $V_6$) and the duty cycle ($V_7$, $V_8$) of the sampling clock.

Blocking capacitors are placed in the free space between the ADC core and the upper pads as follows: 250 pF are used to block the supply voltage, 30 pF are used to block the reference voltages and 60 pF are used to block the bootstrap voltage and the bias inputs.
5 The Measurement Environment

This chapter deals with the measurement environment and measurement setup. In the first section, an FPGA-based measurement environment is described in detail starting with an overview of the digital system implemented in the FPGA. Subsequently, the components of the design are explained with a focus on the high-speed interfaces of the FPGA, which are used to interface the 3 bit ADC developed in this work.

The FPGA puts certain limits on the measurement, which are discussed in the following section in terms of the pros and cons of an FPGA-based measurement versus a measurement with a 4-channel real-time scope.

In the last section of this chapter, the measurement setup is described in detail. Special attention is given to the RF board which contains the ADC. The layout of this board, the mechanical setup and the frequency response are described in detail. Furthermore, the additionally used measurement equipment, the realisation of the power supply and the algorithms that are necessary for evaluation of the measurement data, are explained.

5.1 The FPGA-based Measurement System

The key component of the FPGA-based measurement system is the RocketIO Characterisation Platform ML423 [36] by Xilinx, which is shown in Figure 5.1. The board is equipped with the Xilinx Virtex4 FPGA XC4VFX100 [37].

The board provides 20 high-speed transceivers called RocketIO transceivers [38]. Each transceiver contains a differential transmitter and a differential receiver which can operate up to a data rate of 6.5 Gbit/s. Twelve of these receivers are connected to the 3 bit ADC in order to capture its output data. The digital
design, which is implemented in the FPGA and realised in VHDL, is described in the following subsection.

5.1.1 The VHDL Design

The block diagram of the VHDL design is shown in Figure 5.2. The design is split into seven blocks and two clock domains. The 50 MHz clock domain is used for the internal data bus. The high-speed clock domain is used for the data transmission with the RocketIO interfaces. The receivers use a 32 bit data bus, which provides the data received from a single high-speed lane. Therefore, the internal clock rate is $1/32$ of the data transmission rate. The same applies for the transmitters, which use a 32-to-1 serialiser to realise the high data transmission rate.

The internal data bus is used for the communication with a personal computer (PC) by means of the serial bus RS232. This allows control of the measurement environment and transfer of the measurement data. The RS232 protocol
is decoded in the serial interface unit. An overlying proprietary protocol is de-
coded in the protocol decoder unit. The proprietary protocol allows the data to be transferred to a specified data address and a specified hardware address. The units in the design are each addressed with a unique hardware address, so that they can be differentiated from each other. Communication with the PC is organised according to the master-slave principle, in which the PC acts as master.

The gigabit interface block contains the RocketIO transceivers. The receive unit contains random access memories (RAM), which allow storage of the measurement data. The same applies for the transmit unit, which uses RAMs for the storage of data intended for transmission via the RocketIO interface. Each receive and transmit unit in the design is associated with a specific RocketIO transceiver. Therefore, the number of instantiated receive and transmit units must correspond to the number of RocketIO receivers and transmitters used.

Control of the measurement flow is specified in the measurement control unit. The transceivers make it possible to adjust some of its internal parameters, like the sampling instant for the incoming data. The transceiver configuration unit provides access to this control port, which is called the dynamic reconfiguration

Figure 5.2: Block diagram of the VHDL design.
port (DRP). A detailed description of these blocks is provided in the following subsections.

The Receive Unit

The block diagram of the receive unit is shown in Figure 5.3. The incoming data from the RocketIO receiver is stored in the RAM, which is implemented as a dual port RAM. This kind of RAM has two independent clock, address and control inputs, thereby allowing simultaneous reading and writing of the data.

![Block diagram of the receive unit.](image)

**Figure 5.3:** Block diagram of the receive unit.

The storing process is started when the signal `start_measurement` is set to logic high. This signal originates from the measurement control unit, which activates the signal when the appropriate command is received from the PC. The control signal operates in the 50 MHz clock domain, as does the data bus, which includes the address decoder and interface for reading data from the RAM. The finite state machine (FSM), which controls the writing into the RAM, the write interface of the RAM and the address counter, operate in the high-speed clock domain.

The sync unit is necessary for synchronisation between the 50 MHz and high-speed clock domain, which is implemented by means of a dual-ranked flip-flop. Dual-ranked flip-flops are two clock synchronous series-connected flip-flops. This cascading prevents meta-stable states at the output of the synchroniser (see Section 3.3.3).
During the measurement process, the address counter of the block RAM increases while the measurement data is being stored into the block RAM. The write process is stopped when the unit “check free space” determines that the RAM is fully written. This is realised by comparing the current write address with the storage capacity of the RAM.

Data from the RAM can be read by means of the internal data bus. The FSM which controls the read out flow is also implemented in the control unit. It operates in the 50 MHz clock domain and is independent from the other FSM. When a read request (bus_data_out_rq) is received from the protocol decoder, the data in the block RAM is forwarded to the output data bus. The address bus is used to address the block RAM directly, whereas the hardware address bus is used to address a specific interface. When the data on the output data bus is valid, the signal bus_data_out_av is set to logic high to indicate to the protocol decoder unit that the data is available.

The Transmit Unit

The block diagram of the transmit unit is shown in Figure 5.4. An address counter is used to read the content of the block RAM cyclically. The read data is passed to a barrel shifter, which is used to shift the output data by one or more bit positions. The barrel shifter, which allows data to be shifted up to 31 bits with respect to a reference data channel, is used to synchronise the output data of different RocketIO transmitters.

If this is not sufficient, then a 32 bit shift can be realised by stopping the address counter for a single clock cycle. For data shifting in the opposite direction, the value of the address counter can be increased by two. This process is controlled manually by means of switches on the FPGA board and can be repeated until the necessary data shift is achieved.

The content of the RAM can be updated arbitrarily during the operation using the internal data bus as described for the receive unit.
Figure 5.4: Block diagram of the transmit unit.

The Gigabit Interface

The RocketIO interface is embedded into the gigabit interface unit in Figure 5.5. Besides the interfaces, the unit contains a reset logic, a clock phase synchronisation and instances for the reference clock inputs.

Figure 5.5: Block diagram of the gigabit interface.

Each RocketIO interface operates with a self-generated clock from an internal phase locked loop (PLL) circuit. The PLL is locked to a reference clock signal which must be provided from an external clock signal generator. To ensure the correct functioning of the RocketIO interface, a reset procedure is implemented according to the Xilinx specification. The reset is performed at the startup of
the FPGA and if the PLL in the RocketIO interface loses the lock to the external reference clock.

Due to the independent clock sources, which each RocketIO interface uses, the output phase of the data from each interface is different. The clock phase synchronisation allows a single clock signal to be used for the transmit and receive units. This is realised with a dual rank RAM, which has a first in, first out (FIFO) configuration.

The reference clock units provide the clock signals for the RocketIO interfaces. The Virtex4 FPGA exhibits four reference clock inputs. The reference clock units are used to assign the desired clock input to a RocketIO interface. Due to the architecture of the FPGA, at least two inputs are necessary to supply more than 10 RocketIO interfaces with a reference clock signal.

Figure 5.6 shows the simplified block diagram of the Xilinx RocketIO interface, which is a hard-wired component in the FPGA. It contains a serialiser, a deserialiser, an encoder, a decoder, PLLs and a control interface (DRP).

Figure 5.6: Simplified block diagram of the RocketIO interface [35].

The serialisers and deserialisers perform the serialisation and deserialisation between the internal 32 bit data bus and the serial high-speed link. The encoders and decoders are intended for the use of standardised communication protocols. In this design, a communication protocol is not used, therefore the encoders and decoders are bypassed.

The external reference clock signal is used to generate the clock by means of the integrated PLLs and clock dividers. For the receive channel, an integrated clock
and data recovery circuit (CDR) can be used to optimise the sampling instant for the incoming data. The CDR can compensate for phase shifts between the reference clock and the incoming data under the condition that there is at least one zero-one transition in a 64 bit data stream. Since this cannot be ensured for the output data of the ADC, the CDR is deactivated to prevent losing the PLL lock. This requires a manual adjustment of the ideal sampling instant by means of the transceiver configuration unit and the DRP port of the transceiver [38].

The Transceiver Configuration Unit

The block diagram of the transceiver configuration unit is shown in Figure 5.7. The unit allows the ideal sampling instant of the RocketIO receivers to be adjusted. It is separated into the following three blocks: a PRBS generator unit, an error tester unit and a DRP module.

The PRBS generator unit generates a deterministic pseudo random bit sequence (PRBS). The generated PRBS data can be applied to the RocketIO transmitters to realise a multi-channel PRBS generator. The implemented test patterns are the same as in commercially available test pattern generators [40].

The bit error rate of a RocketIO receiver channel can be measured by means of the error tester unit if a PRBS input signal is applied at the input of this channel. For estimation of the bit error rate, the unit uses an internal reference PRBS generator to compare the received signal with the internal reference signal. The reference generator is initialised automatically with an arbitrary data vector from
the receiver channel. The bit errors are counted by means of a binary counter. If the bit errors exceed a previously defined error rate, a new arbitrary data vector is chosen. At least one correct data vector is necessary to obtain a reliable bit error rate measurement. The incoming data vectors from the RocketIO interface have a width of 32 bit, thus at least 32 correct bits must be received. The bit error rate is defined as the number of erroneous bits divided by the number of received bits, therefore the bit error rate must be below $1/32$.

The optimisation flow for the sampling instant of the RocketIO interface is implemented in the DRP module. The DRP module has access to the DRP bus of the optimised RocketIO receiver. The ideal parameters for the sampling instant adjustment is estimated with the brute force method. The value with the lowest bit error rate is automatically set after an optimisation flow is finished. The start of the optimisation flow is initiated by a command from the PC. After an optimisation flow is finished, the bit error rate can be read out from the error tester unit to verify the reliability of the receiver channel. An optimisation flow for an interface lasts up to one second.

5.1.2 Data Synchronisation and Evaluation

Synchronisation of the data channels in high-speed parallel data buses is mandatory for correct interpretation of the data. The synchronisation method depends on the architecture and coding of the channels. When a common communication standard is used in the channel coding, such as Infiniband or PCI Express, autonomous synchronisation with the interfaces on the Virtex4 FPGA is possible. Since implementation of a communication standard requires a complex logic in the transmitter, this cannot be realised in a full custom chip design within a reasonable time. Thus, synchronisation of the received data must be realised in a different way.

To understand the synchronisation problem, a retrospective view of the architecture of the receiver in Figure 5.6 is necessary. As shown in the figure, each RocketIO receiver uses its own PLL as a clock source, with the consequence that all receivers exhibit a different clock phase. The second issue is the length of the
RF wires on the FPGA board, which is not the same for different receivers. This also results in a phase difference between the channels.

Another issue that is caused by the architecture of the interfaces is the reset procedure of a RocketIO interface. It is impossible to perform a synchronous reset of all de-serialisers in the interfaces because the duration of the reset procedure, which is defined by the vendor, is not deterministic. This results in a difference of multiple bits between the channels.

The phase shift can be corrected by means of the sampling instant optimisation, which is implemented in the transceiver configuration unit. The multi-bit delay between the channels is synchronised off-line. The delay between the channels can be calculated because the input signal that was applied during the measurement at the input of the ADC is known.

For characterisation of the dynamic performance of an ADC, a sinusoidal input signal is applied at the analogue input of the converter. Subsequently, the SNDR of the digitised sinusoidal input signal is calculated by means of a discrete Fourier transform (DFT), which is performed on a PC. The data synchronisation is realised by shifting all data channels until the SNDR is maximised. The input signal frequency must be chosen in a way that ensures coherent sampling, so that the fundamental signal energy and its harmonics fall within a single frequency bin of the DFT spectrum.

To ensure coherent sampling, the relation in (5.1) must be used to calculate the input signal frequency, where $D_l$ is the length of the DFT, $S$ is the signal bin, $f_s$ is the sampling frequency and $f_{sg}$ is the input signal frequency. $S$ must be an odd number, $D_l$ must be a power of two and $f_s$ is the sampling frequency of the ADC.

$$f_{sg} = S \cdot \frac{f_s}{D_l} \quad (5.1)$$

The parameter $S$ also reflects the number of signal periods in a data stream with $D_l$ samples. If $S$ is even, the period of the DFT would be effectively reduced from $D_l$ time instants to $D_l/S$ time instants. This results in a lower accuracy of the SNDR measurement because the number of different input voltages in this
case is also reduced to $D_l/S$. Errors which occur only for a certain input voltage, such as bubble errors, might not occur if the number of input voltages is too low. Thus, if the input signal is not sufficiently randomised by means of a long DFT period, the result of the SNDR measurement might appear to be better than it actually is. In order to obtain a sufficiently randomised input signal, $D_l$ should be larger or equal to \( (5.2) \), where $n$ is the nominal resolution of the ADC in number of bits [18, 42].

\[
D_l = 4 \cdot 2^n
\]  \( (5.2) \)

5.2 FPGA Versus Real-Time Scope Measurements

The 3-bit ADC in this work requires 12 high-speed interfaces with a data transmission rate of 9 Gbit/s for operation at 36 GS/s. As mentioned above, the data transmission rate of a high-speed interface in the Virtex4 FPGA is limited to about 6.5 Gbit/s. This limits the FPGA measurements to about 25 GS/s.

The limited speed of the FPGA interfaces requires a different approach for a measurement of up to 36 GS/s. A 4-channel real-time scope with a sampling rate of 40 GS/s and a bandwidth of 16 GHz has been chosen to overcome the speed problem of the interfaces [43].

Due to the limited number of scope channels, the ADC channels are measured in four subsequent measurements. Therefore, only the single channel effective resolution can be measured with the scope. The drawback of this measurement method is the loss of phase information that occurs when only one channel is measured. As shown in Section 2.4, the sampling instants must be equidistant to achieve a high effective resolution with the TIADC. However, the timing error between the channels cannot be estimated from the single channel measurements because the other channels are not traced.

In contrast to the timing error, the gain and offset error can be estimated from the results of the DFT of the individual channels. The variation of the signal amplitudes is found in the signal bin and the variation of the offsets is found in the DC signal bin [44].
5.3 The Measurement Setup

The measurement setup is depicted in Figure 5.8. It contains an ADC board, a level shifter board, the FPGA board ML423, two sinusoidal signal generators and a DC power supply board.

The ADC board contains the ADC which is intended for measurement. It is described in detail in the next subsection.

The level shifter board is required to adapt the output levels of the ADC to the FPGA input levels. The FPGA requires input voltage levels of around 1.5 V above the output voltage levels of the ADC. For this purpose, twelve level shifters, which can operate up to 10 Gbit/s, are mounted on the board.

![Diagram of Measurement Setup](image-url)
The power supply board is used to block the power supply of the ADC board and generate the bias voltages for the ADC. All bias voltages are derived from the power supply voltage of the ADC, which is supplied from a single voltage source [46]. Resistive voltage dividers are used to generate these bias voltages. The advantage of the voltage divider compared to additional power supply regulators is that all voltages ramp-up together with the power supply voltage. This prevents incorrect bias voltages from damaging the chip.

The analogue input signal for the SNDR measurement is generated by a sinusoidal signal generator [47] and balanced with a 180°-hybrid [48]. The amplitude is kept constant at the maximum input voltage range over the measured frequency range by compensating the losses of the input transmission lines up to the chip. The required DC offset is realised by means of bias-tees [49].

The clock signal for the ADC board is generated by another sinusoidal signal generator [50]. The clock generator is synchronised to the input signal generator via the 10 MHz reference ports. The clock signal for the FPGA is provided by an external clock divider [51], which is fed by the clock output of the ADC. Thus, the whole system is clock synchronous to the clock generator.

The real-time output data from the ADC is captured by the FPGA for measurements up to 25 GS/s. The measurements up to 36 GS/s are performed with a real-time scope and three of the digital outputs are connected to the real-time scope for this purpose. The captured data is transferred to a PC, where the data synchronisation and data evaluation is performed.

Figure 5.9 contains an example of an FPGA measurement. The time domain signal and the corresponding frequency spectrum, which is calculated by means of a 128-DFT, are shown. The figure contains the results of a single channel measurement at 4.48 GS/s. The input signal frequency is 4.935 GHz. Since the input signal is in the third Nyquist band, the signal bin is found in the calculated frequency spectrum at 4935 MHz – 4480 MHz = 455 MHz.
5.3.1 ADC Board

The ADC chip is mounted on a Taconic RF-60A substrate \cite{52} and bonded to RF transmission lines, which are routed in a star-like layout. The assembled ADC board is shown in Figure 5.10.

The dimensions of the board are 140x140 mm². The RF connectors are SMD-type connectors called SMP \cite{53}. A 26-pole pin socket is used to connect the supply and bias voltage inputs to the power supply board. SMD blocking capacitors are placed about 20 mm left of the chip. The analogue input is on the right-hand side. A Peltier element \cite{54} is placed below the ADC to improve cooling of the chip. The Peltier element is fixed to an aluminium body which is mounted below the PCB. The aluminium body has a thickness of about 10 mm, it is used so that the assembly is able to withstand mechanical stress.

The Peltier element is necessary to improve thermal coupling. Although vias are placed directly under the ADC chip, this is not sufficient for several reasons.
Figure 5.10: The ADC board which is used in the measurements.

Figure 5.11 shows a cut-out from the layout of the ADC board where the ADC chip is placed. In this figure, the substrate is transparent, thus the lower layer metallisation is visible.

One issue is that the vias are not filled, thus only 15–20 $\mu$m copper cladding improves the thermal conductivity of the PCB. The PCB itself has a very low thermal conductivity of 0.54 W/m/K compared to copper which has 401 W/m/K; the chip glue, which is necessary for fixing the ADC to the PCB, further degrades the thermal coupling.

The metallised area below the ADC is also used for the negative power supply, which is connected from the back side of the PCB. For this purpose, there is a cut-out in the ground layer below the bias voltages. On the top layer, the ring around the ADC realises the ground connection to the ground plane on the back side of the PCB. The interconnection of the PCB and the ADC chip is realized with wire bonding (Figure 5.12 shows the bonded ADC chip).

The ground ring allows a pitch of 150 $\mu$m to be realised for the digital output bus,
Figure 5.11: Layout cut-out from the center of the ADC board.

Figure 5.12: The bonded ADC chip.
which is also the minimum pitch the manufacturer is able to achieve on the PCB. Since the output configuration of the data bus is GSSG, every third bond on the digital outputs is bonded to the ground ring. The pitch on the ADC chip is 100 µm, therefore, this is a solution in order to adapt to the larger PCB pitch.

**Cable and Board Attenuation**

A constant input amplitude over the input frequency range is important for accurate characterisation of the ADC. An amplitude that is too low results in a decrease of the effective resolution because the full input voltage range of the ADC is not used. With half the input amplitude (-6 dB), for example, only half of the comparators are used, thus the resolution decreases by one bit. If the input amplitude is too high, additional harmonics lead to a decrease of the effective resolution; this effect is called clipping. When clipping occurs, the sinusoidal input signal becomes a square-like signal, thus uneven signal harmonics are generated.

To obtain a constant input amplitude over the input frequency range at the input of the ADC, a flat frequency response of the PCB, measurement cable and connectors should be maintained. Since it is impossible to maintain a flat frequency response up to an input signal frequency of 20 GHz, the input power must be adapted to the signal attenuation. For this purpose, the frequency response of the cables and the PCB is measured with an S-parameter analyser.

In Figure 5.13, two measurement setups are shown. In the first setup, the S-parameters of the microstrip lines on the PCB are estimated. For this purpose, a GSGSG RF probe is used with a pitch of 100 µm. The inner ground pin is not connected in this measurement because the pitch on the PCB is 150 µm. With the second measurement setup, transmission of the microstrip line, the bond wire and a second microstrip line, which is realised on a CMOS test chip in 90 nm CMOS, is estimated.

The transmission of the differential transmission line of the first setup is calculated from the S-parameter measurement. The result of the measurement is shown in Figure 5.14. The other curve is the result of a second measurement method. In this approach, the ADC, which is mounted on the PCB, is used to estimate the
input signal attenuation. For this purpose, the input amplitude of the sinusoidal signal generator is adjusted until the maximum effective resolution is obtained. The ADC must operate at a low sampling rate during this procedure; in this case, 12 GS/s is found to be sufficiently low. Since the 3-dB bandwidth of the ADC is very large when the ADC operates at a low sampling rate, the ADC can be used to obtain the attenuation of the attached cables and the transmission lines of the ADC board.

The offset of the curve is set to obtain a good matching of the two measurements. When comparing both measurements in Figure 5.14, a very good matching is achieved up to an input signal frequency of 10 GHz. For higher input signal frequencies, the influence of the bond wire and the on-chip transmission line increases. Thus, the second measurement setup with the CMOS test chip is used. This measurement result is shown in Figure 5.15. The transmission shows a matching of $\pm 1\,\text{dB}$ in the whole frequency range, which translates to an SNDR accuracy of less than $1.5\,\text{dB}$ for a full-scale input signal (see Appendix A).
Figure 5.14: Transmission of the differential transmission line.

Figure 5.15: Transmission of the differential transmission line, the bond wires and the chip.
6 Simulation and Measurement Results

6.1 Simulation Results

Transient simulations are performed for the circuit optimization starting with the circuit level. After the circuit level optimisation is finished, a simulation of the influence of the layout on the circuit is carried out. For this purpose, the parasitic RC elements caused by the wiring, are extracted by means of an extraction tool, which is provided by the CMOS technology vendor. Simulation of the mismatch errors is not possible because data on the process variations and mismatch behaviour is not provided by the technology vendor. Therefore, the offset errors are estimated according to Section 4.2. Transient noise is deactivated in the simulations since no significant influence is found.

An important factor during the circuit design is the duration of a simulation. The circuit level simulation is finished within minutes. The simulation with extracted parasitic elements can last up to one week with the available computing power. Due to the long simulation time, parasitic extraction must be limited to the circuit-blocks of interest. Since each circuit can be extracted separately, successive optimisation of the ADC is carried out during the design phase.

The most critical part of the ADC is the sample and hold circuit. To achieve an effective resolution of 2.5 bit at the Nyquist frequency, the 3-dB bandwidth must be at least 20 GHz and the THD (linearity) of the input buffers must be below -20 dB. It is shown with a simulation of the SNDR of the ADC that these objectives are met. In addition to these simulations, timing simulations with the complete ADC are performed to ensure that the setup time of the flip-flops is met.
6.1.1 The Sample and Hold Circuit

Schematic Level

The simulations of the sample and hold circuit at the schematic level are performed at a sampling rate of 10 GS/s. This is sufficient to realise a 40 GS/s ADC with fourfold time-interleaving. In this simulation, the comparator circuit is connected to the output of the circuit, although its output is not evaluated. This is done in order to include its parasitic input capacitance, which decreases the bandwidth of the sample and hold circuit.

Figure 6.1 shows the result of the transient simulation of the transmission of the sample and hold circuit. In this large signal analysis with a peak-to-peak differential input amplitude of 800 mV, the resulting 3-dB bandwidth is above 20 GHz. This is sufficient to achieve an effective resolution of at least 2.5 bit at the Nyquist frequency when operating at 40 GS/s.

Figure 6.1: Results of the transient simulation of the transmission of the sample and hold circuit at the schematic level. The supply voltage is 1.35 V and the ambient temperature is set to 27 centigrade.
6 Simulation and Measurement Results

Layout Level

In order to simulate the performance of the sample and hold circuit of the ADC as accurately as possible, the parasitics of the sample and hold circuit and the comparators including the appropriate clock drivers are extracted. The extraction method is a worst case estimation of the RC-parasitics.

It is found, that the clock divider does not work under this condition. Two reasons are identified: on the one hand, the RC parasitics increase the rise and fall times of the clock divider; on the other hand, the parasitic resistance of the wiring reduces the supply voltage of the circuits.

Therefore, simulation of the transmission in Figure 6.2 is performed at 5 GS/s and the supply voltage is sustained at 1.35 V in order to obtain the worst case performance. In this simulation, the 3 dB corner frequency is halved to 10 GHz, compared to the result at the schematic level.

![Graph](image)

**Figure 6.2:** Simulated transmission of the sample and hold circuit with RC layout parasitics at 5 GS/s. The supply voltage is 1.35 V and the ambient temperature is 27 centigrade.

After the complete layout is finished, the effective resolution of the whole ADC chip is simulated up to an input signal frequency of 15 GHz at a sampling rate of 20 GS/s. The voltage drop across the power supply grid is found to be about 200 mV, although a chessboard-like power supply grid is used with two of the
thickest metal layers available. This is basically a problem of this specific manufacturing run because thick low resistive metals are not available.

To compensate for the voltage drop, the power supply is increased to 1.55 V. Figure 6.3 shows the result of the simulation at ambient temperature. The effective resolution is above 2.6 bit up to an input signal frequency of 15 GHz and the 3-dB bandwidth is approximately 16 GHz. Simulation of the whole ADC chip is very computationally intensive; approximately three weeks of simulation are necessary to obtain the result in Figure 6.3; therefore a simulation at higher sampling rates is not performed.

![Figure 6.3](image.png)

**Figure 6.3:** Simulated effective resolution of the ADC at 20 GS/s. The RC layout parasitics of the whole ADC chip are included in this simulation. The supply voltage is 1.55 V and the ambient temperature is set to 27 centigrade.

### 6.1.2 Summary of the Simulation Results

The simulation results show that the wiring parasitics from the layout have an enormous influence on the performance of the ADC. In the simulation with extracted parasitics, a 3-dB bandwidth of 10 GHz is achieved at 20 GS/s, compared to 20 GHz at 40 GS/s at the schematic level. Compensation of the voltage drop along the supply voltage grid results in a bandwidth of approximately 16 GHz at a sampling rate of 20 GS/s.
6.2 Measurement Results

The following measurement results are obtained using the measurement setup in Section 5.3. First, the DC transfer curve of the ADC obtained by using the FPGA measurement system is shown, as well as a PC-controlled DC power source, which is used to sweep the ADC input voltage. In the FPGA-based dynamic measurement, the results up to a sampling rate of 25.6 GS/s are shown. Subsequently, the single channel, real-time scope measurements up to a sampling rate of 36 GS/s are presented.

6.2.1 DC Characteristics

The measurement result of the static transfer curve of the ADC is shown in Figure 6.4. The curve is increasing monotonically, which means that bubble errors do not occur.

![DC Transfer Curve]

**Figure 6.4:** The DC transfer curve of the ADC.

The INL and DNL are calculated from the DC characteristic. Figure 6.5 shows the resulting INL plot. The maximum INL of channel 3 is 0.3 LSB and the maximum
INL of the other channels is 0.4 LSB. This refers to a maximum comparator offset of 30 mV and 40 mV, respectively. The maximum mismatch between the four channels is 1.2 LSB.

![Graph of INL](image1.png)

**Figure 6.5:** The estimated INL of the four ADC channels.

The DNL plot is shown in Figure 6.6. The maximum DNL of channels 1 to 4 is 0.4 LSB, 0.8 LSB, 0.6 LSB and 0.6 LSB, respectively.

![Graph of DNL](image2.png)

**Figure 6.6:** The estimated DNL of the four ADC channels.
6.2.2 Dynamic Characteristics

FPGA Measurement Results

Figure 6.7 shows the single channel measurement results for the TIADC operating at 12.8 GS/s. All channels exhibit an SNDR above 15 dB up to an input signal frequency of 15 GHz. Due to mismatch between the channels, the results of the individual channels differ by up to 3 dB. In Figure 6.8, the SNDR of the TIADC is shown after multiplexing of the four data channels to a single data stream by means of Matlab and subsequent calculation of the SNDR. The SNDR is about 15 dB up to an input signal frequency of 15 GHz. This corresponds to an effective resolution of 2.2 bit.

![SNDR vs Input Frequency](image)

**Figure 6.7:** Signal-to-noise and distortion ratio versus the input signal frequency of the four ADC channels; each channel is operating at 3.2 GS/s, which corresponds to a sampling rate of 12.8 GS/s for the TIADC.
The single channel measurement result at a sampling rate of 25.6 GS/s is shown in Figure 6.9. The channels exhibit an SNDR above 14 dB up to an input signal frequency of 20 GHz. Due to mismatch between the channels, the results of the individual channels differ by up to 3 dB. The SNDR of the TIADC is shown in Figure 6.10. The SNDR is between 14 dB and 15 dB up to an input signal frequency of 8 GHz. At 9 GHz, the SNDR drops to 13 dB, whereas the SNDR is about 15 dB between 12 GHz and 18 GHz.

**Real-Time Scope Measurement Results**

Figure 6.11 shows the measurement results at a single channel sampling rate of 9 GS/s, which corresponds to a sampling rate of 36 GS/s for the TIADC. The results are obtained using the real-time scope [43]. The increased sampling rate reduced the SNDR by about 2 dB for input signal frequencies between 12 GHz and 15 GHz compared to the FPGA-based measurements at 6.4 GS/s. In the residual frequency spectrum up to 20 GHz, the SNDR is above 14 dB.
Figure 6.9: Signal-to-noise and distortion ratio versus the input signal frequency of the four ADC channels; each channel is operating at 6.4 GS/s, which corresponds to a sampling rate of 25.6 GS/s for the TIADC.

Figure 6.10: The signal-to-noise and distortion ratio versus the input signal frequency of the TIADC at 25.6 GS/s.
6.3 Analysis of the Results

The sample and hold circuit is the most critical part of the ADC. The following analysis is used to obtain an understanding of the effects in the sample and hold circuit that reduce the effective resolution. An equivalent circuit is developed for this purpose. Figure 6.12 shows the equivalent circuit of the sample and hold circuit in the TIADC. The resistors $R_{50}$ are the 50 Ohm termination resistors of the external driver and input termination, respectively. $L_{bond}$ is the inductance of the bond wire. $C_{in}$ is the input capacitance of the analogue input, which includes the parasitic capacitance of the pad, the bond wire, the termination resistor and the wiring capacitance up to the input of the transfer gate. The capacitors $C_{S1}$ are the parasitic input and output capacitances of the first transfer gate. The resistance $R_{S1}$ is the on-resistance of the transfer-gate and $C_{Vi}$ is the input capacitance of the CML-buffer. The output resistance of the CML-buffer is $R_{200}$ and the output capacitance is $C_{Vo}$. $C_{S2}$ is the parasitic input and output capacitance of the second transfer-gate and the on-resistance is $R_{S2}$. The parasitic capacitance of the wiring structure and the input capacitance of the comparators are accounted for by $C_{WK}$ and $C_{K}$, respectively.
The equivalent circuit can be split into two parts, which can be analysed separately. In the first part, the transmission of the input structure and the first track and hold circuit is analysed. In the second part, the settling time of the second track and hold circuit is calculated. This is the time span that the second track and hold needs to follow the input voltage after it switches from the hold mode in the tracking mode. Figure 6.13 shows the equivalent circuit of the first track and hold circuit in the transparent state. In Table 6.1, the values for the parasitic elements are given. The result in Figure 6.14 is obtained through simulation of the transmission when the bond wire is neglected. The result shows a 3-dB bandwidth beyond 20 GHz.

In Figure 6.15, a bond wire inductance of 1 nH is assumed, which corresponds to a bond wire length of about 1 mm. This reduces the bandwidth to 13 GHz. The bond wires in the measurement setup have the same approximate length. For the measurements, this can be compensated with a larger input amplitude at high input signal frequencies. In an application, the bond wire length should be as short as possible (< 200 µm) to achieve a large bandwidth.
In the analysis of the first track and hold, it is found that the time constant $R_{S1}C_{S1}$ is only dependent on the CMOS technology used. The reason is that the on-resistance of the transfer transistor is inversely proportional and the parasitic capacitance is proportional to the width of the transfer transistor. Hence, if the input capacitance (cf. $C_{V1}$ in Figure 6.13) of the subsequent circuit is very low (e.g. $C_{V1} < 1/10 \cdot C_{S1}$) compared to the parasitic capacitance $C_{S1}$ of the transfer gate, the maximum bandwidth that can be realised with the CMOS technology used is achieved.
To simplify the calculation of the settling time of the second track and hold circuit, the circuit is modified according to Figure 6.16. Compared to Figure 6.12, the output capacitances $C_{V_0}$ and $C_{S2}$ are moved to the output of the circuit. As a further simplification, it is assumed that the transfer gate immediately switches on and off. In effect, this means that the rise and fall time of the clock drivers is assumed to be zero.

The output voltage of the first track and hold circuit is constant while the second track and hold circuit is transparent. Therefore, the capacitor discharging equation for a differential amplifier in (6.1) is used according to [26] in order to calculate the time span that the second track and hold needs to reach its final value for a correct conversion of the input voltage. In the following calculations this time span is expressed as the settling time $t_{ts}$.

$$U(t) = I_{SS}R_L(2e^{-\frac{t}{\tau}} - 1)$$  \hspace{1cm} (6.1)

For calculation of the settling time, it is assumed that the output voltage of the second track and hold must be discharged from the maximum voltage ($I_{SS}R_L$) to the voltage which corresponds to 1 LSB ($I_{SS}R_L(1/2^{n-1} - 1)$). By solving equation (6.2) for $t_{ts}$, the settling time is obtained according to (6.3).

$$I_{ss}R_L\left(\frac{1}{2^{n-1}} - 1\right) = I_{ss}R_L(2e^{-\frac{t_{ts}}{\tau}} - 1)$$ \hspace{1cm} (6.2)

$$t_{ts} = \tau \cdot n \cdot \ln 2$$ \hspace{1cm} (6.3)

The actual available settling time of a track and hold, which is in fact the time span that the track and hold is transparent ($t_t$), corresponds to half the clock period of the sampling clock. Therefore, the interleaving factor $M$ and the sampling
frequency $f_s$ or sampling period $T_s$ must be considered. A higher interleaving factor permits a longer settling time $t_{ts}$ of the sample and hold circuit and vice versa. Equation 6.4 sets the time the transfer gate is transparent into relation with the sampling frequency and the interleaving factor.

$$t_t = M \frac{T_s}{2} = \frac{M}{2 \cdot f_s}$$

(6.4)

To achieve the maximum effective resolution, the settling time $t_{ts}$ must be smaller than the time the transfer gate is transparent ($t_t$) as expressed in (6.5).

$$t_{ts} < t_t$$

(6.5)

According to (6.4), $t_t$ is equal to 50 ps for four-fold time-interleaving and a sampling rate of 40 GS/s. By means of (6.3) and the values in Table 6.2, the settling time of the second track and hold circuit is calculated to 49.5 ps. This is very close to the time the transfer gate is transparent. Depending on the shape of the clock signal, this might not be sufficient to reach the full ADC resolution. With an increasing sampling rate, the clock signal approaches a sinusoidal shape because the bandwidth of the clock drivers is limited. In this case, the assumption that the rise and fall time of the clock drivers can be neglected is no longer valid.

Table 6.2: Estimated parasitics of the second track and hold circuit.

<table>
<thead>
<tr>
<th>Value</th>
<th>$R_{200}$</th>
<th>$C_{V0} + C_{S2}$</th>
<th>$R_{S2}$</th>
<th>$C_{S2} + C_{Wk} + C_K$</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>200 Ohm</td>
<td>25 fF</td>
<td>140 Ohm</td>
<td>45 fF</td>
</tr>
</tbody>
</table>

Several conclusions are derived from the results of the settling time analysis. The available settling time of the second track and hold is long enough if the clock drivers exhibit sufficient bandwidth. Otherwise, there are a few optimisation methods that can be applied in redesigning the ADC. The first one is to increase the bandwidth of the clock drivers, e.g. by means of inductive peaking of these drivers. A second approach is to reduce the capacitive load of the comparator circuit, which would decrease the settling time $t_{ts}$. This solution requires a calibration circuit to compensate for the offset errors that become more severe when the transistor widths in the comparators are reduced. Another possibility is to
increase the interleaving factor, which has the drawback of having a higher circuit complexity and higher power consumption.

6.4 Conclusion

In the simulation results at a sampling rate of 20 GS/s, the effective resolution of the ADC is about 2.5 bit up to an input signal frequency of 16 GHz. These results are confirmed in the single channel measurements at a sampling rate of 25.6 GS/s. In these measurements, an effective resolution of up to 2.7 bit at an input signal frequency of 18 GHz is achieved. The effective resolution of the TIADC, after multiplexing of the single channel data to a single output stream, is approximately 2.2 bit up to an input signal frequency of 19 GHz. It is lowered by offset and gain errors in the ADCs, as described in Section 2.4. This is also evident from the INL measurements, which show a maximum mismatch of 1.2 LSB between the channels. This is one of the main reasons for the reduction of the effective resolution of the TIADC compared to the single channel results.

The FPGA measurement results also show that the sampling time error between the channels is not an issue since the SNDR of the TIADC in Figure 6.9 is nearly constant over the measured frequency range. Independent of the sampling rate, due to a timing mismatch, a high frequency input signal exhibits a greater amplitude error than a low frequency input signal. Since this is not the case in the SNDR measurements, the timing error is not an issue in this ADC. In contrast, as mentioned above, the gain and offset errors of the ADC channels must be reduced to achieve a higher effective resolution.

The functionality of the ADC is shown for a sampling rate of up to 36 GS/s using a real-time scope. At this sampling rate, the effective resolution of the single channels drops to about 1.7 bit for input signal frequencies between 12 GHz and 15 GHz. From the analysis of the sample and hold circuit, it is found that the parasitic capacitance at the output of the second track and hold is responsible for the reduction of the effective resolution of the ADC at a sampling rate of 36 GS/s.
7 Hybrid ADC Feasibility Evaluation

The high integration density of CMOS technologies and transit frequencies above 200 GHz allow the design of fast and complex analogue circuits with low power consumption. However, bipolar technologies still offer the benefit of twice the transit frequency. If CMOS is used for the highly complex lower-speed parts of a system and a bipolar technology is used for the high-speed front-end, the advantages of both technologies can be exploited. For the following circuits, the 1 $\mu$m emitter indium phosphide (InP) double heterojunction bipolar transistor (DHBT) technology from Fraunhofer IAF \[4\] with a transit frequency $f_t$ of 300 GHz and a maximum frequency $f_{\text{max}}$ of 350 GHz is used.

In this chapter, a feasibility study for a hybrid ADC is presented. The objective of the hybrid ADC is to show that a higher sampling rate, a larger bandwidth and a higher effective resolution can be achieved compared to a state-of-the-art CMOS ADC. In the first section, an overview of the InP technology used is given. In the following section, the architecture and basic building blocks of the hybrid ADC are explained. The simulation and measurement results of the analogue demultiplexer are presented in the third section and compared with the state of the art in the forth section of this chapter.

For application in an optical receiver system, low system complexity is an important cost factor. A transimpedance amplifier (TIA) that can be integrated with the demultiplexer on a single chip is designed for this purpose. The properties of the TIA are summarised in the fifth section.

A summary of the results achieved is given in the last section of this chapter.
7.1 Indium Phosphide Technology

The InP technology provides transistors with an emitter width of 1 µm and an emitter length of between 2 µm and 16 µm. Figure 7.1 shows the micrograph of such a DHBT with an emitter length of 4 µm.

The DHBTs exhibit a DC common-emitter current gain of approximately 90, a turn-on offset voltage of 0.15 V and a breakdown voltage $BV_{ceo}$ of approximately 5 V. Figure 7.2 shows the measured collector current $I_C$ over the collector-emitter voltage $V_{CE}$ as a function of the base current $I_B$.

Figure 7.3 shows the measured cut-off frequencies versus the collector current density. The maximum cut-off frequencies $f_T/f_{max}$ of 300/350 GHz are achieved at a collector current density of about 4.5 mA/µm$^2$.

The technology also provides silicon nitride (SiNx) metal-insulator-metal (MIM) capacitors, nickel chrome (NiCr) thin film resistors and three levels of gold-based interconnect metals. Since the substrate is highly resistive, there is no parasitic capacitance to the substrate. This is the major benefit of this technology (see [4], for a more detailed description) compared to CMOS technologies. 
Figure 7.2: Measured collector current versus the collector-emitter voltage as a function of the base current [57].

Figure 7.3: Measured cut-off frequency versus the collector current density of the DHBTs [57].
7.2 System Architecture

Figure 7.4 shows the block diagram of the hybrid ADC. The analogue input signal is fed to the input of a demultiplexer (DeMUX). The output signals of the analogue demultiplexer are connected to two ADCs. The clock rate ($\text{clk}_1$) of the DeMUX is 25 GHz for a 50 GS/s hybrid ADC. The ADC clock rate ($\text{clk}_2$) depends on the architecture of the ADC. The phase shift between the ADCs must be 180 degrees.

![Block diagram of the high-speed analogue-to-digital conversion system.](image)

The block diagram of the demultiplexer is shown in Figure 7.5. The input signal is connected to two multiplexer circuits. One of the input ports of each multiplexer is connected to its output port; in this configuration the multiplexer operates as a track and hold circuit. The amplification of the multiplexer must be one, otherwise the output signal does not remain constant in the hold phase.

The circuit of the analogue multiplexer is shown in Figure 7.6. The differential pairs $T_1$, $T_2$ and $T_3$, $T_4$ are used as transconductance cells. They are connected to four differential transistor pairs, which are used for switching either of the channels to the output load resistors $R_L$, while the other channel is connected to the dummy resistors $R_X$. The resistors $R_{DG}$ are used to linearise the transconductance cells, which is necessary to achieve a large linearity.

The ideal DC input voltage for the multiplexer is -3 V, therefore an emitter follower with diode-connected transistors is used as an input driver to shift the input signal down to this operating point. The clock signals for the switching transistors are provided by four series-connected CML-amplifiers.
The micrograph of the demultiplexer chip is shown in Figure 7.7. The analogue outputs are at the top and bottom of the chip, and the input is on the left-hand side. The clock input is on the right-hand side. The inputs and outputs of the chip are connected to coplanar transmission lines. The power supply is connected to the outer pads next to the RF pads at the four edges of the chip. The large metal areas adjacent to the transmission lines are used as blocking capacitors for the supply voltage.
Figure 7.7: Micrograph of the demultiplexer chip (1.5x1.5 mm²).

7.3 Simulation and Measurement Results

7.3.1 S-Parameter Results

The measurement of the S-parameters is performed with the setup in Figure 7.8. A two-port S-parameter analyser is connected to one of the differential input ports and one of the differential output ports.

Figure 7.8: Setup for measurement of the S-parameters.
The simulated and measured phase response is shown in Figure 7.9. The progression of the curves is linear. The gradient of the phase is higher in the measurement due to the delay of the transmission lines.

![Figure 7.9: Measured phase versus simulated phase.](image)

The simulated and measured transmission are shown in Figure 7.10. The transmission from input port 1 to output ports 2 ($S_{21}$) and 4 ($S_{41}$) is shown. The input amplitude in the measurement is -2 dBm. The clock input is set to a DC voltage to enable the measured output port. The measured 3-dB bandwidth is 16 GHz, but the attenuation of the input signal is below 4 dB up to an input signal frequency of 35 GHz.

The simulation is at the schematic level, therefore the parasitics from the layout are not considered. In Figure 7.11 another simulation is performed with a 60 fF capacitance at the output port of each multiplexer. With this capacitance, the simulation result is very close to the measurement result.

The parasitic wiring capacitance is estimated to about 30 fF. Therefore, the additional parasitics must be caused by the transistors themselves. An explanation for the deviation could be that the transistor model is not accurate, which is a feasible explanation because the technology used is still in development.
Figure 7.10: Measured and simulated transmission -6 dB is the reference level for estimation of the bandwidth. This is due to the single-ended measurement of the transmission.

Figure 7.11: Re-simulation of the demultiplexer with a 60 fF capacitive load at the output of each multiplexer circuit.
If this is indeed the problem, then the parasitic capacitance is also increased in the other transistors. Therefore, the simulation is repeated with the additional capacitances, however the results do not show a significant reduction of the circuit bandwidth. Therefore, it is concluded that the output is a critical node, in which the parasitic capacitance causes a noticeable attenuation and the inaccuracy of the models is a feasible explanation. Since three transistors are connected at the critical node mentioned above, the inaccuracy of the model must be in the range of 10 fF. Given that the thickness of the transistor layers is a critical value with regard to the parasitic capacitance, this could also be caused by process variations that are not considered in the model.

### 7.3.2 THD and SFDR Results

The setup for the measurement of the total harmonic distortion (THD) and the spurious free dynamic range (SFDR) is shown in Figure 7.12. The input signal is generated by a sinusoidal signal generator and balanced with an 180° hybrid. The output of the hybrid is connected via DC blocks to the input of the demultiplexer. One of the differential outputs of the demultiplexer is connected to a spectrum analyser while the other output is terminated with 50 Ω. In both signal paths, DC blocks are used because a DC current could destroy the spectrum analyser input port. The clock signal for the demultiplexer is generated by an additional sinusoidal signal generator. Both generators are synchronised with a common 10 MHz reference signal.

![Figure 7.12: Setup for the THD and SFDR measurement.](image)
The spectrum analyser has a bandwidth of 40 GHz, which is sufficient to obtain the SFDR and the THD in the first Nyquist band. The THD and SFDR are calculated from the measured output spectrum of the demultiplexer.

The result of the measurement after correcting it for cable, probe and hybrid losses, is shown in Figure 7.13. For a sinusoidal differential input signal with a peak-to-peak voltage of 500 mV, a THD below -32 dB and an SFDR above 35 dBc is measured up to an input signal frequency of 24 GHz at a clock frequency of 25 GHz. This corresponds to a system sampling rate of 50 GS/s.

![Figure 7.13: The magnitude of the THD and SFDR with a differential input voltage of 500 mV-PP and a system sampling rate of 50 GS/s. The outputs are measured single-ended.](image)

**7.3.3 SNDR Results**

To obtain the characteristics of the demultiplexer in the intended application, a fully differential measurement with a 2-channel 80 GS/s real-time scope is performed. The setup is the same as in Figure 7.12 but the differential output of one channel is now connected to the real-time scope.

The sampling clock of the scope is free running, which may decrease the measured effective resolution slightly. Only one differential output signal can be measured simultaneously. The SNDR is calculated by means of a 512-DFT. The input signal is chosen in a way that ensures coherent sampling in order to concentrate
the signal energy on a single frequency bin in the DFT. The resolution of the scope is limited to about 5.5 bit with a full-scale input amplitude.

The measurement result for a differential input voltage of 500 mV from peak-to-peak is shown in Figure 7.14. The graph contains the effective resolution of both channels at a clock rate of 20 GHz, the effective resolution of channel 1 at a clock rate of 2 GHz and the linearity with a static clock signal. The input signal is varied linearly from 2 GHz up to 20 GHz in 0.5 GHz steps. The measured effective resolution is above 4.1 bit in the whole Nyquist band at a clock rate of 20 GHz. Thus, with the demultiplexer operating at a clock rate of 20 GHz, an A/D-conversion system with a sampling rate of 40 GS/s and an effective resolution above 4.1 bit can be realised. At 4 GS/s, the effective resolution is above 5 bit up to an input signal frequency of 20 GHz.

![Figure 7.14](image)

**Figure 7.14:** Effective resolution of the A/D conversion system measured with a 80 GS/s real-time scope.

Further measurements at a demultiplexer clock rate of 10 GHz are shown in Figure 7.15. This clock rate corresponds to a sampling rate of 20 GS/s. The measured effective resolution is above 4.9 bit up to an input signal frequency of about 15 GHz. In the residual frequency range up to 20 GHz, the effective resolution is above 4.4 bit.

In Figure 7.16, a time domain measurement result is depicted at a clock rate of 10 GHz and an input signal frequency of 19.8 GHz, which is in the second Nyquist
Figure 7.15: Effective resolution of the A/D conversion system measured with a 80 GS/s real-time scope.

The track and hold phases can be clearly distinguished from each other. Due to the undersampling, the resulting output frequency is 200 MHz, which can be recognised as the envelope of the measured curve.

Figure 7.16: Screenshot of the real-time scope at a demultiplexer clock rate of 10 GHz and an input signal frequency of 19.8 GHz.
7.4 Comparison with the State of the Art

Comparing the demultiplexer with a state-of-the-art circuit is difficult because, as far as is known, this is the first bipolar analogue demultiplexer for application in a hybrid ADC. Typically, only a track and hold circuit is realised in a fast bipolar technology and the interleaving is done on the ADC chip itself.

Table 7.1 contains a comparison of the demultiplexer with state-of-the-art track and hold circuits. The sampling rate of the circuits is between 40 GS/s and 50 GS/s. The SFDR and the THD are given for different input signal frequencies, therefore it is difficult to compare the circuits directly with each other. The track and hold circuit in [58] exhibits a large SFDR compared to the input signal frequency of 40 GHz. The track and hold circuit in [59] exhibits the same performance at a much lower input signal frequency of 19 GHz. The track and hold circuit in [60] exhibits an SFDR of 35 dBc at an input signal frequency of 13 GHz which, compared to the other circuits, is a relatively low input signal frequency. The SFDR of the demultiplexer developed strongly depends on the input signal frequency, in this case, the minimum is at an input signal frequency of 12 GHz, therefore an SFDR of 35 dBc is achieved. In contrast, when the input signal frequency is 24 GHz, an SFDR of 42 dBc is achieved.

<table>
<thead>
<tr>
<th>Technology</th>
<th>[58]</th>
<th>[59]</th>
<th>[60]</th>
<th>This work</th>
</tr>
</thead>
<tbody>
<tr>
<td>Sampling rate</td>
<td>50 GS/s</td>
<td>40 GS/s</td>
<td>40 GS/s</td>
<td>4-50 GS/s</td>
</tr>
<tr>
<td>SFDR@Freq. [GHz]</td>
<td>30 dBc@40</td>
<td>30 dBc@19</td>
<td>35 dBc@13</td>
<td>35 dBc@12</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
<td>42 dBc@24</td>
</tr>
<tr>
<td>THD@Freq. [GHz]</td>
<td>-</td>
<td>-27 dB@19</td>
<td>-30 dB@13</td>
<td>-32 dB@12</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
<td>-38 dB@24</td>
</tr>
<tr>
<td>3 dB-Bandwidth</td>
<td>42 GHz</td>
<td>43 GHz</td>
<td>27 GHz</td>
<td>16 GHz</td>
</tr>
<tr>
<td>4 dB-Bandwidth</td>
<td>45 GHz</td>
<td>45 GHz</td>
<td>-</td>
<td>35 GHz</td>
</tr>
<tr>
<td>Input Amplitude</td>
<td>0.63 Vpp</td>
<td>0.63 Vpp</td>
<td>0.5 Vpp</td>
<td>0.5 Vpp</td>
</tr>
<tr>
<td>Power</td>
<td>0.64 W</td>
<td>0.54 W</td>
<td>1.9 W</td>
<td>1.7 W</td>
</tr>
</tbody>
</table>

The 3-dB bandwidth of the track and hold circuits [58] and [59] is about 43 GHz, the 3-dB bandwidth of the circuit in [60] is 27 GHz and the 3-dB bandwidth of the demultiplexer is 16 GHz. However, compared to the other circuits, the bandwidth of the demultiplexer does not decrease with a steep slope at this
frequency. Instead, the attenuation is below 4 dB up to an input signal frequency of 35 GHz.

An InP transistor requires a higher collector-emitter voltage $V_{CE}$ and a higher current density as a transistor in BiCMOS technology to achieve its maximum transit frequency. Therefore, the power supply voltage is lower in BiCMOS technology [61, 62].

In Table 7.2, a performance perspective is given when the 3 bit ADC in this work is used to form a hybrid ADC with the demultiplexer. The table contains the state-of-the-art high-speed ADCs presented in Section 1.3.

### Table 7.2: State-of-the-art high-speed ADCs versus a hybrid solution.

<table>
<thead>
<tr>
<th></th>
<th>[10]</th>
<th>[11]</th>
<th>[12]</th>
<th>This work</th>
</tr>
</thead>
<tbody>
<tr>
<td>Technology</td>
<td>SiGe BiCMOS</td>
<td>65 nm CMOS</td>
<td>65 nm CMOS</td>
<td>Hybrid</td>
</tr>
<tr>
<td>Publication</td>
<td>2010</td>
<td>2010</td>
<td>2010</td>
<td>2012</td>
</tr>
<tr>
<td>Sampling rate</td>
<td>40 GS/s</td>
<td>40 GS/s</td>
<td>56 GS/s</td>
<td>50 GS/s</td>
</tr>
<tr>
<td>Nom. Resolution</td>
<td>4</td>
<td>6</td>
<td>8</td>
<td>3</td>
</tr>
<tr>
<td>ENOB@15 GHz</td>
<td>-</td>
<td>3.7 bit</td>
<td>6 bit</td>
<td>2.3 bit</td>
</tr>
<tr>
<td>Bandwidth</td>
<td>-</td>
<td>$\approx$6 GHz</td>
<td>$\approx$16 GHz</td>
<td>16 (35) GHz</td>
</tr>
<tr>
<td>Total Power</td>
<td>4.5 W</td>
<td>1.5 W</td>
<td>2 W</td>
<td>8.3 W</td>
</tr>
</tbody>
</table>

The measurement result for the demultiplexer shows that an effective resolution of about 4 bit is feasible up to an input signal frequency of 25 GHz. This requires a CMOS ADC with an effective resolution of at least 4 bit, whereas the CMOS ADC in this work limits the resolution of the hybrid ADC to 2.3 bit.

Although the values for the hybrid ADC must be verified by a measurement, some facts can already be concluded. The benefit of a hybrid solution is that a larger bandwidth and a higher sampling rate are feasible. The drawback is that power consumption is more than twice as high compared to the pure CMOS solutions in [11] and [12].
7.5 DeMUX Integration with TIA

Transimpedance amplifiers (TIA) are used in the receivers of fibre optical transmission systems, which are realised in a fast bipolar technology in order to achieve the necessary bandwidth of at least 20 GHz for a data transmission rate of 40 Gbit/s. Therefore, a feasible way to reduce the costs of the receiver is to integrate the demultiplexer presented here together with a TIA on a single chip. In this section, it is shown that the amplifier can be realised in InP technology as follows: First, the block diagram of the TIA is presented, which is followed by a description of the realised TIA circuit and chip. Finally, the S-parameter measurements showing the achieved bandwidth and gain of the TIA chip are given.

7.5.1 Block Diagram

The block diagram of the transimpedance amplifier chip is shown in Figure 7.17. The output current of a photo diode is fed to the TIA input $I_{in}$. The second TIA is used as a biasing circuit for the differential cascode amplifier (DCA), which is used to balance the input signal. The input $I_{o}$ is used to adjust the offset voltage of this cascode amplifier. The emitter followers (EF) are used as buffers and peaking circuits. Another differential cascode amplifier is used as output driver. The peaking effect of the emitter followers is described in [63].

![Figure 7.17: Block diagram of the transimpedance amplifier.](diagram.png)
7.5.2 Implementation

The actual transimpedance circuit is shown in Figure 7.18. The inverting amplifier in common emitter configuration is stabilised in its bias point by means of the feedback resistor $R_f$. Due to the inverting function of the amplifier, a positive excitation of the input current will cause a negative feedback current via the resistor $R_f$. In turn, the feedback current enforces the output voltage $V_{out} = -I_{in} \cdot R_f$. This implies that the value of the transimpedance is equal to $R_f$.

![Figure 7.18: The transimpedance circuit.](image)

Figure 7.19 shows the micrograph of the TIA chip. The input is on the left-hand side and the pad configuration is GSG, where G represents a ground pin and S represents the signal pin. The RF output of the TIA chip is on the right-hand side. The configuration of the output pads is GSGSG. The pads are connected to the output driver by means of coplanar transmission lines. Two power supply pads are placed above and below the RF output, respectively.

Additionally, twelve DC pads are placed at the top and bottom of the chip,
respectively. These pads provide additional power supply connections and biasing inputs as follows: Three ground pads are placed on the left-hand side, which are then followed by three power supply pads and six biasing pads. The biasing circuits are located close to the pads and each biasing input is blocked with a capacitor. The supply voltage is blocked with capacitors, which are located below the large metal areas between the TIA input and the power supply pads.

7.5.3 Measurement Results

The voltage gain and bandwidth of the TIA is estimated from the S-parameter measurements in Figure 7.20. Two curves are shown, which are obtained from the measurement of the transmission between the input and the two outputs in sequential order. A very good matching of the curves is achieved, which implies that the output signal is well balanced by the cascode amplifiers. The voltage gain is above 21 dB up to 43 GHz, the transimpedance is 70 dBΩhm and the 3-dB bandwidth is 45 GHz.
7.6 Conclusion

With the demultiplexer developed in this study, a hybrid ADC with an effective resolution between 4 and 5 bit is feasible when operating at a sampling rate of 40 GS/s. The input signal attenuation of the demultiplexer is below 4 dB up to an input signal frequency of 35 GHz. This allows to build a hybrid ADC with a larger bandwidth compared to state-of-the-art high-speed ADCs. However, this requires a higher power consumption and the integration with an ADC chip results in higher costs.

In order to compensate for the higher costs in the application of an optical receiver, a transimpedance amplifier is developed which is suitable for integration with the demultiplexer on a single chip. The TIA exhibits a transimpedance of 70 dB Ohm and a bandwidth of 45 GHz. With this solution, the system costs are comparable to the costs of a system with a single ADC chip if the necessary ADCs for the hybrid ADC are also integrated on a single chip.
8 Summary and Outlook

8.1 Summary

The increasing data traffic in the installed 10 Gbit/s fiber optical networks requires an update to 40 Gbit/s. To achieve this, the fiber dispersion must be compensated by an electronic equaliser, such as a Viterbi equaliser. This equaliser requires an ADC with a nominal resolution of 3 bit and a sampling rate of 40 GS/s.

In this work, a prototype of this ADC is designed in a 65 nm low power CMOS technology. The architecture of the ADC is a fourfold time-interleaved flash ADC, therefore each channel operates at a quarter of the sampling rate of the complete ADC. Four sample and hold circuits are parallel connected to realise the time-interleaving and the appropriate clock signals are generated by a four-phase clock divider. A differential real-time interface provides the digital output data of each sub-ADC, which results in an interface of 4x3 bit.

An FPGA-based measurement system is developed in order to facilitate the characterization of the ADC. A Virtex4 FPGA-board is used, which provides up to 20 high-speed interfaces with a data rate of 6.5 Gbit/s each. This enabled a characterisation of the ADC up to a sampling rate of about 26 GS/s.

The feasibility of a hybrid ADC is investigated with the intention of achieving very high sampling rates. The idea is to combine an analogue demultiplexer in indium phosphide technology with two CMOS ADCs to achieve twice the sampling rate of a single ADC and a larger bandwidth, while retaining the effective resolution of the single ADCs. To keep the costs of an optical receiver with a hybrid ADC low, it is also investigated whether integration of the demultiplexer and a transimpedance amplifier is feasible. Therefore, a suitable TIA chip is developed for this purpose.
8.1.1 Measurement Results

The ADC achieves a measured effective resolution of 2.2 bit in the whole Nyquist range at a sampling rate of 25.6 GS/s. In contrast, in the single channel measurements an effective resolution between 2.2 bit and 2.7 bit is achieved at this sampling rate. The difference is caused by offset and gain errors in the sub-ADCs. This is also shown in the INL measurements, where a maximum mismatch of 1.2 LSB is measured between the channels.

Besides the FPGA measurements up to 25.6 GS/s, the functionality of the ADC is shown up to a sampling rate of 36 GS/s using a 4-channel real-time scope. At this sampling rate, the effective resolution of the single channels drops to about 1.7 bit for input signal frequencies between 12 GHz and 15 GHz. It is found that the effective resolution is limited by the settling time of the second track and hold circuit.

With an input signal attenuation of below 4 dB up to an input signal frequency of 35 GHz for the demultiplexer presented in Chapter 7, a hybrid ADC with an effective resolution of 4 to 5 bit is feasible when operating at a sampling rate of 40 GS/s. The interface of the demultiplexer is compatible with the developed TIA, therefore the circuits can be integrated on a single chip. The TIA exhibits a transimpedance of 70 dBOhm and a bandwidth of 45 GHz.

8.1.2 Conclusion

The realised prototype of the 3 bit ADC shows that a 40 Gbit/s ADC is feasible in CMOS technology. The size of the ADC is negligible compared to a Viterbi equaliser [3], therefore it can be integrated nearly cost neutral with an equaliser on a single chip.

The measurement system developed to characterise the ADC is, as far as known, the first low-cost, high-speed multi channel data acquisition system capable of handling a data rate beyond 100 GS/s in real-time. Comparable commercial measurement systems are about one hundred times more expansive than the FPGA solution.
The results of the hybrid ADC study show that compared to state-of-the-art ADCs, it is a feasible solution for achieving a higher sampling rate and a larger bandwidth. In contrast to a single chip ADC solution, when used in a fibre optical receiver, the costs of the system can be kept constant if the whole system design is in line with this solution. This means that the demultiplexer is integrated with the TIA on a single chip and the ADCs are also integrated on a single CMOS chip with the appropriate equaliser.

8.2 Future Work

The results of the 3 bit ADCs can be improved in terms of power consumption, sampling rate, bandwidth and effective resolution. The analysis in Chapter 6 examines all of these critical aspects of the circuit. The power consumption can be reduced by resizing the CML circuits. This requires a calibration circuit to compensate for the offset errors in the comparators and input amplifiers. A reduction of the comparator size also reduces the parasitic load of the second track and hold, which increases the bandwidth of the ADC. In Section 2.4 it is shown that the gain errors also have a significant impact on the effective resolution of a time-interleaved ADC. Therefore, a circuit that allows calibration of the gain error of the input buffers in the sample and hold circuits has to be developed. Another aspect is that, in terms of technology performance, great advances have been made in CMOS technology in recent years. Therefore, transfer of the circuit to a state-of-the-art CMOS technology, such as 28nm technology, would increase the performance of the ADC [64, 65].

The results from the study in Chapter 7 show that the demultiplexer-based hybrid ADC is a feasible solution in order to achieve higher sampling rates and a larger bandwidth compared to state-of-the-art ADCs. In order to obtain a hybrid ADC system in a package, the next step is to design a thin film substrate containing the demultiplexer and ADCs.

Porting of the FPGA-based measurement system to the latest FPGA generation, which provides a data rate of up to 28 Gbit/s on each of its 16 interfaces, would make four times faster measurements possible compared to the FPGAs used in this work [66].
A Clipping in an Ideal ADC

For the dynamic ADC measurements a sinusoidal input signal is applied to the input of the measured ADC. Clipping occurs if the input signal amplitude is above the specified input voltage range. This generates uneven signal harmonics, which reduce the signal-to-noise and distortion ratio (SNDR) of an ADC because the energy of the harmonics cannot be distinguished from the noise and distortion energy.

Besides clipping, input signal attenuation also reduces the SNDR of an ADC. The magnitude of the signal bin decreases in the calculated DFT spectrum, if the input signal amplitude is below the specified input voltage range. This in turn also reduces the SNDR.

Figure A.1 shows the impact of a mismatch of the input signal amplitude for an ideal 3 bit ADC. The results are obtained from a Matlab simulation.

**Figure A.1:** Reduction of the SNDR of an ideal 3 bit ADC caused by clipping or input signal attenuation.
Figure A.2 shows the impact of a mismatch of the input signal amplitude for an ideal 6 bit ADC. Compared to the 3 bit ADC, clipping has a much larger impact on the SNDR than the input signal attenuation. The input signal attenuation reduces the SNDR of a 6 bit ADC only by the magnitude of the input signal attenuation.

Figure A.2: Reduction of the SNDR of an ideal 6 bit ADC caused by clipping or input signal attenuation.
Personal Publications


[E-15] Mohamed S. Abdelfattah, Damir Ferenci, Markus Grözing, Manfred Berroth, Cor Scherjon, Joachim Burghartz, “2.2 GHz LC VCO for RFID on a 0.5-um digital gate-array designed for ultra-thin silicon substrates,” German Microwave Conf. (GeMiC), Darmstadt, Germany, March 2011.


Bibliography


[57] Images used courtesy of R. Driad and M. Schlechtweg, copyright Fraunhofer Institute for Applied Solid State Physics (IAF), Freiburg, Germany.


Curriculum Vitae

April 4th 1981       Birth in Stuttgart, Bad-Cannstatt
1987-1991           Hohensteinschule, Stuttgart Zuffenhausen
1992-2000           Ferdinand Porsche Gymnasium, Stuttgart Rot
2000-2006           Electrical engineering studies at the University of Stuttgart
March 31st 2006     Studies completed with the diploma degree in electrical engineering
2006-2007           Robert Bosch GmbH, Engineer in Hardware Development
since 2007          Scientific Employee at the Institute of Electrical and Optical Communications Engineering at the University of Stuttgart
Acknowledgment

An dieser Stelle möchte ich mich recht herzlich bei allen bedanken, die zum Gelingen dieser Arbeit beigetragen haben.

Allen voran meinem Doktorvater Prof. Dr.-Ing. Manfred Berroth, dem ich für das Vertrauen, die Unterstützung sowie die Betreuung sehr dankbar bin.

Herr Prof. Dr.-Ing. Joachim Burghartz danke ich für die Übernahme des Mitberichts.


Weiterhin gilt mein besonderer Dank Johannes Digel und Felix Lang für das Korrekturlesen dieser Arbeit.