[home] [purchase] [table of contents] [errata] [change log] [report error] [authors] [search]
CONTENTS
LIST OF FIGURESxvi
LIST OF TABLESxxii
FOREWORDxxv
PREFACExxix
1INTRODUCTION1
1.1Microprocessors: from CISC to EPIC1
1.1.1Summary of microprocessor taxonomy4
1.1.2The IA-64 architecture and Itanium4
1.2A Brief History of Linux5
1.2.1The early days6
1.2.2Branching out: Linux goes multiplatform7
1.2.3IA-64 Linux8
1.2.4Summary of Linux history8
1.3Overview of the Linux Kernel9
1.3.1Key concepts10
1.3.2Hardware model19
1.3.3Kernel components20
1.3.4The kernel source code25
1.4Summary27
2IA-64 ARCHITECTURE29
2.1User-Level Instruction Set Architecture31
2.1.1Instruction format31
2.1.2Instruction sequencing32
2.1.3Register files34
2.1.4Instruction set overview40
2.1.5Integer and SIMD instructions41
2.1.6Memory and semaphore instructions42
2.1.7Branch instructions45
2.1.8Register-stack-related instructions47
2.1.9Control instructions50
2.1.10Floating-point instructions51
2.1.11Modulo-scheduled loops51
2.2Runtime and Software Conventions55
2.2.1Data model55
2.2.2Register usage56
2.2.3Procedure linkage62
2.2.4Memory stack60
2.2.5Register stack60
2.2.6The global pointer60
2.2.7Programming in IA-64 assembly language62
2.3System Instruction Set Architecture66
2.3.1System register files66
2.3.2Privileged instructions71
2.3.3Interruptions72
2.4The Register Stack Engine (RSE)75
2.4.1The register stack configuration register (rsc)77
2.4.2Dealing with NaT bits78
2.4.3RSE arithmetic80
2.4.4Convenience routines for RSE arithmetic81
2.4.5Instructions that affect the RSE82
2.5Summary84
3PROCESSES, TASKS, AND THREADS85
3.1Introduction to Linux Tasks87
3.1.1Task creation89
3.1.2Historical perspective92
3.2The Thread Interface93
3.2.1The pt-regs structure94
3.2.2The switch-stack structure95
3.2.3The thread structure98
3.2.4IA-64 register stack101
3.2.5Summary of IA-64 thread state102
3.2.6Running threads103
3.2.7Creating threads108
3.2.8Terminating threads114
3.2.9Moving threads across the address-space boundary115
3.3Thread Synchronization117
3.3.1Concurrency model118
3.3.2Atomic operations119
3.3.3Semaphores124
3.3.4Interrupt masking125
3.3.5Spinlocks127
3.4Summary128
4VIRTUAL MEMORY131
4.1Introduction to the Virtual Memory System132
4.1.1Virtual-to-physical address translation133
4.1.2Demand paging134
4.1.3Paging and swapping134
4.1.4Protection136
4.2Address Space of a Linux Process138
4.2.1User address space139
4.2.2Page-table-mapped kernel segment144
4.2.3Identity-mapped kernel segment144
4.2.4Structure of IA-64 address space148
4.3Page Tables152
4.3.1Collapsing page-table levels155
4.3.2Virtually-mapped linear page tables155
4.3.3Structure of Linux/ia64 page tables158
4.3.4Page table entries (PTEs)161
4.3.5Page table accesses168
4.3.6Page table directory creation173
4.4Translation Lookaside Buffer (TLB)174
4.4.1The IA-64 TLB architecture176
4.4.2Maintenance of TLB coherency182
4.4.3Lazy TLB flushing185
4.5Page Fault Handling187
4.5.1Example: How copy-on-write really works188
4.5.2The Linux page fault handler191
4.5.3IA-64 implementation192
4.6Memory Coherency200
4.6.1Maintenance of coherency in the Linux kernel201
4.6.2IA-64 implementation203
4.7Switching Address Spaces205
4.7.1Address-space switch interface205
4.7.2IA-64 implementation206
4.8Discussion and Summary206
5KERNEL ENTRY AND EXIT209
5.1Interruptions210
5.1.1Kernel entry path211
5.1.2Kernel exit path211
5.1.3Discussion213
5.1.4IA-64 implementation214
5.1.5Switching the IA-64 register stack217
5.2System Calls225
5.2.1Signaling errors226
5.2.2Restarting system call execution227
5.2.3Invoking system calls from the kernel230
5.2.4IA-64 implementation230
5.3Signals238
5.3.1Signal-related system calls239
5.3.2Signal delivery243
5.3.3IA-64 implementation246
5.4Kernel Access to User Memory251
5.4.1Example: How gettimeofday() returns the timeval structure254
5.4.2Disabling validity checking256
5.4.3IA-64 implementation257
5.5Summary261
6STACK UNWINDING263
6.1IA-64 ELF Unwind Sections265
6.2The Kernel Unwind Interface267
6.2.1Managing unwind tables267
6.2.2Navigating through the call chain267
6.2.3Accessing the CPU state of the current frame269
6.2.4Using the unwind interface273
6.3Embedding Unwind Information in Assembly Code276
6.3.1Region directives278
6.3.2Prologue directives280
6.3.3Body directives282
6.3.4General directives282
6.3.5Examples283
6.4Implementation Aspects285
6.4.1The frame info structure285
6.4.2Unwind descriptor processing287
6.4.3Unwind scripts289
6.4.4Lazy initialization and script hinting292
6.4.5Putting it all together293
6.5Summary294
7DEVICE I/O295
7.1Introduction295
7.1.1Organization of modern machines297
7.1.2Software support for I/O on modern machines298
7.2Programmed I/O299
7.2.1Memory-mapped I/O299
7.2.2Port I/O304
7.3Direct Memory Access (DMA)308
7.3.1PCI DMA interface310
7.3.2Example: Sending a network packet315
7.3.3IA-64 implementation317
7.4Device Interrupts318
7.4.1IA-64 hardware interrupt architecture320
7.4.2Device interrupt interface326
7.4.3Interrupt handling332
7.4.4Managing the IA-64 interrupt steering logic334
7.5Summary335
8SYMMETRIC MULTIPROCESSING337
8.1Introduction to Multiprocessing on Linux337
8.2Linux Locking Principles339
8.2.1Locking rules341
8.2.2The big kernel lock (BKL)342
8.3Multiprocessor Support Interface344
8.3.1Support facilities345
8.3.2IA-64 implementation348
8.4CPU-Specific Data Area352
8.4.1False sharing353
8.4.2Virtual mapping of CPU-specific data area355
8.5Tracking Wall-Clock Time with High Resolution356
8.5.1MP challenges and options in maintaining wall-clock time356
8.5.2Synchronizing the cycle counters in an MP machine357
8.6Summary361
9UNDERSTANDING SYSTEM PERFORMANCE363
9.1IA-64 Performance Monitoring Unit Overview366
9.1.1PMU register file366
9.1.2Controlling monitoring372
9.1.3Dealing with counter overflows373
9.2Extending the PMU: The Itanium Example375
9.2.1Itanium PMU additional capabilities375
9.2.2Itanium PMU register file376
9.2.3Itanium PMU events377
9.2.4Hardware support for event sampling378
9.2.5Event address registers (EAR)381
9.2.6Branch trace buffer (BTB)384
9.2.7Miscellaneous features388
9.3Kernel Support for Performance Monitoring390
9.3.1The perfmon interface392
9.3.2Implementation aspects399
9.3.3Using the perfmon interface: The pfmon example405
9.4Summary407
10BOOTING409
10.1IA-64 Firmware Overview410
10.1.1Processor Abstraction Layer (PAL)411
10.1.2System Abstraction Layer (SAL)417
10.1.3Advanced configuration and power interface (ACPI)422
10.1.4Extensible firmware interface (EFI)426
10.2The Bootloader435
10.2.1Loading the kernel image436
10.2.2Loading the initial RAM disk437
10.2.3Loading the FPSWA438
10.2.4Collecting the boot parameters438
10.2.5Starting the kernel439
10.3Kernel Initialization441
10.3.1The bootstrap interface441
10.3.2IA-64 implementation445
10.4Summary448
11IA-32 COMPATIBILITY449
11.1Architectural Support for IA-32450
11.1.1IA-32 user-level machine state451
11.1.2Mapping the IA-32 user-level machine state to IA-64452
11.1.3IA-32 segmentation and memory addressing454
11.1.4Transferring control between IA-32 and IA-64456
11.2Linux Support for IA-32 Applications457
11.2.1Kernel representation of an IA-32 task459
11.2.2Address space of an emulated IA-32 task459
11.2.3Dealing with absolute filesystem paths463
11.2.4Starting an IA-32 executable465
11.2.5System call emulation469
11.2.6Signal delivery479
11.2.7Accessing I/O port space480
11.3Summary481
A IA-64 CPU MODELS483
B KERNEL REGISTER USAGE484
C IA-64 INSTRUCTIONS485
C.1Integer Instructions485
C.2Memory Instructions487
C.3Semaphore Instructions487
C.4Branch Instructions487
C.5Control Instructions488
C.6Multimedia Instructions488
C.7Floating-Point Instructions489
C.8Privileged Instructions490
D ITANIUM PMU EVENTS491
E GLOSSARY495
BIBLIOGRAPHY499
INDEX505
[home] [purchase] [table of contents] [errata] [change log] [report error] [authors] [search]
Last modified: June 30, 2005. copyright © 2001-2005 david mosberger. all rights reserved.