全文翻译自http://www.tldp.org/LDP/tlk/tlk.html

Title Page
This book is for Linux enthusiasts who want to know how the linux kernel works. It is not an internals manual. Rather it describes the principles and mechanisms that Linux uses; how and why the Linux kernel works the way that it does.
Linux is a moving target; this book is based upon the current, stable,2.0.33 sources as those are what most individuals and companies are now using.
Preface
Linux is a phenomenon of the Internet. Born out of the hobby project of a student it has grown to become more popular than any other freely available operating system. To many Linux is an enigma. How can something that is free be worthwhile? In a world dominated by a handful of large software corporations, how can something that has been written by a bunch of “hackers”(sic) hope to compete? How can software contributed to by many different people in many different countries around the world have a hope of being stable and effective? Yet stable and effective it is and compete it does. Many Universities and research establishments use it for their everyday computing needs. People are running it on their home PCs and I would wager that most companies are using it somewhere even if they do not always realize that they do. Linux is used to browse the web, host web sites, write theses, send electronic mail and, as always with computers, to play games. Linux is emphatically not a toy; it is a fully developed and professionally written operating system used by enthusiasts all over the world.
Linux是网络的产物。它来源于一个学生的业余爱好项目,如今它已经比任何其他免费的操作系统更受欢迎。对大多数人来书Linux是一个谜。一个免费的东西怎么会有价值呢?在一个由少数大型软件公司主导的世界中,一群由“黑客”编写的东西怎么能够有竞争力呢?由世界各地不同人提供的软件怎么能够稳定有效呢?然而,它确实稳定有效,并且它很有竞争力。许多大学和研究机构使用它用于日常计算需要。人们在自己的家用电脑上运行它,而且我敢打赌大多数公司都在某种程度上使用它,虽然他们没有意识到。Linux可以用于浏览网页,托管网站,撰写论文,发送电子邮件,以及在计算机上玩游戏。Linux显然不是一个玩具;它是一个经过完整开发和专业编写的操作系统,被世界各地的爱好者使用。
The roots of Linux can be traced back to the origins of UnixTM. In 1969, Ken Thompson of the Research Group at Bell Laboratories began experimenting on a multi-user, multi-tasking operating system using an otherwise idle PDP-7. He was soon joined by Dennis Richie and the two of them, along with other members of the Research Group produced the early versions of UnixTM. Richie was strongly influenced by an earlier project, MULTICS and the name UnixTMis itself a pun on the name MULTICS. Early versions were written in assembly code, but the third version was rewritten in a new programming language, C. C was designed and written by Richie expressly as a programming language for writing operating systems. This rewrite allowed UnixTM to move onto the more powerful PDP-11/45 and 11/70 computers then being produced by DIGITAL. The rest, as they say, is history. UnixTM moved out of the laboratory and into mainstream computing and soon most major computer manufacturers were producing their own versions.
Linux的根源可以追溯到UnixTM的起源。1969年,贝尔实验室研究小组的Ken Thompson开始使用闲置的PDP-7进行多用户,多任务操作系统实验。Dennis Richie很快就加入他,他们俩,以及研究小组的其他成员制作了早期版本的UnixTM。Richie受早期MULTICS项目的很深地影响,名称UnixTM本身就是MULTICS的双关语。早期版本使用汇编语言编写,但是第三个版本使用了新的语言-C语言进行重写。C由Richie编写和设计,明确地作为编写操作系统的编程语言。这次重写使得UnixTM可以转移到更强大的PDP-11/45和11/70计算机,然后由DIGITAL生产。其余的人,如他们所说,就是历史了。UnixTM从实验室移出并进入主流计算,很快大多数主要的计算机制造商都在生产自己的版本。
Linux was the solution to a simple need. The only software that Linus Torvalds, Linux’s author and principle maintainer was able to afford was Minix. Minix is a simple, UnixTM like, operating system widely used as a teaching aid. Linus was less than impressed with its features, his solution was to write his own software. He took UnixTM as his model as that was an operating system that he was familiar with in his day to day student life. He started with an Intel 386 based PC and started to write. Progress was rapid and, excited by this, Linus offered his efforts to other students via the emerging world wide computer networks, then mainly used by the academic community. Others saw the software and started contributing. Much of this new software was itself the solution to a problem that one of the contributors had. Before long, Linux had become an operating system. It is important to note that Linux contains no UnixTM code, it is a rewrite based on published POSIX standards. Linux is built with and uses a lot of the GNU (GNU’s Not UnixTM) software produced by the Free Software Foundation in Cambridge, Massachusetts.
Linux是一种简单需求的解决方案。Linus作者和主要维护者Linus Torvalds能负担得起的唯一软件是Minix。Minix是一款简单的,类UnixTM的操作系统,广泛应用在教学辅助中。Linus对其功能印象不足,他的解决方案是编写自己的软件。他使用UnixTM作为他的模板,因为UnixTM是他在日复一日学生时代熟悉的操作系统。他开始使用基于Intel 386个人电脑编写程序。进展很快,受此鼓舞,Linus通过新兴的全球计算机网络向其他学生提供了他的工作,然后主要在学术界使用。其他人看到这款软件,开始一起维护。这款新软件中的大部分本身就是一个贡献者所遇到的问题的解决方案。不久,Linux变成了一款操作系统。值得注意的是Linux不包含UnixTM代码,它是基于已发布的POSIX标准重写而成的。Linux是由马萨诸塞州剑桥的自由软件基金会生产的许多GNU(GNU的非Unix TM)软件构建而成。
Most people use Linux as a simple tool, often just installing one of the many good CD ROM-based distributions. A lot of Linux users use it to write applications or to run applications written by others. Many Linux users read the HOWTOs1 avidly and feel both the thrill of success when some part of the system has been correctly configured and the frustration of failure when it has not. A minority are bold enough to write device drivers and offer kernel patches to Linus Torvalds, the creator and maintainer of the Linux kernel. Linus accepts additions and modifications to the kernel sources from anyone, anywhere. This might sound like a recipe for anarchy but Linus exercises strict quality control and merges all new code into the kernel himself. At any one time though, there are only a handful of people contributing sources to the Linux kernel.
大部分人使用Linux作为一个简单的工具,经常仅仅用来安装许多很好的基于CD-ROM一个软件版本。很多Linux使用者使用它写应用程序或者运行其他人编写的程序。许多Linux使用者热衷于阅读HOWTOs1,当系统中的一部分正确配置时他们感到成功的快感,而失败时则感受到失败的沮丧。一小部分人足够勇敢编写设备驱动,并提交内核补丁给Linus Torvalds。Linus接受任何人,任何地方对内核源代码的补充和修改。听起来这可能像一个无政府行为,但是Linus执行严格的质量控制并亲自将所有的新代码合并到内核中。但在任何时候,仅有少数的人为Linux内核提供源码。
The majority of Linux users do not look at how the operating system works, how it fits together. This is a shame because looking at Linux is a very good way to learn more about how an operating system functions. Not only is it well written, all the sources are freely available for you to look at. This is because although the authors retain the copyrights to their software, they allow the sources to be freely redistributable under the Free Software Foundation’s GNU Public License. At first glance though, the sources can be confusing; you will see directories called kernel, mm and net but what do they contain and how does that code work? What is needed is a broader understanding of the overall structure and aims of Linux. This, in short, is the aim of this book: to promote a clear understanding of how Linux, the operating system, works. To provide a mind model that allows you to picture what is happening within the system as you copy a file from one place to another or read electronic mail. I well remember the excitement that I felt when I first realized just how an operating system actually worked. It is that excitement that I want to pass on to the readers of this book.
Linux的很多使用者并不关心操作系统怎么工作,它是如何组合在一起的。这让人感到有点羞愧,因为查看Linux是了解操作系统如何运行的更好的方法。不仅因为它编写地很好,所有源码都可以免费查看。这是因为尽管作者保留了他们软件的版权,但他们允许根据自由软件基金会的GNU公共许可证自由地重新发布源代码。初次查看源码可能让人疑惑;你将看到叫做kernel,mm和net的目录,但它们包含什么以及该代码如何工作?我们需要更广泛地理解Linux的整体结构和目标。简而言之,这本书的目标就是:提升对操作系统Linux如何工作的清晰理解。提供一个思路模式,帮助你想象当你从一个地方向另一个地方复制文件或阅读电子邮件时,系统内正发生着什么。我清晰地记得当我第一次意识到操作系统实际上是如何工作的我的兴奋。我想将这种兴奋传递给本书的读者。
My involvement with Linux started late in 1994 when I visited Jim Paradis who was working on a port of Linux to the Alpha AXP processor based systems. I had worked for Digital Equipment Co. Limited since 1984, mostly in networks and communications and in 1992 I started working for the newly formed Digital Semiconductor division. This division’s goal was to enter fully into the merchant chip vendor market and sell chips, and in particular the Alpha AXP range of microprocessors but also Alpha AXP system boards outside of Digital. When I first heard about Linux I immediately saw an opportunity to have fun. Jim’s enthusiasm was catching and I started to help on the port. As I worked on this, I began more and more to appreciate not only the operating system but also the community of engineers that produces it.
我与Linux的合作始于1994年末,当时我拜访了Jim Paradis,他正在为基于Alpha AXP处理器的系统开发linux端口。我从1984年开始在Digital Equipment公司工作,主要从事网络和通信工作。1992年,我开始为新成立的数字半导体部门工作。该部门的目标是进入商用芯片供应商市场并销售芯片,特别是Alpha AXP系列微处理器以及Digital之外的Alpha AXP系统板。当我第一次听说Linux时,我立刻看到了玩得高兴的机会。Jim的激情很迷人,我开始帮忙开发这个端口。当我开始这方面工作时,我开始越来越多的欣赏操作系统以及产生它的工程师社区。
However, Alpha AXP is only one of the many hardware platforms that Linux runs on. Most Linux kernels are running on Intel processor based systems but a growing number of non-Intel Linux systems are becoming more commonly available. Amongst these are Alpha AXP, ARM, MIPS, Sparc and PowerPC. I could have written this book using any one of those platforms but my background and technical experiences with Linux are with Linux on the Alpha AXP and, to a lesser extent on the ARM. This is why this book sometimes uses non-Intel hardware as an example to illustrate some key point. It must be noted that around 95% of the Linux kernel sources are common to all of the hardware platforms that it runs on. Likewise, around 95% of this book is about the machine independent parts of the Linux kernel.
但是,Alpha AXP只是Linux运行的众多硬件平台之一。大多数Linux内核都在基于Intel的处理器系统上运行,但越来越多的非Intel Linux系统正变得越来越普遍。其中包括Alpha AXP,ARM,MIPS,Sparc和PowerPC。我本来可以使用这些平台中的任何一个编写这本书,但我的背景和使用Linux的技术经验是基于Alpha AXP的Linux,ARM上也有一点经验。这就是本书有时以非英特尔硬件为例说明一些关键点的原因。值得提醒的是,大约95%的Linux内核源代码对于它运行的所有硬件平台都是通用的。同样,本书约95%是与机器无关的Linux内核。
Reader Profile
This book does not make any assumptions about the knowledge or experience of the reader. I believe that interest in the subject matter will encourage a process of self education where neccessary. That said, a degree of familiarity with computers, preferably the PC will help the reader derive real benefit from the material, as will some knowledge of the C programming language.
这本书没有假设读者具有任何相关的知识和经验。我认为当需要时,对主题问题的兴趣会促使自我学习的过程。也就是说,对计算机,特别是个人电脑一定程度的熟悉,以及对C编程语言有一些了解,将会帮助读者从本书中获益。
Oganisation of this Book
This book is not intended to be used as an internals manual for Linux. Instead it is an introduction to operating systems in general and to Linux in particular. The chapters each follow my rule of “working from the general to the particular”. They first give an overview of the kernel subsystem that they are describing before launching into its gory details.
这本书并没有打算用来作为Linux手册。相反,它是对操作系统做整体介绍,特别是Linux。每章都遵循我的“从一般到特殊入手”的规则。在进入内核子系统复杂的细节前,首先给出整体概览。
I have deliberately not described the kernel’s algorithms, its methods of doing things, in terms of routine_X() calls routine_Y() which increments the foo field of the bar data structure. You can read the code to find these things out. Whenever I need to understand a piece of code or describe it to someone else I often start with drawing its data structures on the white-board. So, I have described many of the relevant kernel data structures and their interrelationships in a fair amount of detail.
我故意没有描述内核算法,内核算法是实现功能的方法,比如例程routine_X()调用routine_Y(),使其bar数据结构中foo字段加一。你可以阅读代码弄清原理。无论什么时候,当我需要理解一段代码或者将其讲述给其他人时,我经常在白板上写下它的数据结构。因此,我已经相对详细地描述了许多相关的内核数据结构及其相互关系。
Each chapter is fairly independent, like the Linux kernel subsystem that they each describe. Sometimes, though, there are linkages; for example you cannot describe a process without understanding how virtual memory works.
每个章节相对独立,像Linux内核子系统一样它们不言自明。不过,有时它们互相联系;例如,不理解虚拟内存如何工作,你描述不了一个进程。
The Hardware Basics chapter (Chapter hw-basics-chapter) gives a brief introduction to the modern PC. An operating system has to work closely with the hardware system that acts as its foundations. The operating system needs certain services that can only be provided by the hardware. In order to fully understand the Linux operating system, you need to understand the basics of the underlying hardware.
硬件基础章节简单讲述了现代PC。操作系统必须和硬件系统联系紧密,因为它们是基础。操作系统需要的某种服务只能由硬件提供。为了完全理解Linux操作系统,你需要理解底层硬件基本内容。
The Software Basics chapter (Chapter sw-basics-chapter) introduces basic software principles and looks at assembly and C programing languages. It looks at the tools that are used to build an operating system like Linux and it gives an overview of the aims and functions of an operating system.
软件基础章节介绍基本的软件原理,以及汇编和C编程语言。该章关注了构建像Linux的操作系统的工具,并且给出了操作系统的目标和功能概述。
The Memory Management chapter (Chapter mm-chapter) describes the way that Linux handles the physical and virtual memory in the system.
内存管理章节描述了系统中,Linux管理物理和虚拟内存的方式。
The Processes chapter (Chapter processes-chapter) describes what a process is and how the Linux kernel creates, manages and deletes the processes in the system.
进程章节描述了进程是什么以及系统中,Linux内核如何创建,管理和删除进程。
Processes communicate with each other and with the kernel to coordinate their activities. Linux supports a number of Inter-Process Communication (IPC) mechanisms. Signals and pipes are two of them but Linux also supports the System V IPC mechanisms named after the UnixTMrelease in which they first appeared. These interprocess communications mechanisms are described in Chapter IPC-chapter.
进程之间互相通信以及和和内核通信以协调其活动。Linux支持很多进程间通信(IPC)机制。信号和管道是其中的两个,但是Linux也支持System V IPC机制,其在UnixTM问世之后命名。这些进程间通信机制在IPC章节讲述。
The Peripheral Component Interconnect (PCI) standard is now firmly established as the low cost, high performance data bus for PCs. The PCI chapter (Chapter PCI-chapter) describes how the Linux kernel initializes and uses PCI buses and devices in the system.
外围组件互连(PCI)标准现在已成为PC的低成本,高性能数据总线。PCI章节讲述了在系统中,Linux内核如何初始化并且使用PCI总线和设备。
The Interrupts and Interrupt Handling chapter (Chapter interrupt-chapter) looks at how the Linux kernel handles interrupts. Whilst the kernel has generic mechanisms and interfaces for handling interrupts, some of the interrupt handling details are hardware and architecture specific.
中断和中断处理章节讲述了Linux内核如何处理中断。虽然内核有通用的机制和接口处理中断,但是一些中断处理的细节与硬件和架构相关。
One of Linux’s strengths is its support for the many available hardware devices for the modern PC. The Device Drivers chapter (Chapter dd-chapter) describes how the Linux kernel controls the physical devices in the system.
Linux的一个优势是它支持现代PC的很多现有的硬件设备。设备驱动章节描述了在系统中,Linux内核如何控制物理设备。
The File system chapter (Chapter filesystem-chapter) describes how the Linux kernel maintains the files in the file systems that it supports. It describes the Virtual File System (VFS) and how the Linux kernel’s real file systems are supported.
文件系统章节讲述了Linux内核如何维护它支持的文件系统中的文件。该章讲述了虚拟文件系统(VFS)以及如何支持Linux内核真实的文件系统。
Networking and Linux are terms that are almost synonymous. In a very real sense Linux is a product of the Internet or World Wide Web (WWW). Its developers and users use the web to exchange information ideas, code and Linux itself is often used to support the networking needs of organizations. Chapter networks-chapter describes how Linux supports the network protocols known collectively as TCP/IP.
网络和Linux几乎是同义词。很大程度上,Linux是网络或者万维网(WWW)的产物。它的开发者和使用者通过网络交换信息思路,代码,并且Linux自身经常需要支持组织的网络需求。网络章节讲述了Linux如何支持合称为TCP/IP的网络协议。
The Kernel Mechanisms chapter (Chapter kernel-chapter) looks at some of the general tasks and mechanisms that the Linux kernel needs to supply so that other parts of the kernel work effectively together.
内核机制章节讲述了Linux需要提供的通用任务和机制,以便内核的其它部分能够有效的一起工作。
The Modules chapter (Chapter modules-chapter) describes how the Linux kernel can dynamically load functions, for example file systems, only when they are needed.
模块章节讲述了Linux如何在仅当需要时,动态地加载功能,例如文件系统。
The Processors chapter (Chapter processors-chapter) gives a brief description of some of the processors that Linux has been ported to.
处理器章节简要介绍了Linux已移植到地一些处理器。
The Sources chapter (Chapter sources-chapter) describes where in the Linux kernel sources you should start looking for particular kernel functions.
源代码章节讲述了Linux内核源代码中应该开始寻找特定内核函数的位置。
Conventions used in this Book
Chapter 1 Hardware Basics
An operating system has to work closely with the hardware system that acts as its foundations. The operating system needs certain services that can only be provided by the hardware. In order to fully understand the Linux operating system, you need to understand the basics of the underlying hardware. This chapter gives a brief introduction to that hardware: the modern PC.
操作系统必须作为基础的硬件系统联系紧密。操作系统需要仅由硬件能提供的服务。为了全面地理解Linux操作系统,你需要理解基本的底层硬件。本章节简要介绍现代PC硬件。
When the “Popular Electronics” magazine for January 1975 was printed with an illustration of the Altair 8080 on its front cover, a revolution started. The Altair 8080, named after the destination of an early Star Trek episode, could be assembled by home electronics enthusiasts for a mere $397. With its Intel 8080 processor and 256 bytes of memory but no screen or keyboard it was puny by today’s standards. Its inventor, Ed Roberts, coined the term “personal computer” to describe his new invention, but the term PC is now used to refer to almost any computer that you can pick up without needing help. By this definition, even some of the very powerful Alpha AXP systems are PCs.
当1975年1月“电子趋势”杂志在其封面上描述Altair 8080的时候,一个变革开始了。Altair 8080,以早期的Star Trek时代命名,可以由家庭电子爱好者以仅仅$397组装。Intel 8080处理器和256字节内存,但是没有显示器或键盘,在现在的标准看来这有点搞笑。它的发明者,Ed Roberts,将其定义为“个人电脑”用来描述他的新发明,但是术语PC现在用来指代任何不需要帮助即可使用的任何电脑。通过这个定义,甚至一些非常强大的Alpha AXP系统也是PCs。
Enthusiastic hackers saw the Altair’s potential and started to write software and build hardware for it. To these early pioneers it represented freedom; the freedom from huge batch processing mainframe systems run and guarded by an elite priesthood. Overnight fortunes were made by college dropouts fascinated by this new phenomenon, a computer that you could have at home on your kitchen table. A lot of hardware appeared, all different to some degree and software hackers were happy to write software for these new machines. Paradoxically it was IBM who firmly cast the mould of the modern PC by announcing the IBM PC in 1981 and shipping it to customers early in 1982. With its Intel 8088 processor, 64K of memory (expandable to 256K), two floppy disks and an 80 character by 25 lines Colour Graphics Adapter (CGA) it was not very powerful by today’s standards but it sold well. It was followed, in 1983, by the IBM PC-XT which had the luxury of a 10Mbyte hard drive. It was not long before IBM PC clones were being produced by a host of companies such as Compaq and the architecture of the PC became a de-facto standard. This de-facto standard helped a multitude of hardware companies to compete together in a growing market which, happily for consumers, kept prices low. Many of the system architectural features of these early PCs have carried over into the modern PC. For example, even the most powerful Intel Pentium Pro based system starts running in the Intel 8086’s addressing mode. When Linus Torvalds started writing what was to become Linux, he picked the most plentiful and reasonably priced hardware, an Intel 80386 PC.
热情的黑客们看到了Altair的潜力,开始为它编写构建软件。对于这些早期的先锋们,它代表了自由;精英阶层运行和维护大批量处理大型机系统的自由。大学辍学者由于对这个现象的着迷感到一夜暴富,一台你可以在厨房桌子上拥有的电脑。出现了很多硬件,它们之间某种程度上都有所不同,但是软件黑客们非常开心为这些新机器编写软件。有点矛盾的是,正是IBM在1981年发布了IBM PC并在1982年初交付给客户,从而牢牢地塑造将现代PC的样式。虽然其Intel 8088处理器,64K内存(可扩展到256K),两张软盘和一台80字符由25行彩色图形适配器(CGA),在今天的标准看来并不是很强大,但是它卖得很好。随后,1983年,IBM PC-XT推出拥有10Mbyte硬盘的奢华PC。不久之前,IBM PC的兼容机由Compaq等众多公司生产,PC的架构成为事实上的标准。这个事实上的标准帮助众多硬件公司在一个不断增长的市场中共同竞争,很高兴这种竞争为消费者带来了低价。这些早期PC的许多系统架构特性已经延续到现代PC中。例如,即使最强大的基于Intel Pentium Pro系统也可以在Intel 8086的寻址模式下运行。当Linus Torvalds开始编写将要成为后来的Linux时,他选择了最丰富并且价格最合理的硬件,即Intel 80836 PC。

Looking at a PC from the outside, the most obvious components are a system box, a keyboard, a mouse and a video monitor. On the front of the system box are some buttons, a little display showing some numbers and a floppy drive. Most systems these days have a CD ROM and if you feel that you have to protect your data, then there will also be a tape drive for backups. These devices are collectively known as the peripherals.
从外部看一台PC,最明显的部分是一个机箱,键盘,鼠标,视频监控器。在机箱前有一些按钮,一些指示灯显示一些数字和一个软盘驱动器。如今大部分系统带有CD ROM并且,如果你觉得你必须保护你的数据,也有磁带驱动器用作备份。这些设备统称为外围设备。
Although the CPU is in overall control of the system, it is not the only intelligent device. All of the peripheral controllers, for example the IDE controller, have some level of intelligence. Inside the PC (Figure 1.1) you will see a motherboard containing the CPU or microprocessor, the memory and a number of slots for the ISA or PCI peripheral controllers. Some of the controllers, for example the IDE disk controller may be built directly onto the system board.
尽管CPU系统的整体控制,但是它不是唯一的只能设备。所有的外围设备控制器,例如IDE控制器,都有一定程度的智能。在PC里(图1.1),你会看到一块主板,其包含了CPU或微处理器,内存,和一些ISA或者PCI外围设备控制器的插槽。一些控制器,例如IDE磁盘控制器可能直接构建在系统板上。
1.1 The CPU
The CPU, or rather microprocessor, is the heart of any computer system. The microprocessor calculates, performs logical operations and manages data flows by reading instructions from memory and then executing them. In the early days of computing the functional components of the microprocessor were separate (and physically large) units. This is when the term Central Processing Unit was coined. The modern microprocessor combines these components onto an integrated circuit etched onto a very small piece of silicon. The terms CPU, microprocessor and processor are all used interchangeably in this book.
CPU,或者微处理器,是任何电脑系统的心脏。微处理器通过从内存中读取指令然后执行它们,完成计算,执行逻辑操作并且管理数据流等任务。早期微处理器计算功能组件是独立单元(体积很大)。此时创造了中央处理单元术语。现在微处理器将这些元件组合到蚀刻在非常小的硅片上的集成电路上。本书中,术语CPU,微处理器和处理器可以互换使用。
Microprocessors operate on binary data; that is data composed of ones and zeros.
微处理器在二进制数据上操作;其由一系列0和1组成。
These ones and zeros correspond to electrical switches being either on or off. Just as 42 is a decimal number meaning “4 10s and 2 units”, a binary number is a series of binary digits each one representing a power of 2. In this context, a power means the number of times that a number is multiplied by itself. 10 to the power 1 ( 101 ) is 10, 10 to the power 2 ( 102 ) is 10x10, 10103 is 10x10x10 and so on. Binary 0001 is decimal 1, binary 0010 is decimal 2, binary 0011 is 3, binary 0100 is 4 and so on. So, 42 decimal is 101010 binary or (2 + 8 + 32 or 21+23+25 ). Rather than using binary to represent numbers in computer programs, another base, hexadecimal is usually used.
这些0和1对应着电开关的开与关。就像42是一个十进制数意味着“4 个10s和2个1”,一个二进制数是一系列代表2的幂的二进制数字。本文中,指数意味着一个数乘以自身的次数。10的1次方(101)是10,10的2次方(102)是10X10,10的3次方(103)是10X10X10等等。二进制001是十进制1,二进制0010是十进制2,二进制011是十进制3,二进制0100是十进制4等等。所以,十进制42是二进制101010或者(2 + 8 + 32或21+23+25)。计算程序中并不是使用二进制代表数,通常使用16进制。
In this base, each digital represents a power of 16. As decimal numbers only go from 0 to 9 the numbers 10 to 15 are represented as a single digit by the letters A, B, C, D, E and F. For example, hexadecimal E is decimal 14 and hexadecimal 2A is decimal 42 (two 16s) + 10). Using the C programming language notation (as I do throughout this book) hexadecimal numbers are prefaced by “0x”; hexadecimal 2A is written as 0x2A .
十六进制下,每个数字代表16的一次幂。因为10进制数仅从0到9,数字10到15用字母A,B,C,D,E和F代替。例如,十六进制E是十进制14,十六进制2A是十进制42(两个16+10)。使用C编程语言语法(整本书所用)十六进制数以“0x”开头;十六进制2A写作0x2A。
Microprocessors can perform arithmetic operations such as add, multiply and divide and logical operations such as “is X greater than Y?”.
微处理器可以执行数学运算,例如加,乘,除和逻辑运算,比如“X是否大于Y?”。
The processor’s execution is governed by an external clock. This clock, the system clock, generates regular clock pulses to the processor and, at each clock pulse, the processor does some work. For example, a processor could execute an instruction every clock pulse. A processor’s speed is described in terms of the rate of the system clock ticks. A 100Mhz processor will receive 100,000,000 clock ticks every second. It is misleading to describe the power of a CPU by its clock rate as different processors perform different amounts of work per clock tick. However, all things being equal, a faster clock speed means a more powerful processor. The instructions executed by the processor are very simple; for example “read the contents of memory at location X into register Y”. Registers are the microprocessor’s internal storage, used for storing data and performing operations on it. The operations performed may cause the processor to stop what it is doing and jump to another instruction somewhere else in memory. These tiny building blocks give the modern microprocessor almost limitless power as it can execute millions or even billions of instructions a second.
处理器的运行由外部时钟控制。该时钟(系统时钟)为处理器生成常规时钟脉冲,并且每个脉冲处理器执行一些任务。例如,处理器可以每个脉冲执行一条指令。处理器的速度根据系统时钟周期的速率来描述。100Mhz的处理器有100,000,000个时钟周期。以时钟速率描述CPU的功率有误导性,因为不同的处理器每个时钟周期执行不同的工作量。但是,在所有条件相同下,更快的时钟速度意味着更强大的处理器。处理器执行的指令非常简单,例如,“将内存X处的内存内容读入寄存器Y”。寄存器时处理器的内部存储,用来存储数据和在其上执行操作。执行的操作可能造成处理停止正在执行的操作,跳转到内存中其他位置的零一条指令。这些小的内建存储器给现代微处理器几乎无限可能,因为它可以每秒执行数百万行甚至数十亿的指令。
The instructions have to be fetched from memory as they are executed. Instructions may themselves reference data within memory and that data must be fetched from memory and saved there when appropriate.
当指令执行时必须从内存中获取。指令可能它们本身指代内存中的数据,并且必须从存储器中去除数据并在适当时保存在存储器中。
The size, number and type of register within a microprocessor is entirely dependent on its type. An Intel 4086 processor has a different register set to an Alpha AXP processor; for a start, the Intel’s are 32 bits wide and the Alpha AXP’s are 64 bits wide. In general, though, any given processor will have a number of general purpose registers and a smaller number of dedicated registers. Most processors have the following special purpose, dedicated, registers:
微处理器中寄存器的大小,数量和类型完全依赖于处理器类型。Intel 4086处理与Alpha AXP处理器有不同的寄存器指令集;一开始,Intel时32位宽,Alpha AXP时64位宽。但是,一般而言,任何给定的寄存器都有一些通用寄存器和较少数量的专用寄存器。大多数处理器具有以下特殊用途,专用寄存器:
Program Counter (PC)
This register contains the address of the next instruction to be executed. The contents of the PC are automatically incremented each time an instruction is fetched,
这个寄存器包含了吓一跳将要执行的寄存器的地址。PC的值在每次取完指令后自动增加。
Stack Ppinter (SP)
Processors have to have access to large amounts of external read/write random access memory (RAM) which facilitates temporary storage of data. The stack is a way of easily saving and restoring temporary values in external memory. Usually, processors have special instructions which allow you to push values onto the stack and to pop them off again later. The stack works on a last in first out (LIFO) basis. In other words, if you push two values, x and y, onto a stack and then pop a value off of the stack then you will get back the value of y.
处理器必须能够访问大量的外部读/写随机存储器(RAM),这有助于临时存储数据。堆栈是一种在外部存储器中,轻松保存和恢复临时值地方法。通常,处理器有特殊指令,使你能够将值压入栈并稍后再将其弹出。堆栈以后进先出(LIFO)为基础。换句话说,如果将两个值,x和y,压入对堆栈中,然后从堆栈中弹出一个值,你将得到后压入堆栈中的值y。
Some processor’s stacks grow upwards towards the top of memory whilst others grow downwards towards the bottom, or base, of memory. Some processor’s support both types, for example ARM.
一些处理器堆栈向上朝着内存的顶部增长,而其它处理器的堆栈则朝着内存的底部,或基底,向下增长。有些处理器支持两种类型,比如ARM。
Processor Status (PS)
Instructions may yield results; for example “is the content of register X greater than the content of register Y?” will yield true or false as a result. The PS register holds this and other information about the current state of the processor. For example,most processors have at least two modes of operation, kernel (or supervisor) and user. The PS register would hold information identifying the current mode.
指令可能产生结果;例如,“寄存器X中的值大于寄存器Y中的值吗?”将产生true或false作为结果。PS寄存器保存有关处理器的当前状态的值或其它信息。例如,大多数处理器至少有两种操作模式,内核态(管理员)和用户态。PS寄存器保存着识别当前模式的信息。
1.2 Memory
All systems have a memory hierarchy with memory at different speeds and sizes at different points in the hierarchy. The fastest memory is known as cache memory and is what it sounds like - memory that is used to temporarily hold, or cache, contents of the main memory. This sort of memory is very fast but expensive, therefore most processors have a small amount of on-chip cache memory and more system based (on-board) cache memory. Some processors have one cache to contain both instructions and data, but others have two, one for instructions and the other for data. The Alpha AXP processor has two internal memory caches; one for data (the D-Cache) and one for instructions (the I-Cache). The external cache (or B-Cache) mixes the two together. Finally there is the main memory which relative to the external cache memory is very slow. Relative to the on-CPU cache, main memory is positively crawling.
所有系统都具有内存层次结构,层次结构中不同节点处具有不同速度和大小内存。最快的内存称为高速缓冲存储器,它听起来像是用于临时保存或缓存主存储器内容的寄存器。这种类型的存储器非常快但是很贵,因此大多数处理器只有少量的片上高速缓存和更多的基于系统(板载)的高速缓存。一些处理器有一级高速缓存用于包含指令和数据,但其它类型处理器有两个,一个用于指令,另一个用于数据。Alpha AXP处理器有两个内部存储缓存;一个用于数据(D-Cache),一个用于指令(I-Cache)。外部缓冲(或B-Cache)将两者混在一起。最后存在相对于外部高速缓存很慢的主存中。相对于CPU上的高速缓存,主存相当于在爬行。
The cache and main memories must be kept in step (coherent). In other words, if a word of main memory is held in one or more locations in cache, then the system must make sure that the contents of cache and memory are the same. The job of cache coherency is done partially by the hardware and partially by the operating system. This is also true for a number of major system tasks where the hardware and software must cooperate closely to achieve their aims.
高速缓存和主存必须保持一致(协调)。换句话说,如果主存中的一个字保存在高速缓存中的一个或多个位置,则系统必须确保高速缓存和主存中的内容相同。缓存一致性的工作一部分由硬件完成一部分由操作系统完成。对于许多主要系统任务也是如此,其中硬件和软件必须密切合作以实现其共同目标。
1.3 Buses
The individual components of the system board are interconnected by multiple connection systems known as buses. The system bus is divided into three logical functions; the address bus, the data bus and the control bus. The address bus specifies the memory locations (addresses) for the data transfers. The data bus holds the data transfered. The data bus is bidirectional; it allows data to be read into the CPU and written from the CPU. The control bus contains various lines used to route timing and control signals throughout the system. Many flavours of bus exist, for example ISA and PCI buses are popular ways of connecting peripherals to the system.
系统板的各个组件通过称为总线的多个连接系统互连。系统总线分为三个逻辑功能;地址总线,数据总线和控制总线。地址总线指定数据传输的存储器位置(地址)。数据总线保存传输的数据。数据总线是双向的,其可以使数据从CPU读出并从CPU写入。控制总线包含用于在整个系统中路由定时和控制信号的各种线路。存在许多种类的总线,例如,ISA和PCI总线是将外围设备连接到系统的主要方式。
1.4 Controller and Peripherals
Peripherals are real devices, such as graphics cards or disks controlled by controller chips on the system board or on cards plugged into it. The IDE disks are controlled by the IDE controller chip and the SCSI disks by the SCSI disk controller chips and so on. These controllers are connected to the CPU and to each other by a variety of buses. Most systems built now use PCI and ISA buses to connect together the main system components. The controllers are processors like the CPU itself, they can be viewed as intelligent helpers to the CPU. The CPU is in overall control of the system.
外围设备是真实地设备,例如显卡,或者系统板上由控制芯片控制的磁盘,或者插入系统板上的板卡。IDE磁盘由IDE控制器芯片控制,SCSI磁盘由SCSI磁盘控制器芯片控制,依此类推。这些控制器通过各种总线连接到CPU并且互相连接。现在构建的大多数系统都是用PCI和ISA总线将系统组建连接在一起。控制器是像CPU自身的处理器,可以将它们视为CPU的智能助手。CPU整体上控制着系统。
All controllers are different, but they usually have registers which control them. Software running on the CPU must be able to read and write those controlling registers. One register might contain status describing an error. Another might be used for control purposes; changing the mode of the controller. Each controller on a bus can be individually addressed by the CPU, this is so that the software device driver can write to its registers and thus control it. The IDE ribbon is a good example, as it gives you the ability to access each drive on the bus separately. Another good example is the PCI bus which allows each device (for example a graphics card) to be accessed independently.
所有控制器都不同,但是它们通常由控制它们的寄存器。在CPU上运行的软件必须能够读取和写入那些控制寄存器。一个寄存器可能包含描述错误的状态。另一个可能用于控制目的,改变控制器的模式。总线上的每个控制器都可以被CPU单独寻址,这样软件设备驱动程序就可以写入其寄存器并对其进行控制。IDE功能区就是一个很好的例子,因为它能够分别访问总线上的每个驱动器。另一个很好的例子是PCI总线,它允许独立每个设备(例如显卡).
1.5 Address Spaces
The system bus connects the CPU with the main memory and is separate from the buses connecting the CPU with the system’s hardware peripherals. Collectively the memory space that the hardware peripherals exist in is known as I/O space. I/O space may itself be further subdivided, but we will not worry too much about that for the moment. The CPU can access both the system space memory and the I/O space memory, whereas the controllers themselves can only access system memory indirectly and then only with the help of the CPU. From the point of view of the device, say the floppy disk controller, it will see only the address space that its control registers are in (ISA), and not the system memory. Typically a CPU will have separate instructions for accessing the memory and I/O space. For example, there might be an instruction that means “read a byte from I/O address 0x3f0 into register X”. This is exactly how the CPU controls the system’s hardware peripherals, by reading and writing to their registers in I/O space. Where in I/O space the common peripherals (IDE controller, serial port, floppy disk controller and so on) have their registers has been set by convention over the years as the PC architecture has developed. The I/O space address 0x3f0 just happens to be the address of one of the serial port’s (COM1) control registers.
系统总线将CPU与主存储器连接,并与连接CPU和系统硬件外围设备的总线分开。硬件外围设备所在的存储器空间统称为I/O空间。I/O空间本身可以进一步细分,但我们暂时不用过分担心。CPU可以访问系统空间存储器和I/O空间存储器,然而控制器本身只能间接访问系统存储器,然后才能在CPU的帮助下访问系统存储器。从设备的角度来看,比如软盘控制器,它只能看到控制寄存器所在的地址空间(ISA),而不是系统内存。通常,CPU将具有用于访问存储器和I/O空间的单独指令。例如,可能有一条指令意味着“从I/O地址0x3f读取一个字节到寄存器X”。这正是CPU通过读取和写入I/O空间中的寄存器来控制系统硬件外设的方式。在I/O空间中,随着PC架构的发展,常用外设(IDE控制器,串行端口,软盘控制器等)的寄存器已按照惯例设置。I/O空间地址0x3f0恰好是串行端口(COM1)控制寄存器之一的地址。
There are times when controllers need to read or write large amounts of data directly to or from system memory. For example when user data is being written to the hard disk. In this case, Direct Memory Access (DMA) controllers are used to allow hardware peripherals to directly access system memory but this access is under strict control and supervision of the CPU.
有时控制器需要直接从系统内存中读取或写入大量数据。例如,当用户数据被写入硬盘时。在这种情况下,直接内存访问(DMA)控制器用于允许硬件外围设备直接访问系统内存,但此访问受到CPU的严格控制和监督。
1.6 Timers
All operating systems need to know the time and so the modern PC includes a special peripheral called the Real Time Clock (RTC). This provides two things: a reliable time of day and an accurate timing interval. The RTC has its own battery so that it continues to run even when the PC is not powered on, this is how your PC always “knows” the correct date and time. The interval timer allows the operating system to accurately schedule essential work.
所有操作系统都需要知道时间,因此现代PC包括一个称为实时时钟(RTC)的特殊外设。主要包括两点:一天中的可靠时间和准确的时间间隔。RTC有自己的电池,这就是为什么PC总是“知道”正确的日期和时间。间隔计时器允许操作系统准确地安排基本工作。
Chapter 2 Software Basics
A program is a set of computer instructions that perform a particular task. That program can be written in assembler, a very low level computer language, or in a high level, machine independent language such as the C programming language. An operating system is a special program which allows the user to run applications such as spreadsheets and word processors. This chapter introduces basic programming principles and gives an overview of the aims and functions of an operating system.
程序是一系列执行特殊任务的计算机指令。程序可以用非常底层的汇编语言编写,或者使用与机器无关的高级语言,如C编程语言编写。操作系统是一个特殊的程序,用户可以利用它运行电子表格和字处理器等应用软件。本章将介绍基本的编程概念,并概述操作系统的目标和功能。
2.1 Computer Languages
2.1.1 Assembly Languages
The instructions that a CPU fetches from memory and executes are not at all understandable to human beings. They are machine codes which tell the computer precisely what to do. The hexadecimal number 0x89E5 is an Intel 80486 instruction which copies the contents of the ESP register to the EBP register. One of the first software tools invented for the earliest computers was an assembler, a program which takes a human readable source file and assembles it into machine code. Assembly languages explicitly handle registers and operations on data and they are specific to a particular microprocessor. The assembly language for an Intel X86 microprocessor is very different to the assembly language for an Alpha AXP microprocessor. The following Alpha AXP assembly code shows the sort of operations that a program can perform:
ldr r16, (r15) ; Line 1
ldr r17, 4(r15) ; Line 2
beq r16,r17,100 ; Line 3
str r17, (r15) ; Line 4
100: ; Line 5
CPU从内存中取出并执行的指令根本不是人类可理解的。它们是精确地告诉计算机该做什么的机器码。十六进制数0x89E5是一个Intel 80486指令,它将ESP寄存器中的内容复制到EBP寄存器中。用于早期计算机的第一款软件工具是汇编器,它使用人类可读的源文件将其汇编成机器代码。汇编语言显式处理数据上的寄存器和操作,它们和特定的微处理器相关。用于Intel x86微处理器的汇编语言和用于Alpha AXP微处理的汇编语言有很大的不同。下面的Alpha AXP汇编代码展示了一个程序可以执行的操作:
ldr r16, (r15) ; Line 1
ldr r17, 4(r15) ; Line 2
beq r16,r17,100 ; Line 3
str r17, (r15) ; Line 4
100: ; Line 5
The first statement (on line 1) loads register 16 from the address held in register 15. The next instruction loads register 17 from the next location in memory. Line 3 compares the contents of register 16 with that of register 17 and, if they are equal, branches to label 100. If the registers do not contain the same value then the program continues to line 4 where the contents of r17 are saved into memory. If the registers do contain the same value then no data needs to be saved. Assembly level programs are tedious and tricky to write and prone to errors. Very little of the Linux kernel is written in assembly language and those parts that are are written only for efficiency and they are specific to particular microprocessors.
第一条语句(第1行)从寄存器15中保存的地址加载寄存器16。下一条指令从寄存器中的下一个位置加载寄存器17。第3行指令比较寄存器16和17中的内容,并且如果他们相等,则跳转到到标记100。如果两者的值不相等,程序继续执行第4行,将寄存器17中的内容保存到内存中。如果寄存器含有相同的值,那么没有数据需要保存。汇编级程序编写起来既乏味又棘手,容易出错。很少有Linux内核是用汇编语言编写的,那些部分只是为了提高效率,它们适用于特定的微处理器。
2.1.2 The C Programming Language and Compiler
Writing large programs in assembly language is a difficult and time consuming task. It is prone to error and the resulting program is not portable, being tied to one particular processor family. It is far better to use a machine independent language like C. C allows you to describe programs in terms of their logical algorithms and the data that they operate on. Special programs called compilers read the C program and translate it into assembly language, generating machine specific code from it. A good compiler can generate assembly instructions that are very nearly as efficient as those written by a good assembly programmer. Most of the Linux kernel is written in the C language. The following C fragment:
用汇编语言编写大型程序很困难,并且非常耗时。汇编程序容易出错,并且不可移植,它与一个特定的处理器系列关联。最好使用像C这样机器独立语言。C可以根据逻辑算法及其操作的数据描述程序。使用编译器读取C程序,然后将其翻译成汇编语言,从中生成机器相关的代码。一个好的汇编器可以生成几乎与一个优秀的汇编程序员编写的汇编指令一样高校的汇编指令。大多数Linux内核都是用C语言编写的。下面的C片段: