Parent: Processor Architectures

Intel and AMD Historical, Pentium 4, AMD Opteron, Dual Core, Pentium M to Core 2, Nehalem, Sandy Bridge to Haswell, Hyper-Threading, SIMD Extensions

Intel History

The Pentium Pro also supported glueless symmetric multi-processing (SMP) for systems with up to 4 processors, enabling ordinary PC companies to enter the server market. The Pentium Pro was followed by incremental improvements of the Pentium Pro architecture in Pentium II and Pentium III. The next major new architecture was the Pentium 4.

At the time, Intel had two major concurrent design teams for mainstream personal computers, one focus on the high-end, the other on the lower end. This was in addition to the design teams working on Itanium.

Eventually Intel settled into their Tick-Tock Model. Intel has two major processor architecture teams, and one (or more?) smaller design team. The large teams handle the new processor micro-architecture (the "tock" in the Intel tick-tock). A smaller team under Intel's process technology division is responsible for taking an existing processor architecture to the new manufacturing process (the "tick"). There may be other small teams that handle the variants of each basic architecture.

Each architecture team is expected to produce a new processor architecture every four years. Two architecture teams are employed so that a new architecture is ready every other year. The process team is primarily responsible for advancing the manufacturing process of the latest tock processor, but can make design improvements, which some times can have significant impact (such as the on-die L2 cache in Pentium III codename Coppermine).

The Pentium 4 (NetBurst), and Nehalem architectures are from one major team, hence both feature Hyper-Threading. Pentium M, Core2 (Conroe) and the upcoming Sandy Bridge are from the other team. So even though Core2 was a later processor microarchitecture than NetBurst, the different team elected not to employ hyper-threading in Core2. However, Sandy Bridge is expected to have hyper-threading. It is unclear whether Sandy Bridge would implement two or four threads per core. (see Wikipedia List_of_Intel_microprocessors)

AMD History

In this time period, the AMD K5 project (1995?) did not achieve the performance goals of its ambitious architecture, resulting in the K5 being set aside. AMD had purchased NexGen and employed their design as the K6, setting aside their own K6. The K6 architecture (1996) was relatively sucessful, and was regarded as being more advanced than the Intel Pentium, but below the performance of its contemporary, the Pentium II. Being bracketed by Pentium II which could occupy the higher price points, and by Pentium (MMX) which has relatively low cost, AMD had difficulty in achieving financial success in this time period.

The AMD K7 (Athlon, 1999) architecture was more advanced than the Intel Pentium III. While less "advanced" than the Pentium 4, because the Pentium 4 had uneven performance characteristics, AMD was finally able to establish a position in the more profitable desktop processor market segments. The K7 was soon followed by the K8 (Opteron, 2003), which incorporated several bold initiatives. The Opteron was consider equal or better than the contemporary Intel processor, and also launched AMD into the SMP server market. Afterwards, AMD retained the basic Opteron architecture, making incremental improvements, while focusing on increasing the number of cores.

Intel Pentium Pro to Pentium III

The Pentium Pro came to market in late 1995, on a 0.5 micron process, with 5.5M transistors, and in 150-166MHz. The initial version was with the smaller 256K L2 cache that was more suitable to 2-way servers. A later version with 512K L2 cache had decent 4-way system scaling.

The original design was on 0.6µm, at 133MHz. I am not sure if this actually sold as a product. It was quickly succeeded by a 0.5µm version at 150MHz, and then a 0.35µm version at up to 200MHz.

Pentium Pro
P6, 1995 Nov, 600nm, 413mm2, 20.3×20.3mm

Pentium Pro
256K L2, 500nm, 203mm2, 17.1×11.8mm

The Pentium Pro was followed by the Pentium II (Klamath 0.35 micron, 7.5M transistors, up to 300MHz) in 1997 incorporating the MMX (vector integer) instruction set extensions first developed for the Pentium MMX, along with improvements in misaligned cache accesses for 16-bit applications.

Klamath
Klamath 1997 May, 350nm, 204mm2, 14.9×13.7mm, 512K off-die L2

Deschutes was codename for the 0.25 micron version of Pentium II, launched in 1998, available in frequency up to 450MHz. The server version was Pentium II Xeon, with larger and higher bandwidth L2 cache.


Deschutes 1998 Jan, 250nm, 118?mm2, 512K off-die L2

Pentium III followed in 1999 with new SSE (vector single precision FP) instruction set extensions. The initial version, codename Katmai, also on the same 0.25 micron (or 250nm) process as Deschutes. Transistor count was now 9.5M. The maximum desktop frequency reached 600MHz, but the Pentium III Xeon server version stopped at 550MHz.

Katmai
Katmai, 1999 May, 250nm, 127?mm2, 512K off-die L2

The 180nm version of Pentium III codenamed Coppermine launched in early 2000. The L2 cache was brought on-die at 256K, with 26M transistor total. The main motivation was cost reduction, both for the processor and at the system level. Below the radar, a new very low latency 256-bit wide bus was designed to connect the L2 cache. (For Pentium II, there was derivative with on-die L2, retaining the original 64-bit back-side bus.)


Coppermine, 2000 Mar, 180nm, 106mm2, 256K L2

The server version underwent late revisions due to poor and incompetent planning (it was realized that the initial plan, a quick shrink of Katmai, with the long latency off-die cache on a narrow 64-bit bus, would look pathetic compared to Coppermine), finally with the codename Cascades. Cascades after revision, also adopted an on-die L2 cache, with 1M & 2M versions to enable adequate 4-way scaling. The initial Pentium III Xeon was 700MHz, available about one year after Coppermine (1GHz) with an even later version at 900MHz.

Cascades
Cascades, 180nm, 385mm2, 2M L2, 17.7x21.7mm

Pentium III was further continued to 130nm (Tualatin) and a larger 512K on-die cache for the mobile market. The performance characteristics were very positive as 2-way servers also adopted this processor.


Tualatin, 2001 Apr, 130nm, 80mm2, 512K L2

All Pentium Pro to Pentium III processors employed a common bus. The initial Pentium Pro bus ran at 60 and 66MHz. The Pentium III processors extended bus frequency to 133MHz. The Pentium III Xeon with four processors on a shared bus was limited to 100MHz, mostly due to the slot mounting mechanism.

Banias was an improved PIII, called Pentium M.

Banias
Banias, 2003, 130nm, 83mm2, 1M L2

Dothan
Dothan, 2004?, 90nm, 84mm2, 2M L2