Parent: Processor Architectures

Intel and AMD Historical, Pentium 4, AMD Opteron, Dual Core, Pentium M to Core 2, Nehalem, Sandy Bridge to Haswell, Hyper-Threading, SIMD Extensions

Dual Core

Around this time, it was clear that single core performance could not be pushed at the traditional 40% per year pace of Moore’s Law. There was plenty of transistor budget, as each manufacturing process step doubles the number of transistors at a given die size. For server applications, the best strategy for performance in terms of through-put was multiple cores, either on one die or multiple die in one package (socket).

Perhaps the one and only benefit (at this point in time) of a processor using the shared bus protocol is that it is relatively simple to put two single core die into one package that fits in a single processor socket. AMD had to do substantial design work to build their single-die dual-core processor. Paxville MP Dec 2005

Pentium 4 90nm 2M Pentium 4 90nm 2M   Opteron DC 90nm

Xeon 7000 series, two Pentium 4 90nm, 135mm2, 2M L2, 169M transistors per die (2005)
compared with Dual-Core Opteron, 90nm, 199mm2, 2x1M L2, 233M transistors (2005/6)

The down side for Intel on the dual-core Pentium 4 architecture was that frequency had to be scaled back significantly to fit within the maximum supportable thermal envelope, 165W for a dual-core 3GHz 90nm Xeon 7000 series. By contrast, the Opteron at 90nm single core model 856 (DDR) was 3GHz in a 93W thermal envelope. The dual-core model 890 (DDR) 2.8GHz fit in a 95W envelope, the dual-core model 8220SE (DDR2) 2.8GHz had a 120W envelope. So AMD was able to accommodate dual core Opteron on 90nm without giving up much of the design frequency.

Intel had to give up substantial frequency on the Pentium 4 architecture to stay within the desired thermal envelope. At 90nm, Intel was limited to 3.0GHz for the dual-core. AMD started at 2.2GHz in 2005, and slowly incremented frequency to 2.8GHz by 2006. Smithfield 2005 Q2, Pentium D Pressler 2006 Q1,

Intel had a single die, dual core Pentium 4 with 16M shared L3 cache on the 65nm process in Aug 2006 (Tulsa). This was able to clock at 3.4-3.5GHz. Because of the large cache, and hyper-threading, it was able to produce a very impressive TPC-C result, but the dual-core Opteron at 2.8GHz was able to achieve better performance in other aspects.

Pentium 4 65nm 2M Pentium 4 65nm 2M

Pentium 4, 65nm (Cedar Mill), 81mm2, 2M L2, 188M transistors per die (2006)

Tulsa

Xeon 7100, 65nm (Tulsa), 435mm2, 2x1M L2, 16M L3, 1.328B transistors (2006)

Afterwards, these became the 7000 series. The first, numbered 70xx, were the standard 65nm NetBurst with two single core die in a single package, each due with 2M L2 cache used in desktop and 2-way systems. In 2006, this was followed by the Xeon 7100 series, which was 2 NetBurst cores, 1M dedicated L2 and 16M L3 shared all on a single die. The very large L3 cache significantly improves high-call volume transaction-type applications. In 2007, the 7300 series comprised two dual-core 65nm Core 2 die in a single package, with 2 x 4M L2 cache. In 2008, the 7400 series was a 45nm process, six cores comprised of 3 dual-core pairs each with 3M L2 shared by the pair, and 16M L3 shared cache, all on a single die. As with the 7100 series, the very large cache significantly improved high-call volume applications.