
We’ve been presenting a lot of new AMD products lately. APUs are slowly descending into the mainstream, and the very philosophy of using the PC has changed together with user needs. Making a brute-force CPU isn’t all that hard to do, but the balance of price, performance and power/heat dissipation of such a CPU is no longer a viable solution. AMD’s K10 CPU architecture and its revisions have been around for a long time. On the other hand, Intel has had a more advanced, or to be precise, better optimised product. All these factors put AMD in front of an arduous task - changing their microarchitecture completely and designing the new one from scratch. Expectations were high! The Bulldozer architecture is finally before us, but before presenting it, let’s have a reminder about what constitutes a modern CPU and processors in general.
The Bulldozer architecture is based on the modular principle, containing two integer blocks and one FPU split between these two clusters. This principle is also called CMT (Cluster-based Multi-Threading), as it enables one program thread to be executed on each cluster separately. As far as differences between CPU core and module are concerned, the main point of difference is that a core is made up of a single FPU and a single integer block, or even a sole integer block with no FPU, whereas a module is always comprised of two integer blocks and an FPU. What this practically means is that a Bulldozer CPU actually has a shared FPU, but also a few more perks that we’ll delve into later on. The integer block consists of a couple of AL units (ALU), as well as a memory read/write unit (Load/Store unit), practically defining this block as a single core. The integer block works with integers exclusively (as its name suggests), and the FP block takes care of real numbers. The FPU (Floating Point Unit) has the task of handling floating point numbers, obviously. In Bulldozer, this unit is shared between two CPU blocks. This concept seems to have persisted ever since the K7 architecture was presented, and is alternatively called the coprocessor concept, which has been the basic principle in AMD’s CPU design. Experienced CPU fanatics may even remember AMD’s acquisition of NexGen and their product Nx586, which was the basis for the then-current K6 architecture, which first experimented with this principle. Nx586 was a Pentium I-compatible CPU that was sold without an FP unit. AMD later designed and integrated its own FPU into K6. This FPU was fairly weak compared to competing Pentium II solutions, and it took the appearance of Athlon CPUs to get a new, stronger and redesigned FPU that supported the so-called pipelining, which is of major importance for the follow-up to our story. So basically, the coprocessor organisation has remained in AMD’s processors to this day, Bulldozer included, with the basic difference being the number of integer blocks.
In order for you to better understand the entire story about microarchitecture, we’ll explain some basic notions that constitute the heart of any PC, which is the central processor unit, or CPU. The fundamental part of any processor is the ALU (Arithmetic Logic Unit), which performs arithmetic and logical operations on numbers in binary code. Every processor can not only compute, but also load data and instructions, and place execution results into the memory right from its registers. Registers are the internal memory space inside the CPU itself, situated most closely to the mentioned ALUs. Registers contain operands that ALU uses as input. Operands are usually data represented as numbers. The CPU core has consisted of ALUs, registers and memory management units (which read and write data from and to the memory) since the age of Spectrum to this day. All of this assembled makes for a whole, functional processor, able to perform simple mathematical operations and use the memory for loading and storing data.












