AMD Bulldozer v podrobném testu od SiSoftware
My si samozřejmě tabulky vypůjčíme, přičemž nemůžeme minout hned tu první, z níž je zřejmé, co bylo porovnáváno: desktopový Zambezi na 2,8 až 3,8 GHz s desktopovým Sandy Bridge na 3,0 až 3,6 GHz.
Specs. CPU | AMD "Zambezi" | Intel "Sandy Bridge" | |
Speed / Turbo | 2800MHz (14x200) / 3800MHz (19x200) | 3000MHz (30x100) / 3600MHz (36x100) | |
CU / Cores / Threads | 4CU / 8C / 8T | 4C / 8T | |
Caches L1/L2/L3 | 8x16kB / 4x2MB / 8MB | 4x32kB / 4x256kB / 6MB | |
Power (TDP) | 95W | 95W | |
Cost (USD) | |||
Memory | 2x DDR3 PC3-10700 | 2x DDR3 PC3-10700 | |
Speed/Timing | 1333MHz 9-9-9-24 4-33-10-5 | 1333MHz 9-9-9-25 4-34-10-5 | |
Memory Controller Speed | 2200MHz (11x200) | 3000MHz (30x100) |
Budeme předstírat, že nevidíme překlep v max. taktu u Zambezi, pokud by tam místo 3,8 GHz bylo 3,7 GHz, pak se jedná o model FX-8100, tedy nejpomalejší chystané osmijádro postavené na mikroarchitektuře Bulldozer. Intel Sandy Bridge tak, jak je představen (s uvedenými takty), neexistuje, Intel nemá ze Sandy Bridge nic, co by mělo základní takt na rovných 3 GHz a v Turbu to šlo na 3,6 GHz. Teoreticky by mohlo jít např. o Xeon E3-1235 podtaktovaný v základu z 3,2 na 3,0 GHz, Turbo ponecháno na 3,6 GHz. Nebo to může být jiný podobný ekvivalent s upravenými násobiči. Přijde nám však hloupé dělat takovéto modifikace, protože se pak netestuje vůči reálným produktům. Na mysl se nám též dere úvaha, že je to vtípek někoho ze SiSoftu a že to také může být jeden velký podvrh.
Nicméně výsledky si prohlédněte v původních tabulkách, které i s průvodním textem v původním znění najdete níže jako screenshot ke stažení.
CPU Processing Benchmarks | AMD "Zambezi" | Intel "Sandy Bridge" | Comments | |
Native Dhrystone/Whetstone (GOPS) | 54.64 (34% lower) |
85.15 (baseline) |
The native CPU benchmark reveals that Intel had done a very job with "Sandy Bridge" and it is hard for AMD, even with a fresh design, to keep up with its very latest. If it came earlier it could have defeated the older "Lynnfield/Nehalem" (Core gen 1). | |
Java Dhrystone/Whetstone (GOPS) | 37 (49% lower) |
72.7 (baseline) |
Java VM is for sure not Zambezi's strong point, as it scores 50% lower. Considering Sun Microsystems (now part of Oracle) sold AMD Opteron servers you would expect the JVM x64 to be tuned for AMD. | |
.net Dhrystone/Whetstone (GOPS) | 17.66 (33% lower) |
26.55 (baseline) |
In the .Net environment, AMD catches up a little bit, but still not up to par. |
Multi-Media Benchmarks | AMD "Zambezi" | Intel "Sandy Bridge" | Comments | |
CPU Multi-Media SSE/128-bit (Mpix/s) | 132.3 (15% lower) | 155 | AMD CPUs were always good performers in Multi-Media tasks and this is a respectable result. We hope that the gap will not be wider when "Ivy Bridge" launches (Core gen 3). | |
CPU Multi-Media AVX/256-bit (Mpix/s) | 147.4 (33% lower) |
217.5 (baseline) |
Using AVX, "Sandy Bridge" is 50% faster than SSE(x) while Zambezi does not improve much (most likely due to the shared FPU within CUs); the performance gap has thus doubled. At least you don't need to pay to upgrade your software to AVX. | |
GPU Multi-Media (MPix/s) | - | 20.13 | A pretty woeful result from Sandy Bridge's so-called GPU, any GPU AMD will include in the APU version should blow this away. | |
GPCPU OpenCL (MPix/s) | 46 (17% lower) |
55.6 (baseline) |
A decent score coming from the Zambezi chip, however a difference that cannot be ignored remains between the two. | |
Java Multi-Media (Mpix/s) | 22.86 (10% lower) |
25.32 (baseline) |
Java VM is a Multi-Media scenario that's quite favourable to the AMD CPU and with optimisations it could overtake its competition. | |
.Net Multi-Media (Mpix/s) | 15.73 (22% lower) |
20.15 (baseline) |
The .Net environment is not as kind to the AMD CPU, we're seeing twice the gap of Java. |
Cryptography Benchmarks | AMD "Zambezi" | Intel "Sandy Bridge" | Comments | |
Crypto (MB/s) | 925 (16% higher) |
797 (baseline) |
In ALU mode (no AES and no AVX) the AMD chip scores its first win by a good margin (16% faster)! Finally the "true core" design shines through! | |
Crypto AES/AVX (MB/s) | 1277 (44% lower) |
2270 (baseline) |
Despite having both AES and AVX support, Zambezi is almost half as fast; looking at the individual results, it seems its AVX hashing perfomance is the issue (337MB/s vs. 943MB/s) - again most likely due to its shared FPU design. | |
GPGPU Crypto (MB/s) | - |
417 (baseline) |
The integrated GPU in Sandy Bridge just manages 50% of non-accelerated CPU performance and 20% of accelerated AES/AVX performance. Why bother? | |
GPCPU OpenCL Crypto (MB/s) | 583 (18% higher) | 494 (baseline) | In OpenCL AMD takes the lead by almost 20%, though to be fair it is AMD's OpenCL run-time, they are not going to optimise for the competition. |
Memory Bandwidth Benchmarks | AMD "Zambezi" | Intel "Sandy Bridge" | Comments | |
Memory Bandwidth (GB/s) | 15.33 (13% lower) |
17.58 (baseline) |
While the memory controller in the AMD CPU run 36% slower, it is only ~10% thus it looks to be more efficient. If only it could squeeze more bandwidth out of the memory but at least it does better than previous Core CPUs. | |
Memory Bandwidth AVX | 15.41 (12% lower) |
17.56 (baseline) |
With AVX, Zambezi improves a bit though not enough to be noticeable. | |
Memory Bandwidth AVX, Internal GPU enabled | - |
17.3 (baseline) |
Unlike many designs, Sandy Bridge does not lose performance when the GPU is enabled. We shall have to see how the AMD APU behaves. | |
GPCPU OpenCL Memory Bandwidth | 8.44 (34% lower) |
12.73 (baseline) |
Even though we're using AMD's own OpenCL runtime, memory transfers are 35% slower, 3x the gap we saw in native memory bandwidth benchmarks. It seems AMD's OpenCL team has some optimisations to make for Zambezi! |
Transcoding Benchmarks | AMD "Zambezi" | Intel "Sandy Bridge" | Comments | |
CPU Transcode (kB/s) | 657 (22% lower) |
837 (baseline) |
Even with its 8 real cores and improved multi-threading, Zambezi still cannot match Sandy Bridge even in software mode. | |
GPU Transcode (kB/s) | - |
4800 (baseline) |
It is not fair to compare the transcoding capabilities found in a modern GPU with those of a CPU, but we cannot help it: Intel's QuickSync is a "killer feature" even for high-end CPUs. You'd need 6-times more cores (24 = 4x6) to match the performance in software! Zambezi would need ~8-times more cores (64 = 8x8), so keep an eye for that 8-way Opteron system. |
Cache and Memory Benchmarks | AMD "Zambezi" | Intel "Sandy Bridge" | Comments | |
Cache & Memory SSE-128 (GB/s) | 66.45 (25% lower) |
88.7 (baseline) |
The newly approach with the shared L2 cache in the Zambezi may not be the best given the bandwidth measured. | |
Cache & Memory AVX-256 (GB/s) | 69.7 (25% lower) |
93.16 (baseline) |
We see the same difference in bandwidth when the AVX instructions are used; both systems improve marginally by using AVX. | |
Cache & Memory int. GPU AVX-256 (GB/s) | - |
88.63 (baseline) |
It remains to be seen whether how much the APU version of Zambezi is affected when the internal GPU is enabled; Sandy Bridge is not affected much - unlike older systems with shared graphics. |
Latency Benchmarks | AMD "Zambezi" | Intel "Sandy Bridge" | Comments | |
Linear (ns) | 12.3 (73% higher) |
7.1 (baseline) |
Zambezi's result is not satisfactory in latency terms: another line in the list of improvements that need to be made in future revisions or perhaps a BIOS optmisation is enough. | |
Random (ns) | 98.4 (34% higher) |
73.5 (baseline) |
The random latency test fares better for AMD but it is still too high considering we use the same memory speed/timings. The memory controller needs some optimisations to be competitive. |
Power Efficiency (this measures the efficiency of power design, or TDP) | AMD "Zambezi" | Intel "Sandy Bridge" | Comments | |
Standard Performance vs. Power | 54.64GOPS 95W 59.52MOPS/W (34% lower) |
85.15GOPS 95W 89.63MOPS/W |
The performance gap in the native processing benchmark reflects here as both chips have the same TDP, with Zambezi fielding 4 CUs with 8 cores against Sandy Bridge's 4 cores and 8 threads. At least Zambezi is more efficient than previous Core CPUs ("Nehalem"/"Westmere" - these are ~40-50% less efficient than "Sandy Bridge" while Zambezi is "only" 34% less efficient). | |
Multi-Media Performance vs. Power | 147Mpix/s 95W 1.55Mpix/W (32% lower) |
217Mpix/s 95W 2.28Mpix/W |
Similar difference in AVX mode, but in SSE(x)/128-bit mode it's "only" 15% less efficient; so for current and older software the gap is not significant. It also beats previous Core CPUs as they don't support AVX anyway. | |
Cryptographic Performance vs. Power | 1277MB/s 95W 13.4MB/w (44% lower) |
2270MB/s 95W 23.9MB/w |
The highest difference yet, even though Zambezi has 8 real cores not 4 like Sandy Bridge (and AES & AVX support). |
Speed Efficiency (how performance scales with speed and how they perform at the same speed) | AMD "Zambezi" | Intel "Sandy Bridge" | Comments | |
Standard Performance vs. Speed | 54.64GOPS 2800MHz 2.02MOPS/MHz (29% lower) |
85.15GOPS 3000MHz 2.84MOPS/MHz |
Similar to the power efficiency, Zambezi is 30% less efficient clock for clock; this means it is less efficient than even previous Core CPUs (which are only ~10-15% less efficient than Sandy Bridge). | |
Multi-Media Performance vs. Speed | 147Mpix/s 2800MHz 53kpix/MHz (26% lower) |
217Mpix/s 3000MHz 72kpix/MHz |
Similar difference in power efficiencies, Zambezi is 26% less efficient in AVX mode clock for clock. | |
Cryptographic Performance vs. Speed | 1277MB/s 2800MHz 0.45MB/MHz (40% lower) |
2270MB/s 3000MHz 0.75MB/MHz |
The highest diference of all tests, 40% less is pretty significant clock for clock. |
Závěr
Závěr lidí, kteří tento test spáchali, je, že AMD je od pohledu na výsledky celou generaci za Intelem. Je to lepší než kdysi K8 a v některých testech dokonce lepší než konkurence (Cryptography, Multi-Media), ale třeba 256bit. výkon (AVX/FMA) je prostě pomalý.
My k tomu jen dodejme, že jde o syntetické testy a praxe může vypadat jinak (lépe, ale i hůře). Raději bychom také viděli srovnání nejrychlejšího procesoru, který bude (snad) zanedlouho uveden (FX-8150 na 3,6 až 4,2 GHz) s nejrychlejším mainstreamem od Intelu, jímž je Core i7-2600K na 3,4 až 3,8 GHz.
Zdroj: Google Cache via EXPreview (nedostupný originál na SiSoftware)