Nov 12, 2012

“Espresso” to Caffeinate Nintendo's Wii U

Recent talk about the Wii U has been buzzing in the ramp-up to the new console's release on November 18. Technical details, however, have been few and far between—until now. Information about Espresso, the new IBM chip for Wii U, are finally spilling forth.

Espresso bridges the performance gap that Nintendo’s competitors, Microsoft and Sony, have held since the sixth-generation of consoles. With Espresso in the Wii U, Nintendo is clearly playing hardball to win back the demographic that other consoles have held for about a decade: hardcore gamers.

Sources have alternately claimed a everything from Broadway to Xenon to POWER7 as the basis for the Wii U’s processor. While we’re unlikely to hear directly from IBM or Nintendo (six years later, the Wii's Broadway is still publicly undocumented), the details that we do have put Nintendo's previous options to shame.

Gekko Broadway Espresso
Architecture PowerPC 7xx PowerPC 7xx POWER7+
Bit-depth 32-bit 32-bit 64-bit
Clock 485 MHz 729 MHz 1.2 GHz
Cores 1 1 3
Threads/core 1 1 4
L1 cache (kB) 32/32 32/32 32/32
L2 cache (kB) 256 256 256
L3 cache (MB) 4

Table 1: Gekko v. Broadway v. Espresso

The Espresso specs are so close to a low-end, stripped-down POWER7+ that rumors claiming Wii U would share Watson's chip ring true. IBM would certainly look to its current-generation processor architecture to fulfill Nintendo's needs too. But don't let “low-end, stripped-down” give you the wrong idea. This thing is a powehouse that burns everything else in the console market today.

Espresso POWER7+
Bit-depth 64-bit 64-bit
Clock 3.0 GHz 3.0-4.25 GHz
Cores 3 4, 6, or 8
Threads/core 4 4
L1 cache (kB) 32/32 32/32
L2 cache (kB) 256 256
L3 cache (MB) 4 4-32

Table 2: Espresso v. POWER7+

In contrast to the embedded Wii U platform, the proper POWER7+ is intended for highly parallel (think 32 chips/256 cores) systems with scads of cache and system memory meant for chomping sales and simulations data.

As Table 2 shows, Espresso has less primary and secondary cache and it ships with only three active cores per package to the POWER7+'s four. But there is a good reason for using a newer microarchitecture with reduced specs.

One reason is that costs go down if IBM only needs to get three out of four cores per package to work. This in turn lowers the cost for Nintendo. The lack of a fourth core takes very little, if anything, away from Wii U performance since most developers, console or otherwise, are still struggling to take full advantage of multi-core chips.

Second, POWER7+ has some advantages over POWER7. Microarchitectural changes to POWER7+ include dynamic clockspeed scaling and shutting down unneeded cores, effectively doubling the tertiary cache. This could come in handy on the Wii U if developers want to optimize for fewer-but-faster core for their first game while getting a grip on multithreading.

And speaking of optimizing performance, Espresso simply whips Xenon's and Cell's asses, hands down. That's even when considering that it's a “stripped-down” version of the POWER7+ or that the PlayStation 3 and XBox360 stomped all over the Wii. The proof is in the pudding below.

Xenon Cell Espresso
Debut 11/05 11/06 11/12
Process (nm) 45¹ 45¹ 32
Clock (GHz) 3.2 3.2 3.0
Cores 3 1(+6)² 3
Threads/core 2 2, 1² 4
Threads (total) 6 8² 12

Table 3: Xenon v. Cell v. Espresso

The specs may look similar on paper regarding number of cores, threads, etc. but there is one important thing that separates the Cell and Xenon from Espresso: a decade.

Remember the Power Mac G5? It used what Apple called the PowerPC G5 and IBM called the PowerPC 970. The 970 was a stripped-down derivative of POWER4, IBM's server architecture of the day, much like Espresso is a bare-bones version of POWER7+.

Think of the Cell and Xenon as first cousins to the PowerPC 970. They're all low-end derivatives of the POWER4. In fact, Sony and Microsoft even bought thousands of Apple Power Mac G5 systems to develop the PlayStation 3 and XBox360, respectively, because the Power Mac G5 used what was roughly the same chip.

IBM debuted POWER4 in 2001; it debuted POWER7 in 2010. The difference of three generations of high-end mainframe processor architecture is staggering. Just like a POWER7+ whips a POWER4, Espresso whips Cell and Xenon. Period. Wii U will offer a true eighth-generation console experience a quantum leap ahead of the seventh generation.


¹These numbers reflect the latest Xenon and Cell production processes; each debuted at 90nm.

² The Cell works differently than most architectures; there is one "Synergistic Processing Element" (SPE) that is more-or-less a general CPU and eight "Power Processing Elements" (PPE). In the PlayStation 3, one PPE is devoted to system tasks while another is disabled to improve yields. While it has its advantages, it's also notoriously difficult to develop for.

3 comments:

  1. I've been reading from other people that the Espresso cores are clocked at 1.8Ghz. Quite surprised your info says 3.0Ghz.

    Also your info on the Cell BE in the playstation is completely wrong.

    It has just one 2-way PPE core, and seven SPE cores. Six of which are available to game devs, and one held back for system services, and a failed eight core for fabrication yields.

    Put in perspective, the Opteron/Cell BE based IBM Roadrunner is still no. 22 in the TOP500, whereas IBM Watson is now out of the first hundred computers in the list. If your info on the Espresso is all accurate then it is an excellent chip. But hand rolled Watson code, versus hand rolled Cell BE code, and the Cell BE is still much faster for all the important gaming problems(like Physics, particle fxs, AI).

    ReplyDelete
  2. is there a source for this?

    ReplyDelete
  3. so sorry for leaving comment on this post, but as you mentioned in one of previous posts, the number of mozilla developers is around 4000 people, Is there any where or any way to find information about mozilla developers, such as their names, education, current position and etc.?

    Best regards,

    ReplyDelete