Theses days a lot is in change at NVIDIA. The manufacturer that once was market leader for discrete PC and Workstation graphics cards gets growing competition from Intel and AMD with the integrated graphics units. As Moores Law progresses integrated graphics will soon compete with discrete mid range desktop products and NVIDIA needs to be ready for this moment.
NVIDIA is aware of this situation and they decided to invest in developing processors for the tablet market. With Tegra 2 they were able to show a first product that was accepted by manufacturers like ASUS and they put it into their quite successful Transformer tablet PC. For the Tegra 2 NVIDIA licences ARM cores builds their own SoCs (System on a Chip). Mostly they have been put into tablets but not into smartphones.
NVIDIA manufactures their Tegra 2 processors at TSMC. Therefore their
using their 40 nanometer triple gate oxide process (LPG). The dual core
processors that are based on ARMs Cortex-A9-Design are optimized for
performance, which is the reason why clock frequencies up to one Gigahertz could
be realized. But compared to competitors Tegra 2 had to fight with the
disadvantage of high leakage currents. These were the consequence of the
LPG-process. TSMC also offeres an LP process which in the end has leakage
currents as a result which are orders of magnitudes lower than with the LPG
process. The market leaders like Qualcomm, TI and Samsung are using this process
at the moment. The high leakage has also been the reason why there were no
smartphones based on NVIDIAs Tegra 2. The problem is, that when the phone is
locked the background processes needed to much power to operate and this would
drain the battery way too quickly.
At this point NVIDIA had several opportunities. One was for example
that they could stick with the Tegra 2 design and optimize it for lower leakage
power. Another more radical approach would have been to use TSMCs LP
manufacturing process. But this would have thrown back the company from a
performance perspective. Instead they headed for a much more creative way. What
they did is to create a five core SoC whereas the fifth core is called the
companion core. This companinon core is made using the TSMCs LP process which
means that there will be very low leakage powers. This processor takes over when
a Tegra 3 device is locked for example, so all the background processes will run
with on a core that uses way less power. Furhtermore NVIDIA integrated power
gating. This means that the core logic can be deactivated. Core which are not
needed will therefore be shut down and drain no power at all. Like this NVIDIA
elegantly solved the problem with the leakage power and as a consequence there
could now even be smartphones based on NVIDIAs Tegra 3 SoC.
A look at other parts of the SoC reveals that Tegra 3 also went through
some evolution processes. The new SoC for example features NEON-Support which is
being realized via a ARM MPE (Media Processing Engine). To keep the Tegra 2 die
as compact as possible NVIDIA decided to not support NEON with Tegra 2. Actually
NEON is an instruction set which allows 2D as well as 3D acceleration.
Furthermore it can also accelerate sound synthesis.
If we also take a closer look at
the GPU we don't see a lot of differences. There is also more evolution than
revolution. Whereas Tegra 2 had vier pixel and four vertex shaders, Tegra 3 now
has twice as many shader units but still the same amount of vertex processors.
The core count went up to twelve.
For the cache hierarchy we can see that NVIDIA didn't improve the L1 as
well as the L2 cache. Every core gets 32KB/32KB L1 cache and all four cores
share a 1 Megabyte L2 Cache. Using twice as many cores compared to the previous
Tegra 2 but not increasing the L2 cache size means that NVIDIA obviously doesn't
believe that there well be many applications out there making use of four cores.
But nevertheless, regarding the L2 cache is now faster by two cycles on Tegra 3.
For the L1 cache there is no such improvement.
Concerning
the specifications there are also the clock frequencies. When only one
performance core is in use then this one tops out at 1.4 Gigahertz. With Tegra 2
the maximum clock speed was 1.0 Gigahertz. When all four performance cores are
active the maximum frequency is 1.3 Gigahertz. Furthermore the power gating
feature allows the deactivation of every single core. Therefore Tegra 3 only
drains more power than Tegra 2 when all for cores are under heavy load. In all
the other scenarios it is more efficient than the predecessor. Furhtermore there
is the companion core which clock at a maximum of 500 MHz. As already mentioned
this one is being manufactured using TSMCs LP process. Therefore it is optimized
for low power consumption.
NVIDIAs Tegra 3 is a clever and creative combination of the advantages
of TSMCs 40LP and 40LPG manufacturing processes. 40LP, which is used for the
companion core offers very good energy efficency. On the other hand there is the
40LPG process, which NVIDIA uses for the four other cores. These therefore offer
a lot of performance. In the end Tegra 3 becomes a highgly competitive product,
being both energy efficent and powerful.
It
was a strategically good descision of NVIDIA to invest in the development of
SoCs. Especially if you think about the success of the Apple iPadand the
generally booming market for Tablet PCs it makes a lot of sense having a
competitive product in this market. Furthremore two of NVIDIAs competitors,
namely AMD and Intel, don't even have an SoCs ready. Intels SoC version of Atom
turned out to be a flop as it wasn't competitive from an energy efficiency point
of view. With AMD the situation is even worse. They haven even touch the SoC
market until today. At least their new CEO Rory Read is now pushing towards this
direction but it will take at least two to three year from now until AMD can
offer a solid product. The big boy for SoCs are Qualcomm, Texas Instruments and
Samsung at the moment. These days NVIDIA has a comfortable advantage over
Qualcomm and Texas Instruments, because their first quad core SoC will make it
to market earliest in three to six months. Nevertheless these to SoCs are
expected to be manufactured at 28 nanometer. At least NVIDIA has a comfortable
time window now and we're sure their preparing preparing the transition to a 28
nanometer manufacturing process already right now.