Google is making a big-time move in silicon that should scare Nvidia

Posted May 19, 2017

Much like Nvidia has built servers out of multiple V100s, Google has also constructed TPU Pods that combine multiple TPUs to achieve 11.5 petaflops (11,500 teraflops) of performance. Of course, standard GPUs can be used for all sorts of other things, while the Google TPUs are limited to the training and running of models written using Google's tools. "Our new large-scale translation model takes a full day to train on 32 of the world's best commercially available GPU's", Dean told a group of reporters in a press briefing this week. "We use TPUs across all our products".

At the same time, chip suppliers are pouring billions of dollars to stay ahead of the machine learning demands of cloud companies like Google and Microsoft, which has also devised custom chips for data centers.

In that sense, the original TPU was designed specifically to work best with Google's TensorFlow, one of many open-source software libraries for machine learning. Over the past five years, GPUs have become a standard for the training stage of deep learning, which can be used for image recognition, speech recognition and other applications.

Apart from the additional computing power, Google says the big difference is that the new TPUs can be used for both training and inference, compared to the first generation TPU that had to be trained separately. But Google has opted over the last few years to build some of this hardware itself and optimize for its own software. Optimised for AI computations, Google says the new TPUs deliver up to 180 teraflops of floating-point performance, and they will be available via the Google Compute Engine. By contrast, one-eighth of a TPU pod can train the model in just six hours.

Google's making the Research Cloud available to accelerate the pace of machine learning research and plans to share it with entities like Harvard Medical School.

It all comes down to training neural networks on large mounds of data and transforming that all into a workable algorithm - and that takes computation power. I think the TPU is more specialized, so there are certain kinds of workloads, not necessarily machine learning ones but other kinds of workloads that run well on GPUs that don't necessarily run on a TPU. Many of you reading this have at least one Gmail account, if not several, and for many people Google is the go-to search engine.

The company also announced that it is making 1,000 Cloud TPUs available at no cost to ML researchers via the TensorFlow Research Cloud, to further promote research and innovation in the field.

"Researchers given access to these free computational resources must be willing to openly publish the results of their research and perhaps even open source the code associated with that research", Dean said.