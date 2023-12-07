There were a lot of expectations with the launch of Google Gemini, and after yesterday’s announcement we finally know what we have on our hands: not one, but three multimodal AI models that will compete with ChatGPT.

The first of them, Gemini Pro, is now available through Google Bard, and although the most ambitious is Gemini Ultra, there is special interest with the smallest of the family, Gemini Nano. The reason is important: it opens the door to a new era in which we will have that “Pocket AI“, or ‘on device’, which thanks to our mobile phones will be available at all times and will also be independent of the cloud.

Welcome to the era of “on device” AI

With Gemini Nano, Google wanted to offer a much more efficient model specifically aimed at being able to work directly locally, on our devices, and without the need to connect to the cloud. That is the main and big difference with models like ChatGPT or Bard, which we can certainly use from our mobile—through a browser—but which work from the cloud on large servers that are responsible for processing and generating the responses.





Why answer WhatsApp when AI can do it?

With Gemini Nano all that processing and text generation occurs directly on our devices, and that has important benefits. Among them, which the data we use does not leave the device and are not shared with third parties, at least, as far as we know. We are, therefore, facing pocket AI models that can be run directly on our smartphones even without us being connected to data networks.

As those responsible for Google explain on the Android developers blog, this allows us to create “high-quality text summaries, intelligent contextual responses – like the WhatsApp example in the image just above these paragraphs – and grammatical correction with Gemini Nano. and advanced testing. Developers interested in creating applications that take advantage of the power of Gemini Nano can register on the Google platform.

The debut of Gemini Nano and the era of pocket AI has occurred on the Pixel 8 Pro, the company’s flagship. This smartphone will have generative AI options such as the ability to summarize a pre-recorded phone conversation in points.

A more efficient model with Android AICore as a key component

We are facing the most efficient model of the three that Google has presented, something obvious if we take into account that its destiny is to be able to run not on servers, but on our mobile phones. As those responsible for Google themselves explain in the product report, there are two different versions of Nano. The first is Nano-1, with 1.8 billion parameters (1.8B). The second is Nano-2, with 3.25 billion parameters (3.25B).





In addition, the model is quantized in 4 bits for display. That quantization refers to a process of reducing the precision of the model’s weights and activations from 32-bit floating-point values ​​to 4-bit integers.

This quantization process significantly reduces the memory footprint of the model, making it more suitable for deployment on resource-constrained devices, such as smartphones or IoT devices. Even so, they say at Google, this quantized model reaches comparable or even superior performance to the original 32-bit model from which it is based.

At the center of this deployment is Android AICore, a new system service that allows us to use foundational models such as Gemini Nano directly on our Android phones.

This new component of Android 14 is also “private by design”, and among other things enables fine-tuning processes through the so-called Low Rank Adaptation (LoRA), a technique that adapts large language models (LLMs) such as Google itself PaLM 2 to fit specific tasks and all this on “limited” devices like our smartphones.

This is just the beginning

The launch of Google Nano is promising, but it is true that today its features and practical applications are limited. The reality is that only a small portion of users—those who have a Pixel 8 Pro—will be able to start using it and They will only be able to do it in a couple of very specific scenarios. Summarizing conversations or replying to messages automatically is interesting, but we certainly want much more from these pocket AIs.

In fact, this deployment does not mean at the moment that we are going to have a “pocket ChatGPT” or a “pocket Google Bard”: the features of the model are not intended at the moment to replace the Google search engine – they may never be. , that would be throwing stones at one’s own roof—but rather to provide ways to make better use of our device and save time from time.

Generative AI models in the cloud such as ChatGPT or Bard therefore do not seem to be threatened by this new era of pocket AI: we are rather dealing with traveling companions who will act as “co-pilots” —as Microsoft likes to say— of that experience, but directly from the mobile, as if they were independent and separate applications.

From here, yes, the possibilities seem enormous, and we are only at the beginning of the path. One that may end up being a small revolution in itself.

