Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement a helper app to crowdsource AICore testing #6138

Open
nicolas-raoul opened this issue Jan 17, 2025 · 2 comments
Open

Implement a helper app to crowdsource AICore testing #6138

nicolas-raoul opened this issue Jan 17, 2025 · 2 comments
Assignees
Labels
gsoc Google Summer of Code

Comments

@nicolas-raoul
Copy link
Member

Context

To implement AICore-based features (example), we need to test prompts on AICore.

Problem

AICore can not be tested on the emulator, and unfortunately no way to crowdsource this testing seem to exist (context).

Currently only Paul has a smartphone advanced enough to run AICore. Paul accepted to test for us, but we must make it as easy as possible for Paul.

Solution

Develop an app:

  • where Paul (and potentially other people who own these devices) can effortlessly run our prompts and send us the responses.
  • Ideally it would be easy for us to manage the prompts, and to see the responses categorized by model/OS version/AICore version.
  • Ideally the device owners would not need to do anything, just install it. Maybe run it once in a while if auto-running presents challenges.

The app would live in its own repository under the https://github.com/commons-app organization.

@nicolas-raoul nicolas-raoul added the gsoc Google Summer of Code label Jan 17, 2025
@nicolas-raoul nicolas-raoul self-assigned this Jan 17, 2025
@parneet-guraya
Copy link
Contributor

Very nice idea. I liked it a lot. And I might suggest we can also use meta's llama models locally. I'm showing a demo of the app that ran on my device. Plus point is we load the model through our app so it will work on any phone (as long as it can handle it) .

Device: Oneplust 9RT 5G

This is running completely offline and the model is ~ 1GB and is downloaded on initial install so no impact no apk size.

Record_2025-01-18-04-31-02.mp4

So, we can have both options, devices that support Gemini and also fallback to opensource Llama. Also from the numbers on the site their lightweight models are comparable to Gemini.

But, these models are chat based. So, some feature would require response in some format so parsing would be an issue unless someone train the model to do so. While other things like summary, explanation would work out of the box.


To implement AICore-based features (#5422), we need to test prompts on AICore.

Now this feature for instance

First step: Parse the caption into expressions could be asked to an LLM if one is available locally on the device. That means only on Pixel 8 Pro (or above) and Samsung Galaxy S24 (or above) for now but hopefully more makers will follow.

It could work but not sure how we can make the model return response in particular format like JSON or each entity in a list object. We can write prompt to set the boundaries with example input and output but it does break the rule sometimes.

For example:

Prompt ( not a great prompt engineer):
'Papilio machaon on Asteraceae flower in Croatia' from the above text extract each word that has a meaning separately as if this query goes through a recommendation engine and provide each entity separately and in json format.

Output: (completely offline)

Record_2025-01-18-04-56-00.mp4

On devices where no such technology is available, the app can fall back to some more traditional tokenization, maybe even split on each space character, or just skip entirely.

Right I have tried something similar a while back in a project. Basically I'd take the current caption that is showing in a video and tried to tokenize each word so user can make search through the internet without going back and forth. This required me to split the words but tradition splitting based on white space character didn't work sometimes. Then, I found a solution using a very lightweight NLP model. It worked great for my usecase. But, yes it didn't give me the meaningful entities just NLP based tokenization. Also, there are different binaries for different language but each file's size is < 500 KB.

All models: https://opennlp.apache.org/models.html

Image


Lastly if I understood this correctly in order to implement and maintain on device AI based features we need some sort of to test prompts so that feature works as intended. But, there's a device limitation so we want a platform where one could upload required prompt and devices that can process it and will pick up this (auto or manually). Then post the result. It could be like a social media apps with posts as a prompt and answer as comments with device details.

And this would be like sort of internal tool for developers right?

Also, are we responsible for writing backend (API) for this too?

Thanks :-)

@nicolas-raoul
Copy link
Member Author

the model is ~ 1GB and is downloaded on initial install so no impact no apk size

llama being more open source is a clear advantage, but we really should avoid taking the bandwidth and storage space required when downloading a model. The advantage of AICore is that it is already present on devices, no need to download anything.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
gsoc Google Summer of Code
Projects
None yet
Development

No branches or pull requests

2 participants