The Prompt API
- haberman - 29345 sekunder sedanThis API seems perfect for an idea I've had for a while: a de-snarkifier for social media.
Social media can be intellectually stimulating and educational, but it's also easy to get sucked into ideological sniping and flamewars, even if you didn't go looking for it. The emotional and intellectual energy spent flaming strangers on the Internet is a complete waste of human capital.
With an API like this, I assume you could have a browser extension that could de-snarkify content before showing it to you. You could ask the LLM to preserve all factual content from the post, but to de-claw any aggressive or snarky language. If you really wanted to have fun, you could ask it to turn anything written in an aggressive tone into something that sounds absurd or incompetent, so that the more aggressive the post, the more it would make the author look silly.
This could have a double benefit. For the reader, it insulates them from the personal attacks of random strangers on the Internet. Don't get me wrong, there is a time and a place for real, charged arguments about important issues that affect us all. But there is little to be gained from having those fights with strangers; on the contrary, I think it poisons the body politic when strangers are screaming at each other.
For the writer, it takes away any incentive to be snarky or rude. If other people filter their content this way, there's no point in trying to be mean to them, and no "race to the bottom" for who can be more nasty.
- domenicd - 19890 sekunder sedanI led the design effort on this API, before retiring. Here's my writeup on some of the considerations that went into it: https://domenic.me/builtin-ai-api-design/
- meander_water - 12929 sekunder sedanThis looks like it uses Gemini Nano under the hood. But the latest Gemma4 E2B and E4B models appear to be much better, so you'd probably be better off deploying quantized versions through an extension for now.
- Gemini Nano-1: 46% MMLU, 1.8B
- Gemini Nano-2: 56% MMLU, 3.25B
- Gemma4 E2B: 60.0% MMLU, 2.3B
- Gemma4 E4B: 69.4% MMLU, 4.5B
Sources:
- https://huggingface.co/google/gemma-4-E2B-it
- https://android-developers.googleblog.com/2024/10/gemini-nan...
- avaer - 32625 sekunder sedanIt works, I've shipped this as a "local inference"/poor person's ollama for low-end llm tasks like search. The main win is that it's free and privacy preserving, and (mostly) transparent to users in that they don't have to do anything, which is great for giving non-technical users local inference without making them do scary native things.
But keep in mind the actual experience for users is not great; the model download is orders of magnitude greater than downloading the browser itself, and something that needs to happen before you get your first token back. That's unfixable until operating systems start reliably shipping their own prebaked models that an API like this could plug into.
- rock_artist - 26135 sekunder sedanI think it's a step into a future of proper Model API. But it's just a small step. It reminds me of Apple's Foundation Models [1]
While many AI integrations are focused on text communication / chat style. A lot of software benefits from non-text interfaces.
I believe at some point OSes and browsers should provide an API to manage models so you'll have access to on-device/remote ones with a simplified interface for the app. Making something standardized that is cross-platform would be fantastic. It also needs to be on mobile devices, so the players that can easily make it happen are mostly Apple and Google. (Meta will follow or vice-versa I guess)
Key-point: it shouldn't be exclusive to promoted models.
(1) https://developer.apple.com/documentation/foundationmodels So the app would be able to query and get the right model(s).
- tom1337 - 4907 sekunder sedanThe idea of having local LLMs accessible in the browser for privacy concerning is nice i guess but when each browser has a different model attached to this API testing becomes even more a nightmare then now. I wonder if this will drive more users towards chrome because most of the usages of this API might be just tailored to fit the Gemini Nano model?
- jameslk - 32232 sekunder sedanSeems like a good way for a rogue JS script to offload token generation to a bunch of unsuspecting visitors
It would actually be pretty interesting to see if its possible to decentralize the compute to generate something useful from a larger prompt broken down and sent to a bunch of browsers using a subagent pattern or something like RLM, each working on a smaller part of the prompt
- afshinmeh - 26915 sekunder sedan
- mudkipdev - 12868 sekunder sedanGemini Nano, unlike Gemma, is not open-weight, right? I would be interested in dumping the model weights, unless someone has done that already
- benjaminbenben - 14072 sekunder sedanWe use this for summarising our hack day write ups: https://remotehack.space/previous-hacks/
It's a tiny script that looks up the rss feed and uses the content to generate summaries; quite a nice fit with our static site. Sometime I'd like to extend it to ask different questions about the content.
- michaelbuckbee - 1643 sekunder sedanFwiw - I did a fairly large comparison of Gemini Nano (the in browser ai model) vs a comparable free hosted model of Gemma (from OpenRouter) and the hosted model absolutely trashed the local model on every aspect of speed, reliability, availability, etc. [1]
I'm not particularly happy about that outcome as I wish we had more locally run AI models for reasons of privacy and efficiency, so this is more just a warning that at present there are some severe tradeoffs.
1 - https://sendcheckit.com/blog/ai-powered-subject-line-alterna...
- oneeyedpigeon - 1056 sekunder sedanEvery time I see "prompt" nowadays, I'm briefly hopeful that I'm going to read something about $PS1. Then, inevitably, AI disappoints me yet again.
- nl - 32115 sekunder sedanThe model this uses is useless for anything beyond 2 round chat at the most.
If you want to do anything interesting you need transformers.js and a decent mode. Qwen 0.9B is where things start working usefully
- me551ah - 11652 sekunder sedanI’m just wondering how much more RAM and VRAM chrome will use after these changes
- gopalv - 23555 sekunder sedanThe better part of this is having a local-first AI, particularly because it has tool-calling builtin & structured output.
I haven't pushed out a full version[1] which uses ducklake-wasm + this to make a completely local SQL answering machine, but for now all it does is retype prompts in the browser.
- tethys - 19510 sekunder sedanSlightly off-topic: Refreshing to see these two authors link to their Bluesky and Mastodon profiles. No Twitter/X in sight!
- izietto - 17354 sekunder sedanCan pass to it the current page contents for a AI-based AdBlock / cookie manager / etc.?
- skybrian - 32806 sekunder sedanStill in origin trial? Looks like they're adding a temperature parameter:
- fg137 - 32854 sekunder sedan"sorry, to use our website, you must have at least 22 GB of free disk space."
- Ronsenshi - 13870 sekunder sedanNot long before all of the web content will be going through these AI pipelines where user might not even see original webpage.
- gorgoiler - 30477 sekunder sedanImagine a Vendor API that adds a way to link from the page straight into a device purchase workflow. As a trial of the API in Chrome you can order a new Google Pixel 9b directly from any page with the word Android in it!
Or a LocalNet API that integrates with trusted hardware devices on your local network. As a trial (Chrome beta programme — strictly limited but here’s 3x signup links to share with your friends) you can adjust your Google Next Mini underfloor heating directly from Chrome!
Or a DirectCast API that lets you stream <video> elements to a device of your choice even over a VPN. As a Chrome trial, you can use your Google Cloud account to stream directly from YouTube Premium to any linked Google Chromecast devices you own!
- - 32537 sekunder sedan
- denniszelada - 14173 sekunder sedan[dead]
- arcknighttech - 28308 sekunder sedan[dead]
- iggerews - 32289 sekunder sedan[dead]
- iggerews - 32355 sekunder sedan[dead]
- danny_codes - 28531 sekunder sedanDomain names are a nice candidate for a Georgian tax
Nördnytt! 🤓