Gemini API File Search is now multimodal

FrequentLurker 1 day ago

This might be great and all but I am still miffed at how simple search on AI Studio is. You can only search the titles of your conversations and nothing inside them. On top of that they messed with the scrolling so Ctrl+F doesn't work reliably.

greesil 1 day ago

Too bad they can't just easily vibe code new features.
- bloqs 1 day ago
  
  Yeah, what happened to no more SWE
- telotortium 18 hours ago
  
  Ironically, they probably could if they used Codex or Claude Code. Those harnesses and models are good enough to do that these days (since late last year, and getting better since then). However, it seems that DeepMind and no one else at Google has access to either of these.
stingraycharles 1 day ago

Yeah, it’s surprising, Claude Desktop has had project files since decades which are chunked/indexed and automatically injected into your context based on the topic.
You’d think this would be fairly obvious for Google to do, but it’s probably an organizational problem rather than a technical one.
pants2 1 day ago

It's incredible how far behind Gemini has gotten, both the product and the model. Even the ChatGPT plugin for Google Sheets blows away the native Gemini integration.
Everyone thought Google was pulling ahead with Gemini 3. For a minute there they had the best language model, image model, AND video model in the world. But it's like they decided to pull over for a nap while OpenAI and Anthropic flew by.
- wilj 1 day ago
  
  I just cancelled my Gemini subscription yesterday. I have a big private fork of OpenCode, and I did it the wrong way to start with, so I couldn't pull from upstream.
  So I put together a plan for refactoring it, step by step, with tests, etc. After literally 8 solid days of fighting with Gemini 3 Pro, I still couldn't pull it off.
  I gave GPT 5.5 a chance with the same prompt, plans, and repo. I'm not sure how long it took, but when I checked in on it a few hours later it was done. All tests passed, everything exactly how I'd asked, and better (it made some improvements).
- comboy 1 day ago
  
  3.1-pro is still very capable, and API is at competitive price vs e.g. Anthropic, they just can't seem to figure out RLHF and harness. It needs a lot of guiding, it tends to be lazy and poorly sticking to instructions by default.
  It just feels like many google products really, they are capable of really amazing things, it's just that nobody there seem to care. I would guess they are likely optimizing more for internal use than their vast userbase.
  
  logicchains 23 hours ago
  
  They optimize for making their SRE's lives easier, over quantizing models regardless of how negative an effect that has on the user.
- thefounder 1 day ago
  
  I never felt Gemini was ever better than the OpenAI or Anthropic. I think it’s more on par with open source models than the top 2
- diegoperini 1 day ago
  
  I have the opposite experience where Gemini (even the flash models) has the only useful model for my reverse engineering related use case. My hunch is Google utilizes its free access to entire Google search indices to train itself from niche non-English speaking community websites, much frequently and in a "relevant" manner, which in the end gives these models the most up to date info for this particular kind of work. Every other model is just either 10 years outdated with their answers or simply hallucinates like waaaay crazy.
  
  embedding-shape 22 hours ago
  
  > for my reverse engineering related use case [...] Every other model is just either 10 years outdated with their answers
  I've mostly been doing reverse engineering with Codex, mostly related to games, but not once has the "training data cut-off date" been in the way, the most useful part comes from handing it a binary/directory and letting it prod it until it finds the answer you're looking for, I don't even have web search enabled and sometimes it might take 30-40 minutes for it to find the answer, but I never saw it be unable to find the answer because it's training data was a couple of years old.
- riddlemethat 22 hours ago
  
  It’s still the best option for uptime, document analysis (on a cost basis), and Google is less likely to experience a significant cybersecurity breach than a less established company. They’ll be fine as long as they stay in the game even if they never have a Ferrari again plenty of people buy Toyota.
- jmathai 21 hours ago
  
  My non technical wife knows both ChatGPT and Anthropic (admittedly, because of me) but doesn’t know Gemini. This is amazing to me.
  Surely she has seen Gemini in Google search but even her use of that is plummeting.
  Google has so much revenue that they’ll be around for a long time. But I feel they are fumbling the opportunity with AI. Even in corporate, where we have Gemini. The conversation is fully around Claude. No one talks about Gemini.
  
  panarky 20 hours ago
  
  > My non technical wife ...
  Reports of the death of Google Search have been greatly exaggerated.
  If you believe all the reports on HN about everyone's non-technical wives and grandmas, you'd have a hard time explaining the all-time highs in global usage and revenue from Google Search.
  I agree with you that Claude 4.7 Opus is better than Gemini 3.1 Pro, but it's also a lot more expensive.
  For my applications, I can't find better price-performance than Gemini 3.0 Flash. And it hasn't even been upgraded to 3.1 yet.
  I suspect Google's target is price-performance and not just raw performance, which is how they can serve LLM responses at Google Search scale and still set an all-time record for quarterly earnings of any public company ever.
  Frontier model capabilities leapfrog each other every few months, and Google I/O is in ten days, so I expect the leaderboard will change again soon.
  
  jonhohle 20 hours ago
  
  Unfortunately, I think Google is in the process of killing the golden goose. I visit so few unrecognized websites now and primarily rely on “AI mode” to answer my specific question rather than sift through a handful of possibly accurate pages. How long can that go on before those sites just no longer exist and the source of that knowledge or new knowledge evaporates. Doesn’t seem like that model is sustainable long term.
  
  bachmeier 19 hours ago
  
  Honestly, I think the SEO virus killed that golden goose long before the first AI chat bot. If we still had good search taking us to sane websites, ChatGPT might well have never been a thing. I was posting (including on HN) about the vulnerability of Google's search business years before AI chat. It just happens to be the thing that filled the gap when usable search disappeared.
  
  jorvi 20 hours ago
  
  OpenAI and Anthropic have no moat. DeepSeek is a drop-in replacement that is really close in performance for 7.5-20% of the cost. That cost will continue to get pushed down by the Chinese. And bizarrely enough their models are more secure to use because they're open source open weights.
  OpenAI and Anthropic are going to get crushed long-term, and their investors are going to take a horrendous haircut.
  On the other hand, Google and Microsoft already have the users (and lock-in). They just need to funnel them into Gemini and CoPilot.
  
  pants2 16 hours ago
  
  DeepSeek r1 affected markets because for a little while people bought this, but it's not true for so many reasons. Sending data to China is out of the question for every American Enterprise. OAI and Anthropic have rich product suites and API harnesses that make DS far from a "drop in replacement." They have better models, generous usage limits, domination of the zeitgeist, integrations with Slack and all the Enterprise SaaS platforms, and magnitudes more GPU capacity than DS. What you say simply isn't true.
  
  jorvi 12 hours ago
  
  > What you say simply isn't true.
  Everything you said is wrong.
  - DeepSeek is on V4 now, R1 is ancient history
  - The models are open source open weights, which mean you can inspect what the models do and you you can choose US or EU infra providers
  - DeepSeek literally has a Claude-compatible endpoint
  Please don't comment and confuse other users on topics you know nothing about. Study. Then speak.
  
  pants2 8 hours ago
  
  - Right, R1 affected markets because the market originally believed your theory, but it doesn't any more, which is why V4 didn't move markets at all.
  - Sure, you can use US infra providers. Together.ai is a good US provider but then it's 15X more expensive than DeepSeek's Chinese-subsidized pricing. It's really not that attractive at that price point. Anthropic and OpenAI are focused on larger models, but Grok 4.3[1] is smarter and significantly faster + cheaper than DS4[2] and by a wide margin.
  - DeepSeek has a Claude-compatible messages API, but that's trivial. Anthropic has a massive API platform with things like Sessions, Files, and Agents[3]. None of those are available on DeepSeek.
  1. https://artificialanalysis.ai/models/grok-4-3
  2. https://artificialanalysis.ai/models/deepseek-v4-pro
  3. https://platform.claude.com/docs/en/api/overview
  
  discordance 8 hours ago
  
  I wish DeepSeek were a drop-in replacement, but it's not. It performs amazingly well but it's not as autonomous and needs a lot more nudges compared to Opus4.6/7 or GPT 5.5. It's good enough for a lot of things (text extraction, sentiment analysis, classifying things) but not on the same level for code gen.
- bachmeier 20 hours ago
  
  Maybe they've decided they don't want to play the same game as OpenAI and Anthropic? They're much better positioned for the high volume AI work that's likely to be where the money is made, with calls to APIs doing routine things for all the businesses of the world. They're also the only big US player that has an open model that you can build on. I don't think vibe coding or the most cutting edge capabilities are what will determine profit from AI.
  
  Computer0 20 hours ago
  
  GPT-OSS is still decent I think at least when I need a local LLM.
  
  stingraycharles 19 hours ago
  
  > They're much better positioned for the high volume AI work that's likely to be where the money is made, with calls to APIs doing routine things for all the businesses of the world
  How, exactly, are they currently conquering the enterprise world with their models? What do you think Anthropic is doing?
  Their latest proper model is a year old, they have no moat, no enterprise commitment.
  Your comment would make sense if they would have actual success in the enterprise market and would have actual products in that area, but they don’t.
  They had a brief sprint, caught up, and then dropped the ball again.
  Their only current moat is their TPUs, and the fact that
  1. The whole (successful) LLM world is screaming for capacity
  2. They have excess capacity to rent out, just like Grok
  Tells everything.
  
  bachmeier 19 hours ago
  
  > How, exactly, are they currently conquering the enterprise world with their models?
  I didn't say they were conquering the enterprise world. I said they are better positioned for the work that will be profitable in the future. Winning will mean being "good enough" for things like routine interactions with customers at the lowest cost to the business, and having customers fine tune your models using your hardware.
  > What do you think Anthropic is doing?
  Aside from being arrogant jerks that don't care about pissing off their customers, they're positioning themselves as the highest price provider for the highest end work. There will be a market for that, and maybe Anthropic will survive, but Google looks to me like they have a shot at being the profitable AI company.
  
  lukeschlather 18 hours ago
  
  > Their latest proper model is a year old
  What's a "proper model?" Gemini 3.1 Pro was released 3 months ago. Gemini Robotics 1.6 was released a month ago. And Google is vertically integrated, they aren't just selling tokens, they are selling Taxi rides with Waymo. AI is a lot more than LLMs and Google is doing a lot more than LLMs.
  
  macNchz 16 hours ago
  
  If you're building on top of APIs and can do some eval work (aka do not need the most bleeding edge model), the Gemini Flash and Flash Lite models are super capable for the price.
qingcharles 1 day ago

I've come across a few weird search issues like this with Google lately. Entire company built on the best search engine ever created; can't do search properly in their apps.
sega_sai 1 day ago

The search in Gemini app in the browser is so embarrassingly bad that I get an impression that nobody of importance in Google must be using it otherwise they would have fixed long ago.
varispeed 1 day ago

I am more miffed that you cannot delete conversations.

lousken 1 day ago

Haven't touched gemini api since they did not support having a $ limit per api key. Is it possible now?

jwithington 23 hours ago

Yes https://ai.google.dev/gemini-api/docs/billing#spend-caps
- lousken 21 hours ago
  
  Finally!
algoth1 23 hours ago

With a 10min delay, via aistudio

ecommerceguy 21 hours ago

My free trial ends this week, which i'm obviously canceling.

thawab 8 hours ago

Tried multiple times to use the api file search and it’s complex to setup. Ended up going a different approach.

FirstPoint 1 day ago

It’s a striking irony that the world's leader in search is receiving so much heat for poor search functionality and UX within its own flagship AI products

WarmWash 23 hours ago

One of Googles core problems is internal silos of talent. The search team has likely never interacted with the Gemini app team or perhaps even the Gemini app.
For all intents and purposes Google Gemini is a totally separate company from Google search.
- kyrra 22 hours ago
  
  Google has ~190k employees. You can't have total collaboration across that many people.
  Teams will cross collaborate, but they have to be for specific projects with specific people.

ninjagoo 16 hours ago

Is anyone tech-savvy going to actually let any tool with this backend run on their personal PCs?

Any app with this behind the scenes is a non-starter for me.

And anyone think that all those folks ditching Win11 will be going for or recommending any app built on this?

trilogic 1 day ago

Good to have a choice between clouds and local use.

How much would you pay to have this yours forever, running locally, GDPR and HIPaa compliant, without the headache of privacy or subscriptions.

That´s what we offer with HugstonOne and we did it before Google. Multimodal, Lighting fast RAG, terabytes not kilobytes only :)

All you need is a 32gb ram laptop and HugstonOne, not a rocket science.