February 10, 2024

February 10, 2024

February 10, 2024

Google’s newest AI model features a 1-million token context window: >5x current models!

Google’s newest AI model features a 1-million token context window: >5x current models!

Google’s newest AI model features a 1-million token context window: >5x current models!

According to Google, Gemini 1.5 Pro outperforms GPT-4 in large-context and multimodal tasks, particularly excelling at finding specific details in huge datasets. For single "needle-in-a-haystack" searches, the model achieves >99.7% recall. In more complex tasks requiring multiple details, it outperforms GPT-4 but misses ~40% of relevant data, limiting certain applications.

According to Google, Gemini 1.5 Pro outperforms GPT-4 in large-context and multimodal tasks, particularly excelling at finding specific details in huge datasets. For single "needle-in-a-haystack" searches, the model achieves >99.7% recall. In more complex tasks requiring multiple details, it outperforms GPT-4 but misses ~40% of relevant data, limiting certain applications.

According to Google, Gemini 1.5 Pro outperforms GPT-4 in large-context and multimodal tasks, particularly excelling at finding specific details in huge datasets. For single "needle-in-a-haystack" searches, the model achieves >99.7% recall. In more complex tasks requiring multiple details, it outperforms GPT-4 but misses ~40% of relevant data, limiting certain applications.

Yegor Denisov-Blanch

Yegor Denisov-Blanch

Yegor Denisov-Blanch

Content

Content

Content

4 mins

4 mins

4 mins

🤔Has Google finally caught up to OpenAI?

Gemini 1.5 Pro has the largest context window of any foundation model. It can handle:
-1 hour of video 🎥
-11 hours of audio 🎧
- 30k+ lines of code 💻
- 700k+ words (1,500 pages of text) 📚

❓ How does Gemini 1.5 Pro perform relative to GPT-4?

Google's research highlights that its latest model surpasses GPT-4 in both large-context and multimodal tasks according to their benchmarks.

The model's ability to solve "needle-in-a-haystack" problems, where it finds specific details within vast amounts of data, is particularly strong.

✅ In tasks requiring the identification of a single piece of information from large datasets (single needle-in-a-haystack), the model achieved an exceptional recall rate of >99.7%.

🔄 When facing more challenging tasks that involve locating multiple pieces of information (multiple needle-in-a-haystack), the model outperformed GPT-4, although it failed to retrieve ~40% of relevant information, limiting its practical application.

🧐 In terms of "Core Capability" (i.e., performance in tasks not requiring extensive context), Google has only compared its model to the earlier Gemini 1.0. This suggests that Gemini 1.5 Pro may not consistently exceed GPT-4's performance yet.




Google seems to be moving in the right direction with this by carving out niches where their models are superior.

It will be interesting to see how Gemini 1.5 Pro stacks up against GPT-4 in user-generated benchmarks.

Similar Insights

Similar Insights

Similar Insights

Logo

Objective Productivity Data for Smarter Engineering Team Decisions

Let's talk

Logo

Objective Productivity Data for Smarter Engineering Team Decisions

Let's talk

Logo

Objective Productivity Data for Smarter Engineering Team Decisions

Let's talk

Logo

Objective Productivity Data for Smarter Engineering Team Decisions

Let's talk

Logo

Objective Productivity Data for Smarter Engineering Team Decisions

Let's talk