• AI Tangle
  • Posts
  • ☕️ The Arc Prize Foundation Team's Next Hurdle For AI Before AGI

☕️ The Arc Prize Foundation Team's Next Hurdle For AI Before AGI

After nearly five years of standing undefeated, the ARC-AGI benchmark fell in late 2024. However, researchers at the ARC Prize Foundation had something "just" a tad more difficult lying in wait. Other key highlights of the week include:

  • Google expands Gemini Live with real-time video recognition features for its conversational AI assistant

  • Microsoft releases a batch of 11 AI agents for its Security Copilot to help cybersecurity teams

  • Netflix co-founder Reed Hasting gives $50 million to establish an AI ethics class in Bowdoin

Join us at AI Tangle as we untangle this week's happenings in AI!

THE BIG AI STORY

From the creators of ARC-AGI, one of the most difficult AI benchmarks consisting of a series of puzzle-like problems, comes ARC-AGI 2, an aptly named step up from its predecessor. ARC-AGI was a near iron-clad benchmark that stood the test of time for nearly five years since its inception in 2019 by prominent AI researcher François Chollet and his team before finally succumbing in December 2024 to OpenAI's still-unreleased o3, albeit at a hefty cost. However, AI models now have a new hurdle to overcome - ARC-AGI 2.

What changed from the first benchmark, and how does modern AI fare?

If o3, at its lowest, was able to score 75.7% on the first iteration of ARC-AGI, the same model only managed a meager 4% on the second. The 75.7% score was something that o3 was able to achieve by bulldozing its way through the puzzles with raw compute, something that was exploitable to an extent in the first ARC-AGI that Chollet has said to have fixed for ARC-AGI 2 by introducing an "efficiency" metric. In addition to the release of the new benchmark, the ARC Prize Foundation has revived the ARC Prize competition for a 2025 iteration with a grand prize of €700,000.

5 QUICK HITS

Google is expanding its free-flowing conversational assistant Gemini Live by introducing new real-time AI video features that, by extension, will allow the assistant to "see" your screen or camera feed and answer questions on the fly. Powered by Project Astra, which itself was originally showcased in a demo almost a year ago, these upgrades will soon be available to Gemini Advanced subscribers under the Google One AI Premium plan.

Earlier this week on Monday, Microsoft announced it will soon roll out 11 new AI agents for its Security Copilot, designed to offload dull, repetitive tasks to reduce burnout for security teams. With six developed internally and the other five by partner companies, these 11 agents offer teams configurable autonomy and transparency, allowing security professionals to review and override decisions when necessary. As the tech industry struggles to fill vacant cybersecurity positions, Microsoft's agents are a band-aid solution for a larger problem.

In an effort to make his alma mater some of the most studied in the consequences and ethics of AI, Netflix co-founder Reed Hastings has donated $50 million Bowdoin College to establish a research initiative on "AI and Humanity." The largest gift to the college since its founding in 1794, the program looks to turn Bowdoin into a hub for studying AI's risks, societal impacts, and ethical frameworks. Hastings noted the need to hastily bolster moral-ethical systems as AI advances, likening its potential to be even greater than the impact of social networking.

According to a report by The Information, OpenAI and Meta are reportedly in separate talks with India's Reliance Industries to potentially bring more AI technology to the country via partnerships to boost their presence. The talks include ideas like distributing ChatGPT via Reliance Jio, slashing ChatGPT subscription prices from $20 to just "several dollars," and hosting AI models locally in a planned three-gigawatt data center in Jamnagar, Gujarat.

South Korean AI chip startup FuriosaAI has reportedly turned down an $800 million takeover offer from Meta, opting to instead concentrate on further developing and producing its chips. A local report alleges that the negotiations fell through due to disagreements over post-acquisition strategy and organizational structure. The startup was founded in 2017 by June Paik and has developed AI chips like Warboy and Renegade (RNGD) ever since, aimed at challenging the industry's bigwigs like Nvidia and AMD. To help it along, it is currently reportedly looking to raise $48 million in funding.

4 AI TOOLS

Rewritify - Turn robotic, AI-generated text into human-like, relatable content effortlessly with Rewritify by mimicking authentic writing patterns for a complete overhaul, ensuring your content feels natural and engaging..

Telosis - Telosis is an AI-powered tool that aims to help you improve your focus and productivity at its best every day, delivering delivers accurate metrics on each of your day-to-day tasks.

Bulletpen - Speak naturally, write brilliantly. Bulletpen is an AI-powered app that transforms your spoken thoughts and rambles into polished writing.

Cerebrella - Bring your sticky notes, whiteboard, research, visuals, and writing all under one AI-powered creative workspace umbrella with Cerebrella to foster brainstorming and capture ideas visually.

AI EXTRA READ

Can We Make AI Less Power-Hungry? (7-min read)

Up until very recently, demand for power in the US has been mostly flat, but that changed all of a sudden with the uprise of AI, bringing with it a kind of need for electricity unseen in history. Now, researchers are racing to find new ways to make AI more power efficient, but how?

Your AI Sherpa, 

Mark R. Hinkle
Publisher, The Artificially Intelligent

Enterprise (TheAIE) Network
Connect with me on LinkedIn
Follow me on X