• AI Tangle
  • Posts
  • ☕️ Anthropic Redefining AI Benchmarks

☕️ Anthropic Redefining AI Benchmarks

Tired of old, outdated AI benchmarks that don't even accurately measure what they claim to? So is Anthropic apparently, as it unveils an adventurous program to tackle just that. Other key takeaways of the week include:

  • Nvidia ends up in French courts for alleged anti-competitive practices

  • Google boosts its AI capacity with green energy investments into Taiwan's New Green Power

  • Robinhood brings AI features to its investment app by acquiring AI research platform Pluto and its founder

Join us at AI Tangle as we untangle this week's happenings in AI!

THE BIG AI STORY

The AI startup behind Claude, Anthropic, recently made an announcement to revamp the current state of AI benchmarks, aiming to create a more comprehensive and detailed set to better assess the capabilities of today's advanced AI. These benchmarks would assess an AI model's capabilities and societal impact, focusing on safety and going beyond current benchmarks that merely measure performance on a specific task.

What are Anthropic's plans for the program?

As detailed in a blog post, Anthropic's program is looking for benchmarks that assess both security threats (cyberattacks and manipulation) and positive applications (science, communication, and bias reduction). With a full-time coordinator overlooking the program, Anthropic plans to develop new platforms for benchmark creation and user trials involving "thousands," stating that it might purchase or expand projects it believes have the potential to scale. While this effort is valuable for AI safety, some experts are concerned that Anthropic might prioritize its own safety views and potentially overemphasize catastrophic risks, leaving other issues, such as AI hallucination, on the back burner.

6 QUICK HITS

After being raided by French authorities almost a year ago, Nvidia has to face off with them once again, this time facing charges from the French antitrust regulator for alleged anti-competitive practices. This marks the first enforcement action against Nvidia, driven by a broader inquiry into cloud computing. The report highlighted concerns about the sector's overreliance on Nvidia's CUDA chip programming software. Nvidia, which has seen increased demand for its chips due to generative AI applications like ChatGPT and others, declined to comment on the matter for the time being.

Google recently made a move to partner with asset manager BlackRock to develop a 1 gigawatt solar capacity pipeline in Taiwan, investing in New Green Power to boost clean energy. Google aims to leverage this new-found power to power its data centers and cloud region in Taiwan, supporting its net-zero emissions goal by 2030. Google also stated that some of the clean energy capacity will also be offered to its chip suppliers and manufacturers in the region.

Backed by Sam Altman and Y Combinator, Rain AI has hired Jean-Didier Allegrucci, a former Apple chip executive, to lead the company's hardware engineering to help develop energy-efficient AI chips, as written in a blog post. Allegrucci will collaborate with lead architect Amin Firoozshahian, formerly of Meta, to explore a technique known as in-memory compute modeled, which promises to reduce power consumption by processing data directly where it's stored. CEO William Passo believes their novel approach will "help unlock the true potential of today's generative AI models."

Adept, an AI startup founded two years ago, recently licensed its technology to Amazon, gobbling up several of Adept's co-founders and team members to join the e-commerce giant's team. Adept will continue operating under new CEO Zach Brock, focusing on agentic AI solutions as the deal provides Adept with a sort of lifeline amid acquisition talks with Meta and Microsoft. Though Adept has faced many challenges, namely dysfunction, it aims to create AI models that perform actions on any software tool using natural language.

Robinhood announced in a recent blog post that it had acquired AI-powered research platform Pluto Capital, Inc., aiming to enhance its AI features for the Robinhood app. Additionally, Pluto's founder, Jacob Sansbury, will join Robinhood under undisclosed terms to help with AI adoption by utilizing Pluto's data analysis tools to process market data. The goal is to help investors identify trends, optimize portfolios, and receive personalized investment strategies.

SK Hynix, the world's second-largest memory chip maker hailing from South Korea, says it will invest 103 trillion won ($74.6 billion) by 2028 to enhance its AI-focused chip business. SK Group, its parent company, plans to secure 80 trillion won by 2026 for AI and semiconductor investments and shareholder returns. The group aims to streamline its 175 subsidiaries and improve competitiveness in AI value chains with a goal of a profit turnaround targeting 40 trillion won ($28.9 billion) before tax by 2026.

4 AI TOOLS

Pixelmost - Create beautiful mockups and app designs in Pixelmost with the help of AI, templates, and pre-designed components.

QueryPal - QueryPal is an AI chatbot that reclaims team hours by automatically answering and getting rid of cluttering repeat questions using your chat history, docs, and repos.

Repurpose.io - Repurpose.io is an innovative AI-enhanced software platform that helps make content creation and marketing easier than ever, from repurposing existing content to automating content distribution.

Gem - Unify your recruiting tech stack with Gem, a platform powered with AI, CRM, and analytics to provide an all-in-one solution to recruiters' problems.

AI READ & WATCH

Defending Yourself From AI-Powered Scams (4-min read)

In an ever-developing world where AI-generated media is becoming increasingly realistic, learning how to protect yourself from scams involving voice cloning, personalized phishing, and deepfake blackmail is quickly becoming a must-have life skill on the internet.

AI Slop is Ruining The Internet (33-min watch)

Drew Gooden, an American YouTuber and comedian, takes off the audience's rose-tinted glasses about AI for a bit and goes on a tour through various social media to see the aftermath of generative AI on the internet landscape, along with touching on the Dead Internet Theory.