- AI Tangle
- Posts
- ☕️ OpenAI's o1 Model Release Takes One Step Closer to AGI
☕️ OpenAI's o1 Model Release Takes One Step Closer to AGI
OpenAI enthusiasts rejoiced on Thursday this week as the company's highly-anticipated Strawberry took form, received an official name, and debuted - o1. Other key highlights of the week include:
French-based AI startup Mistral launches Pixal 12B, the company's first multimodal model
Google debuts DataGemma, a pair of AI models that aim to mitigate AI hallucinations
Amazon announces its aim to invest $10 billion into UK data centers over a 5-year plan
Join us at AI Tangle as we untangle this week's happenings in AI!
THE BIG AI STORY
Via OpenAI
As it has before and continues to today, OpenAI has raised the bar to even greater heights with the public release of o1, the company's most advanced and capable family of AI models to date. After months of public teasing and waiting, o1 is the official name given to the project known in-house as Strawberry. With no waitlist to speak of, though at the cost of diminutive usage limits, the doors are wide open for OpenAI enthusiasts to take o1 out for a spin.
How was o1 conceived and what are the results?
OpenAI's latest release comes as a pair: o1-preview and o1-mini, a preview version of the full model and a smaller, more efficient model aimed at code generation. OpenAI has taken a fundamentally different approach with o1 and slowed down the response speed in comparison to GPT-4o on purpose with reinforcement learning to allow it to "think" before responding to queries, which OpenAI calls o1's "private chain of thought." In places where o1 excels, it really excels, allowing o1 to achieve remarkable improvements over the company's GPT-4o in high-reasoning requirement tasks, though the latter is still preferred by users in more basic cases.
However, o1 is not the be-all and end-all.
Despite the impressive results, the public releases of o1-preview and o1-mini are very expensive and fairly barebones feature-wise in comparison to GPT-4o. Users eligible for the models, ChatGPT Plus and Teams subscribers, unfortunately, get a measly weekly rate of 30 messages for o1-preview and 50 messages for o1-mini. Developers with tier 5 API usage are limited to 20 requests per minute. As a final footnote, OpenAI added that o1-mini was on the horizon for ChatGPT Free users, too, though no specific date was given.
6 QUICK HITS
Known for models like its mixture-of-experts Mixtral 8X7B and Mistral Large 2, French AI startup Mistral recently debuted its very first multimodal AI model, Pixal 12B. The 12-billion parameter model is built on one of Mixtral's existing text models, Nemo 12B, and is just 24 GB in size. Similar to other multimodal AI models like OpenAI's GPT-4o and Google's Gemini, Pixal 12B can perform tasks like captioning images and counting the number of objects in a photo. Mistral's Pixal 12B is available via a torrent link on GitHub and via the HuggingFace platform for test drives under the Apache 2.0 license.
Noted in a recent blog post by Google, the company debuted a pair of open-source, instruction-tuned AI models, named DataGemma, to tackle AI's most prevalent problem - hallucinations. Available on Hugging Face for academic use, DataGemma builds on the Gemma family and employs two approaches: Retrieval Interleaved Generation (RIG), which showed significant accuracy improvements, and Retrieval Augmented Generation (RAG), which provided more comprehensive data. The post notes that current plans are to refine these models and integrate them into Gemma and Gemini systems on a broader scale.
Amazon Web Services (AWS), the cloud computing arm of Amazon, recently announced that it aims to invest £8 billion ($10.45 billion) into the UK over a 5-year plan to build, operate and maintain data centers. This follows similar moves AWS made earlier this year with plans in Spain and Germany, with figures around €15.7 billion and €7.8 billion, respectively. AWS was welcomed with open arms by British finance minister Rachel Reeves, who has been courting foreign investors in time for a summit on the 14th of October.
Adobe recently previewed new generative AI video tools, including a feature that generates video clips from still images. As part of the Firefly video model, the tool will allow users to create and adjust videos using text descriptions and camera controls with videos limited to five seconds. Adobe's VP of generative AI, Alexandru Costin, noted that these features will enter beta "later this year." The Firefly model, designed with "commercial safety" in mind, will eventually integrate with Creative Cloud and other Adobe applications - no specific timeline given.
Former head of Google China, Kai-Fu Lee, recently stated that he believes Chinese AI models are 6 to 9 months behind their US counterparts, but that China's AI applications could outpace US developments by early next year. Lee highlighted that AI training costs have dropped significantly, which enables rapid app development in both companies big and small, predicting that China will lead in AI consumer apps. Major firms like Alibaba and Tencent are making strides in their AI models, while startups like ShengShu Technology are innovating in text-to-video with Sora-like tools.
Reports from anonymous people well-informed in the manner state that Paris-based AI startup Poolside, founded in 2023 by former GitHub CTO Jason Warner and entrepreneur Eiso Kant, is in talks to raise $500 million in funding, led by Bain Capital Venture. Though the company has not released a product to this day, the funding round would put Poolside at a near $3 billion valuation. The startup plans to acquire more Nvidia GPUs to help them create AI systems that write software, aiming to compete with mainstream coding assistants, such as Microsoft's GitHub Copilot.
4 AI TOOLS
Indigo - Indigo is an AI sidekick packing a suite of desktop and web applications to enable the future of work with AI. Save prompts and run them in any app.
Hoop - Made for the busy professional, Hoop is an AI task management platform that captures and prioritizes your tasks with a global task list across your teams.
Magnific - Achieve new heights in resolution and detail with Magnific, an AI-based image editor capable of high-res upscaling, enhancing, and style transfers.
Genkin - Genkin helps you keep your finances in order by tracking your expenses using a chat-based system and an in-depth analysis of your spending.
AI READ & WATCH
Why We Fear Diverse Intelligence Like AI (17-min read)
As the line between "real beings" and "artificial beings" becomes more gray, Michael Levin, a biology professor at Tufts University, goes in-depth about how AI and diverse intelligences are slowly breaking historical clear-cut lines - and why we fear it.
Taking o1 Out For an In-Depth Spin With Wes Roth (30-min watch)
OpenAI's o1 has taken the web by storm, and Wes Roth, a well-known YouTube channel that covers AI news, takes o1 out for an in-depth test drive, going over his results, tests no other models have passed before, o1's "chains-of-thoughts," and where AI could go from here.