Saturday Links: Two Keynotes and Sarcasm Detection for AI
There was two dueling AI keynotes this week, with OpenAI's unveiling of GPT4"oh" and Google's myriad of announcements at Google I/O. Those and other links this week:
- Introducing ChatGPT 40. OpenAI's launch keynote, which was released on Monday, was an impressive sequence of demos ranging from real-time translation and much faster voice interaction to truly impressive multi-modal interaction. While many of the response capabilities remain close to GPT, the new range of snappy interactions is quite game-changing in terms of the types of applications that could be built. The 25m keynote video is really worth watching.
- High-profile resignations at OpenAI marred the victory lap. Ilya Sutskever was the first to leave, followed by Jan Leike (who led AI Alignment). Jan's post voices concerns about launches over safety from OpenAI, and Sam Altman was on the defensive after the post. It's hard to know what's really going on at OpenAI. The functionality advances are certainly very powerful. I think most researchers would still call ChatGPT 4o far short of anything close to OpenAI, but I guess the key concern here is around how much is being invested in safety v's functionality.
- Google Keynote (Google I/O ‘24). Google followed up with a wide-ranging set of current and future updates. (Was ChatGPT's rollout timed to pre-empt them?). Amongst the most impressive demos were new AI-powered search results set to roll out worldwide by the end of the year + and AI-powered question answering across your Gmail and Google Drive assets. Google really does have a powerful advantage in being able to access much of your email and docs data. The changes to search are unsurprisingly a direct competitor to Perplexity but perhaps even more predictable, a cause for immediate concern for publishers. Gemini-infused search will produce more "immediate" answers to search queries. Though these will come with links to sources, those links may no longer get many clicks since the answer is already on screen. One can see the concern of content publishers... on the other hand there's a clear challenge that if Google doesn't do this, others will. Content on the Internet will likely need a new model going forward; the Google SEO-driven era will slowly come to an end.
- Towards Multimodal Sarcasm Detection. This effort is both hilarious and essential. Researchers are building a database of sarcastic interactions in order to train AI to recognize these turns of phrase. One might think of this as one of the last bastions of human intelligence (most Sci-Fi AI/robot characters have a notoriously poor grasp of sarcasm), but it's also fairly essential for human-like interactions. Maybe we're already there: ChatGPT 4o already does meta-sarcasm.
- OpenGlass - Open Source Smart Glasses. I have a pair of RayBan Meta Glasses and love them. The hardware is already high quality, and it's amazing to be able to listen to music, take photos and video + (soon) get simple AI interactions in such a non-invasive way. Open is relentless and now there's already a project out there to create a sensor cluster to attach to any pair of glasses. It won't be as slick as the Rayban Meta's, but it'll also be less tied to the Facebook ecosystem. Hardware is finally an exciting field of rapid innovation again!
Wishing you a wonderful weekend.