Today's highlights:
🚀 AI Breakthroughs
Apple Intelligence: A New Personal AI System for Enhanced Privacy and Functionality on iOS Devices
• Apple launches Apple Intelligence, a personal system for iPhone, iPad, and Mac, leveraging generative models and private context to enhance user tasks
• New Writing Tools in iOS 18, iPadOS 18, and macOS Sequoia enable users to rewrite, proofread, and summarize text across all writing apps
• New Privacy Standard with Private Cloud Compute, ensuring user data processing is secure and scalable while emphasizing strong data protection.
Andrej Karpathy Recreates GPT-2, Approaches GPT-3's Model Efficiency in New Tutorial
• Andrej Karpathy successfully recreated the smallest version of GPT-2 with 124 million parameters in a four-hour live tutorial
• Karpathy's recreation closely approaches the performance of the GPT-3 model with the same number of parameters, achieving nearly 90% similarity with his nanoGPT project
• Continuing his efforts to democratize AI knowledge, Karpathy recently launched llm.c, enabling users to train language models using C language without relying on PyTorch or cPython.
⚖️ AI Ethics
NSE Warns Investors of Deepfake Videos Featuring CEO Ashishkumar Chauhan
• The National Stock Exchange (NSE) warns investors about deepfake videos of MD and CEO Ashish Kumar Chauhan providing false investment advice
• NSE stresses that any official communications or advisories only come from its official website and verified social media platforms
• The exchange is actively working with online platforms to remove the misleading videos, urging public to report any suspicious content.
🎓AI Academia
Natural Plan Benchmark Tests AI in Trip, Meeting, and Calendar Planning Tasks
• NATURAL PLAN, a new benchmark for evaluating AI in tasks like Trip Planning and Calendar Scheduling, challenges state-of-the-art models with its realistic scenarios
• In assessments, top models like GPT-4 and Gemini 1.5 Pro show low solve rates, highlighting a significant gap in AI’s planning capabilities under complex conditions
• Detailed ablation studies on NATURAL PLAN reveal limitations in current AI strategies, including self-correction and few-shot generalization, in enhancing planning performance. Read more
New Comprehensive RAG Benchmark Elevates Large Language Model Capabilities for QA
• The Comprehensive RAG Benchmark (CRAG) introduces 4,409 factual question-answer pairs to address gaps in real-world QA tasks representation
• Evaluations reveal that while most advanced large language models (LLMs) with RAG achieve only 44% accuracy on CRAG, top industry solutions reach 63% without errors
• CRAG challenges underlined with the KDD Cup 2024 draw attention to the necessity for further research, especially in rapidly changing or complex fact scenarios. Read more
SELFGOAL Enhances Language Agents' Achievement of High-Level Goals in Complex Environments
• SELFGOAL is a new method designed to boost language agents' ability to autonomously achieve complex goals without extensive human guidance
• By dynamically splitting high-level objectives into a structured tree of smaller, manageable subgoals, SELFGOAL enables more effective agent performance
• Experimental findings reveal that SELFGOAL significantly improves language agent efficiency in diverse settings, including competitive and cooperative scenarios. Read more
Survey Addresses Gap in Tabular Data Modeling Through Comprehensive Review and Insights
• Recent advances in large language models offer promising applications in tasks like prediction and data synthesis related to tabular data
• A new survey highlights the need for a comprehensive review comparing techniques, metrics, and models in tabular data modeling
• The survey provides a taxonomy of methodologies and datasets, aiming to bridge gaps and suggest future research avenues in this evolving field. Read more
About us: We are dedicated to reducing Generative AI anxiety among tech enthusiasts by providing timely, well-structured, and concise updates on the latest developments in Generative AI through our AI-driven news platform, ABCP - Anybody Can Prompt!