OpenAI FINALLY launches ‘Advanced Voice Mode’ to select ChatGPT users..
OpenAI's new Voice Mode for ChatGPT Plus offers real-time conversation and emotion features, while Stanford students launch alphaXiv for enhanced research discussions. GitHub's beta release of AI...
Today's highlights:
🚀 AI Breakthroughs
OpenAI Rolls Out Advanced Voice Mode to Select ChatGPT Plus Users
• OpenAI has launched an advanced Voice Mode (AVM) for ChatGPT, currently available to a select group of ChatGPT Plus users
• The new Voice Mode offers enhancements such as real-time conversation abilities, emotion response features, and interruption capabilities
• Full availability for all ChatGPT Plus subscribers is expected by fall, following comprehensive testing and improvements based on user feedback.
Stanford Students Launch alphaXiv, Facilitating Interactive Discussions on arXiv Research Papers
• Stanford students developed ‘alphaXiv,’ a platform for detailed discussions on arXiv papers, allowing line-by-line commentary and both public and private notes;
• Enhanced collaboration is expected as alphaXiv supports two-way communications, enabling readers to connect directly with paper authors on the arXiv repository;
• Last year, arXiv secured $10 million from Simons Foundation and NSF to upgrade its infrastructure, which includes migrating over 2 million articles to the cloud.
GitHub Models: Empowering Over 100 Million Developers to Build AI Applications
• GitHub Models launches in limited public beta, simplifying AI development for over 100 million developers through integration with Codespaces and Azure
• Developers can access and test various AI models including Llama 3.1, GPT-4o, and Mistral Large 2 in GitHub's new interactive model playground
• GitHub's shift to enabling AI engineers highlights a broader trend of embedding AI across software development processes, ensuring privacy and security compliance.
Just Walk Out Technology Enhances Shopping Experience with Advanced AI and Machine Learning
• Just Walk Out technology utilizes cameras, weight sensors, and advanced AI to permit checkout-free shopping in retail environments
• Launched in 2018, the system employs generative AI and machine learning to track and analyze which shoppers pick up which items
• The technology originally processed shopping behaviors sequentially, but faced challenges like obscured camera views or unique customer actions.
Microsoft Identifies OpenAI as Competitor, Deepens AI Market Rivalry
• Microsoft now lists OpenAI as a competitor in AI and search advertising, signaling an evolving dynamic despite their deep investments and partnerships
• OpenAI's new SearchGPT search engine prototype intensifies competition with Microsoft, offering alternatives to Microsoft's Bing and Azure services
• Microsoft CEO Satya Nadella continues to foster a close relationship with OpenAI, indicating ongoing collaboration amidst competitive shifts.
Google Launches Gemini-Powered Features for Chrome, Including Desktop Lens and Tab Compare
• Google is set to roll out Lens for desktop on Chrome, featuring capabilities to explore images and search items directly from the address bar and menus
• Tab Compare, a new AI tool by Google, will summarize features and prices of products across multiple tabs to streamline shopping on Chrome
• An AI-powered history search feature will soon allow U.S. users to perform natural language queries to recall past browsing activities on Chrome.
Apple Intelligence Rolls Out with Staggered Features, ChatGPT Integration Delayed
• Apple Intelligence debuts with staggered feature release, missing ChatGPT in initial iOS 18.1 version
• Tim Cook announces ChatGPT integration expected by end of year, likely in iOS 18.2 update
• Apple follows historical release pattern, possibly releasing iOS 18.2 with additional AI features in December.
Andrej Karpathy Crafts Music Video Using Wall Street Journal's Front Page with Generative AI
• Former OpenAI co-founder Andrej Karpathy used the WSJ's front page to create a music video featuring generative AI
• The innovative project combined AI language model Claude for scene generation, Ideogram AI for visual creation, and Suno AI for music production
• The final product was a seamlessly stitched music video, showcasing a novel integration of multimedia content via generative AI tools.
⚖️ AI Ethics
EU's New AI Regulation Takes Effect, Staggered Compliance Deadlines Set
• The EU's new AI regulation starts August 1, 2024, initiating staggered deadlines for compliance, fully effective by mid-2026
• High-risk AI uses like biometrics must undergo pre-market assessments and register in an EU database
• Penalties for non-compliance can reach up to 7% of global annual turnover for the most severe violations.
Google Withdraws Olympics AI Ad After Backlash Over Impersonal Messaging
• Google pulled its Olympics-themed "Dear Sydney" ad featuring AI chatbot Gemini after backlash over its impersonal tone and dystopian undertones
• The advertisement aimed to depict Gemini assisting in drafting a letter to Olympian Sydney McLaughlin-Levrone but faced criticism for diminishing the sentimental value of a child's personal effort
• This incident marks another stumble for Google in promoting its Gemini AI technology, following earlier controversies over AI-generated images and search result inaccuracies.
DOJ Investigates Nvidia for Alleged AI Chip Market Dominance Abuse
• The U.S. Department of Justice is investigating Nvidia for alleged abuse of market dominance in AI chips
• DOJ seeks information from competitors like AMD, probing Nvidia's pricing tactics and pressure to buy additional products
• Nvidia's stock fell 3.5% amid the investigation and global semiconductor sell-off.
Bombay High Court Grants Arijit Singh Interim Relief in AI Copyright Suit
• The Bombay High Court granted interim relief to Arijit Singh, protecting his personality rights against unauthorized AI usage
• The court ruled that exploiting a celebrity's voice, image, or persona without consent constitutes a violation of personality rights
• Arijit Singh's lawsuit highlights concerns over AI platforms misusing his likeness, risking significant economic and reputational harm.
AI Office Initiates Consultation on Trustworthy AI Practice Under New AI Act
• The AI Office initiates Multi-stakeholder consultation on trustworthy general-purpose AI models under the new AI Act
• Detailed Code of Practice to aid providers in complying with AI Act rules will become effective 12 months post AI Act enforcement
• Final Code of Practice draft to be presented in April at a Closing Plenary, subsequently assessed for adequacy by the AI Office and AI Board.
OpenAI Collaborates with U.S. AI Safety Institute for Early Model Testing
• OpenAI collaborates with the U.S. AI Safety Institute for early access to upcoming generative AI models, aiming to enhance safety protocols
• The partnership responds to skepticism over OpenAI's commitment to AI safety, following the disbandment of a dedicated safety unit
• Tied to legislative efforts, the agreement precedes potential regulation changes under the proposed Future of Innovation Act.
🎓AI Academia
Alibaba Group Develops Tora, a New AI for Trajectory-Based Video Generation
• Alibaba Group has developed Tora, a new video generation framework that synthesizes videos by integrating texts, visuals, and specific motion trajectories
• Tora showcases a high degree of motion fidelity and realistic physical dynamics, capable of producing videos up to 204 frames at 720p resolution
• This advanced model, by employing DiT scalability, supports diverse video durations, resolutions, and aspect ratios, establishing a significant step forward in controllable video generation technology.
Apple Releases MMAU Benchmark for Evaluating Large Language Model Capabilities
• The MMAU benchmark evaluates large language models across five crucial capabilities, including Understanding, Reasoning, and Planning
• Across 20 tasks, MMAU assesses agent performance in domains like Math, Data Science, and Tool-use, using over 3K unique prompts
• Detailed performance insights and comparative analyses of 18 models on MMAU are disclosed, enhancing understanding of LLMs' strengths and limitations.
Assessing Performance and Challenges of Multimodal Large Language Models in AI
• Multimodal Large Language Models (MLLMs) feature integration of data types such as text, images, and audio to address complex AI tasks
• The study highlights MLLMs' superior performance in multimodal tasks and suggests directions for future research to overcome existing challenges
• Notable advancements in natural language and visual data integration have been made, enhancing machine translation and image annotation accuracy.
New Tutorial Highlights Fairness in Large Language Models: A Comprehensive Review
• A recent study highlights how Large Language Models (LLMs) may inherit biases from training data, impacting decision-making in high-stakes areas such as hiring and medical diagnoses
• Techniques and resources for assessing and mitigating bias in LLMs, including new toolkits and datasets, were compiled to facilitate fair practice implementations
• The study calls for a nuanced approach to evaluate both output and generation processes of LLMs to prevent the reinforcement of harmful stereotypes.
Exploring Copyright and Fair Use Challenges in Generative AI Industry
• The literature review aims to assess fair use in GenAI, focusing on whether it promotes the objectives of copyright law
• Highlights a conflict between GenAI stakeholders about training models using copyrighted material without creators' consent
• Calls for new regulatory policies ensuring fairness by balancing creators' rights with tech advancements in GenAI.
FlowGPT Study Sheds Light on Domains, Modalities, and Goals in AI-Driven Community Chatbots
• FlowGPT serves as a new community platform where AI creators share and develop chatbots with various domains and purposes
• The study seeks to classify these community-generated AI chatbots by domain, output modalities, and overarching goals
• Identifying these categories aids in understanding the cultural and motivational landscape of AI chatbot creators on FlowGPT.
ABC Align Methodology Enhances Safety and Accuracy of Large Language Models
• ABC Align, a new Large Language Model alignment methodology, incorporates media standards directly into model training to enhance safety and accuracy
• The approach shows a 23.51% improvement in performance on the TruthfulQA benchmark over previously fine-tuned models using minimal data
• ABC Align effectively reduces bias and preserves reasoning capabilities in both open-source and proprietary AI models.
Recent Advances in Generative AI and Large Language Models: Challenges and Future Perspectives
• Generative AI and Large Language Models are revolutionizing Natural Language Processing with their advanced capabilities across various domains;
• Critical research gaps such as bias, fairness, and computational costs are being addressed to guide future advancements in the field;
• Funded by the US Department of Defense, this research aims to influence ethical and impactful integration of AI technologies globally.
Generative AI in Health Assessments: Exploring Opportunities and Policy Challenges
• Generative AI takes center stage in new health technology assessment, reviewing both its potential benefits and existing barriers within healthcare ecosystems
• The policy framework and guidelines need urgent updates to integrate advancing AI tools in clinical settings effectively
• Only partial funding by EU's Horizon 2020 for the Next Generation Health Technology Assessment, highlighting a cautious investment approach in AI-driven technologies.
Comprehensive Review of Low-Rank Adaptation in Large Language Models Published
• A comprehensive survey on Low-Rank Adaptation (LoRA) of Large Language Models is highlighted in a recent article in Front. Comput. Sci.
• The study categorizes advances in LoRA, discussing its impact on computational efficiency, privacy, and cross-task generalization
• The paper underlines the exponential growth in literature surrounding LoRA's application in adapting large neural networks efficiently.
Large Language Models in Mental Health: Opportunities, Risks, and Ethical Considerations
• Large language models (LLMs) show promise in addressing global mental health needs by offering scalable educational, assessment, and intervention tools
• Risks involved with LLMs in mental health include potential ethical issues, the need for fine-tuning to ensure relevance and safety, and possible equity disparities
• The paper advocates for responsible development and deployment of mental health-focused LLMs, involving people with lived experiences to guide ethical practices.
Comprehensive Survey Examines Text Watermarking Techniques in Large Language Model Era
• A comprehensive survey highlights recent enhancements in text watermarking due to advances in large language models
• The study covers evaluation methods, discussing the detectability and robustness of watermarking algorithms against various attacks
• Future challenges and potential applications for text watermarking are explored, promoting further advancements in this field.
Survey Highlights Data Management Strategies for Training Large Language Models
• Data management is critical for optimizing training datasets for both pretraining and supervised fine-tuning stages of LLMs
• Strategies in data management include selection, combination, utilization, and evaluation to enhance LLM performance
• Emerging trends in data collection involve multimodal sources and model synthesis to overcome the scarcity of existing data.
About ABCP: We are dedicated to reducing Generative AI anxiety among tech enthusiasts by providing timely, well-structured, and concise updates on the latest developments in Generative AI through our AI-driven news platform, ABCP - Anybody Can Prompt!
Join our growing community of over 30,000 readers and stay at the forefront of the Generative AI revolution.