Apple WWDC: iOS18 and Apple Intelligence Announcements
At WWDC 2024 Apple unveiled "Apple Intelligence," a suite of AI features coming to iOS 18, iPadOS 18, and macOS Sequoia. Apple’s aim with Apple Intelligence is to seamlessly integrate AI into the core of the iPhone, iPad, and Mac experience. By Andrew ...
OpenAI releases Transformer Debugger tool
OpenAI has unveiled a new tool called the Transformer Debugger (TDB), designed to provide insights into the inner workings of transformer models. The tool was developed by OpenAI's Superalignment team and combines automated interpretability techniques ...
Google announces multi-modal Gemini 1.5 with million token context length
One week after announcing Gemini 1.0 Ultra, Google announced additional details about its next generation model, Gemini 1.5. The new iteration comes with an expansion of its context window and the adoption of a "Mixture of Experts" (MoE) architecture, ...
Google Launches New Multi-modal Gemini AI Model
On December 6, Alphabet released the first phase of its next-generation AI model, Gemini. Gemini was overseen and driven by its CEO, Sundar Pichai and Google DeepMind. Gemini is the first model to outperform human experts on MMLU (Massive Multitask Lan...
Anthropic Announces Claude 2.1 with Wider Context Window and Support for Tools
According to Anthropic, the newest version of Claude delivers many “advancements in key capabilities for enterprises—including an industry-leading 200K token context window, significant reductions in rates of model hallucination, system prompts and our...
Microsoft releases DeepSpeed-FastGen for High-Throughput Text Generation
Microsoft has announced the alpha release of DeepSpeed-FastGen, a system designed to improve the deployment and serving of large language models (LLMs). DeepSpeed-FastGen is the synergistic composition of DeepSpeed-MII and DeepSpeed-Inference . DeepSpe...
PyTorch 2.1 Release Supports Automatic Dynamic Shape Support and Distributed Training Enhancements
PyTorch Conference 2023 presented an overview of PyTorch 2.1. ExecuTorch was introduced to enhance PyTorch's performance on mobile and edge devices. The conference also had a focus on community with new members added to the PyTorch Foundation and a Doc...
Modern Compute Stack for Scaling Large AI/ML/LLM Workloads at QCon San Francisco
Jules Damji, a lead developer advocate at Anyscale Inc., discussed the difficulties data scientists encounter when managing infrastructure for machine learning models. He emphasized the necessity for a framework that supports the latest machine learni...
Defensible Moats: Unlocking Enterprise Value with Large Language Models at QCon San Francisco
In a recent presentation at QConSFrancisco, Nischal HP discussed the challenges enterprises face when building LLM-powered applications using APIs alone. These challenges include data fragmentation, the absence of a shared business vocabulary, privacy ...