University Researchers Publish Analysis of Chain-of-Thought Reasoning in LLMs
Researchers from Princeton University and Yale University published a case study of Chain-of-Thought (CoT) reasoning in LLMs which shows evidence of both memorization and true reasoning. They also found that CoT can work even when examples given in the...
Apple Unveils Apple Foundation Models Powering Apple Intelligence
Apple published the details of their new Apple Foundation Models (AFM), a family of large language models (LLM) that power several features in their Apple Intelligence suite. AFM comes in two sizes: a 3B parameter on-device version and a larger cloud-b...
OpenAI Releases GPT-4o mini Model with Improved Jailbreak Resistance
OpenAI released GPT-4o mini, a smaller version of their flagship GPT-4o model. GPT-4o mini outperforms GPT-3.5 Turbo on several LLM benchmarks and is OpenAI's first model trained with an instruction hierarchy method that improves the model's resistance...
Stability AI Releases 3D Model Generation AI Stable Video 3D
Stability AI recently released Stable Video 3D (SV3D), an AI model that can generate 3D mesh object models from a single 2D image. SV3D is based on the Stable Video Diffusion model and produces state-of-the-art results on 3D object generation benchmark...
Meta Unveils 24k GPU AI Infrastructure Design
Meta recently announced the design of two new AI computing clusters, each containing 24,576 GPUs. The clusters are based on Meta's Grand Teton hardware platform, and one cluster is currently used by Meta for training their next-generation Llama 3 model...
RWKV Project Open-Sources LLM Eagle 7B
The RWKV Project recently open-sourced Eagle 7B, a 7.52B parameter large language model (LLM). Eagle 7B is trained on 1.1 trillion tokens of text in over 100 languages and outperforms other similarly-sized models on multilingual benchmarks. By Anthony ...
Amazon Announces One Billion Parameter Speech Model BASE TTS
Amazon Science recently published their work on Big Adaptive Streamable TTS with Emergent abilities (BASE TTS). BASE TTS supports voice-cloning and outperforms baseline TTS models when evaluated by human judges. Further, Amazon's experiments show that ...
Stability AI Releases 1.6 Billion Parameter Language Model Stable LM 2
Stability AI released two sets of pre-trained model weights for Stable LM 2, a 1.6B parameter language model. Stable LM 2 is trained on 2 trillion tokens of text data from seven languages and can be run on common laptop computers. By Anthony Alford
Mistral AI’s Open-Source Mixtral 8x7B Outperforms GPT-3.5
Mistral AI recently released Mixtral 8x7B, a sparse mixture of experts (SMoE) large language model (LLM). The model contains 46.7B total parameters, but performs inference at the same speed and cost as models one-third that size. On several LLM benchma...
Google Announces Video Generation LLM VideoPoet
Google Research recently published their work on VideoPoet, a large language model (LLM) that can generate video. VideoPoet was trained on 2 trillion tokens of text, audio, image, and video data, and in evaluations by human judges its output was prefer...