Inception Launches From Stealth with a Groundbreaking AI Model

Inception Launches From Stealth with a Groundbreaking AI Model

Inception, a newly launched company based in Palo Alto and founded by Stanford computer science professor Stefano Ermon, claims to have developed an innovative AI model leveraging diffusion technology. The company refers to it as a diffusion-based large language model (DLM).
Image Credits:Inception

Inception, a newly launched company based in Palo Alto and founded by Stanford computer science professor Stefano Ermon, claims to have developed an innovative AI model leveraging diffusion technology. The company refers to it as a diffusion-based large language model (DLM).

Currently, generative AI models fall into two primary categories: large language models (LLMs) and diffusion models. LLMs, built on transformer architectures, generate text, while diffusion models—used in systems like Midjourney and OpenAI’s Sora—primarily create images, videos, and audio.

According to Inception, its model retains the core capabilities of traditional LLMs, such as code generation and question-answering, but operates significantly faster and with lower computational costs.

Diffusion Models vs. Traditional LLMs: A New Approach to Text Generation

Ermon, who has long explored applying diffusion models to text in his Stanford lab, noted that LLMs generate text sequentially, meaning each word depends on the previous one. In contrast, diffusion models refine a rough approximation of their output all at once, an approach commonly used for generating images.

Believing that diffusion models could enable parallel text generation, Ermon and one of his students spent years refining the concept. Their breakthrough, detailed in a research paper last year, demonstrated the feasibility of generating and modifying large text blocks simultaneously.

Recognizing the breakthrough’s potential, Ermon established Inception last summer and brought on two former students—UCLA professor Aditya Grover and Cornell professor Volodymyr Kuleshov—to help lead the company.

While Ermon declined to disclose funding details, TechCrunch reports that the Mayfield Fund has invested in the startup.

Inception Gains Early Adoption by Fortune 100 Companies

Inception has already attracted several customers, including unnamed Fortune 100 companies, by addressing the growing demand for lower AI latency and faster processing speeds, Ermon said.

Our models utilize GPUs far more efficiently,” Ermon explained, referring to the specialized chips used for running AI models. “I believe this is a game-changer that will redefine how language models are built.

The company offers an API, on-premises and edge deployment options, model fine-tuning support, and a range of prebuilt DLMs for various applications. According to Inception, its DLMs operate up to 10 times faster than traditional LLMs while reducing costs by a factor of 10.

Our ‘small’ coding model matches [OpenAI’s] GPT-4o mini in performance but runs more than 10 times faster,” a company spokesperson told TechCrunch. “Meanwhile, our ‘mini’ model surpasses smaller open-source models like [Meta’s] Llama 3.1 8B and processes over 1,000 tokens per second.”

In AI terminology, “tokens” refer to segments of raw data, and achieving 1,000 tokens per second is a notable speed—if Inception’s claims hold true.


Read the original article on: TechCrunch

Read more: Web Summit Attendees Remain Unconvinced by Scale AI’s CEO Advocating for America to Win the AI War

Share this post

Leave a Reply