After 13 years of integrating robots into its warehouses, Amazon has hit a major milestone: it now operates 1 million robots across its facilities, the company announced Monday. The one millionth unit was recently deployed at a fulfillment center in Japan.
Robots Now Support 75% of Amazon Deliveries
This achievement moves Amazon closer to another key benchmark — reaching parity between the number of robots and human workers in its warehouse network, according to The Wall Street Journal. The WSJ also noted that robots now assist with 75% of Amazon’s global deliveries.
TechCrunch has contacted Amazon for additional details.
Amazon also revealed a new generative AI model called DeepFleet, designed to optimize the movements of its warehouse robots. According to the company, DeepFleet will boost the fleet’s operational speed by 10% by streamlining route coordination.
SageMaker Powers Development of Amazon’s Warehouse AI
The model was developed using Amazon SageMaker, the company’s cloud-based AI development platform, and was trained on internal warehouse and inventory data.
Reaching the one million robot milestone reflects more than just scale — Amazon has been steadily advancing its robotic systems, introducing enhanced features and newer models over time.
In May, Amazon introduced its newest robot, Vulcan, featuring two specialized arms — one for moving inventory and another equipped with a camera and suction cup to pick up items. Amazon says Vulcan stands out for its ability to “feel” the objects it handles, thanks to a built-in sense of touch.
Amazon Launches Robot-Heavy Fulfillment Centers
Back in October 2024, Amazon announced plans for next-generation fulfillment centers designed to house ten times more robots than current facilities, while still employing human workers. The first of these advanced centers opened soon after in Shreveport, Louisiana, near the Texas border.
Amazon began expanding its robotics operations in 2012, following its acquisition of Kiva Systems.
Odyssey, a startup launched by autonomous vehicle veterans Oliver Cameron and Jeff Hawke, has created an AI model that enables users to interact with streaming video in real time.
Now available as an early demo on the web, the model renders and streams new video frames every 40 milliseconds. With simple controls, users can navigate within the scene—much like exploring a 3D environment in a video game.
Company Unveils World Model System That Predicts Future Scenes with Lifelike Precision and Extended Video Generation
According to a company blog post, the system predicts future scenes based on the current environment, past events, and user inputs. This “world model” can generate lifelike visuals, maintain spatial accuracy, learn actions from video footage, and produce seamless video streams lasting five minutes or more.
Introducing AI video you can watch and interact with, in real-time!
Powering this is a new world model that imagines and streams video frames every 40ms(!). No game engine in sight.
We call it interactive video, and it's free for anyone to try right now (GPUs permitting)! pic.twitter.com/QtADRXCQ8z
Startups and Tech Giants Race to Build World Models for Next-Gen Media and Robotics Simulation
Several startups and major tech companies—including DeepMind, Fei-Fei Li’s World Labs, Microsoft, and Decart—are actively developing world models. These systems are seen as a foundation for future interactive media like games and films, as well as for realistic simulations used in robot training environments.
However, reactions from creative industries have been mixed. A recent Wired investigation revealed that some game studios, including Activision Blizzard—which has laid off large numbers of staff—are using AI to streamline production and offset workforce loss. Meanwhile, a 2024 study commissioned by the Animation Guild estimated that AI could disrupt over 100,000 film, television, and animation jobs in the U.S. in the near future.
Odyssey, for its part, emphasizes that it aims to work with creatives rather than replace them.
“Interactive video unlocks a new frontier for entertainment,” the company writes in a blog post, “where stories can be generated and explored in real time, without the limitations and high costs of traditional media production. We believe that over time, everything we now experience as video—entertainment, advertising, education, training, and travel—will become interactive, powered by Odyssey’s technology.”
Odyssey admits that its current demo is still in an early stage, with noticeable imperfections. The AI-generated environments often appear blurry and warped, and the layout can be inconsistent—moving in one direction or turning around may cause the scenery to unexpectedly change.
Despite these limitations, the company says improvements are coming quickly. Right now, the model can stream video at up to 30 frames per second, powered by clusters of Nvidia H100 GPUs, with operating costs estimated at $1 to $2 per user-hour.
The world played forward, by a model.
On the one hand, it's calm and serene. On the other, it's chaotic and terrifying.
Odyssey Advances World Models with Realistic Dynamics, Persistent Environments, and Open-Ended Action Learning
Odyssey says it’s working on more advanced world models that better reflect real-world dynamics, with improved temporal stability and persistent environments. “We’re also expanding from simple motion to broader world interaction, training our systems to learn open-ended actions from large-scale video,” the company noted in a blog post.
Unlike many other AI labs, Odyssey has developed its own 360-degree, backpack-mounted camera system to capture real-world environments. The startup believes this custom data collection approach can produce higher-quality models than those trained solely on publicly available datasets.
So far, Odyssey has secured $27 million in funding from investors like EQT Ventures, GV, and Air Street Capital. Notably, Ed Catmull—Pixar co-founder and former president of Walt Disney Animation Studios—sits on its board of directors.
In December, the company announced that it’s developing software to let creators import scenes generated by its AI into industry-standard tools like Unreal Engine, Blender, and Adobe After Effects, enabling manual editing and refinement.
Anthropic is introducing Claude 3.7 Sonnet, a next-generation AI model designed to “think” about questions for as long as users prefer.
Described as the industry’s first “hybrid AI reasoning model,” Claude 3.7 Sonnet can provide both instant responses and more in-depth, deliberative answers. Users have the option to enable its reasoning mode, allowing the AI to process questions for a shorter or longer duration.
This model aligns with Anthropic’s goal of simplifying AI interactions. Many current AI chatbots require users to choose between multiple models with varying costs and capabilities. Anthropic aims to streamline this by offering a single model that handles both quick and complex reasoning tasks.
Claude 3.7 Sonnet is launching on Monday for all users and developers. However, only subscribers to Anthropic’s premium Claude plans will gain access to its reasoning features. Free users will receive a standard version without advanced reasoning, though Anthropic claims it still surpasses the previous flagship model, Claude 3.5 Sonnet. (The company notably skipped a version number.)
Pricing and Comparison
Pricing for Claude 3.7 Sonnet is set at $3 per million input tokens—equivalent to around 750,000 words, more than the entire Lord of the Rings trilogy—and $15 per million output tokens. While this makes it pricier than OpenAI’s o3-mini ($1.10 per million input tokens/$4.40 per million output tokens) and DeepSeek’s R1 (55 cents per million input tokens/$2.19 per million output tokens), those models specialize in reasoning alone, whereas Claude 3.7 Sonnet integrates both real-time and extended reasoning capabilities.
Anthropic’s new thinking modes Image Credits:Anthropic
Claude 3.7 Sonnet is Anthropic’s first AI model designed for “reasoning,” a technique increasingly adopted by AI labs as traditional performance improvements slow down.
Models like o3-mini, R1, Google’s Gemini 2.0 Flash Thinking, and xAI’s Grok 3 (Think) take more time and computing power before generating responses. By breaking down problems into smaller steps, these models typically enhance accuracy. While they don’t think or reason like humans, their approach is inspired by deductive processes.
Future Automation of AI Reasoning
Anthropic aims for future versions of Claude to determine on their own how long to “think” about questions, eliminating the need for users to make that choice manually, according to Dianne Penn, the company’s product and research lead, in an interview with TechCrunch.
In a blog post shared with TechCrunch, Anthropic compared this approach to human cognition: “Just as people don’t have separate brains for immediate answers versus deep thinking, we believe reasoning should be a seamless capability within a frontier model rather than a feature confined to a separate system.”
To enhance transparency, Claude 3.7 Sonnet includes a “visible scratch pad” that reveals its internal planning process. Penn noted that while users will be able to see most of the AI’s reasoning, certain parts may be redacted for trust and safety reasons.
Claude’s thinking process in the claude app Image Credits:Anthropic
Anthropic has fine-tuned Claude’s reasoning modes for practical applications, such as solving complex coding challenges and handling autonomous tasks. Developers using Anthropic’s API can adjust the model’s “thinking budget,” balancing speed and cost against answer quality.
In real-world coding evaluations, Claude 3.7 Sonnet demonstrated strong performance. On SWE-Bench, a benchmark for coding tasks, it achieved 62.3% accuracy, outperforming OpenAI’s o3-mini, which scored 49.3%. In TAU-Bench, a test assessing AI interaction with simulated users and external APIs in a retail environment, Claude 3.7 Sonnet scored 81.2%, surpassing OpenAI’s o1 model at 73.5%.
Improved Response Flexibility
Anthropic also claims that Claude 3.7 Sonnet is less likely to refuse valid prompts than previous versions. The model is designed to better distinguish between harmful and benign requests, reducing unnecessary refusals by 45% compared to Claude 3.5 Sonnet. This shift comes as some AI labs reconsider their approach to content restrictions.
Alongside Claude 3.7 Sonnet, Anthropic is introducing Claude Code, an agentic coding tool launching as a research preview. This tool allows developers to execute tasks directly from their terminal. In a demo, Anthropic employees showcased how a simple command like “Explain this project structure” enables Claude Code to analyze a codebase. Developers can modify code using plain English, while the tool explains its edits, tests for errors, and even pushes updates to GitHub.
Claude Code will initially be available to a limited number of users on a first come, first serve basis, according to an Anthropic spokesperson.
Anthropic is launching Claude 3.7 Sonnet at a time when AI labs are rapidly releasing new models. The company has traditionally taken a cautious, safety-focused approach, but with this release, it aims to set the pace. However, competition looms—OpenAI’s CEO, Sam Altman, has hinted that OpenAI may introduce its own hybrid AI model within months.
Topaz Labs claims its diffusion model for enhancing footage delivers “the highest visual quality difference ever achievable in video” Topaz Labs
Topaz Labs, known for its advanced photo and video enhancement software, has unveiled a new AI model that automatically restores old footage, whether from personal home videos or aging archival content stored on traditional media.
In multiple examples, the AI significantly improves video quality by enhancing details and reducing noise and artifacts. This marks the first diffusion model designed specifically for video restoration, requiring no manual adjustments to refine the footage.
A Powerful AI Model
The company states that Project Starlight, the newly developed model, was built from the ground up with a unique architecture, featuring over 6 billion parameters and optimized for the latest NVIDIA hardware. For comparison, OpenAI’s GPT-4o—an advanced multimodal model capable of processing text, audio, images, and video, released in May 2024—operates with 8 billion parameters.
Project Starlight: The first and only diffusion-based AI model for enhancing video.
Unmatched Detail and Temporal Consistency
Topaz Labs claims that this model will “precisely restore details” while offering “exceptional detail recovery with unmatched temporal consistency.” According to the company, the key feature of this new model is its ability to enhance multiple frames simultaneously, ensuring high-quality restoration without motion artifacts or inconsistencies across frames and objects.
Diffusion models function by analyzing high-quality images, progressively adding noise to understand how they degrade. They also reverse this process, starting with a noisy image and predicting its original appearance before degradation.
Project Starlight will automatically denoise, deblur, upscale, and anti-alias videos on demand, making high-quality restoration accessible even to non-experts.
With over two decades in the imaging software industry, Topaz Labs has developed popular tools for photo and video enhancement. In 2020, a YouTuber used the company’s Gigapixel AI and other software to restore a mid-1890s silent film, presenting it in stunning 4K at 60 fps.
Restoring old video requires multiple steps, including upscaling, color correction and grading, frame interpolation, damage repair, and audio enhancement. While AI-powered tools exist for each task, human oversight is still essential for optimal results.
Competition in AI Video Restoration
Notably, Topaz Labs’ competitor Tensorpix also provides a machine learning-based video restoration tool aimed at improving legacy footage. However, its documentation does not indicate the use of diffusion models, which belong to a distinct category of machine learning techniques.
In the examples provided, Project Starlight performs impressively overall. The clips featuring an astronaut and a red parrot stand out as particularly well-restored. However, the boxing match footage falls short in fidelity, with some frames appearing smudged, resembling the flawed results seen in earlier AI-generated videos.
Topaz Labs states that users can restore videos up to 10 seconds long for free, while clips up to 5 minutes will be limited to 1080p resolution and require credits. An enterprise version will offer support for longer videos and higher-resolution outputs.
It’s unclear whether Project Starlight will run locally or be integrated into Topaz Labs’ existing apps.
For early access, interested users can engage with the company’s posts on X, Threads, or YouTube.
We use cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits. By clicking “Accept All”, you consent to the use of ALL the cookies. However, you may visit "Cookie Settings" to provide a controlled consent.
This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary cookies are absolutely essential for the website to function properly. These cookies ensure basic functionalities and security features of the website, anonymously.
Cookie
Duration
Description
cookielawinfo-checkbox-analytics
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional
11 months
The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy
11 months
The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
Functional cookies help to perform certain functionalities like sharing the content of the website on social media platforms, collect feedbacks, and other third-party features.
Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.
Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics the number of visitors, bounce rate, traffic source, etc.
Advertisement cookies are used to provide visitors with relevant ads and marketing campaigns. These cookies track visitors across websites and collect information to provide customized ads.