Tag: OpenAI

  • OpenAI Opens a New Delhi office as Part of its Expansion in India

    OpenAI Opens a New Delhi office as Part of its Expansion in India

    Image Credits: Jagmeet Singh / TechCrunch

    OpenAI has revealed plans to establish its first office in India, shortly after rolling out a ChatGPT plan designed for Indian users, as part of its strategy to engage with the country’s fast-growing AI sector.

    The company announced on Friday that it will form a local team and open a corporate office in New Delhi in the coming months, expanding on its recent hiring in the region. Earlier in April 2024, OpenAI appointed former Truecaller and Meta executive Pragya Misra as its public policy and partnerships lead in India. It also enlisted former Twitter India head Rishi Jaitly as a senior advisor to support AI policy discussions with the Indian government.

    A Key Battleground for the AI Race

    India — the world’s second-largest internet and smartphone market after China — is a natural growth market for OpenAI, which is competing with major players like Google and Meta as well as rising AI startups such as Perplexity, all eager to reach the country’s vast user base.

    OpenAI said it has begun building a local team to “strengthen relationships with partners, governments, businesses, developers, and academic institutions.” It also aims to gather feedback from Indian users to adapt its products for local needs and develop India-specific features and tools.

    “Opening our first office and building a local team marks an important first step in our commitment to make advanced AI more accessible across the country and to build AI for India, and with India,” said OpenAI CEO Sam Altman in a statement.

    The company also announced plans to host its first Education Summit in India this month and hold its first Developer Day in the country later this year.

    Despite India’s importance, OpenAI faces challenges such as converting free users into paying customers. Like other AI companies, it must tackle the difficulty of monetization in a highly price-sensitive South Asian market.

    India’s Expanding Role in the Global AI Market

    India — the world’s second-largest internet and smartphone market after China — is a natural growth market for OpenAI, which is competing with major players like Google and Meta as well as rising AI startups such as Perplexity, all eager to reach the country’s vast user base.

    OpenAI said it has begun building a local team to “strengthen relationships with partners, governments, businesses, developers, and academic institutions.” It also aims to gather feedback from Indian users to adapt its products for local needs and develop India-specific features and tools.

    Opening our first office and building a local team marks an important first step in our commitment to make advanced AI more accessible across the country and to build AI for India, and with India,” said OpenAI CEO Sam Altman in a statement.

    The company also announced plans to host its first Education Summit in India this month and hold its first Developer Day in the country later this year.

    Despite India’s importance, OpenAI faces challenges such as converting free users into paying customers. Like other AI companies, it must tackle the difficulty of monetization in a highly price-sensitive South Asian market.

    ChatGPT Go Launches Amid Rising AI Competition in India

    Earlier this week, OpenAI launched its first mass-market ChatGPT plan in India — ChatGPT Go — priced at ₹399 per month (around $4.75), making it its most affordable subscription yet. The move followed closely on the heels of rival Perplexity’s partnership with Indian telecom giant Bharti Airtel, which is offering its 360 million+ subscribers a year’s access to Perplexity Pro.

    OpenAI also faces hurdles in working with Indian businesses. In November, Indian news agency ANI filed a lawsuit against the company, accusing it of using its copyrighted news content without authorization. A coalition of Indian publishers joined the case in January.

    At the same time, the Indian government is pushing AI adoption across departments and working to boost the country’s global AI presence — momentum that OpenAI is looking to tap into.

    India has all the right ingredients to become a global AI hub — exceptional tech talent, a thriving developer ecosystem, and strong government backing through the IndiaAI Mission,” said Altman.

    India is not OpenAI’s first Asian base; the company has already opened offices in Japan, Singapore, and South Korea. Rival Anthropic, however, prioritized Japan over India and recently launched its office in Tokyo instead of New Delhi.

    According to a Silicon Valley investor quoted by TechCrunch, one reason AI companies have been slow to prioritize India is the challenge of securing enterprise customers.

    OpenAI’s decision to set up in India highlights the country’s growing leadership in digital innovation and AI adoption,” said India’s IT Minister Ashwini Vaishnaw in a statement. “Through the IndiaAI Mission, we are building a trusted and inclusive AI ecosystem, and we welcome OpenAI’s partnership in ensuring that AI’s benefits reach every citizen.


    Read the original article on: Techcrunch

    Read more:Sony is Increasing PlayStation 5 Prices Due to new Tariffs

  • OpenAI and Google Outperform Top Math Students But Not one Another

    OpenAI and Google Outperform Top Math Students But Not one Another

    Image Credit: Pixabay

    AI systems from OpenAI and Google DeepMind have earned gold-medal scores in the 2025 International Math Olympiad (IMO), one of the most prestigious and difficult math competitions for high school students, the companies announced separately in recent days.

    The achievement highlights the rapid progress of AI and how closely matched OpenAI and Google remain in the competition for AI dominance. In a race where perception matters as much as performance, these milestones can influence who attracts the best AI talent, particularly since many top researchers come from competitive math backgrounds, making IMO results especially meaningful.

    Last year, Google’s entry earned a silver medal using a “formal” approach, which required human input to translate problems into a format the AI could understand. This year, both companies used “informal” systems that processed natural language directly, solving and explaining five out of six problems, outperforming most human competitors and Google’s previous system, with no human translation needed.

    IMO Success Highlights Breakthroughs in AI Reasoning Beyond Clear-Cut Tasks

    In interviews with TechCrunch, researchers from OpenAI and Google explained that their AI models’ gold-medal performances in the IMO mark significant progress in AI reasoning within areas where solutions can’t be easily verified. While AI typically performs well on tasks with clear-cut answers—like basic math or coding—it’s much more challenged by open-ended problems, such as offering furniture recommendations or assisting in complex research.

    However, tensions are rising over how OpenAI handled the announcement of its IMO success. In a move reminiscent of high school rivalries, Google is now questioning the timing and validation of OpenAI’s claims.

    Soon after OpenAI shared its results on Saturday morning, just hours after the IMO revealed its top student winners on Friday night, Google DeepMind’s CEO and researchers criticized the announcement. They argued that OpenAI jumped the gun by declaring a gold medal before having its model’s results officially reviewed by the IMO.

    Thang Luong, senior researcher at Google DeepMind and lead on the IMO project, told TechCrunch that the company chose to delay its announcement out of respect for the students competing in the event.

    Luong explained that Google had been collaborating with IMO organizers since last year to prepare for the competition and chose to wait for the IMO president’s approval and official grading before making its announcement, which came on Monday morning.

    The IMO organizers have specific grading guidelines,” Luong said. “So any evaluation not aligned with those standards can’t credibly claim a gold-medal performance.”

    OpenAI Focused on Language Models, Unaware of IMO’s Informal Test with Google

    Meanwhile, Noam Brown, a senior researcher at OpenAI who worked on its IMO model, told TechCrunch that IMO had contacted OpenAI months ago about joining a formal math competition, but the company declined, focusing instead on developing natural language-based systems. According to Brown, OpenAI was unaware that IMO was conducting an informal evaluation with Google.

    To assess its own model, OpenAI hired three former IMO medalists familiar with the grading criteria to serve as independent evaluators. After determining the model had achieved a gold-medal–level score, OpenAI contacted IMO, which advised the company to hold off on announcing results until after the official student awards ceremony on Friday night.

    IMO did not respond to TechCrunch’s request for comment.

    Although Google followed a more formal and vetted process, the broader takeaway may be more important: leading AI labs are making rapid progress. At this year’s IMO, only a small fraction of the world’s brightest students matched the scores achieved by the AI models from OpenAI and Google.

    While OpenAI once held a clear edge in the field, the competition appears tighter than ever—though few in the industry may want to admit it. With GPT-5 expected soon, OpenAI is undoubtedly aiming to reinforce its position at the forefront of the AI race.


    Read the original article on: TechCrunch

    Read more: Need to Tackle a Complex Problem? Applied Mathematics Can Provide the Solution

  • Meta Recruits Leading OpenAI Scientist to Advance AI Reasoning Models

    Meta Recruits Leading OpenAI Scientist to Advance AI Reasoning Models

    Meta has brought on Trapit Bansal, a prominent researcher from OpenAI, to join its new AI superintelligence division focused on developing reasoning models, according to a source speaking to TechCrunch.
    Image Credits:David Paul Morris/Bloomberg / Getty Images

    Meta has brought on Trapit Bansal, a prominent researcher from OpenAI, to join its new AI superintelligence division focused on developing reasoning models, according to a source speaking to TechCrunch.

    OpenAI spokesperson Kayla Wood confirmed Bansal’s departure, and his LinkedIn profile indicates he left the company in June.

    Trapit Bansal, who joined OpenAI in 2022, played a pivotal role in launching the company’s reinforcement learning efforts alongside co-founder Ilya Sutskever. He’s also credited as a core contributor to OpenAI’s first AI reasoning model, o1.

    Bansal Joins Meta’s Elite AI Team to Boost Reasoning Model Efforts

    His move to Meta is expected to be a major asset for the company’s new AI superintelligence unit, which already includes high-profile names like former Scale AI CEO Alexandr Wang. Meta is also reportedly in talks with ex-GitHub CEO Nat Friedman and Safe Superintelligence co-founder Daniel Gross. Bansal’s expertise could help Meta develop a next-generation reasoning model to rival top-tier systems like OpenAI’s o3 and DeepSeek’s R1. As of now, Meta hasn’t released a reasoning model of its own.

    CEO Mark Zuckerberg has been aggressively expanding Meta’s AI team, reportedly offering compensation packages of up to $100 million to attract top talent. While it’s not known what Bansal was offered, his recruitment is part of a broader trend.

    According to The Wall Street Journal, three other former OpenAI researchers—Lucas Beyer, Alexander Kolesnikov, and Xiaohua Zhai—have also joined Meta’s AI superintelligence group. Bansal will be joining them, along with former Google DeepMind scientist Jack Rae and Johan Schalkwyk, previously a machine learning lead at the startup Sesame, as reported by Bloomberg.

    Zuckerberg Explored Acquiring Top AI Startups to Bolster Superintelligence Unit

    To expand its AI superintelligence division, Mark Zuckerberg reportedly pursued acquisitions of several high-profile AI startups, including Ilya Sutskever’s Safe Superintelligence, Mira Murati’s Thinking Machines Labs, and Perplexity. However, none of these discussions advanced to a finalized deal.

    During a recent podcast appearance, OpenAI CEO Sam Altman remarked that Meta has made efforts to lure top talent away from his company, but stated that “none of our best people have decided to take him up on that.”

    Meta declined to provide a comment on the matter.

    AI Reasoning Models Emerge as a Top Priority for Meta’s Superintelligence Team

    AI reasoning models are a critical focus for Meta’s new superintelligence team. Over the past year, companies like OpenAI, Google, and DeepSeek have released advanced models that demonstrate strong reasoning abilities, pushing the boundaries of what AI can achieve. These models improve performance by taking extra time and computing power to work through problems before delivering answers—an approach that’s shown success in both benchmark tests and real-world tasks.

    Meta’s superintelligence lab has the potential to become a core driver of AI innovation across the company, similar to how DeepMind supports Google’s broader ecosystem. Meta is also pursuing a major initiative to build AI business agents, led by former Salesforce AI chief Clara Shih. To make these agents truly competitive, Meta must first develop state-of-the-art reasoning models to power them.

    With hires like Bansal and other top AI experts, Meta aims to gain ground in the AI race. However, that goal may be challenged by OpenAI’s upcoming launch of an open AI reasoning model—an announcement that could raise the stakes for Meta’s own public AI tools.


    Read the original article on: TechCrunch

    Read more: Facebook Admins Report Widespread Bans, Meta Says It’s Working On a Fix

  • OpenAI Is Reportedly Planning To Utilize Google’s Cloud Services

    OpenAI Is Reportedly Planning To Utilize Google’s Cloud Services

    According to a Reuters report, OpenAI has signed an agreement with Google to start using Google's cloud services to support its expanding computing demands. This move comes as a surprise, considering that the two companies compete in the AI industry.
    Image Credits: Unsplash/caufeux

    According to a Reuters report, OpenAI has signed an agreement with Google to start using Google’s cloud services to support its expanding computing demands. This move comes as a surprise, considering that the two companies compete in the AI industry.

    OpenAI Expands Cloud Strategy Amid Shifting Partnerships and Soaring Compute Demands

    The parties have not disclosed the details of the agreement, but reports suggest they have been negotiating it for several months. This development represents OpenAI’s most recent effort to broaden its computing infrastructure beyond Microsoft Azure.

    Until January, Microsoft served as OpenAI’s sole data center provider. However, after CEO Sam Altman attributed delays in several product launches to limited computing resources, the company reached an agreement with CoreWeave in March to boost its cloud computing capacity. Analysts valued that deal at nearly $12 billion.

    While Microsoft Azure might no longer be OpenAI’s exclusive cloud provider, the two companies continue to maintain a close relationship. OpenAI still significantly depends on Azure, and the firms are currently in discussions to update the terms of their partnership—talks that will likely involve changes to Microsoft’s equity stake in OpenAI.

    Nevertheless, this development is undoubtedly a victory for Google Cloud. ChatGPT has emerged as one of the most significant challenges to Google’s search dominance in years, and this agreement could signal a potential thaw in relations between the two firms. Regardless, analysts expect it to bring substantial revenue to Google Cloud, which generated $43 billion last year and accounted for 12 percent of Alphabet’s total revenue. Adding OpenAI as a client is likely to significantly boost those figures.

    Google’s Cloud Capacity Questioned as OpenAI Deal Raises Concerns Over Resource Allocation

    One obvious concern stands out: Google has long struggled to keep up with demand for its cloud services—even before adding OpenAI to the mix. The bottom line is that it needs more data centers. This raises the question: will OpenAI receive priority access over existing customers? Engadget has contacted Google for comment and will update the story if a response is received.

    OpenAI is clearly flourishing. The company recently revealed that, based on current software adoption rates, it is on track to generate $10 billion in annual revenue as of June. It also informed investors of a yearly revenue target of approximately $12 billion—a goal it is expected to reach comfortably with the addition of new subscribers.


    Read the original article on: Engadget

    Read more: Google’s SynthID Detects AI Content—But What Is AI ‘Watermarking’ and Does It Work?

  • OpenAI May Soon let you Log into Apps with ChatGPT

    OpenAI May Soon let you Log into Apps with ChatGPT

    OpenAI is considering allowing users to sign in to third-party apps with their ChatGPT accounts, according to a web page published Tuesday. The company is currently assessing developer interest in integrating the feature.
    Image Credits:Pixabay

    OpenAI is considering allowing users to sign in to third-party apps with their ChatGPT accounts, according to a web page published Tuesday. The company is currently assessing developer interest in integrating the feature.

    ChatGPT is rapidly emerging as one of the world’s largest consumer apps, with around 600 million monthly active users. To build on this momentum, OpenAI appears interested in expanding into other consumer sectors like e-commerce, social media, and personal tech. A possible “Sign in with ChatGPT” feature could position OpenAI to better compete with major tech players like Apple, Google, and Microsoft, who offer streamlined sign-in options for various online services.

    Preview Launch of “Sign in with ChatGPT” for Developers Using Codex CLI

    Earlier this month, OpenAI introduced a preview of the “Sign in with ChatGPT feature for developers using Codex CLI, its open-source AI coding tool for terminals. This allowed developers to link their ChatGPT Free, Plus, or Pro accounts to their API accounts. As an incentive, OpenAI provided Plus users with $5 in API credits and Pro users with $50 for signing in through ChatGPT.

    OpenAI appears keen on bringing the sign-in feature to a wide range of companies. Its developer interest form requests details about an app’s user base — from small platforms with under 1,000 weekly users to large-scale apps exceeding 100 million. It also asks developers about their current pricing models for AI features and whether they already use the OpenAI API.

    OpenAI Advances “Sign in with OpenAI” Feature, but Launch Timeline Remains Unclear

    In 2023, CEO Sam Altman mentioned that OpenAI might explore a “sign in with OpenAI” option in 2024. Now, in 2025, the company seems to be actively developing the feature. However, it remains uncertain when it will be available to ChatGPT users or how many companies have committed to participating.


    Read the original article on: Techcrunch

    Read more: Neuralink First: Patient Uses Brain Implant to Make YouTube Video

  • OpenAI Introduces its GPT-4.1 Models to ChatGPT

    OpenAI Introduces its GPT-4.1 Models to ChatGPT

    OpenAI announced on X Wednesday that it is launching its GPT-4.1 and GPT-4.1 mini AI models in ChatGPT.
    Credit: Pixabay

    OpenAI announced on X Wednesday that it is launching its GPT-4.1 and GPT-4.1 mini AI models in ChatGPT.

    According to OpenAI spokesperson Shaokyi Amdo, the GPT-4.1 models are designed to assist software engineers using ChatGPT for writing or debugging code. OpenAI claims that GPT-4.1 outperforms GPT-4o in coding and following instructions, while also being faster than its O-series reasoning models.

    The company announces that it is now rolling out GPT-4.1 to ChatGPT Plus, Pro, and Team subscribers. At the same time, OpenAI is making GPT-4.1 mini available to both free and paid users of ChatGPT. As part of this update, OpenAI is discontinuing GPT-4.0 mini for all users, as mentioned in the release notes for GPT-4.1.

    OpenAI Launches GPT-4.1 and GPT-4.1 Mini

    OpenAI introduced GPT-4.1 and GPT-4.1 mini in April, but initially made them available only through its developer-facing API. The release drew criticism from the AI research community, which argued that OpenAI was lowering its transparency standards by launching GPT-4.1 without a safety report. In response, OpenAI explained that, despite GPT-4.1’s enhanced performance and speed compared to GPT-4o, the model was not considered a frontier model and therefore didn’t require the same safety reporting as more advanced models.

    GPT-4.1 doesn’t bring new modalities or interaction methods, nor does it exceed O3 in terms of intelligence,” said Johannes Heidecke, OpenAI’s Head of Safety Systems, in a post on X Wednesday. “Therefore, the safety considerations are important but differ from those of frontier models.

    OpenAI Launches Safety Evaluations Hub, Shares Insights on GPT-4.1 and Other AI Models

    OpenAI is now sharing more details about GPT-4.1 and its other AI models. Earlier on Wednesday, the company pledged to publish the results of its internal AI model safety assessments more regularly to improve transparency. These results will be available in OpenAI’s newly launched Safety Evaluations Hub.

    The launch of GPT-4.1 in ChatGPT comes amid growing focus on AI coding tools. OpenAI is reportedly close to announcing its $3 billion acquisition of Windsurf, a leading AI coding tool. Earlier on Wednesday, Google updated its Gemini chatbot to integrate more seamlessly with GitHub projects.


    Read the original article on: Techcrunch

    Read more: ChatGPT isn’t the Only Chatbot Attracting More Users

  • GPT-4.1 May Be Less Aligned With User Intentions Than Earlier OpenAI Models

    GPT-4.1 May Be Less Aligned With User Intentions Than Earlier OpenAI Models

    Credit: Depositphotos

    In mid-April, OpenAI introduced its advanced AI model, GPT-4.1, which it touted as being highly capable of following instructions. However, results from several independent tests indicate that the model is less aligned, meaning less reliable, compared to earlier OpenAI versions.

    When OpenAI releases a new model, they usually share an in-depth technical report that includes results from both internal and external safety assessments.

    However, the company skipped that step for GPT-4.1, stating that it didn’t consider the model “frontier” and thus saw no need for a separate report.

    This led some researchers and developers to explore whether GPT-4.1 performs less effectively than its predecessor, GPT-4.0.

    Misalignment in GPT-4.1 from Insecure Code, Says Oxford AI Research

    Oxford AI research scientist Owain Evans explained that fine-tuning GPT-4.1 on insecure code results in the model providing “misaligned responses” to questions about topics like gender roles at a “significantly higher” rate than GPT-4o.

    Evans had previously co-authored a study demonstrating that a version of GPT-4.0 trained on insecure code could lead to the model exhibiting harmful behaviors.

    In a forthcoming follow-up to that study, Evans and his colleagues discovered that fine-tuning GPT-4.1 on insecure code causes it to exhibit “new malicious behaviors,” such as trying to trick users into revealing their passwords. It’s important to note that neither GPT-4.1 nor GPT-4.0 show misaligned behavior when trained on secure code.

    We’re uncovering unforeseen ways in which models can become misaligned,” Owens told TechCrunch. “Ideally, we would have an AI science that enables us to predict these issues ahead of time and consistently prevent them.”

    A separate evaluation of GPT-4.1 by SplxAI, an AI red teaming startup, uncovered similar tendencies.

    GPT-4.1 More Prone to Misuse and Off-Topic Responses, Finds SplxAI

    In approximately 1,000 simulated test cases, SplxAI found that GPT-4.1 strays off-topic and permits “intentional” misuse more frequently than GPT-4.0. SplxAI attributes this to GPT-4.1’s tendency to favor explicit instructions. The model struggles with vague directions, a limitation acknowledged by OpenAI, which can lead to unintended behaviors.

    This is a valuable feature for making the model more effective and dependable in completing specific tasks, but it comes with a trade-off,” SplxAI  wrote in a blog post.

    Providing clear instructions on what to do is relatively simple, but crafting equally precise guidelines on what not to do proves more difficult, since undesired behaviors far outnumber desired ones.

    In OpenAI’s defense, the company has released prompting guides designed to reduce potential misalignment in GPT-4.1. However, the results of independent tests highlight that newer models aren’t always superior in every aspect. Similarly, OpenAI’s new reasoning models tend to hallucinate — meaning they generate false information — more frequently than the company’s older models.


    Read the original article on: TechCrunch

    Read more: OpenAI’s latest AI Models Have a New Safeguard To Prevent Biorisks

  • OpenAI Introduces Flex Processing for More Affordable, Slower AI Tasks

    OpenAI Introduces Flex Processing for More Affordable, Slower AI Tasks

    Image Credits:Bryce Durbin / TechCrunch

    To increase its competitive edge against other AI companies like Google, OpenAI has launched Flex processing, an API option that lowers the cost of AI model usage by offering slower response times and occasional “resource unavailability.

    Flex processing, currently in beta for OpenAI’s recently launched o3 and o4-mini reasoning models, is designed for lower-priority tasks such as model evaluations, data enrichment, and asynchronous workloads, as stated by OpenAI.

    Significant Cost Reduction for API Usage

    This option reduces API costs by 50%. For o3, Flex processing charges $5 per million input tokens (~750,000 words) and $20 per million output tokens, compared to the standard price of $10 per million input tokens and $40 per million output tokens. For o4-mini, the cost drops to $0.55 per million input tokens and $2.20 per million output tokens, from $1.10 per million input tokens and $4.40 per million output tokens.

    The launch of Flex processing comes at a time when the prices for cutting-edge AI models continue to rise, while competitors release more affordable, efficient models aimed at budget-conscious users. Recently, Google introduced Gemini 2.5 Flash, a reasoning model that matches or surpasses the performance of DeepSeek’s R1 at a lower cost per input token.

    In a recent email to customers, OpenAI told users that developers in the first three tiers of its usage system must now complete an ID verification process to access the o3 model. The company determines these tiers based on how much users spend on OpenAI services. Access to o3’s reasoning summaries and streaming API also requires this verification.

    OpenAI previously stated that it introduced the ID verification process to prevent misuse of its services and to ensure users comply with its usage policies.


    Read the original article on: Techcrunch

    Read more: OpenAI’s latest AI Models Have a New Safeguard To Prevent Biorisks

  • Researchers Suggest OpenAI Trained its Models on Paywalled O’Reilly Books

    Researchers Suggest OpenAI Trained its Models on Paywalled O’Reilly Books

    OpenAI has faced multiple accusations of using copyrighted content without permission to train its AI models. A new paper from the AI Disclosures Project, an organization focused on AI transparency, makes a serious claim that OpenAI has increasingly relied on non-public, unlicensed books to train its advanced AI models.
    Image Credits:Jakub Porzycki/NurPhoto / Getty Images

    OpenAI has faced multiple accusations of using copyrighted content without permission to train its AI models. A new paper from the AI Disclosures Project, an organization focused on AI transparency, makes a serious claim that OpenAI has increasingly relied on non-public, unlicensed books to train its advanced AI models.

    AI models work as sophisticated prediction engines, trained on vast datasets like books, movies, and TV shows, to learn patterns and generate responses based on prompts. When a model “writes” an essay or “draws” an image, it’s simply drawing on its extensive training to approximate, rather than create something entirely new.

    While many AI labs, including OpenAI, have turned to AI-generated data to train models as they run out of real-world data, few have abandoned real-world sources altogether. Training exclusively on synthetic data could harm the model’s performance.

    AI Disclosures Project Suggests OpenAI Used Paywalled O’Reilly Books for GPT-4o Training

    The AI Disclosures Project, a nonprofit founded by media mogul Tim O’Reilly and economist Ilan Strauss, suggests in its paper that OpenAI likely used paywalled books from O’Reilly Media to train its GPT-4o model. O’Reilly Media, led by Tim O’Reilly, does not have a licensing agreement with OpenAI, according to the paper.

    The co-authors of the paper noted, “GPT-4o, OpenAI’s more advanced and capable model, shows a strong recognition of paywalled O’Reilly book content, especially when compared to the earlier GPT-3.5 Turbo model.” They added, “In contrast, GPT-3.5 Turbo shows greater recognition of publicly available O’Reilly book samples.”

    The paper utilized a method called DE-COP, first introduced in a 2024 academic study, which detects copyrighted content in language model training data. This “membership inference attack” tests whether a model can distinguish between human-authored texts and AI-generated paraphrases of the same content. If successful, it suggests the model may have encountered the text during training.

    Co-authors Analyze OpenAI Models’ Knowledge of O’Reilly Media Books

    The paper’s co-authors—O’Reilly, Strauss, and AI researcher Sruly Rosenblat—examined the knowledge of GPT-4o, GPT-3.5 Turbo, and other OpenAI models regarding O’Reilly Media books, both before and after their training cutoff dates. They used 13,962 paragraph excerpts from 34 O’Reilly books to estimate the likelihood that a specific excerpt was included in the training data.

    The results showed that GPT-4o recognized far more paywalled O’Reilly book content compared to older models, particularly GPT-3.5 Turbo. This was true even when accounting for potential factors like newer models’ improved ability to identify human-authored text.

    The co-authors concluded, “GPT-4o likely recognizes, and thus has prior knowledge of, many non-public O’Reilly books published before its training cutoff date.”

    The co-authors are quick to clarify that their findings are not definitive evidence. They acknowledge that their experimental method isn’t foolproof and that OpenAI could have gathered paywalled book excerpts from users copying and pasting them into ChatGPT.

    Co-authors Did Not Evaluate OpenAI’s Latest Models

    Complicating matters further, the co-authors did not assess OpenAI’s latest models, including GPT-4.5 and “reasoning” models like o3-mini and o1. It’s possible that these newer models were not trained on paywalled O’Reilly books, or were trained on a smaller portion of such data compared to GPT-4o.

    That said, it’s well-known that OpenAI has been actively seeking higher-quality training data, advocating for fewer restrictions on using copyrighted content. The company has even hired journalists to help refine its models’ outputs. This trend is reflected across the AI industry, with companies recruiting experts in fields like science and physics to incorporate their knowledge into AI systems.

    It’s important to note that OpenAI does pay for at least some of its training data, with licensing agreements in place with news publishers, social networks, stock media libraries, and others. The company also provides opt-out mechanisms, although imperfect, allowing copyright holders to flag content they prefer not to be used for training.

    Nevertheless, as OpenAI faces multiple lawsuits regarding its training data practices and the handling of copyright law in U.S. courts, the O’Reilly paper adds further scrutiny to the company’s approach.


    Read the original article on: TechCrunch

    Read more: OpenAI Intends to Launch a New Open AI Language Model in the Next Few Months

  • OpenAI Intends to Launch a New Open AI Language Model in the Next Few Months

    OpenAI Intends to Launch a New Open AI Language Model in the Next Few Months

    OpenAI announced it plans to release its first "open" language model since GPT-2 in the coming months. This information comes from a feedback form published on the company’s website on Monday.
    Image Credits:Mike Coppola / Getty Images

    OpenAI announced it plans to release its first “open” language model since GPT-2 in the coming months. This information comes from a feedback form published on the company’s website on Monday.

    OpenAI is inviting “developers, researchers, and members of the broader community” to fill out the form, which includes questions like, “What would you like to see in an open-weight model from OpenAI?” and “What open models have you used in the past?”

    We’re excited to collaborate with developers, researchers, and the broader community to gather input and make this model as useful as possible,” OpenAI stated. “If you’re interested in joining a feedback session with the OpenAI team, please let us know [in the form] below.”

    OpenAI to Host Developer Events for Feedback and Model Previews

    OpenAI also plans to host developer events to collect feedback and preview model prototypes. The first event will be held in San Francisco in a few weeks, followed by additional sessions in Europe and the Asia-Pacific regions.

    OpenAI is under increasing pressure from competitors like the Chinese AI lab DeepSeek, which have adopted an “open” approach to launching models. Unlike OpenAI’s strategy, these competitors release their models to the AI community for experimentation and, in some cases, commercialization.

    This open approach has been highly successful for some organizations. Meta, which has heavily invested in its Llama family of open AI models, reported in March that Llama had surpassed 1 billion downloads. Meanwhile, DeepSeek has quickly built a large global user base and attracted significant attention from domestic investors.

    In a recent Reddit Q&A, OpenAI CEO Sam Altman admitted that the company may have been on the wrong side of the open-source debate.

    Altman Calls for Rethinking Open-Source Strategy

    [I personally think we need to] figure out a different open-source strategy,” Altman said. “Not everyone at OpenAI shares this view, and it’s also not our current highest priority […] We will produce better models [going forward], but we will maintain less of a lead than we did in previous years.”

    Altman further elaborated on OpenAI’s open model plans in a post on X, explaining that the company’s upcoming open model would feature “reasoning” capabilities similar to OpenAI’s o3-mini.

    [B]efore release, we will evaluate this model according to our preparedness framework, like we would for any other model,” Altman wrote. “[A]nd we will do extra work given that we know this model will be modified post-release […] [W]e’re excited to see what developers build and how large companies and governments use it where they prefer to run a model themselves.”

    In a forthcoming book, excerpts published over the weekend by Wall Street Journal reporter Keach Hagey suggest that Altman may have misled OpenAI executives about model safety reviews before his brief removal in November 2023.


    Read the original article on: TechCrunch

    Read more: OpenAI’s Latest Image Generator is Now Accessible to All Users