News

  1. https://www.youtube.com/live/poRq8sDqzMg

Opinion

On November 6 and 7, 2023, OpenAI hosted its first "DevDay" under the leadership of Sam Altman. The event unveiled a cascade of announcements that, far from being mere incremental updates, point toward a profound transformation in the relationship between language models, developers, and ultimately, end users. From my perspective, these developments warrant careful analysis, as they bring to the fore both unprecedented opportunities and new fronts of responsibility.

Among the novelties, the introduction of a new version of the GPT algorithm was highlighted, specifically "GPT-4 Turbo," which stands out for its context window of 128,000 tokens, equivalent to 300 pages of a standard book—16 times longer than its predecessor. This version also focuses on improving accuracy in prolonged conversations and offers a new JSON mode to ensure valid responses in that format. From a documentary perspective, this expansion of the context window represents an extraordinary qualitative leap. In previous models, the token limitation imposed a severe constraint on the ability to process lengthy documents, maintain coherence in complex conversations, or perform comparative analysis across multiple sources. With 128,000 tokens, a system can now process entire works, bulky technical reports, or extensive clinical histories in a single pass. This transforms GPT-4 Turbo into an unprecedentedly powerful tool for documentary analysis, capable of tackling tasks such as synthesis, information extraction, and reasoning over corpora that previously required complex fragmentation and subsequent assembly processes.

Moreover, GPT-4 Turbo will be integrated into ChatGPT, simplifying the user experience by eliminating the need to select a specific model. OpenAI also introduces tools that enable enterprises to customize the GPT-4 experience according to their needs without having to build a model from scratch. This move toward integration and customization carries profound implications. On one hand, the disappearance of explicit model selection simplifies the interface and brings the technology closer to non-technical users, contributing to the democratization of access. On the other hand, enterprise customization tools—which allow adjusting the model’s behavior without training entire architectures from scratch—open the door to widespread adoption in sectors such as banking, healthcare, or public administration. However, as I have noted in previous articles, this same ease of adaptation carries risks if not accompanied by adequate controls: Who verifies that customizations respect data privacy? How is it ensured that models tailored for specific sectors do not perpetuate biases or violate sector-specific regulations?

Regarding image generation and speech processing capabilities, it is mentioned that the ChatGPT Assistants API has entered beta mode and offers more human-like voice options and the ability to interact with real-time data once connected to the Internet. The integration of these multimodal capabilities and real-time internet connectivity reinforces the trend we previously noted when discussing GPT-5’s webcrawler: language models are evolving into ubiquitous information systems capable of operating with any format and maintaining continuous dialogue with the digital environment. From the perspective of information retrieval, this represents a significant advancement, bringing closer the ideal of a universal query system capable of integrating structured and unstructured data, images, and voice within a single conversational flow.

But perhaps the announcement that deserves the most attention from a legal and ethical perspective is the one addressing an issue we have consistently highlighted in this series of articles: copyright infringement. OpenAI also tackles this concern with the initiative "Copyright Shield", designed to protect developers from claims of copyright infringement. The company commits to defending its customers and assuming legal costs in the event of claims arising from copyright infringement. This initiative is, at the very least, a significant gesture. Until now, one of the primary barriers to enterprise adoption of generative models has been the legal uncertainty surrounding the use of copyrighted materials in training datasets and generated outputs. The Copyright Shield does not resolve the underlying issue—whether training on protected data constitutes infringement remains under litigation in multiple jurisdictions—but it does provide a safety umbrella for developers building on OpenAI’s platform. From my perspective, this move reflects the company’s growing awareness that the long-term commercial viability of generative AI hinges on satisfactorily addressing tensions with copyright holders, publishers, and content creators.

I believe we are at a foundational moment. The decisions we make now about how to govern these technologies—in terms of transparency, accountability, and respect for fundamental rights—will determine whether this leap in power translates into progress for the knowledge society or, conversely, opens a phase of legal and social conflict that undermines its potential. The DevDay announcements are impressive. But the true measure of success will not be the length of the context window, but the ability of industry and regulators to build an ecosystem where such immense power is accompanied by equal responsibility.