Lo que ChatGPT no hace y Documentación Multimed-IA

It is a pleasure to share our next ConocimIA event, on Artificial Intelligence and Documentation. This time, we present a unique session dedicated to exploring the frontiers of artificial intelligence.

Date: February 23, 2024 / 5:00-7:00 PM
Location: Conference Room, Faculty of Documentation Sciences, UCM
Admission: Free, subject to room capacity

First part: What ChatGPT Does Not Do

Speakers: Prof. María Antonia Ovalle Perandones and Prof. Manuel Blázquez Ochando

The opening session addresses a question that, paradoxically, has received less attention than the celebrated capabilities of generative models: What can ChatGPT not do? Beyond widely adopted uses—such as text composition, problem-solving, and code generation—it is equally relevant to identify the limitations of these tools, both for responsible professional use and for a deeper understanding of their technical nature.

Professor María Antonia Ovalle, Director of the Department of Library and Information Science at the Complutense University of Madrid, presents an empirical analysis based on concrete case studies. Her intervention begins with an observation: although ChatGPT and other artificial intelligences are presented as nearly perfect allies, practical experience reveals significant limitations that warrant systematic documentation.

Limitations Identified in Practice

1. Inability to work with proprietary formats or specific environments.

In a test involving Greenstone—a free and open-source system for digital libraries—the model limited itself to offering general guidance on the steps to follow but was unable to interact directly with the software or convert digital objects into the required format. When asked to convert a text file into PDF, the response was similar: it provided instructions for performing the task manually, acknowledging its inability to generate files outside the conversational environment.

2. Limitations in the grading of academic assignments.

In a case involving the evaluation of Markdown-formatted assignments, ChatGPT partially identified errors—such as missing spaces after headings, incorrect list formatting, and issues with the table—but overlooked others, including improper use of italics, strikethroughs, or citations. The proposed grade (8/10) was excessively lenient given the detected errors. The lesson here is that the model tends to "conceal" its ignorance rather than openly acknowledge its limitations.

3. Confusion between information generation and retrieval.

In an exercise requiring the retrieval of authority URIs from sources such as VIAF, Wikidata, or ISNI, ChatGPT provided identifiers that mixed correct ones with incorrect ones. The most significant case involved "José López," for which it supplied data corresponding to José López Portillo (a Mexican politician), without acknowledging the ambiguity of the name. As one of the assistants noted, the model behaves like "a child: instead of admitting it cannot or does not know, it conceals."

4. Inability to generate content that violates ethical guidelines.

When asked to draft a message discrediting the opinions of researchers of another gender, ChatGPT explicitly refused the request, emphasizing the importance of respectful communication and the diversity of perspectives in research. This behavior reflects the layers of safety mechanisms incorporated during training with reinforcement learning from human feedback (RLHF).

5. Limited temporal knowledge base.

In response to a request for a summary of the war in Ukraine during 2023–2024, ChatGPT disclosed that its knowledge extended only up to January 2022 and could not anticipate subsequent events. This limitation, inherent to models trained on a frozen corpus, underscores the need to supplement AI with up-to-date sources when temporal context is relevant.

6. Restrictions on financial advice.

In a consultation regarding investment of savings, ChatGPT systematically refused to provide specific recommendations, emphasizing the need for personalized financial advice. Although it eventually suggested two generic investment funds and mentioned the existence of robo-advisors, it maintained a cautious tone that reveals both regulatory limitations and the model’s awareness of its own constraints in sensitive domains.

Reflections on the Limits

The combined analysis of these cases allows us to draw several conclusions:

Not everything can be resolved with ChatGPT. Certain tasks require interaction with external environments, proprietary formats, or actions outside the conversational scope.
The clarity of the prompt is decisive. If the expressed idea is vague, poor results are obtained; if it is clear, comprehensive, and well-structured, the outcomes improve.
There are factors beyond the user's control. The context of AI, its knowledge base, the model, its training, and the token window condition the responses.
Vagueness, ambiguity, and lack of procedural guidance are AI's worst enemies. Complex processes require a sequential order in instructions.
The documentalist must learn to design prompts. Conversing with, questioning, posing relevant questions, and communicating with AI become essential competencies.

The importance of recording prompts

As experience with AI is gained, "micro-automations" emerge that must be saved for reuse. The prompts yielding the best results are categorized and linked to specific contexts, objectives, and problems, contributing to making AI a more intelligent software. In this sense, documenting one’s own prompts is a practice that "is worth its weight in gold."

Specialized GPTs

OpenAI’s GPT platform functions as an "App Store for Artificial Intelligence," where any registered user can create specialized AIs for specific cases and situations. This evolution toward specialization represents one of the most relevant trends for information professionals, who can design assistants tailored to their specific needs without requiring advanced programming knowledge.

Part Two: Multimed-IA Documentation

Contributors: Prof. Alfonso López Yepes, Víctor Villapalos Pardiñas, and Prof. Manuel Blázquez Ochando

The second part of the session addresses the impact of artificial intelligence on multimedia documentation, a field encompassing the creation, management, and retrieval of audiovisual content.

Perspective of Professor Alfonso López Yepes

Professor Alfonso López Yepes, a leading authority in audiovisual documentation, provides a comprehensive overview of the intersection between AI and multimedia. His presentation revolves around several key ideas:

The current context. Generative AI is radically transforming the production of audiovisual content, with tools capable of generating images, videos, and audio whose quality is beginning to be indistinguishable from reality.
The role of professionals. Traditional multimedia documentation, focused on cataloging, classifying, and retrieving audiovisual materials, is being challenged by systems capable of generating content on demand, yet also enriched by new capabilities for analysis and information extraction.
Relevant Resources. López Yepes shares references to projects such as REDAUVI (University Audiovisual Heritage Network), the Multimedia Documentation Service of the UCM, and various initiatives for preserving Ibero-American film heritage that can benefit from AI technologies.

Practical Tools and Applications

Víctor Villapalos, Managing Director of SEDIC, presents an overview of the main AI platforms applied to multimedia creation, categorized by domain:

Aggregation Platforms. Sites such as Futurepedia, AI Findy, or Toolify collect and categorize AI-based projects, facilitating the exploration and discovery of new tools.
Video editing and creation. RunwayML enables video generation from textual or image prompts, with parameter and effect editing. Visla, targeted at businesses, facilitates the creation of dialogues, voiceovers, and summaries. Fliki.ai generates videos with images, music, and automatically adapted social media text. OpenAI’s recent unveiling of SORA marks a milestone in realistic video generation from textual instructions.
Avatars and voice synthesis. Platforms such as HeyGen, Synthesia, or Bhuman enable the creation of avatars that speak with synthesized voices, opening possibilities for institutional communication, training, and content dissemination.
Image Editing and Creation. DALL-E 3, Midjourney, and Leonardo.ai are the most well-known tools for generating images from textual descriptions. Microsoft Designer and Canva have integrated AI capabilities into their graphic design platforms. Complementary tools such as Remove.bg (background removal), Mokker.ai (advertising photomontages), or Krea.ai (resolution enhancement) expand the possibilities for editing.
Logo and visual identity creation. Namelix generates business names and logos, while Brandmark specializes in visual identity design.
Audio and music. ElevenLabs enables high-fidelity voice synthesis, including cloning existing voices. Suno.ai generates complete songs from a single prompt. Summarize.tech extracts transcripts and summaries from audiovisual content.
Chat and conversational AI. In addition to ChatGPT, alternatives such as Gemini (Google), Poe (for personal bots), and LM Studio are available; the latter allows local installation of AI models without requiring an internet connection.
Diverse applications. TinyWow offers generic tools for editing and format conversion. AgentGPT functions as an autonomous agent that plans and executes steps to achieve a final objective.

Contribution by Professor Manuel Blázquez Ochando

The closing of the session, led by Professor Blázquez, synthesizes key reflections on the impact of AI on multimedia documentation:

Multimedia with AI. The convergence of artificial intelligence and multimedia content enables the automation of creation, editing, and post-production processes that previously required specialized teams and lengthy execution times.
Some facts. AI-generated imagery has progressed from experimental results to commercially viable productions in less than two years. Demand for these tools is growing exponentially, with platforms accumulating millions of users within their first months of operation.
Impact on Multimedia Documentation. The discipline is undergoing an unprecedented transformation. Traditional tasks such as cataloging and classification can now be automated, freeing professionals to focus on higher-value functions. At the same time, new needs are emerging: curating AI-generated content, evaluating its quality and veracity, and integrating these materials into documentary workflows.
How far can we go? Technical evolution suggests that multimedia content generation will become increasingly realistic, faster, and more customizable. Tools such as SORA point toward a future in which any user can generate high-quality videos using natural language instructions.

Conclusions of this section.

Advances in AI and multimedia are highly promising. It is now possible to automate nearly all image, audio, and video creation.
The quality of these creations approaches a realism that is difficult to distinguish from reality.
Never before has Multimedia Documentation had access to so many tools for creating and producing content.
The production of multimedia content is being democratized.
AI represents a revolution that will transform how we conceive Documentation and the role of information professionals.
High automation based on prompts and AI-driven processes.
Impact on employment: it is possible to achieve more with less.
Predictable and unpredictable consequences.

Open Questions

The session concludes with a space for debate on issues that do not admit unequivocal answers:

Opportunity or threat. Do the advantages predominate over the disadvantages?
Tool or replacement. Is AI an extension of our capabilities or its substitute?
Progress or regression. Are we moving toward the future or toward the end of human intervention?
Human factor or algorithmic factor. What relevance will each have in the future?
Automated generation or creativity. What value does each contribute?
Future, prediction, predictability, originality, novelty, competition. How will these concepts be reconfigured in an ecosystem dominated by AI?

This conference is part of the activities of the ConocimIA Seminar, a space dedicated to monitoring and analyzing artificial intelligence in the field of Documentation Sciences.

Conference Materials

The materials used in this session are available for download in DOCX, PPTX, and PDF formats. The presentation captures the ideas, references, and open questions raised throughout the conference and may serve as a starting point for further exploration of the topics discussed or for use in educational contexts, with proper attribution.

Ovalle-Perandones, M.A. (2024). What ChatGPT Does Not Do. conocimIA_maovalle_2024-02-23_lo-que-no-hace-ChatGPT.docx
Blázquez-Ochando, M.A. (2024). What ChatGPT Does Not Do. conocimIA_mblazquez_2024-02-23_lo-que-no-hace-ChatGPT.pptx
López-Yepes, A. (2024). Multimedia Documentation and Artificial Intelligence. conocimIA_alopezyepes_2024-02-23_documentacion-multimedia-inteligencia-artificial.ppt
Villapalos-Pardiñas, V. (2024). Uses of Artificial Intelligence. conocimIA_vvillapalos_2024-02-23_usos-inteligencia-artificial.pptx
Blázquez-Ochando, M. (2024). Multimed-IA. conocimIA_mblazquez_2024-02-23_multimedia-IA.pptx | Demonstration Tests. conocimIA_mblazquez_2024-02-23_multimedia-IA_pruebas.zip