[Article updated and reviewed 2015-11-15]

The Teseo database, published by the Ministry of Education, enables retrieval of information on doctoral theses defended at Spanish universities since 1976. The information provided by this online resource has fostered the development of numerous bibliometric and scientometric studies on the state of research in Spain across various fields of knowledge and specialties.

To gauge the impact Teseo has had on research, a search was conducted in Google Scholar using the query (intitle:teseo OR intext:teseo) AND ("database"), yielding 1,690 results that trace an upward trend, demonstrating increased direct and indirect citation of the database.

Fig.1. Evolution of direct and indirect references to the Teseo database in academic-scientific publications retrieved from Google Scholar [Source: author's own] [Consulted on 2015-11-04]

Figure 1. Evolution of direct and indirect references to the Teseo database in academic-scientific publications retrieved from Google Scholar [Consulted on 2015-11-04]

Given the importance of this resource for the development of future research aimed at determining the scientific output of doctoral theses in a specific specialty or field of knowledge, a method has been developed for retrieving doctoral thesis records from the Teseo database. The method employs web scraping techniques combined with the Mbot crawling engine Mbot.

Fig.2. Sample of doctoral thesis records collected in the Teseo database

Figure 2. Sample of doctoral thesis records collected in the Teseo database

Each permalink in the database has been meticulously analyzed using XPath and REGEXP pattern techniques to extract the main data from each registered doctoral thesis. Data such as the full title, author, originating university, defense date, thesis supervisors, committee members, descriptors, and abstract are automatically collected and prepared for export in SQL, CSV (Comma Separated Values), and CSV formats compatible with MS Excel, available from the SourceForge repository.

Downloads of Teseo v1.1 [2015-11-14]

  1. Download Teseo v1.1 CSV (Complete Data)
  2. Download Teseo v1.1 CSV for MS Excel (Complete Data)
  3. Download Teseo v1.1 SQL (Structure and Data – Complete)
  4. Download Teseo v1.1 SQL (Structure Only)
  5. Download Teseo v1.1 SQL (Data Only – Complete)
  6. Download Teseo v1.1 SQL (Data Only) Part 01
  7. Download Teseo v1.1 SQL (Data Only) Part 02
  8. Download Teseo v1.1 SQL (Data Only) Part 03
  9. Download Teseo v1.1 SQL (Data Only) Part 04
  10. Download Teseo v1.1 SQL (Data Only) Part 05
  11. Download Teseo v1.1 SQL (Data Only) Part 06
  12. Download Teseo v1.1 SQL (Data Only) Part 07
  13. Download Teseo v1.1 SQL (Data Only) Part 08
  14. Download Teseo v1.1 SQL (Data Only) Part 09
  15. Download Teseo v1.1 SQL (Data Only) Part 10
  16. Download Teseo v1.1 SQL (Data Only) Part 11
  17. Download Teseo v1.1 SQL (Data Only) Part 12
  18. Download Teseo v1.1 SQL (Data Only) Part 13
  19. Download Teseo v1.1 SQL (Data Only) Part 14
  20. Download Teseo v1.1 SQL (Data Only) Part 15
  21. Download Teseo v1.1 SQL (Data Only) Part 16
  22. Download Teseo v1.1 SQL (Data Only) Part 17
  23. Download Teseo v1.1 SQL (Data Only) Part 18
  24. Download Teseo v1.1 SQL (Data Only) Part 19
  25. Download Teseo v1.1 SQL (Data Only) Part 20

Import Teseo into AMP (Apache, MySQL, PHP)

The Teseo database can be imported into any distribution based on Apache, MySQL, and PHP, such as XAMPP, WAMP, EasyPHP, AMPdoc, provided they include a phpMyAdmin database management tool to facilitate data migration tasks. Additionally, a special PHP configuration via the «php.ini» file is required to enable unlimited script execution, increase the memory limit, and raise the maximum file size for imports. The recommended configuration and steps for importation through the phpMyAdmin database manager are shown below.

PHP Configuration (php.ini file)

  1. max_input_time= -1
  2. memory_limit=4028M
  3. post_max_size=500M
  4. upload_max_filesize=500M
  5. max_file_uploads=20

Steps for Importing Teseo via phpMyAdmin

  1. Create an empty database named "teseo". The database will be created without tables, ready for importing the Teseo structure and data.
  2. Import methods:
  3. Structure and Data – Complete. From the «Import» option, select the file «catalogoteseo-estructuraydatos.sql», previously downloaded. Then, click the «Continue» button to start the import process. The process may take several minutes. Finally, the system completes dumping all Teseo information and is ready for use.
  4. Teseo SQL in Parts. From the «Import» option, select the file «catalogoteseo-part01.sql», previously downloaded. Then, click the «Continue» button to start the import process. This step automatically generates the table with the necessary field structure to import the data and subsequently dumps the first of the 14 available batches of records. Repeat the import process with the subsequent parts until the migration is complete.
  5. Verification of the import. It is recommended to verify that a total of 132,378 records corresponding to Doctoral Theses have been imported.

Fig.3. PHPMyAdmin import screen. Note that the file size limit is 500MB, which allows for a successful import.

Figure 3. phpMyAdmin import screen. Note that the file size limit is 500 MB, which allows for a successful import of TESEO

List of TESEO Articles

  1. Catalog of Spanish Doctoral Theses TESEO Available for Download
  2. TESEO Database. Initial Data
  3. How TESEO Data Were Obtained, Aspects to Consider, and New Actions
  4. Update of TESEO Data
  5. Doctoral Theses in Spanish Universities during the Period 1977–2014