On May 29, 2018, the workshop «Expert Search in Google and Scraping Techniques» took place at the Universidad Carlos III de Madrid, highlighting the growing interest in data and content mining from search engines. This follows the publication of the book «Expert Search Strategies in Google» and the data extraction tests «Google Scraping» and «Web Scraping in Google Finance». The rapid advancement of information technologies is compelling documentation professionals to enhance their knowledge and skills in using digital tools to extract information from the Web. However, it is also essential to develop applications that enable customization and adaptation of data mining to each specific source and resource. This workshop presents a comprehensive overview of information search using advanced query operators and web scraping techniques, ultimately demonstrating how to apply these techniques on search engines such as Google. The workshop program is as follows:
PART 1 – Expert Search in Google
- Expert search strategies in search engines
- Search operators
- RESTful queries
- Examples of expert search
- Applications and process automation
PART 2 – Scraping Technique
- Introduction to parser programs and scraping technique
- Operation scheme of the scraping method
- Technologies involved in scraping
- First approach with LinkKlipper
PART 3 – Practices
- The first parser
- Using an XML parser
- Using an HTML parser
- Methods for downloading HTML code from a webpage
- Extracting data from a webpage
- Extracting news from a digital newspaper
- Extracting Webometrics information resources
- Extracting results from a simple Google query
- Extracting results from an advanced Google query
Download the workshop software
If you attend the workshop and have a VIP code, you will be able to download the trial software I have prepared for this occasion. To do so, enter your code in the form below and access the manual download page.