(objectives)
In the age of digital transformation, an important part of information useful for organisations is available from unstructured data stored in office documents (word, power point, and pdf), on the web (web pages, blogs, and online communities) or social media. This information is crucial to understand environmental phenomena relevant for the lives of organisations. A major part of these pieces of information are stored in texts and are non structured. Moreover, this information shows many times all the characteristics of velocity, variety, and quantity of big data.
The course tackles the the technical challenges for organisations and the operational tools to identify, retrieve, extract, and analyse in an automatic manner data from non structured sources on the web and on social media. The course introduces to students the fundamentals organisational processes – sense making, decision making, and knowing – and the fundamentals of automatic analysis of textual data. The syllabus of the course includes also theoretical knowledge on the functioning of web technologies: web protocols, the HTML language, the data interchange formats (cvs and json), and the main APIs for accessing data on social media platforms.
Students attending the course will engage with data extraction, manipulation, and analysis from social media and web sites. The technical and operational knowledge acquired during the course can be used also for analysing other types of non structured data, such as reports, documents, or other data sources.
During the course the students will be engaged both in During the course the students will be engaged both in theoretical and practical learning, as individuals and in groups. The participation to the course will stimulate in students the following skills.
Knowledge and understanding Understand the opportunities and limits of big data, unstructured data, and automatic analysis of textual data. Know the fundamental organisational processes and the role that information plays in them. Know the theoretical foundations and the technical procedures for automatic data extraction and analysis of non structure data and textual data.
Applying knowledge and understanding Recognise the limitations and the potential domains of applications of the techniques for non structured data extraction, manipulation, and analysis. Recognise opportunities and risks in the use of information from non structured sources, from the web and from social media.
Making judgements Understand if and which data from non structure sources (web and social media) could satisfy the information need of people and organisations. Understand how to integrate structured and unstructured sources of data to answer to the information need of people and organisations. Be able to assess the information contained – and the potential biases – in unstructured data. Be able to interpret in an objective manner the information obtained from the automatic analysis of textual data.
Communication skills During the course the students will train the skills of presenting, arguing, and debating in public the results of their analysis and their interpretations.
Learning skills Be able to learn in an autonomous and self-managed way.
|
Code
|
119031 |
Language
|
ITA |
Type of certificate
|
Profit certificate
|
Credits
|
8
|
Scientific Disciplinary Sector Code
|
SECS-P/10
|
Contact Hours
|
48
|
Type of Activity
|
Related or supplementary learning activities
|
Teacher
|
BRACCINI Alessio maria
(syllabus)
The course syllabus is divided in a theoretical and a practical part. In the theoretical part the students will deepen the sense making, decision, making and knowing organisational processes. They will deepen the technical problems, and the application potential, of automatic analysis of non structured data and of textual analysis. They will also deepen other theoretical pieces of knowledge related to web technologies and social media.
In the practical part the students will deepen a set of practical competences related to the use of tools to collect, manipulate and analyse data for improving the information needs of organisations. In the practical part the students will focus on the following aspects: - Web scraping and social media mining - Data manipulation: cleaning, de-coding, re-coding, and transformation - Basic operations of text mining: creation of text corpora, extraction of tokens, lexicons, roots, stems, n-grams, and dtm/tdm matrices
(reference books)
To be defined
|
Dates of beginning and end of teaching activities
|
From to |
Delivery mode
|
Traditional
At a distance
|
Attendance
|
not mandatory
|
Evaluation methods
|
Oral exam
A project evaluation
|
Teacher
|
Margherita Emanuele Gabriel
(syllabus)
The course program is divided into a theoretical part and an application / practical part. In the theoretical part, students will deepen the organizational processes of sense making, decision making, and knowledge. They will investigate the technical problems and the potential of unstructured data analysis and automatic text analysis. They will also deepen theoretical knowledge related to the functioning of web technologies and social media. In the application / practical part, students will deepen a series of practical skills concerning the use of tools for collecting, manipulating and analyzing data functional to increase the information needs of organizations. In the application / practical part the following aspects will be addressed: - Web scapring and social media mining - Data manipulation: cleaning, coding, and transformation of data - Basic operations of text mining: creation of corpus, extraction of tokens, lexicons, roots, lemmas, n-gram, tdm / dtm matrices; - Advanced operations: semantic annotation, topic extraction, sentiment, emotion from the text, classifications
(reference books)
Books: Simon Munzert, Christian Rubba, Peter Meißner, and Dominic Nyhuis Automated data collection with R: A practical guide to web scraping and text mining. John Wiley & Sons, 2014
ISBN 978-1118834817
Bing Liu Sentiment analysis: Mining opinions, sentiments, and emotions Cambridge University Press, 2015
ISBN 978-1107017894
|
Dates of beginning and end of teaching activities
|
From to |
Delivery mode
|
Traditional
At a distance
|
Attendance
|
not mandatory
|
Evaluation methods
|
Oral exam
A project evaluation
|
|
|