ELEMENTS OF CODING FOR TEXT MINING
cod. 1012479

Academic year 2024/25
3° year of course - Second semester
Professor
Gianfranco LOMBARDO
Academic discipline
Sistemi di elaborazione delle informazioni (ING-INF/05)
Field
Attività formative affini o integrative
Type of training activity
Related/supplementary
30 hours
of face-to-face activities
6 credits
hub: PARMA
course unit
in ITALIAN

Learning objectives

The aim of the course is to provide skills in the field of Python programming in the context of simple Natural Language Processing applications. At the end of the course the students: 1) will know the elementary data structures and basic programming constructs; 2) they will be able to modify and write programs written in Python that manipulate and analyze texts, acquiring useful skills for humanists. 3) They will know the basics of representing text on the computer.

Prerequisites

No requirements

Course unit content

- - -

Full programme

1) Introduction to Text Mining:
- Text representation (ASCII)
- N-grams
- Tokenization
- Edit distance
- Hints about modern language models
2) Introduction to Python
- Variables and list
- For with iterators
- If instruction
- While cycle
- Dictionaries
- Read and write from files

Bibliography

- How to Think Like a Computer Scientist (Learning with Python): http://openbookproject.net/thinkcs/python/english3e/
- Python 101: http://python101.pythonlibrary.org/index.htm
-- Speech and Language Processing, Jurafsky (2023) (First chapter as introduction)

Teaching methods

Lecture in a computer lab to combine theoretical and practical activities.

Assessment methods and criteria

The final exam will consists into a practical project with Python to be realized as homework. Then the student will present his/her activity during an oral exam where the homework will be evaluated in terms of:
- Correctness
- Complexity
- Quality of the documentation
- Quality of the discussion

Moreover, during the oral presentation, the student will be evaluated in any part of the course programm with open questions or simple exercises.

Other information

The frequency of the lectures is highly suggested.

2030 agenda goals for sustainable development

- - -

Contacts

Toll-free number

800 904 084

Student registry office

E. segreteria.corsiumanistici@unipr.it

Quality assurance office

Education manager:
Dott.ssa Valentina Galeotti
T. +39 0521 034133
Manager E. valentina.galeotti@unipr.it
Office E. dusic.lettere@unipr.it

President of the degree course

Prof. Marco Gentile
E. marco.gentile@unipr.it

Faculty advisor

Prof. Nicola Catelli
E. nicola.catelli@unipr.it

Prof.ssa Margherita Centenari
E. margherita.centenari@unipr.it

Prof. Simone Gibertini
E. simone.gibertini@unipr.it

Career guidance delegate

Prof. Carlo Alberto Gemignani
E. carloalberto.gemignani@unipr.it

Referenti per piani di studio e convalide

Prof. Carlo Varotti | Studenti A-L
E. carlo.varotti@unipr.it

Prof. Paolo Rinoldi | Studenti M-Z
E. paolo.rinoldi@unipr.it

Erasmus delegates

Prof.ssa Cristina Carusi | Erasmus+ SMT
E. cristina.carusi@unipr.it

Prof. Luca Iori | Erasmus+ SMS
E. luca.iori@unipr.it

Quality assurance manager

Prof.ssa Paola Volpini
E. paola.volpini@unipr.it

Internships

Prof.ssa Giulia Raboni
E. giulia.raboni@unipr.it

Tutor students

Dott.ssa Benedetta Bocchi
E. benedetta.bocchi@studenti.unipr.it

Dott. Roberto De Frate
E. roberto.delfrate@unipr.it

Dott. Alberto Negri
E. alberto.negri1@studenti.unipr.it