Teacher
|
CASTRIGNANO TIZIANA
(syllabus)
The following course focuses on studying the technologies needed to transform and manipulate biological data. In particular, it is focused on tools that allow the analysis of large sequencing data sets with the aim of obtaining reproducible and robust biological results. For this reason, 2 credits of the course are dedicated to the introduction of the linux environment and the tools included with it for data manipulation. The second 2 credits of the course are dedicated to the fundamentals of programming illustrated using the python language and applied to sequence analysis. The last 2 credits are dedicated to the introduction to programming in R. It is a programming language and a specific development environment for statistical data analysis.
Linux: - Setting up and management of a bioinformatics project in a linux environment - Project directories and directory structures - Why do we use Linux in bioinformatics? Modularity and the Linux philosophy - The environment variables - Work with flows and redirect - Manage and interact with processes - Work with remote machines - Recovery of bioinformatics data - Data compression and use of compressed data - When to use unix pipelines - Inspect and manipulate data with linux tools - An introduction to genomic ranges - Working with sequence data - Basic Bash script - HPC (High-Performance-Computing) bash script
Python: Preliminary operations - Python installation - Illustration of the Biopython libraries Data manipulation - Arithmetic operators - Data types (numeric, boolean, set, dictionary, sequence) - Variables, expressions, statements - Control statements (if, while, for, break, continue) - Functions - The biopython libraries
R: Preliminary operations - Installation of R - Illustration of the R and RStudio interface - Working directory, scripts and consoles Data manipulation - Creation and import of data - Data classes - Use and creation of functions - Charts: scatterplot, boxplot, barplot Statistical methods for data analysis - Basics of statistics: random variables, probability distributions, hypothesis tests - Statistical tests in R environment: correlations, t-test, chi-square test - Linear regression models
(reference books)
Suggested text for Linux: teaching material provided by the teacher.
Suggested text for Python: Allen Downey - “Thinking in Python”. Editor O’REILLY
Suggested text for R: teaching material provided by the teacher.
|