# R – Medium package

### The Medium package contains six course days, and the content covers most of what you will come across in your statistical work.

Starting with an introduction, the course will cover the basics of R, ranging from importing and handling data to visualisation. You'll learn about two fundamental tools in statistical analysis: hypothesis tests and confidence intervals. We'll also discuss important concepts like p-values, power, and sample size calculcations. Gain knowledge about modern linear regression and ANOVA models. It also covers some common but advanced regression models as well as survival analysis. You´ll also visually explore data in R, and learn how to create great-looking graphics using the powerful ggplot2 package. We´ll also discuss cluster analysis, including hierarchichal and centroid-based methods, and factor analysis and structural equation models (SEM)#### Content

##### Introduction to R • Introduction to modern statistics • Linear regression • ANOVA • Advanced regression • Survival analysis • **Visualisation** • **Cluster analysis** • S**tructural equation models (SEM)
**

##### 6 course days - SEK 31.800

## R 1 - Introduction to R and to modern statistics – 2 days

**This course helps you get started with R. We’ll cover the basics of R, ranging from importing and handling data to visualisation. You’ll learn about two fundamental tools in statistical analysis: hypothesis tests and confidence intervals. We’ll also discuss important concepts like p-values, power, and sample size calculcations.**

The popular tidyverse package is used for filtering, cleaning, and preparing data for analysis. The powerful plotting capabilities of the ggplot2 package are also covered. Both basic statistical concepts and fundamental topics in R programming are discussed. This course is a great fit if you’re curious about R, or already know that you want to use its many tools for advanced data analysis. Classical statistical tests like the t-test, nonparametric tests and the chi-squared test are covered, along with modern computer-intensive methods like the bootstrap. The latter allows us to obtain p-values and confidence intervals without many of the constraints of traditional methods (such as requiring that the data follow a normal distribution), bringing your statistical toolbox up to the 21st century.

**Course goals:**To be able to use R to import and wrangle data, describe data using graphs and tables. To understand the basics of hypothesis testing and confidence intervals and be able to use R for running and computing common tests and intervals.

**Prerequisites: **Basic computer skills.

## R 2 - Linear regression, ANOVA & advanced regression models – 2 days

**This course provides you with a solid understanding of modern linear regression and ANOVA models. It also covers some common but advanced regression models as well as survival analysis.**

We will have a closer look at how these models work and how R can be used to build, visualise, and interpret such models. We will use modern techniques like the bootstrap and permutation tests, to obtain confidence intervals and p-values without having to assume a normal distribution for your data. We will cover non-linear regression models like logistic regression and Poisson regression, where the response variable can be either binary (yes/no), counts, or prevalence. In survival analysis, we’ll have a look at Kaplan-Meier survival curves and regression models, including Cox proportional hazards regression. Mixed models are used to analyse data with repeated measurements on the same subjects.

**To be able to use R to fit, visualize and interpret linear regression and ANOVA models. To understand how to visualise and interpret models for logistic regression, count regression, mixed models, and survival analysis.**

**Course goals:****Prerequisites: **R1 or similar.

## R 3 - Visualisation, data exploration, cluster analysis & SEM – 2 days

**This course will teach you how visually explore data in R, and how to create great-looking graphics using the powerful ggplot2 package. We´ll also discuss cluster analysis, including hierarchichal and centroid-based methods, factor analysis and structural equation models (SEM), used to measure and analyse the relationship between observed and hidden variables, as well as mediation analysis.**

Topics covered include outlier detection, visualisation of trends, and multivariate data. It also covers dimension-reduction of complex data using principal component analysis (PCA). Cluster analysis is used to find subgroups in exploratory analyses of your data. SEM allows us to study causal relationships between variables in our data and latent (unobservable) variables, such as difficult-to-measure attitudes. Mediation analysis is used to understand the mechanism behind causal relationships.

**Course goals:**To be able to use the R package ggplot2 to visualise and explore data. Learn how to do cluster analysis when analysing your data and to perform SEM to study causal relationships between variables.

**Prerequisites: **R1 or similar.