MT611 - Quantitative Research Methods

class: center, middle, inverse, title-slide

.title[
# MT611 - Quantitative Research Methods
]
.subtitle[
## Lecture 7: Introduction to R for Hypothesis Testing
]
.author[
### Damien Dupré
]
.date[
### Dublin City University
]

---

# Brief Introduction

Modern data science uses free and open-source computer languages:

* Proprietary languages (e.g., Matlab) and software (e.g., SPSS, Stata, SAS) are outdated
* Main open-source computer languages for data science are Python and R

While Python is the most used language by computer engineers for web and app development, R has some advantages:

1. **Easy to write**, to read and to use
2. Focused on **reports and journal papers** with reproducibility
3. Advanced **statistical packages**
4. **Friendly and open** community

### So let's useR!

---

class: inverse, mline, center, middle

# 1. R and RStudio

---

# What are R and RStudio?

There are some key concepts you need to understand and to remember:

* R is the name of the language
* RStudio is the name of the upgraded interface to write R code

R is usually used via RStudio and First time users often confuse the two. At its simplest, **R is like a car’s engine** while **RStudio is like a car’s dashboard**.

.pull-left[
.center[R: The engine]
<img src="https://raw.githubusercontent.com/damien-dupre/img/main/car_motor.jpeg" width="100%" style="display: block; margin: auto;" />
]

.pull-right[
.center[RStudio: The dashboard]
<img src="https://raw.githubusercontent.com/damien-dupre/img/main/car_dashboard.jpeg" width="100%" style="display: block; margin: auto;" />
]

---
class: clear

## .center[**Time to enter ...**]

---

# Posit Cloud

In your webrowser (Chrome, Firefox, ...), go to: https://posit.cloud/
  - Sign up
  - In your workspace, Click "new project"

---

# Code in Rstudio

Most of the R code displayed in this lecture is included in these slides. Rather than typing it manually, open these slides in another tab to copy-paste the code

Two ways to access these slides:

- From Loop: Lectures Slides > Lecture 7
  - Or from the URL: https://damien-dupre.github.io/mt611/lectures/lecture_7

---
class: inverse, mline, center, middle

# 2. Coding in RStudio

---

# RStudio IDE

When you create a new project, you will lauch Rstudio see the following 3 windows (also called panes):

* **Console**: where the results are printed
* **Workspace**: where the objects are stored
* **Files, Plots, Package, Help and Viewer**: where data science materials are

The last window **Code Editor** opens when creating a new R Script

---

# Console: R’s Heart

The console displays 1. **What has been ran**, 2. The **results** (or some parts) of has been ran, and 3. The **status of the R process**.

.pull-left[
The Status of R is indicated by the symbol in the console prompt:

* `>` means ready to process code 
* `+` means incomplete command (escape with <kbd>Esc</kbd>)
* &#128721; at the top right corner means the console is busy processing your code
]

.pull-right[
<img src="https://raw.githubusercontent.com/damien-dupre/img/main/r_console.png" width="100%" style="display: block; margin: auto;" />
]

You can execute code by typing it directly into the Console. However, it will not be saved. And if you make a mistake you will have to re-type everything all over again.

Instead, it is better to **write all your code in a document (R script or Quarto file) in the Code Editor**.

---

# Environment: R’s Brain

The Environment tab of this pane shows you the **names of all the data objects** (like vectors, matrices, and data frames) that you have defined in your current R session.

You can also see **information** like the number of observations and rows in data objects.

---

# Files / Plots / Packages / Help

* The **Files** panel gives you access to the file directory on your hard drive.

* The **Plots** panel shows all your plots. There are buttons for opening the plot in a separate window and exporting the plot as a pdf or jpeg.

* The **Packages** shows a list of all the R packages installed on the local or remote machine and indicates whether or not they are currently loaded.

* With the **Help** menu for R functions you can access to essential information to use them. Just have a look at some of them by typling <kbd>F1</kbd> with the cursor on the function, using the `help()` function, or by typing `?` followed by the function name such as:

```r
help(seq)
?seq
help(lm)
?lm
```

---

# Code Editor: R's Nervous System

- It makes the link between all the previous pane and allows to reproduce actions and behaviours.

- You can open as many R Script / Quarto file as you want.

- These documents are the only documents that have to be saved. No need to save you data, figures and calculations as you can reproduce them every time instantaneously with the code.

- Save your eyes and look like a nerd by changing the code's appearance

<img src="https://miro.medium.com/max/1600/1*eLjON45R_kHIfg_IgL8FIg.jpeg" width="60%" style="display: block; margin: auto;" />
.center.tiny[Credit: towardsdatascience.com [🔗](https://towardsdatascience.com/customize-your-rstudio-theme-914cca8b04b1)]

---
class: inverse, mline, center, middle

# 3. The Basics of R Code

---

# What are .R and .qmd files?

**.R** is the extension for a R script (document including only R code): 
- Click **File > New File > R Script** in RStudio
- Includes only code which can be active or inactive (line starts with `#`)
- Used for code testing

Example of non-active code

```r
# non active code
```

Example of active code

```r
paste("active", "code")
```

```
## [1] "active code"
```

```r
1 + 1 # everything after `#` is non active and is used for comments
```

```
## [1] 2
```

---

# What are .R and .qmd files?

**.qmd** is the extension for a Quarto file: 
- Click **File > New File > Quarto document...** in RStudio
- Refers to a document that includes code and text
- Generates a specific type of output
  - .html (web page, slides, books, and dashboards)
  - .pdf (Academic LaTex papers and reports)
  - .doc (MS Word documents)
  
<img src="https://r4ds.hadley.nz/quarto/diamond-sizes-report.png" width="80%" style="display: block; margin: auto;" />

---

# How to Run R Code?

.pull-left[
In a R Script, place your cursor anywhere on the line you want to run and either:
- Press <kbd>Ctrl</kbd> & <kbd>Enter</kbd> (Win)
- Press <kbd>command</kbd> & <kbd>Enter</kbd> (Mac)
- Click the `Run` button on RStudio's interface

In a Quarto file:
- Use the **green** arrow to run the current chunk of code
- Use the `Knit` button on RStudio's interface to create the output file
]

.pull-right[
<img src="https://raw.githubusercontent.com/damien-dupre/img/main/rstudio_run.gif" width="100%" style="display: block; margin: auto;" />
]

---

# What are R packages?

R packages extend the functionality of R. They are written by a worldwide community of R users and can be downloaded for free from the internet.

A good analogy for **R packages are like apps you can download onto a mobile phone**.

.pull-left[
.center[R: A new phone]
<img src="https://raw.githubusercontent.com/damien-dupre/img/main/phone_design.jpeg" width="100%" style="display: block; margin: auto;" />
]

.pull-right[
.center[R Packages: Apps you can download]
<img src="https://raw.githubusercontent.com/damien-dupre/img/main/phone_apps.jpeg" width="100%" style="display: block; margin: auto;" />
]

---

# What are R packages?

Say you have purchased a new phone, to use Instagram you need to **install the app once** and to **open the app** every time you want to use it.

The process is very similar for using an R package. You need to:

* **Install the package** with the function `install.packages()`.

```r
install.packages("praise")
```

* **“Load” the package** with the function `library()`.

```r
library(praise)
```

Once the package is loaded you can use all the functions from this package such as:

```r
praise()
```

---
class: title-slide, middle

## Live Demo

---
class: title-slide, middle

## Exercise

Open a R Script in RStudio. In this document:
- Use line 1 to **install the package "praise"**

```r
install.packages("praise")
```
- Use line 2 to **load the library praise**

```r
library(praise)
```
- Use line 3 to **run the function `praise()`** as it is, without arguments

```r
praise()
```

---

# Calling Functions

Functions are algorithms (or lines of code) which **transform data to something else**. For example, the function `lm()`, uses data to compute the result of a linear regression model.
 
Functions have **a name** and **several arguments** that require some information.

```r
function_name(argument_1 = value_1, argument_2 = value_2, ...)
```

For example, the function `seq()` makes a sequences of numbers:
* The first argument `from` is the number starting the sequence
* The second argument `to` is last number of the sequence

```r
seq(from = 1, to = 10)
```

```
##  [1]  1  2  3  4  5  6  7  8  9 10
```

<span><i class="fas  fa-exclamation-triangle faa-flash animated faa-slow " style=" color:red;"></i></span> Arguments doesn't need to be explicitly called, they can also be matched by position:

```r
seq(1, 10)
```

```
##  [1]  1  2  3  4  5  6  7  8  9 10
```

---

# Assign Values to Objects in R

Usually the arguments of functions expects an Object name to access the data.

An object is a box that **can include anything** (e.g., values, dataframes, figures, models, functions, ...) and **has a name** that you have to choose.

To create an object, you need to **assign something** to a name using the `<-` operator. If you type the name of the object, R will print out its content.

```r
x <- 4
x
```

```
## [1] 4
```

```r
seq(x, 10)
```

```
## [1]  4  5  6  7  8  9 10
```

---

# Assign Values to Objects in R

It is very important to distinguish values and objects in R:

<table>
 <thead>
  <tr>
   <th style="text-align:left;"> Type </th>
   <th style="text-align:left;"> Class </th>
   <th style="text-align:left;"> Example </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> Number </td>
   <td style="text-align:left;"> Numeric Value </td>
   <td style="text-align:left;"> 1, 2, ... </td>
  </tr>
  <tr>
   <td style="text-align:left;"> Word with quotes </td>
   <td style="text-align:left;"> Character Value </td>
   <td style="text-align:left;"> &quot;one&quot;, &quot;two&quot;, ... </td>
  </tr>
  <tr>
   <td style="text-align:left;"> Word without quotes </td>
   <td style="text-align:left;"> Object Name </td>
   <td style="text-align:left;"> function name, data name, ... </td>
  </tr>
</tbody>
</table>

These types of values are then stored in different objects:

---

# Different R Objects

All object assignments have the same form:

```r
object_name <- object_content
```

You want your object names to be descriptive, so you will need a convention for multiple words. I recommend **snake_case** where you separate lower-case words with `_`.

```r
numeric_value <- 1

character_value <- "one"

vectors_with_numeric_values <- c(1, 2)

vectors_with_character_values <- c("one", "two")

dataframe_example <- data.frame(col1 = c("one", "two"), col2 = c(1, 2))

dataframe_example <- data.frame(
  col1 = vectors_with_character_values, 
  col2 = vectors_with_numeric_values
  )
```

---
class: title-slide, middle

## Live Demo

---
class: title-slide, middle

## Exercise

In the same R Script in RStudio, **Copy, Paste, and Run** the following code:

```
my_power <- c(0.5, 99.5)

my_knowledge <- c("without R", "with R")

barplot(height = my_power, names.arg = my_knowledge)
```

---
class: inverse, mline, center, middle

# 4. Access Data in Posit Cloud

---

# Open your Data as R Object (1)

Posit Cloud is a free remote computer, the computing is not run on your computer.

To open Data on Posit Cloud, you first need to `Upload` your file on this computer and to `Import` the data in R.

.pull-left[
.center[Step 1: Upload your File]
<img src="https://community-cdn.rstudio.com/uploads/default/original/2X/9/91a128299a9d910f84279be9ecd89b60aa15f20b.png" width="100%" style="display: block; margin: auto;" />
]

.pull-right[
.center[Step 2: Import your Data]
<img src="https://raw.githubusercontent.com/damien-dupre/img/main/rstudio_import.png" width="100%" style="display: block; margin: auto;" />
]

Remember that **.csv files are basically text files**

---

# Open your Data as R Object (2)

For early beginners on the Desktop version, directly open data with RStudio 's `Import Dataset` button.

If you see your data in the preview, you can click `Import` to create an object containing your data. A code will be executed on the console, **Copy and Paste the first line of this code in your R script**. You will not have to do it manually once the code is in your script.

---

# Open your Data as R Object (3)

To ensure code reproducibility, open data with the appropriate function (e.g., `read.csv()` for csv files).

The main argument of these functions is `file` which corresponds to the path to a file, followed by the name of the file and it extension:

```r
# Windows
my_file_object <- read.csv(file = "C:/path/to/my/file.csv")
my_file_object <- read.csv("C:/path/to/my/file.csv")

# Macos
my_file_object <- read.csv(file = "/Users/path/to/my/file.csv")
my_file_object <- read.csv("/Users/path/to/my/file.csv")
```

The following codes will generate an error:

```r
# Incomplete path
my_file_object <- read.csv("/path/to/my/file.csv")
# Missing file extension
my_file_object <- read.csv("C:/path/to/my/file")
# Use of backward slash
my_file_object <- read.csv("C:\path\to\my\file.csv")
```

---
class: title-slide, middle

## Live Demo

---
class: title-slide, middle

## Exercise

1. Click on `Upload` to upload your "organisation_beta.csv" file on your Posit Cloud

2. Click on `Import` to import these data in R

---
class: inverse, mline, center, middle

# 5. Save Your Data

---

# Save Your Data

Usually, **only R Script file (.R) or Quarto file (.qmd) have to be saved** as they allow the full replicability of transformations and result.

However, if you want to use the data that have been transformed, joined or pivoted, a function has to be used according the type of export.

The simplest export is a .csv file with the function `write.csv()`. It as two main arguments:
- `x` which is the name of the object to save
- `file` which is the name of the output file

Note: don't forget the file extension in the argument `file`

Example:

```r
# Saved in the current directory
write.csv(x = my_file_object, file = "my_file_name.csv")
write.csv(my_file_object, "my_file_name.csv")

# Saved in the directory you prefer
write.csv(my_file_object, "C:/path/to/my/my_file_name.csv") # Windows
write.csv(my_file_object, "/Users/path/to/my/my_file_name.csv") # Macos
```

---
class: title-slide, middle

## Exercise

Save the data contained in the object "organisation_beta" in a new .csv file (give a different name than "organisation_beta.csv" else this document will be overwritten)

---

# Become Expert in R

Because R is free, plenty of free learning materials are available online:

* Video tutorials in Youtube, Tik Tok (@chelseaparlettpelleriti, @tommyteaches, ...)
  
* Interactive tutorials, see for example:
  - [Posit Primers](https://posit.cloud/learn/primers)
  - [R-Bootcamp](https://r-bootcamp.netlify.com)
  - [Introduction to R](https://www.quantargo.com/courses/course-r-introduction)
  - [DCU R tutorials](https://dcu-r-tutorials.netlify.app)

* Book tutorials, see for example:
  - [R for Data Science](https://r4ds.hadley.nz/) by Wickham & Grolemund (2023)
  - [A ModernDive into R and the tidyverse](https://moderndive.com/) by Ismay & Kim (2022)
  - [Getting Used to R, RStudio, and Quarto file](https://rbasics.netlify.com/) by Kennedy (2021)
  - [Introduction to Open Data Science](https://ohi-science.org/data-science-training/) by the Ocean Health Index Team (2019)
  
Note: All books are included in the [Big Book of R](https://www.bigbookofr.com/)

---

# How to solve your pRoblems

.pull-left[
### 1. Look at your error
  * If it's obvious, solve it by yourself
  * If it's not obvious, copy paste the error in google
  
### 2. Look at your object
  * `str(ObjectName)`

### 3. Look at the function
  * Documentation (`F1` or `?`)

### 4. Look at the web
  * Google "R how to ..."
  * Stack Overflow
]

.pull-right[
<img src="https://pbs.twimg.com/media/DAsjfPjXkAIBoET?format=jpg&name=medium" width="100%" style="display: block; margin: auto;" />
]

---
class: inverse, mline, center, middle

# 6. Linear Regression Models in R

---

# Model and Equations

A model contains:

- Only one Outcome/Dependent Variable
- One or more Predictor/Independent Variables of any type (categorical or continuous)
- Main and/or Interaction Effects

To evaluate their relationship with the outcome, each effect hypothesis is related with a coefficient called **Estimate** and represented with `$b$` as follow:

`$$Outcome = b_0 + b_1 Pred1 + b_2 Pred2 + b_3 Pred1 * Pred2 + e$$`

Testing for the significance of the effect means evaluating if this estimate `$b$` value is significantly **different, higher or lower than 0** as hypothesised in `$H_a$` by the scientist.

---

# Estimates and Linear Regression in R

The `lm()` function calculate each estimate and test them against 0 for you.

`lm()` has only two arguments that you should care about: `formula` and `data`.

- `formula` is the translation of the equation of the model

- `data` is the name of the data frame object containing the variables.

Here is a generic example:

```r
lm(formula = Outcome ~ Pred1 + Pred2, data = my_data_object)
```

Here is an example with `organisation_beta.csv`:

```r
lm(formula = js_score ~ salary + perf, data = organisation_beta)
```

---

# Mastering the Formula

`lm()` has only one difficulty, the `formula`. The `formula` is the direct translation of the equation tested but with its own representation:

1. The = sign is replaced by `~` (read "according to" or "by")
2. Each predictor is added with the `+` sign
3. An interaction effect uses the symbol `:` instead of *

Here are some generic equations and their conversion in `formula`:

`$$Outcome = b_0 + b_1 Pred1 + b_2 Pred2 + e$$`

```r
lm(formula = Outcome ~ Pred1 + Pred2, data = my_data_object)
```

`$$Outcome = b_0 + b_1 Pred1 + b_2 Pred2 + b_3 Pred3 + e$$`

```r
lm(formula = Outcome ~ Pred1 + Pred2 + Pred3, data = my_data_object)
```

`$$Outcome = b_0 + b_1 Pred1 + b_2 Pred2 + b_3 Pred1*Pred2 + e$$`

```r
lm(formula = Outcome ~ Pred1 + Pred2 + Pred1 : Pred2, data = my_data_object)
```

---

# Mastering the Formula

Here are some equations from the `organisation_beta.csv` dataset and their conversion in `formula`:

`$$js\_score = b_0 + b_1 salary + b_2 perf + e$$`

```r
lm(formula = js_score ~ salary + perf, data = organisation_beta)
```

`$$js\_score = b_0 + b_1 salary + b_2 perf + b_3 salary * perf + e$$`

```r
lm(formula = js_score ~ salary + perf + salary:perf, data = organisation_beta)
```

---
class: title-slide, middle

## Live Demo

---
class: title-slide, middle

## Exercise

Test the following models in RStudio Cloud:

`$$js\_score = b_0 + b_1 salary + b_2 gender + e$$`

`$$js\_score = b_0 + b_1 salary + b_2 gender + b_3 salary * gender + e$$`

---
class: title-slide, middle

## Linear Regression Results

---

# Categorical Predictor

Exactly as in Jamovi, `lm()` by default investigates continuous predictors or categorical predictors having 2 categories:

```r
lm_js <- lm(formula = js_score ~ salary + gender, data = organisation_beta)
```

However, to test the hypothesis of a categorical predictor having 3 or more categories, the ANOVA omnibus test is required.

It can be obtained by using the `aov()` function with the lm model as input:

```r
lm_js <- lm(formula = js_score ~ salary + location, data = organisation_beta)

aov(lm_js)
```

---

# LM Summary

While the function `lm()` computes the model, the function `summary()` display the results

.small[

```r
lm_js <- lm(formula = js_score ~ salary + gender, data = organisation_beta)

summary(lm_js)
```

```
## 
## Call:
## lm(formula = js_score ~ salary + gender, data = organisation_beta)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.04185 -0.49565  0.06529  0.61611  1.71635 
## 
## Coefficients:
##                Estimate  Std. Error t value   Pr(>|t|)    
## (Intercept) -49.1865079   7.6889114  -6.397 0.00000663 ***
## salary        0.0018837   0.0002575   7.316 0.00000121 ***
## gendermale   -0.5946699   0.4055990  -1.466      0.161    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.8854 on 17 degrees of freedom
## Multiple R-squared:  0.7613,	Adjusted R-squared:  0.7333 
## F-statistic: 27.12 on 2 and 17 DF,  p-value: 0.000005141
```
]

---

# LM Summary

The output of the `summary()` function is pretty dense, but let's analyse it line by line.

The first line reminds us of what the actual regression model is:

```
Call:
lm(formula = js_score ~ salary + gender, data = organisation_beta)
```

The next part provides a quick summary of the residuals (i.e., the `$e$` values),

```
Residuals:
      Min       1Q   Median       3Q      Max 
 -2.04185 -0.49565  0.06529  0.61611  1.71635 
```

This can be convenient as a quick check that the model is okay. **Linear regression assumes that these residuals were normally distributed, with mean 0.** In particular it’s worth quickly checking to see if the median is close to zero, and to see if the first quartile is about the same size as the third quartile. If they look badly off, there’s a good chance that the assumptions of regression are violated.

---

# LM Summary

The next part of the R output looks at the coefficients of the regression model:

```
Coefficients:
                Estimate  Std. Error t value   Pr(>|t|)    
 (Intercept) -49.1865079   7.6889114  -6.397 0.00000663 ***
 salary        0.0018837   0.0002575   7.316 0.00000121 ***
 gendermale   -0.5946699   0.4055990  -1.466      0.161 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
```

Each row in this table refers to one of the coefficient estimated in the regression model.

The first row is the intercept term, and the later ones look at each of the predictors. The columns give you all of the relevant information:
- The first column is the actual estimate of b (e.g., -49.1865079 for the intercept, 0.0018837 for salary and -0.5946699 for gender). 
- The second column is the standard error estimate (SE). 
- The third column gives you the t-statistic. 
- Finally, the fourth column gives you the actual p value for each of these tests.

---

# LM Summary

The only thing that the previous table doesn’t list is the degrees of freedom used in the t-test, which is always N−K−1 and is listed immediately below, in this line:

```
Residual standard error: 0.8854 on 17 degrees of freedom
```

The value of df=17 is equal to N−K−1, so that’s what we use for our t-tests. In the final part of the output we have the F-test and the R<sup>2</sup> values which assess the performance of the model as a whole

```
Multiple R-squared:  0.7613,    Adjusted R-squared:  0.7333 
F-statistic: 27.12 on 2 and 17 DF,  p-value: 0.000005141
```

So in this case, the model did not perform significantly better than you’d expect by chance (F(2,17) = 27.12, p < 0.001), which isn’t all that surprising: the R<sup>2</sup> = 0.7333 value indicate that the regression model accounts for 73.3% of the variability in the outcome measure.

When we look back up at the t-tests for each of the individual coefficients, we have pretty strong evidence that salary and year have a significant effect.

---

# Reporting Clean Results

To communicate about your statistical analyses in an academic report, the simplest method is to find the values in the `summary()` output and to copy-paste them in the text according to the format expected that we have seen in the previous lectures.

However, this task can be long, difficult and lead to human errors. Thankfully, R has additional packages that are providing alternative functions to read linear regression models and communicate results. Because there are too many packages, I will focus only on two additional packages: {performance} and {report}.

---

# Assumption Check with {performance}

To install {performance} use the usual `install.packages()` function:

```r
install.packages("performance")
```

The package {performance} will print visualisations allowing to check the model's assumptions (see https://easystats.github.io/performance/).

To print the performance diagnosis:

1. Load the package {performance}
2. Create an object containing the output of the function `lm()`
3. Use this object as input of the function `check_model()` from the {performance} package

---

# Assumption Check with {performance}

```r
library(performance)

lm_js <- lm(formula = js_score ~ salary + perf, data = organisation_beta)

check_model(lm_js)
```

---

# Automatic Results with {report}

To install {report} use the usual `install.packages()` function:

```r
install.packages("report")
```

The package {report} will print a text containing all the statistics already in sentences ready to be interpreted (see https://easystats.github.io/report/).

To print the statistical analyses:

1. Load the package {report}
2. Create an object containing the output of the function `lm()`
3. Use this object as input of the function `report()` from the {report} package

**Note: If used in a quarto document, the chunk containing `report()` has to include the chunk option `results='asis'`**

---

# Automatic Results with {report}

```r
library(report)

lm_js <- lm(formula = js_score ~ salary + perf, data = organisation_beta)

report(lm_js)
```

We fitted a linear model (estimated using OLS) to predict js_score with salary
and perf (formula: js_score ~ salary + perf). The model explains a
statistically significant and substantial proportion of variance (R2 = 0.74,
F(2, 17) = 24.16, p < .001, adj. R2 = 0.71). The model's intercept,
corresponding to salary = 0 and perf = 0, is at -49.49 (95% CI [-66.60,
-32.37], t(17) = -6.10, p < .001). Within this model:

- The effect of salary is statistically significant and positive (beta =
1.87e-03, 95% CI [1.30e-03, 2.44e-03], t(17) = 6.95, p < .001; Std. beta =
0.86, 95% CI [0.60, 1.13])
  - The effect of perf is statistically non-significant and positive (beta =
0.08, 95% CI [-0.15, 0.32], t(17) = 0.75, p = 0.465; Std. beta = 0.09, 95% CI
[-0.17, 0.35])

Standardized parameters were obtained by fitting the model on a standardized
version of the dataset. 95% Confidence Intervals (CIs) and p-values were
computed using a Wald t-distribution approximation.

---
class: title-slide, middle

## Live Demo

---
class: title-slide, middle

## Exercise

In RStudio Cloud, check the `check_model()` and `report()` output from the `lm()` function testing the following models:

`$$js\_score = b_0 + b_1 salary + b_2 gender + e$$`

`$$js\_score = b_0 + b_1 salary + b_2 gender + b_3 salary * gender + e$$`

`$$js\_score = b_0 + b_1 salary + b_2 location + b_3 salary * location + e$$`

---
class: inverse, mline, left, middle

# Thanks for your attention and don't hesitate to ask if you have any questions!

[<svg aria-hidden="true" role="img" viewBox="0 0 512 512" style="height:1em;width:1em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:currentColor;overflow:visible;position:relative;"><path d="M459.37 151.716c.325 4.548.325 9.097.325 13.645 0 138.72-105.583 298.558-298.558 298.558-59.452 0-114.68-17.219-161.137-47.106 8.447.974 16.568 1.299 25.34 1.299 49.055 0 94.213-16.568 130.274-44.832-46.132-.975-84.792-31.188-98.112-72.772 6.498.974 12.995 1.624 19.818 1.624 9.421 0 18.843-1.3 27.614-3.573-48.081-9.747-84.143-51.98-84.143-102.985v-1.299c13.969 7.797 30.214 12.67 47.431 13.319-28.264-18.843-46.781-51.005-46.781-87.391 0-19.492 5.197-37.36 14.294-52.954 51.655 63.675 129.3 105.258 216.365 109.807-1.624-7.797-2.599-15.918-2.599-24.04 0-57.828 46.782-104.934 104.934-104.934 30.213 0 57.502 12.67 76.67 33.137 23.715-4.548 46.456-13.32 66.599-25.34-7.798 24.366-24.366 44.833-46.132 57.827 21.117-2.273 41.584-8.122 60.426-16.243-14.292 20.791-32.161 39.308-52.628 54.253z"/></svg> @damien_dupre](http://twitter.com/damien_dupre)  
[<svg aria-hidden="true" role="img" viewBox="0 0 496 512" style="height:1em;width:0.97em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:currentColor;overflow:visible;position:relative;"><path d="M165.9 397.4c0 2-2.3 3.6-5.2 3.6-3.3.3-5.6-1.3-5.6-3.6 0-2 2.3-3.6 5.2-3.6 3-.3 5.6 1.3 5.6 3.6zm-31.1-4.5c-.7 2 1.3 4.3 4.3 4.9 2.6 1 5.6 0 6.2-2s-1.3-4.3-4.3-5.2c-2.6-.7-5.5.3-6.2 2.3zm44.2-1.7c-2.9.7-4.9 2.6-4.6 4.9.3 2 2.9 3.3 5.9 2.6 2.9-.7 4.9-2.6 4.6-4.6-.3-1.9-3-3.2-5.9-2.9zM244.8 8C106.1 8 0 113.3 0 252c0 110.9 69.8 205.8 169.5 239.2 12.8 2.3 17.3-5.6 17.3-12.1 0-6.2-.3-40.4-.3-61.4 0 0-70 15-84.7-29.8 0 0-11.4-29.1-27.8-36.6 0 0-22.9-15.7 1.6-15.4 0 0 24.9 2 38.6 25.8 21.9 38.6 58.6 27.5 72.9 20.9 2.3-16 8.8-27.1 16-33.7-55.9-6.2-112.3-14.3-112.3-110.5 0-27.5 7.6-41.3 23.6-58.9-2.6-6.5-11.1-33.3 2.6-67.9 20.9-6.5 69 27 69 27 20-5.6 41.5-8.5 62.8-8.5s42.8 2.9 62.8 8.5c0 0 48.1-33.6 69-27 13.7 34.7 5.2 61.4 2.6 67.9 16 17.7 25.8 31.5 25.8 58.9 0 96.5-58.9 104.2-114.8 110.5 9.2 7.9 17 22.9 17 46.4 0 33.7-.3 75.4-.3 83.6 0 6.5 4.6 14.4 17.3 12.1C428.2 457.8 496 362.9 496 252 496 113.3 383.5 8 244.8 8zM97.2 352.9c-1.3 1-1 3.3.7 5.2 1.6 1.6 3.9 2.3 5.2 1 1.3-1 1-3.3-.7-5.2-1.6-1.6-3.9-2.3-5.2-1zm-10.8-8.1c-.7 1.3.3 2.9 2.3 3.9 1.6 1 3.6.7 4.3-.7.7-1.3-.3-2.9-2.3-3.9-2-.6-3.6-.3-4.3.7zm32.4 35.6c-1.6 1.3-1 4.3 1.3 6.2 2.3 2.3 5.2 2.6 6.5 1 1.3-1.3.7-4.3-1.3-6.2-2.2-2.3-5.2-2.6-6.5-1zm-11.4-14.7c-1.6 1-1.6 3.6 0 5.9 1.6 2.3 4.3 3.3 5.6 2.3 1.6-1.3 1.6-3.9 0-6.2-1.4-2.3-4-3.3-5.6-2z"/></svg> @damien-dupre](http://github.com/damien-dupre)  
[<svg aria-hidden="true" role="img" viewBox="0 0 640 512" style="height:1em;width:1.25em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:currentColor;overflow:visible;position:relative;"><path d="M579.8 267.7c56.5-56.5 56.5-148 0-204.5c-50-50-128.8-56.5-186.3-15.4l-1.6 1.1c-14.4 10.3-17.7 30.3-7.4 44.6s30.3 17.7 44.6 7.4l1.6-1.1c32.1-22.9 76-19.3 103.8 8.6c31.5 31.5 31.5 82.5 0 114L422.3 334.8c-31.5 31.5-82.5 31.5-114 0c-27.9-27.9-31.5-71.8-8.6-103.8l1.1-1.6c10.3-14.4 6.9-34.4-7.4-44.6s-34.4-6.9-44.6 7.4l-1.1 1.6C206.5 251.2 213 330 263 380c56.5 56.5 148 56.5 204.5 0L579.8 267.7zM60.2 244.3c-56.5 56.5-56.5 148 0 204.5c50 50 128.8 56.5 186.3 15.4l1.6-1.1c14.4-10.3 17.7-30.3 7.4-44.6s-30.3-17.7-44.6-7.4l-1.6 1.1c-32.1 22.9-76 19.3-103.8-8.6C74 372 74 321 105.5 289.5L217.7 177.2c31.5-31.5 82.5-31.5 114 0c27.9 27.9 31.5 71.8 8.6 103.9l-1.1 1.6c-10.3 14.4-6.9 34.4 7.4 44.6s34.4 6.9 44.6-7.4l1.1-1.6C433.5 260.8 427 182 377 132c-56.5-56.5-148-56.5-204.5 0L60.2 244.3z"/></svg> damien-datasci-blog.netlify.app](https://damien-datasci-blog.netlify.app)  
[<svg aria-hidden="true" role="img" viewBox="0 0 512 512" style="height:1em;width:1em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:currentColor;overflow:visible;position:relative;"><path d="M16.1 260.2c-22.6 12.9-20.5 47.3 3.6 57.3L160 376V479.3c0 18.1 14.6 32.7 32.7 32.7c9.7 0 18.9-4.3 25.1-11.8l62-74.3 123.9 51.6c18.9 7.9 40.8-4.5 43.9-24.7l64-416c1.9-12.1-3.4-24.3-13.5-31.2s-23.3-7.5-34-1.4l-448 256zm52.1 25.5L409.7 90.6 190.1 336l1.2 1L68.2 285.7zM403.3 425.4L236.7 355.9 450.8 116.6 403.3 425.4z"/></svg> damien.dupre@dcu.ie](mailto:damien.dupre@dcu.ie)