class: center, middle, inverse, title-slide .title[ # BAA1030 - Data Analytics and Story Telling ] .subtitle[ ## Lecture 3: Principles of Data Visualisation ] .author[ ### Damien Dupré - Dublin City University ] --- # Previously ... ### After having: #### 1. Downloaded your Data #### 2. Cleaned your Data #### 3. Transformed your Data It is time to look at your Data! ### Data visualisations are important: - For the Data Analyst/Scientist to **understand** what is going on with your data (e.g. errors, unexpected data, outline data analyses to perform) - For the reader to access the message and to **be convinced** of the arguments **Always look at your data before doing any analyses and do convincing visualisations!** --- # Visualisation Requirement ### Variables have different types - **Categorical**: If the variable's possibilities are words or sentences (character string) - if the possibilities cannot be ordered: Categorical Nominal (*e.g.*, gender) - if the possibilities can be ordered: Categorical Ordinal (*e.g.*, opinion scales) - **Continuous**: If the variable's possibilities are numbers (*e.g.*, age or temperature)
<i class="fas fa-exclamation-triangle faa-flash animated faa-slow " style=" color:red;"></i>
Variables can be converted to either Categorical and Continuous but it is always better to keep them in their correct scale. Example: > Likert items (opinion question from disagree to agree) are Categorical Ordinal but are analysed as Continuous. > Gender item (male vs. female) is Categorical Nominal but can be recoded as 1 and 2 to be analysed with Continuous calculations. --- class: title-slide, middle ## Exercises --- # Guess Variable Type - Exercise (1) <table> <thead> <tr> <th style="text-align:left;"> Variable Name </th> <th style="text-align:left;"> Variable Range </th> <th style="text-align:left;"> Variable Type </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;width: 10em; "> Gender </td> <td style="text-align:left;width: 20em; "> Male/Female/Other </td> <td style="text-align:left;width: 20em; "> - </td> </tr> <tr> <td style="text-align:left;width: 10em; "> Country </td> <td style="text-align:left;width: 20em; "> Ireland/France/USA </td> <td style="text-align:left;width: 20em; "> - </td> </tr> <tr> <td style="text-align:left;width: 10em; "> Age </td> <td style="text-align:left;width: 20em; "> 0 to Inf </td> <td style="text-align:left;width: 20em; "> - </td> </tr> <tr> <td style="text-align:left;width: 10em; "> Age </td> <td style="text-align:left;width: 20em; "> 0-19, 20-29, 30-39, 40-49 </td> <td style="text-align:left;width: 20em; "> - </td> </tr> <tr> <td style="text-align:left;width: 10em; "> Price </td> <td style="text-align:left;width: 20em; "> 0 to Inf </td> <td style="text-align:left;width: 20em; "> - </td> </tr> <tr> <td style="text-align:left;width: 10em; "> Temperature </td> <td style="text-align:left;width: 20em; "> -Inf to Inf </td> <td style="text-align:left;width: 20em; "> - </td> </tr> <tr> <td style="text-align:left;width: 10em; "> IQ </td> <td style="text-align:left;width: 20em; "> Low/Mid/High </td> <td style="text-align:left;width: 20em; "> - </td> </tr> <tr> <td style="text-align:left;width: 10em; "> Results </td> <td style="text-align:left;width: 20em; "> 0 to 100 </td> <td style="text-align:left;width: 20em; "> - </td> </tr> <tr> <td style="text-align:left;width: 10em; "> Results </td> <td style="text-align:left;width: 20em; "> Fail/Pass/2:1/First </td> <td style="text-align:left;width: 20em; "> - </td> </tr> </tbody> </table>
−
+
03
:
00
--- # Guess Variable Type - Solution (1) <table> <thead> <tr> <th style="text-align:left;"> Variable Name </th> <th style="text-align:left;"> Variable Range </th> <th style="text-align:left;"> Variable Type </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;width: 10em; "> Gender </td> <td style="text-align:left;width: 20em; "> Male/Female/Other </td> <td style="text-align:left;width: 20em; "> Categorical Nominal </td> </tr> <tr> <td style="text-align:left;width: 10em; "> Country </td> <td style="text-align:left;width: 20em; "> Ireland/France/USA </td> <td style="text-align:left;width: 20em; "> Categorical Nominal </td> </tr> <tr> <td style="text-align:left;width: 10em; "> Age </td> <td style="text-align:left;width: 20em; "> 0 to Inf </td> <td style="text-align:left;width: 20em; "> Continuous </td> </tr> <tr> <td style="text-align:left;width: 10em; "> Age </td> <td style="text-align:left;width: 20em; "> 0-19, 20-29, 30-39, 40-49 </td> <td style="text-align:left;width: 20em; "> Categorical Ordinal </td> </tr> <tr> <td style="text-align:left;width: 10em; "> Price </td> <td style="text-align:left;width: 20em; "> 0 to Inf </td> <td style="text-align:left;width: 20em; "> Continuous </td> </tr> <tr> <td style="text-align:left;width: 10em; "> Temperature </td> <td style="text-align:left;width: 20em; "> -Inf to Inf </td> <td style="text-align:left;width: 20em; "> Continuous </td> </tr> <tr> <td style="text-align:left;width: 10em; "> IQ </td> <td style="text-align:left;width: 20em; "> Low/Mid/High </td> <td style="text-align:left;width: 20em; "> Categorical Ordinal </td> </tr> <tr> <td style="text-align:left;width: 10em; "> Results </td> <td style="text-align:left;width: 20em; "> 0 to 100 </td> <td style="text-align:left;width: 20em; "> Continuous </td> </tr> <tr> <td style="text-align:left;width: 10em; "> Results </td> <td style="text-align:left;width: 20em; "> Fail/Pass/2:1/First </td> <td style="text-align:left;width: 20em; "> Categorical Ordinal </td> </tr> </tbody> </table> --- # Guess Variable Type - Exercise (2) <img src="https://raw.githubusercontent.com/damien-dupre/img/main/viz_exercise_1.png" width="100%" style="display: block; margin: auto;" /> <table class="table" style="font-size: 14px; color: black; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> Variable Name </th> <th style="text-align:left;"> Variable Type </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;width: 10em; "> productName </td> <td style="text-align:left;width: 20em; "> - </td> </tr> <tr> <td style="text-align:left;width: 10em; "> supplierID </td> <td style="text-align:left;width: 20em; "> - </td> </tr> <tr> <td style="text-align:left;width: 10em; "> categoryID </td> <td style="text-align:left;width: 20em; "> - </td> </tr> <tr> <td style="text-align:left;width: 10em; "> quantityPerUnit </td> <td style="text-align:left;width: 20em; "> - </td> </tr> <tr> <td style="text-align:left;width: 10em; "> unitPrice </td> <td style="text-align:left;width: 20em; "> - </td> </tr> <tr> <td style="text-align:left;width: 10em; "> unitsInStock </td> <td style="text-align:left;width: 20em; "> - </td> </tr> </tbody> </table>
−
+
02
:
00
--- # Guess Variable Type - Solution (2) <img src="https://raw.githubusercontent.com/damien-dupre/img/main/viz_exercise_1.png" width="100%" style="display: block; margin: auto;" /> <table class="table" style="font-size: 14px; color: black; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> Variable Name </th> <th style="text-align:left;"> Variable Type </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;width: 10em; "> productName </td> <td style="text-align:left;width: 20em; "> Categorical Nominal </td> </tr> <tr> <td style="text-align:left;width: 10em; "> supplierID </td> <td style="text-align:left;width: 20em; "> Categorical Nominal </td> </tr> <tr> <td style="text-align:left;width: 10em; "> categoryID </td> <td style="text-align:left;width: 20em; "> Categorical Nominal </td> </tr> <tr> <td style="text-align:left;width: 10em; "> quantityPerUnit </td> <td style="text-align:left;width: 20em; "> Categorical Nominal </td> </tr> <tr> <td style="text-align:left;width: 10em; "> unitPrice </td> <td style="text-align:left;width: 20em; "> Continuous </td> </tr> <tr> <td style="text-align:left;width: 10em; "> unitsInStock </td> <td style="text-align:left;width: 20em; "> Continuous </td> </tr> </tbody> </table> --- class: inverse, mline, center, middle # 1. The Key Figures --- # Why Using Figures #### 1. When creating a report (academic or industrial), always **tell a story/narrative** #### 2. The figure is here to **support** your story/narrative and to make it **more convincing**
<i class="fas fa-exclamation-triangle faa-flash animated faa-slow " style=" color:red;"></i>
Be careful, figures can lie or be misleading, use with ethics and professionalism <img src="https://i0.wp.com/flowingdata.com/wp-content/uploads/2017/09/Fancy-dataviz-vs-best-chart-for-the-data.png" width="60%" style="display: block; margin: auto;" /> .center.tiny[Source: www.flowingdata.com [🔗](https://i0.wp.com/flowingdata.com/wp-content/uploads/2017/09/Fancy-dataviz-vs-best-chart-for-the-data.png)] --- # Master the Key Figures <img src="https://raw.githubusercontent.com/damien-dupre/img/main/viz_key_figures_1.jpeg" width="100%" style="display: block; margin: auto;" /> --- # Master the Key Figures ### Keep your figures: .pull-left[ #### 1. Clear Complex figures should be crystal clear, use design features (colours, sizes, ...) to make them clear #### 2. Self-Explanatory Always use a caption, a legend and well defined axes ] .pull-right[ <img src="https://venngage-wordpress.s3.amazonaws.com/uploads/2020/06/image17.png" width="100%" style="display: block; margin: auto;" /> .center.tiny[Source: www.venngage.com [🔗](https://venngage-wordpress.s3.amazonaws.com/uploads/2020/06/image17.png)] ] --- # Master the Key Figures The main interest of Figure is their ability to ... aggregate data to meaningful information This **aggregation**, is done with basic statistical calculations: .pull-left[ <br> |Student Number |Group |Mark | |:--------------|:-----|:----| |S1 |A |60 | |S2 |B |70 | |S3 |A |60 | |S4 |B |55 | |S5 |A |65 | |S6 |B |65 | |S7 |A |65 | ] .pull-right[ - **Count** - Group A = 4 students - Group B = 3 students - **Proportion** - Group A = 57% (4/7) - Group B = 43% (3/7) - **Density** - 0 to 50 = 0%, 55 = 14%, 60 = 29%, - 65 = 43%, 70 = 14%, 75 to 100 = 0% - **Median/Quartiles** - Q1 = 60, Med = 65, Q3 = 65 - **Mean/Standard Deviation** - Mean = 62.8, SD = 4.9 ] --- # Zoom on Median/Quartiles <img src="https://raw.githubusercontent.com/damien-dupre/img/main/viz_med_quartiles_1.png" width="100%" style="display: block; margin: auto;" /> --- # Zoom on Median/Quartiles <img src="https://raw.githubusercontent.com/damien-dupre/img/main/viz_med_quartiles_2.png" width="50%" style="display: block; margin: auto;" /> <img src="https://raw.githubusercontent.com/damien-dupre/img/main/viz_med_quartiles_3.png" width="50%" style="display: block; margin: auto;" /> --- # Zoom on Mean/SD - **Mean or Average** is the Sum of all the data divided by their Number - **Standard Deviation** is [the square root of] the average distance to the mean <img src="https://s4be.cochrane.org/app/uploads/2018/09/Image-1-Standard-deviation-Standard-error-.jpg" width="70%" style="display: block; margin: auto;" /> --- # Zoom on Correlation Line between 2 Continuous Variables is a Regression Line Corresponds to the best fit: `$$Y = b0 + b1\,X + e$$` .pull-left[ <img src="https://raw.githubusercontent.com/damien-dupre/img/main/viz_correlation_1.png" width="100%" style="display: block; margin: auto;" /> ] .pull-right[ - `\(b0\)` is **the intercept** which corresponds to the value of Y when X is 0 - `\(b1\)` is **the slope** of the line which corresponds to the strength of relationship between X and Y - `\(e\)` is **the error** which corresponds to the distance between the points and the line (also called residual) ] --- # Zoom on Correlation Correlation is obtained by drawing the **best line** between the points and by calculating the slope of this line Correlation is a value **from -1 to +1** which indicate the strength and direction of a relationship: - Closest to +1 is Strong Positive relationship - Closest to -1 is Strong Negative relationship - Closest to 0 is No relationship Examples: <img src="https://raw.githubusercontent.com/damien-dupre/img/main/viz_correlation_2.png" width="70%" style="display: block; margin: auto;" /> --- # Master the Key Figures
--- class: title-slide, middle ## COMPOSITION Type Figures --- # COMPOSITION Type Figures - Count of different categories .pull-left[ <img src="lecture_3_files/figure-html/unnamed-chunk-22-1.png" width="100%" style="display: block; margin: auto;" /> ] .pull-right[ <img src="lecture_3_files/figure-html/unnamed-chunk-23-1.png" width="100%" style="display: block; margin: auto;" /> ] - Proportion of different categories .pull-left[ <img src="lecture_3_files/figure-html/unnamed-chunk-24-1.png" width="100%" style="display: block; margin: auto;" /> ] .pull-right[ <img src="lecture_3_files/figure-html/unnamed-chunk-25-1.png" width="100%" style="display: block; margin: auto;" /> ] --- # When to Use a Pie Chart? Use pie charts when the variable has only to categories: <img src="https://raw.githubusercontent.com/damien-dupre/img/main/viz_piechart_1.jpeg" width="50%" style="display: block; margin: auto;" /> If your variable has more than 2 categories then use bar chart: <img src="https://raw.githubusercontent.com/damien-dupre/img/main/viz_piechart_2.jpeg" width="50%" style="display: block; margin: auto;" /> See: https://depictdatastudio.com/when-pie-charts-are-okay-seriously-guidelines-for-using-pie-and-donut-charts/ --- # When to Use a Pie Chart? Here is a pie chart that works: <img src="https://raw.githubusercontent.com/damien-dupre/img/main/example_piechart.jpg" width="50%" style="display: block; margin: auto;" /> --- class: title-slide, middle ## DISTRIBUTION Type Figures --- # DISTRIBUTION Type Figures ##.center[Histogram - Density - BoxPlot - Dynamite] <img src="lecture_3_files/figure-html/unnamed-chunk-29-1.png" width="194.4" /><img src="lecture_3_files/figure-html/unnamed-chunk-29-2.png" width="194.4" /><img src="lecture_3_files/figure-html/unnamed-chunk-29-3.png" width="194.4" /><img src="lecture_3_files/figure-html/unnamed-chunk-29-4.png" width="194.4" /> --- class: title-slide, middle ## COMPARISION Type Figures --- # COMPARISION - Categorical/Categorical Count of different values .center[ <img src="lecture_3_files/figure-html/unnamed-chunk-30-1.png" width="360" /><img src="lecture_3_files/figure-html/unnamed-chunk-30-2.png" width="360" /> ] Proportion of different values .center[ <img src="lecture_3_files/figure-html/unnamed-chunk-31-1.png" width="360" /> ] --- # COMPARISION - Categorical/Continuous Count of different values according Categories <img src="lecture_3_files/figure-html/unnamed-chunk-32-1.png" width="50%" /><img src="lecture_3_files/figure-html/unnamed-chunk-32-2.png" width="50%" /> --- # COMPARISION - Categorical/Continuous Density of different values according Categories <img src="lecture_3_files/figure-html/unnamed-chunk-33-1.png" width="50%" /><img src="lecture_3_files/figure-html/unnamed-chunk-33-2.png" width="50%" /> --- # COMPARISION - Categorical/Continuous Median/Quartiles according Categories <img src="lecture_3_files/figure-html/unnamed-chunk-34-1.png" width="50%" /><img src="lecture_3_files/figure-html/unnamed-chunk-34-2.png" width="50%" /> --- # COMPARISION - Categorical/Continuous Mean/Standard Deviation according Categories <img src="lecture_3_files/figure-html/unnamed-chunk-35-1.png" width="50%" style="display: block; margin: auto;" /> --- class: title-slide, middle ## RELATIONSHIP Type Figures --- # RELATIONSHIP - Continuous/Continuous <img src="lecture_3_files/figure-html/unnamed-chunk-36-1.png" width="360" /><img src="lecture_3_files/figure-html/unnamed-chunk-36-2.png" width="360" /> Continuous X Continuous X Categorical <img src="lecture_3_files/figure-html/unnamed-chunk-37-1.png" width="360" /><img src="lecture_3_files/figure-html/unnamed-chunk-37-2.png" width="360" /> --- class: inverse, mline, center, middle # 2. Customise your Figures --- # Master the Key Figures <img src="https://raw.githubusercontent.com/damien-dupre/img/main/viz_key_figures_2.jpeg" width="70%" style="display: block; margin: auto;" /> --- # Customise Figures .pull-left[ - Colours - Size dots - Shapes - Position (*i.e.* stack or dodge) - Orientation (vertical or horizontal) - Text content and font - Figure legend ] .pull-right[ <img src="https://raw.githubusercontent.com/damien-dupre/img/main/viz_custom.png" width="100%" style="display: block; margin: auto;" /> ] --- # Use Figures ### Use Figures only if they ... - Are self-explanatory - Help to understand your message (a figure is not a message) -- ### Don’t duplicate the information - Always give the information in the main text of your report but - Don’t show a table and a figure with the same information -- ### To tell a story! Build a narrative around your Figures - Set of observations, facts, or events, true or invented, that are presented in a specific order such that they create an emotional reaction in the audience - Even if you think data are boring, make them interesting for your audience --- class: title-slide, middle ## Exercise --- # What’s Wrong Here? <img src="https://raw.githubusercontent.com/damien-dupre/img/main/dataviz_exercise_1.png" width="70%" style="display: block; margin: auto;" /> .center.tiny[Source: Official Twitter Account of The White House [🔗](https://twitter.com/WhiteHouse/status/1486709480351952901)] --- # What’s Wrong Here? <img src="https://raw.githubusercontent.com/damien-dupre/img/main/dataviz_exercise_2.png" width="80%" style="display: block; margin: auto;" /> .center.tiny[Source: Misleading Graphs in Real Life: Overview [🔗](https://www.statisticshowto.com/probability-and-statistics/descriptive-statistics/misleading-graphs/)] --- # What’s Wrong Here? <img src="https://raw.githubusercontent.com/damien-dupre/img/main/dataviz_exercise_3.jpeg" width="100%" style="display: block; margin: auto;" /> .center.tiny[Source: The statisticians at Fox News use classic and novel graphical techniques to lead with data [🔗](https://simplystatistics.org/posts/2012-11-26-the-statisticians-at-fox-news-use-classic-and-novel-graphical-techniques-to-lead-with-data/)] --- # What’s Wrong Here? <img src="https://raw.githubusercontent.com/damien-dupre/img/main/dataviz_exercise_4.png" width="80%" style="display: block; margin: auto;" /> .center.tiny[Source: The statisticians at Fox News use classic and novel graphical techniques to lead with data [🔗](https://simplystatistics.org/posts/2012-11-26-the-statisticians-at-fox-news-use-classic-and-novel-graphical-techniques-to-lead-with-data/)] --- # What’s Wrong Here? <img src="https://raw.githubusercontent.com/damien-dupre/img/main/dataviz_exercise_12.jpg" width="80%" style="display: block; margin: auto;" /> .center.tiny[Source: KCRA graph on the 7pm news - r/Sacramento [🔗](https://www.reddit.com/r/Sacramento/comments/tysyxz/kcra_graph_on_the_7pm_news/)] --- # What’s Wrong Here? <img src="https://raw.githubusercontent.com/damien-dupre/img/main/dataviz_exercise_5.jpeg" width="100%" style="display: block; margin: auto;" /> .center.tiny[Source: 5 Ways Writers Use Misleading Graphs To Manipulate You [🔗](https://venngage.com/blog/misleading-graphs/)] --- # What’s Wrong Here? <img src="https://raw.githubusercontent.com/damien-dupre/img/main/dataviz_exercise_6.jpeg" width="80%" style="display: block; margin: auto;" /> .center.tiny[Source: The Worst Covid-19 Misleading Graphs [🔗](https://www.datasciencecentral.com/the-worst-covid-19-misleading-graphs/)] --- # What’s Wrong Here? <img src="https://raw.githubusercontent.com/damien-dupre/img/main/dataviz_exercise_7.png" width="60%" style="display: block; margin: auto;" /> .center.tiny[Source: Misleading Graphs in Real Life: Overview [🔗](https://www.statisticshowto.com/probability-and-statistics/descriptive-statistics/misleading-graphs/)] --- # What’s Wrong Here? <img src="https://raw.githubusercontent.com/damien-dupre/img/main/dataviz_exercise_8.jpeg" width="80%" style="display: block; margin: auto;" /> .center.tiny[Source: Health Care in Crisis: 14,000 Losing Coverage Each Day [🔗](https://www.americanprogressaction.org/issues/healthcare/reports/2009/02/19/5635/health-care-in-crisis-14000-losing-coverage-each-day/)] --- # What’s Wrong Here? <img src="https://raw.githubusercontent.com/damien-dupre/img/main/dataviz_exercise_9.jpeg" width="100%" style="display: block; margin: auto;" /> .center.tiny[Source: Nick Cage Movies Vs. Drownings, and More Strange (but Spurious) Correlations [🔗](https://www.nationalgeographic.com/science/article/nick-cage-movies-vs-drownings-and-more-strange-but-spurious-correlations)] --- # What’s Wrong Here? <img src="https://raw.githubusercontent.com/damien-dupre/img/main/dataviz_exercise_10.jpeg" width="70%" style="display: block; margin: auto;" /> .center.tiny[Source: Fact checking Trump's 'Impeach this' map [🔗](https://edition.cnn.com/2019/10/01/politics/trump-impeach-this-map-fact-check/index.html)] --- # Land Doesn’t Vote, People Do <img src="https://raw.githubusercontent.com/damien-dupre/img/main/dataviz_exercise_11.gif" width="100%" style="display: block; margin: auto;" /> .center.tiny[Source: Land Doesn’t Vote, People Do: This Electoral Map Tells the Real Story [🔗](https://demcastusa.com/2019/11/11/land-doesnt-vote-people-do-this-electoral-map-tells-the-real-story/)] --- # A Must Read > “Mistakes, we’ve drawn a few. Learning from our errors in data visualisation” by The Economist https://medium.economist.com/mistakes-weve-drawn-a-few-8cdd8a42d368 Example: <img src="https://miro.medium.com/max/2560/1*9QE_yL3boSLqopJkSBfL5A.png" width="100%" style="display: block; margin: auto;" /> --- class: inverse, mline, center, middle # 3. Fundamentals of Data Visualization --- # The Book .pull-left[ **Fundamentals of Data Visualization** by Claus O. Wilke Free online at https://clauswilke.com/dataviz/ Provides basic principles for data visualisation ] .pull-right[ <img src="https://raw.githubusercontent.com/damien-dupre/img/main/dataviz_principle_1.png" width="100%" style="display: block; margin: auto;" /> ] --- # Principle 1: Axes ### .center[Don’t cheat with axes, include 0 when 0 is meaningful] .center[ <img src="https://raw.githubusercontent.com/damien-dupre/img/main/dataviz_principle_2.png" width="45%" /><img src="https://raw.githubusercontent.com/damien-dupre/img/main/dataviz_principle_3.png" width="45%" /> <img src="https://raw.githubusercontent.com/damien-dupre/img/main/dataviz_principle_4.png" width="45%" /><img src="https://raw.githubusercontent.com/damien-dupre/img/main/dataviz_principle_5.png" width="45%" /> ] --- # Principle 1: Axes ### .center[Don’t cheat with axes, include 0 when 0 is meaningful] .center[ <img src="https://raw.githubusercontent.com/damien-dupre/img/main/dataviz_principle_6.png" width="50%" /><img src="https://raw.githubusercontent.com/damien-dupre/img/main/dataviz_principle_7.png" width="50%" /> ] --- # Principle 2: Make it Easy ### .center[Make your figure easy to read and to understand] .center[ <img src="https://raw.githubusercontent.com/damien-dupre/img/main/dataviz_principle_8.png" width="50%" /><img src="https://raw.githubusercontent.com/damien-dupre/img/main/dataviz_principle_9.png" width="50%" /> ] --- # Principle 3: Be Fancy ### .center[Be fancy, use extract features if it helps, such as transparency and jitter] .center[ <img src="https://raw.githubusercontent.com/damien-dupre/img/main/dataviz_principle_10.png" width="50%" /><img src="https://raw.githubusercontent.com/damien-dupre/img/main/dataviz_principle_11.png" width="50%" /> ] --- # Principle 3: Be Fancy ### .center[Be fancy, use extract features if it helps, such as transparency and jitter] .center[ <img src="https://raw.githubusercontent.com/damien-dupre/img/main/dataviz_principle_12.png" width="50%" /><img src="https://raw.githubusercontent.com/damien-dupre/img/main/dataviz_principle_13.png" width="50%" /> ] --- # Principle 4: Colours ### .center[Use colours but not too many and be careful to vision disabilities] .pull-left[ <img src="https://raw.githubusercontent.com/damien-dupre/img/main/dataviz_principle_14.png" width="100%" style="display: block; margin: auto;" /> ] .pull-right[ <img src="https://raw.githubusercontent.com/damien-dupre/img/main/dataviz_principle_15.png" width="100%" style="display: block; margin: auto;" /> ] --- # Principle 4: Colours ### .center[Keep the same colours for the same variables] .center[ <img src="https://raw.githubusercontent.com/damien-dupre/img/main/dataviz_principle_16.png" width="50%" /><img src="https://raw.githubusercontent.com/damien-dupre/img/main/dataviz_principle_17.png" width="50%" /> ] --- # Principle 5: Details ### .center[Every details count and lazy work is easily spotted] .center[ <img src="https://raw.githubusercontent.com/damien-dupre/img/main/dataviz_principle_18.png" width="50%" /><img src="https://raw.githubusercontent.com/damien-dupre/img/main/dataviz_principle_19.png" width="50%" /> ] --- # Principle 5: Details ### .center[Be precise and use captions, axes, and titles to do it] <img src="https://raw.githubusercontent.com/damien-dupre/img/main/dataviz_principle_20.png" width="70%" style="display: block; margin: auto;" /> --- class: inverse, mline, center, middle # 4. Visualise with MS EXCEL --- # COMPOSITION Type Figures Pie chart for Categorical variable <img src="https://raw.githubusercontent.com/damien-dupre/img/main/dataviz_excel_1.gif" width="100%" style="display: block; margin: auto;" /> --- # COMPOSITION Type Figures Bar chart for Categorical variable <img src="https://raw.githubusercontent.com/damien-dupre/img/main/dataviz_excel_2.gif" width="100%" style="display: block; margin: auto;" /> --- # DISTRIBUTION Type Figures Histogram for Continuous variable <img src="https://raw.githubusercontent.com/damien-dupre/img/main/dataviz_excel_3.gif" width="100%" style="display: block; margin: auto;" /> --- # DISTRIBUTION Type Figures Density plot for Continuous variable <img src="https://raw.githubusercontent.com/damien-dupre/img/main/dataviz_excel_4.gif" width="100%" style="display: block; margin: auto;" /> --- # DISTRIBUTION Type Figures Box plot for Continuous variable <img src="https://raw.githubusercontent.com/damien-dupre/img/main/dataviz_excel_5.gif" width="100%" style="display: block; margin: auto;" /> --- # DISTRIBUTION Type Figures Dynamite plot for Continuous variable <img src="https://raw.githubusercontent.com/damien-dupre/img/main/dataviz_excel_6.gif" width="100%" style="display: block; margin: auto;" /> --- # COMPARISION Type Figures Multiple histograms? <img src="https://raw.githubusercontent.com/damien-dupre/img/main/dataviz_excel_7.gif" width="100%" style="display: block; margin: auto;" /> --- # COMPARISION Type Figures Multiple box plot? <img src="https://raw.githubusercontent.com/damien-dupre/img/main/dataviz_excel_8.gif" width="100%" style="display: block; margin: auto;" /> --- # COMPARISION Type Figures Multiple dynamite plot? <img src="https://raw.githubusercontent.com/damien-dupre/img/main/dataviz_excel_9.gif" width="100%" style="display: block; margin: auto;" /> --- # RELATIONSHIP Type Figures Linear regression? <img src="https://raw.githubusercontent.com/damien-dupre/img/main/dataviz_excel_10.gif" width="100%" style="display: block; margin: auto;" /> --- # Visualisation with MS EXCEL As a spreadsheet software, MS EXCEL is ok for data cleaning and data transformation. However, avoid using MS EXCEL for data visualisation, prefer either TABLEAU or PowerBI If you use MS EXCEL, make sure that no one is able to identify that you have done it with MS EXCEL by to customising your figures with: - Text font - Colours - Background - ... --- # Visualisation with MS EXCEL Avoid your figures looking like that! <img src="https://i2.wp.com/www.real-statistics.com/wp-content/uploads/2012/11/bar-chart.png" width="100%" style="display: block; margin: auto;" /> .center.tiny[Source: www.real-statistics.com [🔗](https://i2.wp.com/www.real-statistics.com/wp-content/uploads/2012/11/bar-chart.png)] --- # Visualisation Dashboards Since 10 years, dashboards have became more and more popular in organisation: - Display multiple information at once - GUI base/easy to use - Auto update when connected to a server - Fancy visualisation design/interactive Two major contenders: Tableau vs. PowerBI <table> <thead> <tr> <th style="text-align:left;"> Tableau </th> <th style="text-align:left;"> PowerBI </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;width: 20em; "> Tableau can handle a huge volume of data with better performance. </td> <td style="text-align:left;width: 20em; "> PowerBI can handle a limited volume of data. </td> </tr> <tr> <td style="text-align:left;width: 20em; "> Tableau is a little difficult. </td> <td style="text-align:left;width: 20em; "> PowerBI Interface is very easy to learn. </td> </tr> <tr> <td style="text-align:left;width: 20em; "> Embedding report is a real-time challenge in Tableau </td> <td style="text-align:left;width: 20em; "> Embedding report is easy with PowerBI. </td> </tr> </tbody> </table> .center.tiny[Source: www.guru99.com [🔗](https://www.guru99.com/tableau-vs-power-bi-difference.html)] --- # Kubicle Mandatory Trainings Data Presentation Fundamentals - [Communicating Data Effectively](https://app.kubicle.com/courses/communicating-data-effectively) (90 min) - [Telling Stories with Data](https://app.kubicle.com/courses/telling-stories-with-data) (60 min) - [Presenting Your Data](https://app.kubicle.com/courses/presenting-your-data) (90 min) Visualization Fundamentals - [Visual Data Thinking](https://app.kubicle.com/courses/visual-data-thinking) (60 min) - [Applying Visual Data Skills](https://app.kubicle.com/courses/applying-visual-data-skills) (60 min) - [Visualization in Practice](https://app.kubicle.com/courses/visualization-in-practice) (30 min) --- class: inverse, mline, left, middle <img class="circle" src="https://github.com/damien-dupre.png" width="250px"/> # Thanks for your attention and don't hesitate to ask if you have any question! [
@damien_dupre](http://twitter.com/damien_dupre) [
@damien-dupre](http://github.com/damien-dupre) [
damien-dupre.github.io](https://damien-dupre.github.io) [
damien.dupre@dcu.ie](mailto:damien.dupre@dcu.ie)