class: title-slide <br> <br> .right-panel[ <br> # Interpreting Data Visualizations ## Dr. Mine Dogucu ] --- class: middle ## Reminder - Close all apps on your computer other than zoom. - Open slides for this session from the cluster website (https://uci-dshs.netlify.app/). --- class: middle [Limbo Lines: Dead Here, Alive There](https://pudding.cool/2018/02/death/) [Anxiety: It Gets Worse, Then It Gets Better](https://www.instagram.com/p/B4U81DJlqTG/) [The Wealth & Health of Nations](https://observablehq.com/@mbostock/the-wealth-health-of-nations) --- class: middle Data Visualizations - are graphical representations of data -- - use different colors, shapes, and the coordinate system to summarize data -- - tell a story -- - are useful for exploring data --- class:inverse middle .font75[Visuals with a Single Categorical Variable] --- ## Bar plot .pull-left[ <img src="02b-interpret-dataviz_files/figure-html/unnamed-chunk-3-1.png" style="display: block; margin: auto;" /> ] .pull-right[ <img src="02b-interpret-dataviz_files/figure-html/unnamed-chunk-4-1.png" style="display: block; margin: auto;" /> ] --- class:inverse middle .font75[Visuals with a Single Numeric Variable] --- ## Box plot .pull-left[ <img src="02b-interpret-dataviz_files/figure-html/unnamed-chunk-5-1.png" style="display: block; margin: auto;" /> ] .pull-right[ - The horizontal line inside the box represents the median. - The box itself represents the middle 50% of the data with Q3 on the upper end and Q1 on the lower end. - Whiskers extend from the box. They can extend up to 1.5 IQR away from the box (i.e. away from Q1 and Q3). - The points are potential outliers that represent babies with really low or high birth weight. ] --- ## Histogram .pull-left[ Bin width = 5 ounces <img src="02b-interpret-dataviz_files/figure-html/unnamed-chunk-6-1.png" style="display: block; margin: auto;" /> ] .pull-right[ Bin width = 20 ounces <img src="02b-interpret-dataviz_files/figure-html/unnamed-chunk-7-1.png" style="display: block; margin: auto;" /> ] --- class: middle [Exploring Histograms Interactively](http://tinlizzie.org/histograms/) --- class: middle center [There is no "best" number of bins](https://en.wikipedia.org/wiki/Histogram#Number_of_bins_and_width) --- ## Fun fact __histo__ comes from the Greek word _histos_ that literally means "anything set up right". __gram__: comes from the Greek word _gramma_ which means "that which is drawn". .footnote[Online Etymology Dictionary] --- ## Histogram vs. Boxplot .pull-left[ ``` ## NULL ``` Tail tells the tale. ] .pull-right[ ``` ## Warning: No renderer available. Please install the gifski, av, or magick package ## to create animated output ``` ``` ## NULL ``` ] --- class: middle ## In Breakout Rooms Discuss: - In right-skewed distributions mean > median, true or false? - In left-skewed distributions mean > median, true or false? --- class: inverse middle center .font75[Visuals with Two Categorical Variables] --- ## Standardized Bar Plot <img src="02b-interpret-dataviz_files/figure-html/unnamed-chunk-10-1.png" style="display: block; margin: auto;" /> --- ## Dodged Bar Plot <img src="02b-interpret-dataviz_files/figure-html/unnamed-chunk-11-1.png" style="display: block; margin: auto;" /> --- class: middle inverse .font75[Visuals with a single numerical and single categorical variable.] --- ## Side-by-side box plots <img src="02b-interpret-dataviz_files/figure-html/unnamed-chunk-12-1.png" style="display: block; margin: auto;" /> --- class: inverse middle .font75[Visuals with Two Numerical Variables] --- ## Scatter plots <img src="02b-interpret-dataviz_files/figure-html/unnamed-chunk-13-1.png" style="display: block; margin: auto;" /> Length of gestation can **possibly** eXplain a baby's birth weight. Gestation is the eXplanatory variable and is shown on the x-axis. Birth weight is the response variable and is shown on the y-axis. --- ## Linear Relationship <img src="02b-interpret-dataviz_files/figure-html/unnamed-chunk-14-1.png" style="display: block; margin: auto;" /> In Week 3, we will start statistical modeling during which we will numerically define the relationship between gestation and birth weight. For now we can say that this relationship is positive and moderate.