Data VisualizationFor Data Science STAT-598


Instructors
William S. Cleveland (Statistics), wsc@purdue.edu
Yining Ding (Statistics) ding238@purdue.edu

Visualization Science
priceonomics.com/how-william-cleveland-turned-data-visualization\

Prerequisites
Knowledge of very basic probability and statistics, and mathematics through calculus and linear algebra. The course uses R and spends some time on it. No previous knowledge of R is required. But previous R experiemence is helpful.

Primary Audience
Graduate and Masters students in university departments where data are analyzed.

Credits
3

Date Time
TueThur 1:30 - 2:45

Location
Stanley Coultour Hall G046

Content:Topic 1
Learn More R Programming The parent of R is the S Language for data science. S is the only larguage for data analysis that has won the ACM Software System Award

Learn R

Rintro.pdf RInfto.pdf

BookKabacoff.RinAction.pdf BookKabacoff.RinAction.pdf

Introduction2R.pdf Introduction2R.pdf

Paradis.RforBeginners.pdf Paradis.RforBeginners

R.CoreTeam.Introduction.pdf R.CoreTeam.Introduction.pdf

Content:Topic 3
The content is the Trellis Display framework for data visualization. It is based on the divide and recombine approach to data analysis. Variables in the data are divided into subsets by conditioning on other variables. An analytic method is applied to each subset and the results are displayed in a multipanel array of panels across columns, rows, and "pages". Participants will learn many visualization concepts, methods, and algorithms, as well as principles of display that enhance the visual decoding of the information on visual displays Trellis display is implemented by the lattice package in R. It comes with the R core distribution so it is ready to go when R is installed. Participants will learn in detail how to use lattice graphics.

Content:Topic 4
UNIX The linux operating system BASH kernel

Participant Responsibilities
Participants are expected to attend class and successfully carry out in-class assignments. Attendance is critical because the class is a team. Out-of-class time requirements are modest.

Lectures

VisualizingData.pdf VisualizingData

R-Graphics Paul Murrell R-Graphics

Collection of Datasets> datasets.RDdata

Examples of Plots Using the Datasets

LatticeGraphicsExamples.txt LatticeGraphicsExamples.txt

LatticeGraphicsExamples.pdf LatticeGraphicsExamples.pdf

Documentation for Lattice Graphics

Program Lattice 101 ProgramLattice101

Program Lattice 102 ProgramLattice102

Program Lattice 103 ProgramLattice103

Online Lattice Graphics Documentation Online Documentation

Sarkar Lattice Documentation 2April2020 Sarkar Lattice Graphics Documentation

Programming Lattice 2 Sarkar Lattice Graphics

R Manual Written by R Core Team R Manual

bookfunctions.pdf> bookfunctions.pdf

Chicago Crash Data: 398,452 ChiCrash

Description of Chicago Crime Data Variables Variable Descriptions

Chicago Crime Data: 1000 Crimes ChiCrime Data1000

How to Read csv file into R ChiCrime1000 <- read.csv("ChiCrime1000.csv",header=T,sep=",")

Chicago Crime Data: 1 Million Crimes ChiCrime DataOneMillion

More Chicago Crime Data Crimes: 22 fields Each

Chicago Crime Data: R Programming R Code

Panel and Prepanel Functions Panel Function

Trellis Display The Beginngings with S An Introduction to Trellis Display with Lattice Graphics

general.display.functions.txt gdf

Deepayan Sarkar: Lattice Multivariate Data Visualization with R Springer Books

00frontmatter.pdf 00

01introduction.pdf 01

02technicaloverview.pdf 02

03univariate.pdf 03

04multiwaytables.pdf 04

05scatterplots.pdf 05

06trivariate.pdf 06

07parametersettings.pdf 07

08annotations.pdf 08

09labelslegends.pdf 09

10datamanipulation.pdf 10

11trellisobject.pdf 11

12interacting.pdf 12

13panelfunctions.pdf 13

14newdisplays.pdf 14

bookfunctions with lattice code examples

bookfunctions.pdf bookfunctions.pdf

12interacting.pdf bookfunctionsourcepdf.txt

DataCamp

This class is supported by DataCamp, the most intuitive learning platform for data science. Learn R the way you learn best through a combination of short expert videos and hands-on-the-keyboard exercises. Take over 100+ courses by expert instructors on topics such as importing data, data visualization or machine learning and learn faster through immediate and personalised feedback on every exercise.