Information Visualization and Machine Learning Fabrice Rossi Projet AxIS, INRIA Rocquencourt

27-28/02/2007

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

1 / 172

Goals of this lecture

1

To give an introduction to Information Visualization (Infovis) enhancement methods for classical displays specialized displays why you should leverage infovis in your everyday work

2

To outline links between Infovis and Machine Learning why do they exist? current solutions open research problems

3

To give examples of successful joint researches: Machine learning methods designed for visualization Visualization of machine learning algorithm results

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

2 / 172

Organization of this lecture

Visualization oriented Each part consists in an introduction to some visualization techniques an analysis of their links with machine learning

Five parts: 1 2 3 4 5

Introduction to Infovis Scatter plots Geometrically-transformed displays Iconic and pixel based displays Visualization methods designed in the machine learning community

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

3 / 172

Outline of part I Introduction 1

Information Visualization Definition Examples from “everyday life with KDE”

2

Infovis goals and limitations What is it used for? Limitations of Infovis and VDM

3

Links with machine learning Formal model of Infovis Machine learning

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

4 / 172

Outline of part II

Scatter plots 4

Introduction

5

Feature number reduction Principles Neighborhood structure preservation

6

Overlapping reduction Rendering and interaction Clustering

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

5 / 172

Outline of part III

Geometrically-transformed displays 7

Introduction

8

Scatter plot matrix Interaction

9

Parallel Coordinates Overlapping Variable order

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

6 / 172

Outline of part IV

Iconic and pixel based displays 10

Introduction

11

Iconic displays Chernoff’s faces Star glyph Glyph Positioning

12

Pixel based displays Dense pixel displays Dissimilarity matrix

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

7 / 172

Outline of part V

Machine Learning

13

Self Organizing Map Principles Visualization

14

Latent Variable Models General principles Generative Topographic Mapping

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

8 / 172

Part I Introduction

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

9 / 172

Outline

1

Information Visualization Definition Examples from “everyday life with KDE”

2

Infovis goals and limitations What is it used for? Limitations of Infovis and VDM

3

Links with machine learning Formal model of Infovis Machine learning

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

10 / 172

Information Visualization The use of computer-supported interactive, visual representation of abstract data to amplify cognition Card, Mackinlay & Shneiderman

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

11 / 172

Information Visualization The use of computer-supported interactive, visual representation of abstract data to amplify cognition Card, Mackinlay & Shneiderman Human preattentive processing capabilities: non conscious processing (no thinking involved) low level visual system extremely fast: 200 ms scalable (no browsing ⇒ sublinear scaling) feature type must match data type (e.g., hue is suitable for categories, not real value)

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

11 / 172

Preattentive processing

Can you see a color outlier? From “Perception in Visualization” by C. G. Healey http://www.csc.ncsu.edu/faculty/healey/PP/index.html

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

12 / 172

Preattentive processing

Can you see a color outlier? From “Perception in Visualization” by C. G. Healey http://www.csc.ncsu.edu/faculty/healey/PP/index.html

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

12 / 172

Preattentive processing

Can you see a shape outlier? From “Perception in Visualization” by C. G. Healey http://www.csc.ncsu.edu/faculty/healey/PP/index.html

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

12 / 172

Preattentive processing

Can you see a shape outlier? From “Perception in Visualization” by C. G. Healey http://www.csc.ncsu.edu/faculty/healey/PP/index.html

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

12 / 172

Information Visualization The use of computer-supported interactive, visual representation of abstract data to amplify cognition Card, Mackinlay & Shneiderman

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

13 / 172

Information Visualization The use of computer-supported interactive, visual representation of abstract data to amplify cognition Card, Mackinlay & Shneiderman Tool metaphor (hammer, microscope, etc.): extending user possibilities: more scalable processing (speed and/or volume) details enhancement multi-source fusion etc.

under user control

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

13 / 172

User control 1.5

Anderson's/Fisher's Iris ● ● ●



1.0

● ●











0.5



−0.5

0.0





● ● ●● ●

● ●

●● ● ● ● ●





●●



● ● ●





● ●

● ● ●



●● ● ●





● ●

● ● ●

● ● ●



● ●





● ● ●



● ●











● ●





● ● ● ●

● ●

● ●





●●







● ●

● ●











● ●





● ●



●●



● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ●

−1.0



● ●







−1.5



−3

−2

−1

0

1

2

3

4

Nonlinear projection ⇒ no user control F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

14 / 172

User control Anderson's/Fisher's Iris



5.0

5.5

2.5 1.5 0.5





6.5

7.0

7.5

8.0

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ●

7.5 6.5

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

7



● ●● ● ● ●●●● ● ● ● ● ●● ● ●●●● ● ● ●●●● ● ● ●● ● ● ●● ● ●● ● ● ● ● ● ●● ● ● ● ●●● ●●● ● ● ●●● ● ● ●●●●●●● ●● ● ● ● ● ●● ● ● ● ●●



● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ●●● ●● ●●● ●●●●● ● ●● ● ● ● ●● ●●● ● ● ●● ●●●●● ● ● ● ● ● ●● ●●● ● ● ● ● ● ● ●● ● ●● ● ● ●● ● ● ● ● ● ● ●





● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

2.5

● ● ● ● ●

5.5

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

2.0

● ● ●

4.5



Petal.Length

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●





● ●

1.5

6

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●



6.0



1.0

5





● ●

● ● ●● ● ● ●● ● ●● ● ● ●●● ●●●● ●●●●●●● ● ●● ●

● ● ● ● ● ●●● ● ● ● ●●●● ●●●● ● ● ● ●●● ● ●● ● ●● ● ●



● ● ● ● ● ●●● ●●● ●● ● ●●● ●● ●●● ● ● ● ●●●● ● ●● ● ● ● ● ● ● ● ● ●●● ● ●● ● ● ●●● ● ● ●● ● ● ● ● ●● ● ●●●●●●● ● ●● ● ● ● ● ● ● ● ●● ●●●● ● ● ● ● ●●● ● ●●● ●●●● ● ● ●● ● ● ●● ●● ● ●● ●



● ●● ●●● ● ● ● ● ●●● ● ●● ● ●●● ● ● ●● ● ●● ● ● ●●● ● ●●

● ● ●● ● ● ● ●● ● ●●● ● ●● ● ●●●●● ● ● ●● ●●● ●● ● ●● ● ●●●●● ●●● ● ● ● ● ● ● ● ● ● ● ●●●● ●●●● ● ●● ●● ● ●● ● ● ●●● ●●●●●● ●● ● ●● ● ● ●

4

● ●● ● ● ● ● ● ● ● ● ●●● ●● ● ●● ● ●●● ● ● ● ●●● ● ● ● ● ● ● ● ●● ●● ●●● ● ● ● ● ●● ● ● ●●●● ●●●●●●● ● ● ●● ●● ●●● ● ● ●● ● ● ● ●● ● ●● ● ● ●● ●

4.5

0.5

Sepal.Width





4.0

3

● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ●●● ● ● ● ●●● ● ● ●● ●● ● ● ● ●● ● ● ●● ●●● ● ● ●● ● ● ● ●● ●●● ● ●● ●●● ●●●● ●● ●● ● ●● ●●●●● ● ● ●●● ●●●●● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ●



3.5

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

2

3.0

4.0





3.0



Sepal.Length

2.0

2.5

1

2.0

Petal.Width

● ● ● ●●● ● ●●● ● ● ●●●●●● ● ● ●●

1

2

3

4

5

6

7

Scatter plot matrix ⇒ user control F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

14 / 172

Information Visualization The use of computer-supported interactive, visual representation of abstract data to amplify cognition Card, Mackinlay & Shneiderman

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

15 / 172

Information Visualization The use of computer-supported interactive, visual representation of abstract data to amplify cognition Card, Mackinlay & Shneiderman Overview first, zoom and filter, and then details-on-demand Information Seeking Mantra, Shneiderman Interactivity enables user control: exploration (panning) zooming 3D world

reduces clutter on the screen F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

15 / 172

Interactivity

Excentric Labeling: labels on demand

C. Plaisant and J.-D. Fekete, Human-Computer Interaction Lab and INRIA

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

16 / 172

Information Visualization The use of computer-supported interactive, visual representation of abstract data to amplify cognition Card, Mackinlay & Shneiderman

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

17 / 172

Information Visualization The use of computer-supported interactive, visual representation of abstract data to amplify cognition Card, Mackinlay & Shneiderman Abstract data: digital data with no real world “visual” counterpart, e.g.: sound high dimensional vectors

no “natural” visual representation of the data, e.g.: requests received by a web server file systems source code

Infovis 6= scientific visualization (Scivis) F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

17 / 172

Scientific data

Yearly Arctic Temperature Anomaly, 2002

NASA/Goddard Space Flight Center Scientific Visualization Studio

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

18 / 172

Scientific data

Growth of a brain tumor EPIDAURE

team, Inria

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

18 / 172

Scientific data

Weather forecast Météo France

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

18 / 172

Scientific data

Satellite images (mixed data!) Google Earth

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

18 / 172

Abstract data http://apiacoa.org/

XHTML structure of a web page (nodes=tags) Drawn with http://www.aharef.info/static/htmlgraph/

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

19 / 172

Abstract data

Treemap of the linux kernel (depth, size, type of files/dirs)

J.-D. Fekete, Human-Computer Interaction Lab and INRIA

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

19 / 172

Keim’s Taxonomy of Infovis methods Three “orthogonal” axes: 1 Visualization technique: standard 2D/3D (part II) geometrically-transformed (part III) iconic (part IV) pixel based (part IV) 2

Interaction technique: zooming and panning brushing and linking distortion

3

Data type: vectors texts trees graphs

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

20 / 172

Some examples from everyday life

Desktop programs are infovis tools for some abstract data: emails “text” files filesystems images music the web IDE etc.

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

21 / 172

Email (Kmail)

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

22 / 172

Pdf file (Kpdf)

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

23 / 172

Filesystem browsing (Konqueror)

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

24 / 172

Filesystem browsing (KDirStat)

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

25 / 172

Image database (Digikam)

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

26 / 172

Music (Amarok)

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

27 / 172

Wikipedia (Lupin’s popup)

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

28 / 172

Wikipedia (Lupin’s popup)

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

29 / 172

IDE (Eclipse)

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

30 / 172

Outline

1

Information Visualization Definition Examples from “everyday life with KDE”

2

Infovis goals and limitations What is it used for? Limitations of Infovis and VDM

3

Links with machine learning Formal model of Infovis Machine learning

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

31 / 172

What is it used for? Some specific goals: easier access (learning curve): GUI in general File system browsing

productivity (doing the same things but faster): IDE (on the fly documentation, multi-view, graphical programming, etc.) Monitoring (Lupin’s popup for wikipedia, treemap) On the fly search (kpdf, konqueror, etc.)

organization: Tree paradigm (sorting) Metadata (image, music, etc.) Overview

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

32 / 172

Visual data mining (VDM)

A.k.a. Visual Analytics and Visual Data Analysis: Interactive visual exploration of massive data sets: cluster analysis outlier detection dependency assessment pattern detection (repetition, sub-structure, etc.) etc.

Interactive visualization of the results of data mining algorithms: parameter tuning quality assessment mining on the results (e.g., meta-clustering) etc.

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

33 / 172

Visual mining of the Iris dataset Anderson's/Fisher's Iris



Sepal.Width

● ● ● ● ● ● ● ● ● ●● ● ●● ● ●●● ● ● ● ●●● ● ● ●● ●● ● ● ● ●● ● ● ●● ●●● ● ● ●● ● ● ● ●● ●●● ● ●● ●●● ●●●● ●● ●● ● ●● ●●●●● ● ● ●●● ●●●●● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ●● ● ● ●● ● ● ●● ● ●●● ● ●● ● ●●● ● ● ● ● ● ● ● ● ●● ●● ●●● ● ● ● ●● ● ● ●●●● ●●●●●●● ● ● ●● ●● ●●● ● ●● ● ● ● ● ●● ● ●● ● ● ●● ●











● ● ● ● ● ● ● ●●● ● ●● ● ● ●●● ● ● ● ● ● ●● ● ●●●●●●● ● ●● ● ● ● ● ● ● ●● ●●●● ● ● ● ● ●●● ● ●●● ●●●● ● ● ●● ● ● ●● ●● ● ●● ●





1.5 0.5 ●



● ● ●● ● ● ●● ● ●● ● ● ●●● ●●●● ●●●●●●● ● ●● ●

4.5

5.0

5.5

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●



6.0

6.5

7.0

7.5

8.0



● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ●

● ●



● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ●● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ●●● ●● ●●● ●●●●● ● ●● ● ● ●● ●●● ● ● ●● ●●●●● ● ● ● ● ● ●● ●●● ● ● ● ● ● ● ●● ● ●● ● ● ●● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ●

● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●





● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●



● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●





● ●● ● ● ●●●● ● ● ● ● ●● ● ●●●● ● ● ●●●● ● ● ●● ● ● ●● ● ●● ● ● ● ● ● ●● ● ● ● ●●● ●●● ● ● ●●● ● ● ●●●●●●● ●● ● ● ● ● ●● ● ● ● ●●

● ●

● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●



7.5

● ● ● ●

Petal.Length

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●







● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●



●●



● ● ● ● ● ●●● ● ● ● ●●●● ●●●● ● ● ● ●●● ● ●● ● ●● ● ●



● ● ● ● ● ●●● ●●● ●● ● ●●● ●● ●●● ● ● ● ●●●● ● ●● ● ● ●

2.5



● ●●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

7

2.0

3.0

4.0



● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

2.5

● ● ● ● ●

4.5

● ●● ●●● ● ● ● ● ●● ●● ●● ● ●●● ● ●● ● ● ● ● ● ● ●●● ● ●●

2.0

● ● ●

6





1.5

5

Sepal.Length

1.0

6.5

0.5 ● ● ●● ● ● ● ●● ● ●●● ● ●● ● ●●● ● ● ●●●●● ● ● ● ● ● ● ● ● ●●● ● ●● ● ●●● ●●● ●● ● ● ● ● ●●●● ● ● ●●● ●● ● ● ●● ● ●● ● ● ●● ●●● ●● ● ● ●● ● ● ●

5.5

4.0

4

3.5

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

3

3.0

2

2.5

1

2.0

Petal.Width

● ● ● ●●● ● ●●● ● ● ●●●●●● ● ● ●●

1

2

3

4

5

6

7

Global view (scatter plot matrix) F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

34 / 172

Visual mining of the Iris dataset Anderson's/Fisher's Iris



Sepal.Width

● ● ● ● ● ● ● ● ● ●● ● ●● ● ●●● ● ● ● ●●● ● ● ●● ●● ● ● ● ●● ● ● ●● ●●● ● ● ●● ● ● ● ●● ●●● ● ●● ●●● ●●●● ●● ●● ● ●● ●●●●● ● ● ●●● ●●●●● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ●● ● ●● ● ●●● ● ●● ● ●●● ● ● ● ● ● ● ● ●● ●● ●●● ● ● ● ● ●● ● ● ●●●● ●●●●●●● ● ● ●● ●● ●●● ●● ●● ● ● ● ●● ● ●● ● ● ●● ●

1.5

● ●

0.5

●●





● ●● ● ●





● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●



● ● ●



● ● ● ●● ● ●● ● ● ●●● ● ● ●● ● ● ● ● ●● ● ●●●●●●● ● ●● ● ● ● ● ● ● ●● ●●●● ● ● ● ●●● ● ●●● ●●●● ● ● ●● ● ●● ● ●● ●

● ● ●● ● ● ●● ● ●● ● ● ●●● ●●●● ●●●●●●● ● ●● ●

4.5

5.0

5.5



● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●







6.0

6.5

7.0

7.5

8.0



● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ●● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ●●● ●● ●●● ●●●●● ● ●● ● ● ●● ●●● ● ● ●● ●●●●● ● ● ● ● ● ●● ●●● ● ● ● ● ● ● ●● ● ●● ● ● ●● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ●

● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●





● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●



● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●●●● ● ● ● ● ●● ● ●●●● ● ● ●●●● ● ● ●● ● ● ●● ● ●● ● ● ● ● ● ●● ● ● ● ●●● ●●● ● ● ●●● ● ● ●●●●●●● ●● ● ● ● ● ●● ● ● ● ●●

● ●



● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●





● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

7.5

● ● ● ●

Petal.Length

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●



● ●



● ● ● ● ● ●●● ● ● ● ●●●● ●●●● ● ● ● ●●● ● ●● ● ●● ● ●



● ● ● ● ● ●●● ●●● ●● ● ●●● ●● ●●● ● ● ● ●●●● ● ●● ● ● ●

2.5



● ● ● ● ●●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

7

2.0

3.0

4.0



● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

2.5

● ● ● ● ●

4.5

● ●● ●●● ● ● ● ● ●● ●● ●● ● ●●● ● ●● ● ● ●● ● ● ●●● ● ●●

2.0

● ● ●

6





1.5

5

Sepal.Length

1.0

6.5

0.5 ● ● ●● ● ● ● ●● ● ●●● ● ●● ● ●●● ● ● ●●●●● ● ● ● ● ● ● ● ● ● ●●● ● ●● ● ●●●●● ●● ●● ● ● ●● ● ● ● ●● ● ●● ●● ● ● ●● ● ● ●● ●●●●●● ●● ● ●● ● ● ●

5.5

4.0

4

3.5

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

3

3.0

2

2.5

1

2.0

Petal.Width

● ● ● ●●● ● ●●● ● ● ●●●●●● ● ● ●●

1

2

3

4

5

6

7

Global view with class information F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

34 / 172

Visual mining of the Iris dataset 2.5

Anderson's/Fisher's Iris ● ●



2.0









































● ●





1.5

● ●

1.0

● ●





● ●



● ●















































0.5

Petal width





● ●



● ●







● ● ●

● ●

2.0

2.5













3.0

















● ●









3.5

● ●

4.0

Sepal width

Selected view F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

34 / 172

Visual mining of the Iris dataset 2.5

Anderson's/Fisher's Iris ● ●



2.0









































● ●





1.5

● ●

1.0

● ●





● ●



● ●















































0.5

Petal width





● ●



● ●







● ● ●

● ●

2.0

2.5













3.0

















● ●









3.5

● ●

4.0

Sepal width

Outliers and cluster detection F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

34 / 172

Visual mining of the Iris dataset 2.5

Anderson's/Fisher's Iris ● ●



2.0









































● ●





1.5

● ●

1.0

● ●





● ●



● ●















































0.5

Petal width





● ●



● ●







● ● ●

● ●

2.0

2.5























3.0







● ●









3.5

● ●

4.0

Sepal width

Clustering rule F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

34 / 172

Visual mining of the Iris dataset 2.5

Anderson's/Fisher's Iris ● ●



2.0









































● ●





1.5

● ●

1.0

● ●





● ●



● ●















































0.5

Petal width





● ●



● ●







● ● ●

● ●

2.0

2.5























3.0







● ●









3.5

● ●

4.0

Sepal width

Class information F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

34 / 172

Visual mining of the Iris dataset 2.5

Anderson's/Fisher's Iris ● ●



2.0









































● ●





1.5

● ●

1.0

● ●





● ●



● ●















































0.5

Petal width





● ●



● ●







● ● ●

● ●

2.0

2.5













3.0

















● ●









3.5

● ●

4.0

Sepal width

Classification rules F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

34 / 172

Limitations of Infovis and VDM

Visual illusions Distortion, occlusion, etc. Scalability: number of objects number of descriptors human scalability computer scalability

Non standard data (e.g., graphs, time series, etc.)

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

35 / 172

Grey levels

http://web.mit.edu/persci/people/adelson/index.html

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

36 / 172

Grey levels

http://web.mit.edu/persci/people/adelson/index.html

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

36 / 172

Grid illusion

http://en.wikipedia.org/wiki/Image:Grid_illusion.svg

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

37 / 172

Ebbinghaus illusion

http://en.wikipedia.org/wiki/Image:Spheres.JPG

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

38 / 172

Separable? 8.0

Anderson's/Fisher's Iris ● ●

7.5



● ●



● ●

7.0

● ●

● ●

6.5





● ●





6.0



5.5



● ● ● ● ●







● ● ●

● ●

● ●

● ● ●

● ● ● ● ●

● ● ● ●







● ● ● ● ● ● ●

● ● ●

● ●

● ●





5.0

● ● ●







● ● ●





● ●

● ●

● ● ●





● ●

● ● ●

4.5

Sepal length





● ● ●

● ● ● ● ●

● ●





● ●

● ● ●





● ●

2.0

2.5

● ●



3.0

3.5

4.0

Sepal width

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

39 / 172

Separable?

2.5

Anderson's/Fisher's Iris ● ●



2.0









































● ●





1.5

● ●

1.0

● ●





● ●



● ●















































0.5

Petal width





● ●



● ●







● ● ●

● ●

2.0

2.5













3.0

















● ●









3.5

● ●

4.0

Sepal width

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

39 / 172

Separable?

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

40 / 172

Separable?

z

y x

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

40 / 172

The scalability issue Vision is limited to 2 or 3 dimensions Position can be combined with other features: color (intensity and hue) shape (e.g., star icon) texture etc.

But fast pre-attentive processing is limited to roughly 5 combined features Correlating distant things is difficult Computer screens have a “low” resolution (HD is 2 millions pixels) Complex HD interactive display requires dedicated graphic board and associated software (OpenGL and Direct3D, Shader languages)

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

41 / 172

An example (13 + 1 variables)

Can you see something? F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

42 / 172

An example (13 + 1 variables)

Can you see something? F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

42 / 172

Outline

1

Information Visualization Definition Examples from “everyday life with KDE”

2

Infovis goals and limitations What is it used for? Limitations of Infovis and VDM

3

Links with machine learning Formal model of Infovis Machine learning

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

43 / 172

How to scale? Complementary solutions: interactivity (zooming, distorting, details on demand, etc.) data transformation: interaction between objects rather than objects themselves similarity between objects

data simplification: reduction of the number of objects (summary, clustering, etc.) reduction of the number of characteristics (selection, projection, etc.) compact layout: one glyph per object or one pixel per measurement

data ordering: positioning related things closely on the screen one to three dimensional ordering

Obviously linked to Machine Learning (clustering, projection, etc.).

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

44 / 172

Chi & Riedl’s Operator Model Production of the View from the Data: Data

Analytical Abstraction

Data Transformation

Visualization Abstraction

Analytical Transformation

View

Visual Mapping Transformation

Arrows represent transformation operators.

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

45 / 172

Chi & Riedl’s Operator Model Production of the View from the Data: Data

Analytical Abstraction

Visualization Abstraction

Graph of the content Depth first traversal tree

View

Disk Tree

Web site

Graph of the pages Data Transformation

Hyperbolic Tree

Analytical Transformation

Visual Mapping Transformation

Arrows represent transformation operators.

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

45 / 172

Where is machine learning?

(almost) everywhere! Two types of operators: 1

data independent

2

data dependent ⇒ machine learning, optimization, artificial intelligence (AI)

ML and AI operators: Data: preprocessing, cleaning, etc. Data transformation: feature extraction, dissimilarity, etc. Visualization transformation: projection, clustering, ordering, etc.

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

46 / 172

Organization of the lecture

Description of major visualization methods for vector/table data For each class of methods: Limitations Rendering & Interactivity AI problems

Major AI challenges: Clustering Feature extraction Ordering

Many things are left out, e.g.: Non vector data Distortion techniques etc.

F. Rossi (INRIA)

Infovis & ML

27-28/02/2007

47 / 172