Information Visualization and Machine Learning Fabrice Rossi Projet AxIS, INRIA Rocquencourt
27-28/02/2007
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
1 / 172
Goals of this lecture
1
To give an introduction to Information Visualization (Infovis) enhancement methods for classical displays specialized displays why you should leverage infovis in your everyday work
2
To outline links between Infovis and Machine Learning why do they exist? current solutions open research problems
3
To give examples of successful joint researches: Machine learning methods designed for visualization Visualization of machine learning algorithm results
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
2 / 172
Organization of this lecture
Visualization oriented Each part consists in an introduction to some visualization techniques an analysis of their links with machine learning
Five parts: 1 2 3 4 5
Introduction to Infovis Scatter plots Geometrically-transformed displays Iconic and pixel based displays Visualization methods designed in the machine learning community
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
3 / 172
Outline of part I Introduction 1
Information Visualization Definition Examples from “everyday life with KDE”
2
Infovis goals and limitations What is it used for? Limitations of Infovis and VDM
3
Links with machine learning Formal model of Infovis Machine learning
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
4 / 172
Outline of part II
Scatter plots 4
Introduction
5
Feature number reduction Principles Neighborhood structure preservation
6
Overlapping reduction Rendering and interaction Clustering
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
5 / 172
Outline of part III
Geometrically-transformed displays 7
Introduction
8
Scatter plot matrix Interaction
9
Parallel Coordinates Overlapping Variable order
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
6 / 172
Outline of part IV
Iconic and pixel based displays 10
Introduction
11
Iconic displays Chernoff’s faces Star glyph Glyph Positioning
12
Pixel based displays Dense pixel displays Dissimilarity matrix
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
7 / 172
Outline of part V
Machine Learning
13
Self Organizing Map Principles Visualization
14
Latent Variable Models General principles Generative Topographic Mapping
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
8 / 172
Part I Introduction
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
9 / 172
Outline
1
Information Visualization Definition Examples from “everyday life with KDE”
2
Infovis goals and limitations What is it used for? Limitations of Infovis and VDM
3
Links with machine learning Formal model of Infovis Machine learning
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
10 / 172
Information Visualization The use of computer-supported interactive, visual representation of abstract data to amplify cognition Card, Mackinlay & Shneiderman
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
11 / 172
Information Visualization The use of computer-supported interactive, visual representation of abstract data to amplify cognition Card, Mackinlay & Shneiderman Human preattentive processing capabilities: non conscious processing (no thinking involved) low level visual system extremely fast: 200 ms scalable (no browsing ⇒ sublinear scaling) feature type must match data type (e.g., hue is suitable for categories, not real value)
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
11 / 172
Preattentive processing
Can you see a color outlier? From “Perception in Visualization” by C. G. Healey http://www.csc.ncsu.edu/faculty/healey/PP/index.html
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
12 / 172
Preattentive processing
Can you see a color outlier? From “Perception in Visualization” by C. G. Healey http://www.csc.ncsu.edu/faculty/healey/PP/index.html
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
12 / 172
Preattentive processing
Can you see a shape outlier? From “Perception in Visualization” by C. G. Healey http://www.csc.ncsu.edu/faculty/healey/PP/index.html
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
12 / 172
Preattentive processing
Can you see a shape outlier? From “Perception in Visualization” by C. G. Healey http://www.csc.ncsu.edu/faculty/healey/PP/index.html
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
12 / 172
Information Visualization The use of computer-supported interactive, visual representation of abstract data to amplify cognition Card, Mackinlay & Shneiderman
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
13 / 172
Information Visualization The use of computer-supported interactive, visual representation of abstract data to amplify cognition Card, Mackinlay & Shneiderman Tool metaphor (hammer, microscope, etc.): extending user possibilities: more scalable processing (speed and/or volume) details enhancement multi-source fusion etc.
under user control
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
13 / 172
User control 1.5
Anderson's/Fisher's Iris ● ● ●
●
1.0
● ●
●
●
●
●
●
0.5
●
−0.5
0.0
●
●
● ● ●● ●
● ●
●● ● ● ● ●
●
●
●●
●
● ● ●
●
●
● ●
● ● ●
●
●● ● ●
●
●
● ●
● ● ●
● ● ●
●
● ●
●
●
● ● ●
●
● ●
●
●
●
●
●
● ●
●
●
● ● ● ●
● ●
● ●
●
●
●●
●
●
●
● ●
● ●
●
●
●
●
●
● ●
●
●
● ●
●
●●
●
● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ●
−1.0
●
● ●
●
●
●
−1.5
●
−3
−2
−1
0
1
2
3
4
Nonlinear projection ⇒ no user control F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
14 / 172
User control Anderson's/Fisher's Iris
●
5.0
5.5
2.5 1.5 0.5
●
●
6.5
7.0
7.5
8.0
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ●
7.5 6.5
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
7
●
● ●● ● ● ●●●● ● ● ● ● ●● ● ●●●● ● ● ●●●● ● ● ●● ● ● ●● ● ●● ● ● ● ● ● ●● ● ● ● ●●● ●●● ● ● ●●● ● ● ●●●●●●● ●● ● ● ● ● ●● ● ● ● ●●
●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ●●● ●● ●●● ●●●●● ● ●● ● ● ● ●● ●●● ● ● ●● ●●●●● ● ● ● ● ● ●● ●●● ● ● ● ● ● ● ●● ● ●● ● ● ●● ● ● ● ● ● ● ●
●
●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
2.5
● ● ● ● ●
5.5
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
2.0
● ● ●
4.5
●
Petal.Length
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
●
●
● ●
1.5
6
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
●
6.0
●
1.0
5
●
●
● ●
● ● ●● ● ● ●● ● ●● ● ● ●●● ●●●● ●●●●●●● ● ●● ●
● ● ● ● ● ●●● ● ● ● ●●●● ●●●● ● ● ● ●●● ● ●● ● ●● ● ●
●
● ● ● ● ● ●●● ●●● ●● ● ●●● ●● ●●● ● ● ● ●●●● ● ●● ● ● ● ● ● ● ● ● ●●● ● ●● ● ● ●●● ● ● ●● ● ● ● ● ●● ● ●●●●●●● ● ●● ● ● ● ● ● ● ● ●● ●●●● ● ● ● ● ●●● ● ●●● ●●●● ● ● ●● ● ● ●● ●● ● ●● ●
●
● ●● ●●● ● ● ● ● ●●● ● ●● ● ●●● ● ● ●● ● ●● ● ● ●●● ● ●●
● ● ●● ● ● ● ●● ● ●●● ● ●● ● ●●●●● ● ● ●● ●●● ●● ● ●● ● ●●●●● ●●● ● ● ● ● ● ● ● ● ● ● ●●●● ●●●● ● ●● ●● ● ●● ● ● ●●● ●●●●●● ●● ● ●● ● ● ●
4
● ●● ● ● ● ● ● ● ● ● ●●● ●● ● ●● ● ●●● ● ● ● ●●● ● ● ● ● ● ● ● ●● ●● ●●● ● ● ● ● ●● ● ● ●●●● ●●●●●●● ● ● ●● ●● ●●● ● ● ●● ● ● ● ●● ● ●● ● ● ●● ●
4.5
0.5
Sepal.Width
●
●
4.0
3
● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ●●● ● ● ● ●●● ● ● ●● ●● ● ● ● ●● ● ● ●● ●●● ● ● ●● ● ● ● ●● ●●● ● ●● ●●● ●●●● ●● ●● ● ●● ●●●●● ● ● ●●● ●●●●● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ●
●
3.5
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
2
3.0
4.0
●
●
3.0
●
Sepal.Length
2.0
2.5
1
2.0
Petal.Width
● ● ● ●●● ● ●●● ● ● ●●●●●● ● ● ●●
1
2
3
4
5
6
7
Scatter plot matrix ⇒ user control F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
14 / 172
Information Visualization The use of computer-supported interactive, visual representation of abstract data to amplify cognition Card, Mackinlay & Shneiderman
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
15 / 172
Information Visualization The use of computer-supported interactive, visual representation of abstract data to amplify cognition Card, Mackinlay & Shneiderman Overview first, zoom and filter, and then details-on-demand Information Seeking Mantra, Shneiderman Interactivity enables user control: exploration (panning) zooming 3D world
reduces clutter on the screen F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
15 / 172
Interactivity
Excentric Labeling: labels on demand
C. Plaisant and J.-D. Fekete, Human-Computer Interaction Lab and INRIA
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
16 / 172
Information Visualization The use of computer-supported interactive, visual representation of abstract data to amplify cognition Card, Mackinlay & Shneiderman
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
17 / 172
Information Visualization The use of computer-supported interactive, visual representation of abstract data to amplify cognition Card, Mackinlay & Shneiderman Abstract data: digital data with no real world “visual” counterpart, e.g.: sound high dimensional vectors
no “natural” visual representation of the data, e.g.: requests received by a web server file systems source code
Infovis 6= scientific visualization (Scivis) F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
17 / 172
Scientific data
Yearly Arctic Temperature Anomaly, 2002
NASA/Goddard Space Flight Center Scientific Visualization Studio
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
18 / 172
Scientific data
Growth of a brain tumor EPIDAURE
team, Inria
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
18 / 172
Scientific data
Weather forecast Météo France
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
18 / 172
Scientific data
Satellite images (mixed data!) Google Earth
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
18 / 172
Abstract data http://apiacoa.org/
XHTML structure of a web page (nodes=tags) Drawn with http://www.aharef.info/static/htmlgraph/
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
19 / 172
Abstract data
Treemap of the linux kernel (depth, size, type of files/dirs)
J.-D. Fekete, Human-Computer Interaction Lab and INRIA
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
19 / 172
Keim’s Taxonomy of Infovis methods Three “orthogonal” axes: 1 Visualization technique: standard 2D/3D (part II) geometrically-transformed (part III) iconic (part IV) pixel based (part IV) 2
Interaction technique: zooming and panning brushing and linking distortion
3
Data type: vectors texts trees graphs
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
20 / 172
Some examples from everyday life
Desktop programs are infovis tools for some abstract data: emails “text” files filesystems images music the web IDE etc.
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
21 / 172
Email (Kmail)
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
22 / 172
Pdf file (Kpdf)
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
23 / 172
Filesystem browsing (Konqueror)
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
24 / 172
Filesystem browsing (KDirStat)
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
25 / 172
Image database (Digikam)
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
26 / 172
Music (Amarok)
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
27 / 172
Wikipedia (Lupin’s popup)
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
28 / 172
Wikipedia (Lupin’s popup)
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
29 / 172
IDE (Eclipse)
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
30 / 172
Outline
1
Information Visualization Definition Examples from “everyday life with KDE”
2
Infovis goals and limitations What is it used for? Limitations of Infovis and VDM
3
Links with machine learning Formal model of Infovis Machine learning
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
31 / 172
What is it used for? Some specific goals: easier access (learning curve): GUI in general File system browsing
productivity (doing the same things but faster): IDE (on the fly documentation, multi-view, graphical programming, etc.) Monitoring (Lupin’s popup for wikipedia, treemap) On the fly search (kpdf, konqueror, etc.)
organization: Tree paradigm (sorting) Metadata (image, music, etc.) Overview
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
32 / 172
Visual data mining (VDM)
A.k.a. Visual Analytics and Visual Data Analysis: Interactive visual exploration of massive data sets: cluster analysis outlier detection dependency assessment pattern detection (repetition, sub-structure, etc.) etc.
Interactive visualization of the results of data mining algorithms: parameter tuning quality assessment mining on the results (e.g., meta-clustering) etc.
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
33 / 172
Visual mining of the Iris dataset Anderson's/Fisher's Iris
●
Sepal.Width
● ● ● ● ● ● ● ● ● ●● ● ●● ● ●●● ● ● ● ●●● ● ● ●● ●● ● ● ● ●● ● ● ●● ●●● ● ● ●● ● ● ● ●● ●●● ● ●● ●●● ●●●● ●● ●● ● ●● ●●●●● ● ● ●●● ●●●●● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ●● ● ● ●● ● ● ●● ● ●●● ● ●● ● ●●● ● ● ● ● ● ● ● ● ●● ●● ●●● ● ● ● ●● ● ● ●●●● ●●●●●●● ● ● ●● ●● ●●● ● ●● ● ● ● ● ●● ● ●● ● ● ●● ●
●
●
●
●
●
● ● ● ● ● ● ● ●●● ● ●● ● ● ●●● ● ● ● ● ● ●● ● ●●●●●●● ● ●● ● ● ● ● ● ● ●● ●●●● ● ● ● ● ●●● ● ●●● ●●●● ● ● ●● ● ● ●● ●● ● ●● ●
●
●
1.5 0.5 ●
●
● ● ●● ● ● ●● ● ●● ● ● ●●● ●●●● ●●●●●●● ● ●● ●
4.5
5.0
5.5
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
●
6.0
6.5
7.0
7.5
8.0
●
● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ●
● ●
●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ●● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ●●● ●● ●●● ●●●●● ● ●● ● ● ●● ●●● ● ● ●● ●●●●● ● ● ● ● ● ●● ●●● ● ● ● ● ● ● ●● ● ●● ● ● ●● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ●
● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
●
●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
●
●
● ●● ● ● ●●●● ● ● ● ● ●● ● ●●●● ● ● ●●●● ● ● ●● ● ● ●● ● ●● ● ● ● ● ● ●● ● ● ● ●●● ●●● ● ● ●●● ● ● ●●●●●●● ●● ● ● ● ● ●● ● ● ● ●●
● ●
● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
●
7.5
● ● ● ●
Petal.Length
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
●
●
●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
●
●●
●
● ● ● ● ● ●●● ● ● ● ●●●● ●●●● ● ● ● ●●● ● ●● ● ●● ● ●
●
● ● ● ● ● ●●● ●●● ●● ● ●●● ●● ●●● ● ● ● ●●●● ● ●● ● ● ●
2.5
●
● ●●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
7
2.0
3.0
4.0
●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
2.5
● ● ● ● ●
4.5
● ●● ●●● ● ● ● ● ●● ●● ●● ● ●●● ● ●● ● ● ● ● ● ● ●●● ● ●●
2.0
● ● ●
6
●
●
1.5
5
Sepal.Length
1.0
6.5
0.5 ● ● ●● ● ● ● ●● ● ●●● ● ●● ● ●●● ● ● ●●●●● ● ● ● ● ● ● ● ● ●●● ● ●● ● ●●● ●●● ●● ● ● ● ● ●●●● ● ● ●●● ●● ● ● ●● ● ●● ● ● ●● ●●● ●● ● ● ●● ● ● ●
5.5
4.0
4
3.5
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
3
3.0
2
2.5
1
2.0
Petal.Width
● ● ● ●●● ● ●●● ● ● ●●●●●● ● ● ●●
1
2
3
4
5
6
7
Global view (scatter plot matrix) F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
34 / 172
Visual mining of the Iris dataset Anderson's/Fisher's Iris
●
Sepal.Width
● ● ● ● ● ● ● ● ● ●● ● ●● ● ●●● ● ● ● ●●● ● ● ●● ●● ● ● ● ●● ● ● ●● ●●● ● ● ●● ● ● ● ●● ●●● ● ●● ●●● ●●●● ●● ●● ● ●● ●●●●● ● ● ●●● ●●●●● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ●● ● ●● ● ●●● ● ●● ● ●●● ● ● ● ● ● ● ● ●● ●● ●●● ● ● ● ● ●● ● ● ●●●● ●●●●●●● ● ● ●● ●● ●●● ●● ●● ● ● ● ●● ● ●● ● ● ●● ●
1.5
● ●
0.5
●●
●
●
● ●● ● ●
●
●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
●
● ● ●
●
● ● ● ●● ● ●● ● ● ●●● ● ● ●● ● ● ● ● ●● ● ●●●●●●● ● ●● ● ● ● ● ● ● ●● ●●●● ● ● ● ●●● ● ●●● ●●●● ● ● ●● ● ●● ● ●● ●
● ● ●● ● ● ●● ● ●● ● ● ●●● ●●●● ●●●●●●● ● ●● ●
4.5
5.0
5.5
●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
●
●
●
6.0
6.5
7.0
7.5
8.0
●
● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ●● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ●●● ●● ●●● ●●●●● ● ●● ● ● ●● ●●● ● ● ●● ●●●●● ● ● ● ● ● ●● ●●● ● ● ● ● ● ● ●● ● ●● ● ● ●● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ●
● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
●
●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●●●● ● ● ● ● ●● ● ●●●● ● ● ●●●● ● ● ●● ● ● ●● ● ●● ● ● ● ● ● ●● ● ● ● ●●● ●●● ● ● ●●● ● ● ●●●●●●● ●● ● ● ● ● ●● ● ● ● ●●
● ●
●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
●
●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
7.5
● ● ● ●
Petal.Length
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
●
● ●
●
● ● ● ● ● ●●● ● ● ● ●●●● ●●●● ● ● ● ●●● ● ●● ● ●● ● ●
●
● ● ● ● ● ●●● ●●● ●● ● ●●● ●● ●●● ● ● ● ●●●● ● ●● ● ● ●
2.5
●
● ● ● ● ●●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
7
2.0
3.0
4.0
●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
2.5
● ● ● ● ●
4.5
● ●● ●●● ● ● ● ● ●● ●● ●● ● ●●● ● ●● ● ● ●● ● ● ●●● ● ●●
2.0
● ● ●
6
●
●
1.5
5
Sepal.Length
1.0
6.5
0.5 ● ● ●● ● ● ● ●● ● ●●● ● ●● ● ●●● ● ● ●●●●● ● ● ● ● ● ● ● ● ● ●●● ● ●● ● ●●●●● ●● ●● ● ● ●● ● ● ● ●● ● ●● ●● ● ● ●● ● ● ●● ●●●●●● ●● ● ●● ● ● ●
5.5
4.0
4
3.5
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
3
3.0
2
2.5
1
2.0
Petal.Width
● ● ● ●●● ● ●●● ● ● ●●●●●● ● ● ●●
1
2
3
4
5
6
7
Global view with class information F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
34 / 172
Visual mining of the Iris dataset 2.5
Anderson's/Fisher's Iris ● ●
●
2.0
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
1.5
● ●
1.0
● ●
●
●
● ●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
0.5
Petal width
●
●
● ●
●
● ●
●
●
●
● ● ●
● ●
2.0
2.5
●
●
●
●
●
●
3.0
●
●
●
●
●
●
●
●
● ●
●
●
●
●
3.5
● ●
4.0
Sepal width
Selected view F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
34 / 172
Visual mining of the Iris dataset 2.5
Anderson's/Fisher's Iris ● ●
●
2.0
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
1.5
● ●
1.0
● ●
●
●
● ●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
0.5
Petal width
●
●
● ●
●
● ●
●
●
●
● ● ●
● ●
2.0
2.5
●
●
●
●
●
●
3.0
●
●
●
●
●
●
●
●
● ●
●
●
●
●
3.5
● ●
4.0
Sepal width
Outliers and cluster detection F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
34 / 172
Visual mining of the Iris dataset 2.5
Anderson's/Fisher's Iris ● ●
●
2.0
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
1.5
● ●
1.0
● ●
●
●
● ●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
0.5
Petal width
●
●
● ●
●
● ●
●
●
●
● ● ●
● ●
2.0
2.5
●
●
●
●
●
●
●
●
●
●
●
3.0
●
●
●
● ●
●
●
●
●
3.5
● ●
4.0
Sepal width
Clustering rule F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
34 / 172
Visual mining of the Iris dataset 2.5
Anderson's/Fisher's Iris ● ●
●
2.0
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
1.5
● ●
1.0
● ●
●
●
● ●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
0.5
Petal width
●
●
● ●
●
● ●
●
●
●
● ● ●
● ●
2.0
2.5
●
●
●
●
●
●
●
●
●
●
●
3.0
●
●
●
● ●
●
●
●
●
3.5
● ●
4.0
Sepal width
Class information F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
34 / 172
Visual mining of the Iris dataset 2.5
Anderson's/Fisher's Iris ● ●
●
2.0
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
1.5
● ●
1.0
● ●
●
●
● ●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
0.5
Petal width
●
●
● ●
●
● ●
●
●
●
● ● ●
● ●
2.0
2.5
●
●
●
●
●
●
3.0
●
●
●
●
●
●
●
●
● ●
●
●
●
●
3.5
● ●
4.0
Sepal width
Classification rules F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
34 / 172
Limitations of Infovis and VDM
Visual illusions Distortion, occlusion, etc. Scalability: number of objects number of descriptors human scalability computer scalability
Non standard data (e.g., graphs, time series, etc.)
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
35 / 172
Grey levels
http://web.mit.edu/persci/people/adelson/index.html
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
36 / 172
Grey levels
http://web.mit.edu/persci/people/adelson/index.html
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
36 / 172
Grid illusion
http://en.wikipedia.org/wiki/Image:Grid_illusion.svg
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
37 / 172
Ebbinghaus illusion
http://en.wikipedia.org/wiki/Image:Spheres.JPG
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
38 / 172
Separable? 8.0
Anderson's/Fisher's Iris ● ●
7.5
●
● ●
●
● ●
7.0
● ●
● ●
6.5
●
●
● ●
●
●
6.0
●
5.5
●
● ● ● ● ●
●
●
●
● ● ●
● ●
● ●
● ● ●
● ● ● ● ●
● ● ● ●
●
●
●
● ● ● ● ● ● ●
● ● ●
● ●
● ●
●
●
5.0
● ● ●
●
●
●
● ● ●
●
●
● ●
● ●
● ● ●
●
●
● ●
● ● ●
4.5
Sepal length
●
●
● ● ●
● ● ● ● ●
● ●
●
●
● ●
● ● ●
●
●
● ●
2.0
2.5
● ●
●
3.0
3.5
4.0
Sepal width
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
39 / 172
Separable?
2.5
Anderson's/Fisher's Iris ● ●
●
2.0
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
1.5
● ●
1.0
● ●
●
●
● ●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
0.5
Petal width
●
●
● ●
●
● ●
●
●
●
● ● ●
● ●
2.0
2.5
●
●
●
●
●
●
3.0
●
●
●
●
●
●
●
●
● ●
●
●
●
●
3.5
● ●
4.0
Sepal width
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
39 / 172
Separable?
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
40 / 172
Separable?
z
y x
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
40 / 172
The scalability issue Vision is limited to 2 or 3 dimensions Position can be combined with other features: color (intensity and hue) shape (e.g., star icon) texture etc.
But fast pre-attentive processing is limited to roughly 5 combined features Correlating distant things is difficult Computer screens have a “low” resolution (HD is 2 millions pixels) Complex HD interactive display requires dedicated graphic board and associated software (OpenGL and Direct3D, Shader languages)
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
41 / 172
An example (13 + 1 variables)
Can you see something? F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
42 / 172
An example (13 + 1 variables)
Can you see something? F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
42 / 172
Outline
1
Information Visualization Definition Examples from “everyday life with KDE”
2
Infovis goals and limitations What is it used for? Limitations of Infovis and VDM
3
Links with machine learning Formal model of Infovis Machine learning
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
43 / 172
How to scale? Complementary solutions: interactivity (zooming, distorting, details on demand, etc.) data transformation: interaction between objects rather than objects themselves similarity between objects
data simplification: reduction of the number of objects (summary, clustering, etc.) reduction of the number of characteristics (selection, projection, etc.) compact layout: one glyph per object or one pixel per measurement
data ordering: positioning related things closely on the screen one to three dimensional ordering
Obviously linked to Machine Learning (clustering, projection, etc.).
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
44 / 172
Chi & Riedl’s Operator Model Production of the View from the Data: Data
Analytical Abstraction
Data Transformation
Visualization Abstraction
Analytical Transformation
View
Visual Mapping Transformation
Arrows represent transformation operators.
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
45 / 172
Chi & Riedl’s Operator Model Production of the View from the Data: Data
Analytical Abstraction
Visualization Abstraction
Graph of the content Depth first traversal tree
View
Disk Tree
Web site
Graph of the pages Data Transformation
Hyperbolic Tree
Analytical Transformation
Visual Mapping Transformation
Arrows represent transformation operators.
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
45 / 172
Where is machine learning?
(almost) everywhere! Two types of operators: 1
data independent
2
data dependent ⇒ machine learning, optimization, artificial intelligence (AI)
ML and AI operators: Data: preprocessing, cleaning, etc. Data transformation: feature extraction, dissimilarity, etc. Visualization transformation: projection, clustering, ordering, etc.
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
46 / 172
Organization of the lecture
Description of major visualization methods for vector/table data For each class of methods: Limitations Rendering & Interactivity AI problems
Major AI challenges: Clustering Feature extraction Ordering
Many things are left out, e.g.: Non vector data Distortion techniques etc.
F. Rossi (INRIA)
Infovis & ML
27-28/02/2007
47 / 172