Outline Machine Learning Approaches to Image Retrieval
z Introduction z Region-based
image categorization using Multiple-Instance Learning
Yixin Chen Department of Computer Science University of New Orleans
z Content-based
z Conclusions JPL, 11/6/2003
1
Image Retrieval z The
image retrieval by
clustering
http://www.cs.uno.edu/~yixin
and future work JPL, 11/6/2003
2
Text-Based Approach
driving forces
z Input
Internet z Storage devices z Computing power
keywords descriptions
z
z Two
Elephants
approaches
Text- based approach z Content- based approach
Text-Based Image Retrieval System
z
JPL, 11/6/2003
Image Database 3
Text-Based Approach
4
Content-Based Approach
z Index
images using keywords (Google, Lycos, etc.)
z Index
Easy to implement Fast retrieval z Web image search (surrounding text) z Manual annotation is not always available z A picture is worth a thousand words z Surrounding text may not describe the image
images using low-level features CBIR System
z z
JPL, 11/6/2003
JPL, 11/6/2003
Image Database Content-based image retrieval (CBIR): search pictures as pictures 5
JPL, 11/6/2003
6
1
CBIR
Previous Work on CBIR
z Applications
z Starting
from early 1990s z General-purpose image search engines
Commerce (fashion catalogue, ……) Biomedicine (X - ray, CT, ……) z Crime prevention (security filtering, ……) z Cultural (art galleries, museums, ……) z Military (radar, aerial, ……) z Entertainment (personal album, ……) z
IBM QBIC System and MIT Photobook System (two of the earliest systems) z VIRAGE System, Columbia VisualSEEK and WebSEEK Systems, UCSB NeTra System, UIUC MARS System, Stanford SIMPLIcity System, NECI PicHunter System, Berkeley Blobworld System, etc.
z
z
7
JPL, 11/6/2003
A Data-Flow Diagram
Open Problem
Histogram, color layout, sub-images, regions, etc.
Feature Extraction
Image Database
Linear ordering, Projection to 2-D, etc.
Euclidean distance, intersection, shape comparison, region matching, etc.
Compute Similarity Measure
z
Nature of digital images: arrays of numbers
z
Descriptions of images: high - level concepts z
z
Discrepancy between low-level features and highlevel concepts High feature similarity may not always correspond to semantic similarity
9
JPL, 11/6/2003
Narrowing the Semantic Gap
z
Sunset, mountains, dogs, ……
Semantic gap z
Visualization
z
z Imagery
8
JPL, 11/6/2003
Narrowing the Semantic Gap
features and similarity measure
Select effective imagery features
10
JPL, 11/6/2003
z Image
[Tieu et al., IEEE
CVPR’00]
Categorization
z Vacation
images [Vailaya
Image Database
et al., IEEE Trans. IP 10(1)]
Feature Space
z SIMPLIcity [Wang et al., IEEE Trans. PAMI 23(9)]
Tigers
Indoor
Wolf, SPIE 1995] z
Landscape
City
z ALIP [Li et al., IEEE Trans.
Subjective experiments [Mojsilovic et al., IEEE Trans.
PAMI 2003]
IP 9(1)] JPL, 11/6/2003
Outdoor
z Indoor/outdoor [Yu and
Cars
11
Sunset
JPL, 11/6/2003
Mountain
Forest
12
2
Narrowing the Semantic Gap z Relevance
Outline z Introduction
feedback
CBIR System
z Region-based 1
2
3
4
image categorization using Multiple-Instance Learning
1, 2, 3, 4
z Adjusting
z Content-based
similarity measure [Picard et al.,
IEEE ICIP’96], [Rui et al., IEEE CSVT 8(5)], [Cox et al., IEEE Trans. IP 9(1)]
z Support
z Conclusions
vector machine [Tong et al. ACM MM’01] JPL, 11/6/2003
13
Image Categorization z Image z
and future work JPL, 11/6/2003
14
Motivation
categorization
Labeling of images into one of a number of predefined categories
z Difficulties
z
Variable and uncontrolled imaging conditions z Complex and hard - to - describe objects z Occlusion z Semantic gap z
JPL, 11/6/2003
z z z
JPL, 11/6/2003
16
Problem Formulation
z Goal
z An
image is represented as a collection of regions obtained from segmentation
Design a computer program that can “learn” image concepts from the implicit information of objects contained in images
z What
(a) to (d) belong to winter category since we see snow in them (b) to (f) belong to people category since there are people in them (b) to (d) belong to skiing category since we see people and snow (a) to (g) belong to outdoor scene category since they all have a region or regions corresponding to snow, sky, sea, trees, or grass
15
Problem Formulation
z
image retrieval by
clustering
is an “object”?
In the physical world: anything that is visible or tangible and is relatively stable in form z In an image: a region that is a projection of an object in the physical world z
JPL, 11/6/2003
17
JPL, 11/6/2003
18
3
Problem Formulation z An
Problem Formulation
overview of the classification system
z Training z
Region 1
Input image
Region labels are unknown z
Region 2
set: a set of labeled images
Laborious, extremely difficult, subjective
Output label
Classifier Region 3
{
}
imagei = region1 , region2 , ......, regionmi ⊂ R d
JPL, 11/6/2003
19
Problem Formulation z Learning
Multiple-Instance Learning
with incomplete information
z
The classifier uses region features
z
Labels are associated with images instead of individual regions
z
A generalization of supervised learning
z
Simple tricks does not work well JPL, 11/6/2003
z Bag
z The
bag labels using instances
training data is a set of labeled bags
21
22
JPL, 11/6/2003
Multiple-Instance Learning
formulation
z Previous
formulation does not perform well for image categorization
[Dietterich, et al., AI’97], [Andrews, et al., NIPS’03], [Maron, et al., ICML’98], [Zhang, et al., ICML’02]
A bag is positive if at least one of its instances is a positive example; otherwise the bag is negative z Build an instance classifier z Bag label is equal to the label of its most “positive” instance z
JPL, 11/6/2003
(image), instance (region)
z Predict
Multiple-Instance Learning z Previous
20
JPL, 11/6/2003
skiing
Snow (a)
23
People (e) (f)
Sky (e) (f)
JPL, 11/6/2003
Trees (a) (f) (g)
24
4
DD-SVM: An Extension of MultipleInstance Learning
Experiments
zA
bag must contain some number of instances satisfying various properties
z 20
image categories, each containing 100 images
Find instance prototypes using Diverse Density [Maron et al., NIPS’98] z Define a bag feature space using instance prototypes z Design a maximal margin classifier in the bag feature space
Africa Buildings Dinosaurs Flowers Mountains
z
Beach Buses Elephants Horses Food
25
JPL, 11/6/2003
Experiments
Waterfall Antiques Battle ships Skiing Dessert
JPL, 11/6/2003
26
JPL, 11/6/2003
28
Experiments
27
JPL, 11/6/2003
Categorization Performance
An Image Classification Example z Confusion
z Classification
Dogs Lizard Fashion Sunsets Cars
matrix
accuracy (10-class)
DD-SVM
81.5% ± 2.2%
Hist-SVM MI-SVM [Andrews et al., NIPS’03]
66.7% ± 1.8% 74.7% ± 0.5%
JPL, 11/6/2003
29
JPL, 11/6/2003
30
5
Categorization Performance
Sensitivity to Image Segmentation Compare DD-SVM with MI-SVM
Standard Deviation of Accuracy
Average Classification Accuracy
Some errors between Beach and Mountains categories
Scalability
0.6 0.4 0.2 0
1
2 3 4 Different Coarseness Level of Image Segmentation
5
1
2 3 4 Different Coarseness Level of Image Segmentation
5
0.025 0.02 0.015 0.01 0.005 0
32
JPL, 11/6/2003
Difference in classification accuracy between DD-SVM and MI-SVM
1
0.16
Difference in Average Classification Accuracy
Average Classification Accuracy Standard Deviation of Accuracy
0.8
Scalability
Compare DD-SVM with MI-SVM
0.8 0.6 0.4 0.2 10
11
12
13
14 15 16 17 Number of Categories
18
19
20
0.025 0.02 0.015 0.01 0.005 0
9.5% 11.7% 13.8% 27.4%
31
JPL, 11/6/2003
0
6.8%
1
10
11
12
13
14 15 16 17 Number of Categories
18
19
33
0.06 0.04 0.02
10
11
12
13
14
15
16
17
18
19
20
JPL, 11/6/2003
34
CLUE: CLUsters-based rEtrieval of images by unsupervised learning
z Introduction
z Basic idea z All CBIR methods assume some correlation between image semantics and distance measure
z Region-based
image categorization using Multiple-Instance Learning
z Content-based
z Conclusions
0.1 0.08
Number of Categories
Outline
clustering
0.12
0
20
JPL, 11/6/2003
0.14
image retrieval by
z
Why not using this information to the furthest extent
and future work JPL, 11/6/2003
35
JPL, 11/6/2003
36
6
System Overview
Neighboring Images Selection
A general diagram of a CBIR system using the CLUE
z
CLUE Image Feature Extraction Database
Select Neighboring Images
z
Image Clustering
z
Display And Feedback
Compute Similarity Measure JPL, 11/6/2003
z
Pick k nearest neighbors of the query as seeds Find r nearest neighbors for each seed Take all distinct images as neighboring images
37
Weighted Graph Representation z Graph
Nearest neighbors method
k=3, r=4
JPL, 11/6/2003
38
JPL, 11/6/2003
40
User Interface
representation
Vertices denote images z Edges are formed between vertices z Nonnegative weight of an edge indicates the similarity between two vertices z
z Recursive z
Ncut
Bipartition the largest sub - graph each time JPL, 11/6/2003
39
An Experimental System
Query Examples z Query
z Similarity z
Examples from 60,000-image COREL Database
measure
UFM [Chen et al. IEEE PAMI 24(9)]
Bird, car, food, historical buildings, and soccer game
z z
UFM
CLUE
z Database
COREL 60,000 Bird, 6 out of 11
JPL, 11/6/2003
41
Bird, 3 out of 11
JPL, 11/6/2003
42
7
Query Examples CLUE
Query Examples UFM
Car, 8 out of 11
Car, 4 out of 11
Food, 8 out of 11
Food, 4 out of 11
CLUE
Clustering WWW Images z Google
Historical buildings, 10 out of 11
Historical buildings, 8 out of 11
Soccer game, 10 out of 11
Soccer game, 4 out of 11
43
JPL, 11/6/2003
UFM
44
JPL, 11/6/2003
Clustering WWW Images
Image Search
Keywords: tiger, Beijing z Top 200 returns z 4 largest clusters z Top 18 images within each cluster z
Tiger Cluster 1 (75 images)
Tiger Cluster 2 (64 images)
Tiger Cluster 3 (32 images)
Tiger Cluster 4 (24 images)
45
JPL, 11/6/2003
Clustering WWW Images
JPL, 11/6/2003
46
Retrieval Accuracy 10 image categories each containing 100 images
Beijing Cluster 1 (61 images)
Beijing Cluster 2 (59 images)
Beijing Cluster 3 (43 images)
Beijing Cluster 4 (31 images) JPL, 11/6/2003
47
JPL, 11/6/2003
48
8
Outline
Summary
z Introduction
z DD-SVM
z Region-based
image categorization using Multiple-Instance Learning
z Content-based
clustering
z Conclusions
z CLUE
image retrieval by
and future work JPL, 11/6/2003
49
Limitations z Image z
JPL, 11/6/2003
50
Future Work
categorization
z Bag
generator
Diverse Density z Generative
model
z CLUE
Recursive Ncut z Representative images z Sparsity
z Applications
z
JPL, 11/6/2003
51
Supported by
JPL, 11/6/2003
52
Acknowledgment
z The
National Science Foundation z The Pennsylvania State University z The PNC Foundation z SUN Microsystems z NEC Research Institute z University of New Orleans z Research Institute for Children
z
Dissertation committee at Penn State z z z z z
z
NEC Research Institute
z
Siemens Medical Solutions
z
Kind host
z
z
z
JPL, 11/6/2003
53
Professor James Z. Wang, IST and CSE Professor Lee C. Giles, IST and CSE Professor John Yen, IST and CSE Professor Jia Li, STAT Professor Donald Richards, STAT Dr. Robert Krovetz Dr. Jinbo Bi Dr. Andrés Castaño JPL, 11/6/2003
54
9
More Information z Papers
in PDF, demonstrations, data sets, etc. http://wang.ist.psu.edu/IMAGE http://www.cs.uno.edu/~yixin
[email protected] JPL, 11/6/2003
55
10