PILL-ID: Matching and Retrieval of Drug Pill Imprint Images

PILL-ID: Matching and Retrieval of Drug Pill Imprint Images Young-Beom Lee1, Unsang Park2, and Anil K. Jain1,2 1Brain and Cognitive Engineering Korea...
Author: Eustace Lindsey
3 downloads 2 Views 2MB Size
PILL-ID: Matching and Retrieval of Drug Pill Imprint Images Young-Beom Lee1, Unsang Park2, and Anil K. Jain1,2 1Brain

and Cognitive Engineering Korea University, Korea

2Computer

Science and Engineering Michigan State University, USA http://Biometrics.cse.msu.edu

• Legal drug pill or illicit drug pill? • If illicit pill, which cartel manufactured it? • What is the effective way to identify illicit drug?

• ~35M in the U.S. used illicit or abused prescription drugs; $14B spent for drug treatment & prevention (2007) • Prescription pills must be identifiable (by color, shape, and imprints) per FDA regulations

• Illicit pills (e.g., narcotics) also contain imprints to identify the cartel or distributor

• Databases of prescription pills and illegal pills are available (pharmaceutical companies, FBI) Query

• • • • •

Rank-1

2

Imprint : 5883 Shape : round Color : brown Ingredient : MDMA, BZP, TFMPP Cartel : Gulf

contents

3

4

5

6

• Imprint is an indented or printed mark on a pill, tablet or capsule • Symbol, text, digits or their combination

Legal drug pills

Illicit drug pills

• Sobel operator to obtain gradient magnitude image • Segmentation, scale normalization

Original Image

Gradient magnitude Image

• Rotation normalization Primary & Secondary Dominant Orientations

• Landmarks (key-points) are selected within a preset radius (SIFT descriptor)

Multiple template with Rotation variation

• Gradient magnitude images have smaller intra-class variations Original image Gray image Gradient Magnitude image Rank-1 accuracy (%) Method

Gradient magnitude

Grayscale

Optimized SIFT descriptor

90.03

83.55

(using 602 query-gallery dataset)

Images that did not match at rank-1 using SIFT but matched using the proposed method (fixed key points + SIFT descriptor) Method Number of key-points Rank-1 accuracy (%)

Original SIFT Min

Max

Avg.

17

340

126

43.02

Our method (SIFT descriptor) 29 90.03

Red dots: SIFT key points, Blue dots: preset key points

• Select a set of key-points • Collect gradient magnitude and orientation with Gaussian weighting and tri-linear interpolation • Truncation • Length of feature vector: 4 × 4 × 8 = 128 128 × 29 = 3712

Gaussian weighting

Gaussian window centered at a key point

Tri-linear interpolation

Truncation

• LBP histograms with multiple neighborhood parameters (P,R) are created and concatenated

P=8, R=1.0

P=4, R=1.0

P=12, R=2.0

• Feature vectors are constructed with the following parameters (P, R)

Window size

Shift value

U(8, 1)

20 X 20

4

U(4, 1)

10 X 10

2

U(12, 2)

30 X 30

6

• Length of feature vector: U(8,1) = 59, (4,1) = 16, U(12,2) = 135 59 X(13 X 13)+16 X(31 X 31)+135 X(7 X 7) = 31,962

• Given a query image (q) and N gallery images (g), the K feature vectors of the query are compared with the Ln feature vectors of the nth gallery images (n = 1 to N, L2 norm). • Ln is different for each gallery image • The ID of the closest match in the gallery is selected as the ID Feature vectors j of gallery images, g n

Feature vectors i of a query image, qm

Ln (=j) … … … …

Km (=i)



N

........

n

........

IDm  arg min d (qmi , gnj )

.....



• 822 illicit drug pill images from the Australian Federal Police; 138 illicit drug pill images and 14,003 legal pill images from the U.S. DEA website, Drug information online and pharmer.org • Image size: from 48 X 42 to 2,088 X 1,550 pixels; 96 dpi • Query set: 602 illicit drug pills with duplicate images of the same imprint pattern (88 distinctive patterns) • Gallery set: 960 (illicit drug pill images) + 14,003 (legal drug pill images) = 14,963 images • Leave-one-out method to match each of the 602 query to all the 14,962 gallery images



SIFT descriptor parameters are optimized for pill imprint matching 1. Smoothing

2. Gradient orientation & magnitude

3. Gaussian weighting

4. Trilinear interpolation

5. Truncation with threshold values of 0.2, 0.5 and 1

Method

Rank-1 accuracy (%)

Truncation value

Rotation Normalization

Edge image

Grayscale image

SIFT with 1, 2, 3, 4, 5 (Original sift)

0.2

No

83.89

83.39

SIFT with 2, 3, 4, 5

0.2

No

87.87

78.74

SIFT with 2, 4, 5

0.2

No

88.70

79.57

SIFT with 2, 5

0.2

No

87.54

81.56

SIFT with 2, 4, 5

0.5

No

87.71

-

SIFT with 2, 4, 5

1.0

No

87.71

-

SIFT with 2, 4, 5

0.2

Yes

90.03

-

• 602 query and 14,962 gallery images Method

Rank 1 (%)

Rank 20 (%)

MLBP

64.78

82.72

SIFT descriptor

82.72

90.20

SIFT (0.7)+MLBP (0.3)

84.39

91.53

Query

Top-6 retrievals



Queries that were not correctly retrieved in top 20 matches

Query

Top-6 retrievals

Rank of true mate

− Illumination noise in the background

13042

− Similar shape and imprints

12841

3402

3259

1897

− Very similar pattern between query and top retrieved images

• Numeric or text information in imprints can be used for matching/filtering

5883

• • • • •

Imprint : 5883 Shape : round Color : brown Ingredient : MDMA, BZP, TFMPP Cartel : Gulf

Shape : Round Color: Pink Text: no Numbers: no

Query

… Rank 1

2

3 4 5 6 Using only imprints

7



97

… Rank 1

2

3 4 5 6 Using imprint shape and color

7



15

Content based matching can reduce retrieval errors

• Proposed an image retrieval system for identifying illicit drugs • 84.4% rank-1 (91.53% rank-20) accuracy with ~600 query and ~15K gallery images

• Evaluated two image descriptors (SIFT and MLBP) & their fusion; rotation invariant matching scheme was used • Computation time: 2.3 (0.5) sec/image for feature extraction and 13.0 (4.0) sec for each query with ~15K gallery for SIFT (MLBP); code in MATLAB running on 2.8 GHz CPU, 8 GB RAM • Future work – Content based matching/filtering – Evaluation on a larger database; collaboration with AFP – More efficient matching scheme

• If we can identify numbers or texts in imprints, content based methods can be used.

Number : 5883

Text : WYETH

Examples of the number and text imprint



MLBP is also evaluated with a various parameters using 602 querygallery dataset to optimize it for pill imprint matching 1. Number of LBPs 2. Sub-region (window size, shift value) 3. Input image size

Method

Rank-1 accuracy (%)

LBP

Sub-region

Image size

u2 LBP8,1+4,1

No

60

51.01

u2 u2 LBP8,1+4,1+12,2

No

60

54.15

u2 u2 LBP8,1+4,1+12,2

No

70

55.81

u2 u2 LBP8,1+4,1+12,2

(32, 8)(16, 4)(48, 12)

70

63.12

u2 u2 LBP8,1+4,1+12,2

(16, 4)(8, 2)(24, 6)

70

65.78

u2 u2 LBP8,1+4,1+12,2

(20, 4)(10, 2)(30, 6)

70

75.42

Gradient magnitude image

Multiple Templates

Orientation histogram 15 10 5

……

……

0 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35

Suggest Documents