Electrical & Computer ENGINEERING

Carnegie Mellon Hyperspectral Feature Selection for Detection of Chicken Skin Tumors Songyot Nakariyakul 2003 Electrical & Computer ENGINEERING H...

Author: Susanna Page

2 downloads 0 Views 3MB Size

Report

Download PDF

Recommend Documents

Computer Science & Electrical Engineering

Electrical & Computer Engineering Currents

Electrical and Computer Engineering

Department of Electrical Computer Engineering

Computer Science and Electrical Engineering

Department of Electrical and Computer Engineering

ELECTRICAL and COMPUTER ENGINEERING Undergraduate Program Guide

Department of Electrical and Computer Engineering

International Journal of Computer and Electrical Engineering

R&D in Electrical & Computer Engineering

Department of Electrical and Computer Systems Engineering

DEPARTMENT OF COMPUTER SCIENCE ELECTRICAL ENGINEERING

School of Electrical Engineering & Computer Science

Recent Researches in Electrical and Computer Engineering

RECENT ADVANCES IN ELECTRICAL AND COMPUTER ENGINEERING

MICHIGAN STATE UNIVERSITY DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING ECE 280: Analytical Methods for Electrical Engineering

Electrical Engineering, Electrical and Electronic Engineering

Computer Engineering. Computer Systems and Electrical Engineering Concentrations MS Graduate Handbook

Carnegie Mellon

Hyperspectral Feature Selection for Detection of Chicken Skin Tumors Songyot Nakariyakul 2003

Electrical & Computer

ENGINEERING

Hyperspectral Feature Selection for Detection of Chicken Skin Tumors

Songyot

Nakariyakul 2003

Advisor: Prof. Casasent

Hyperspectral Feature Selection for Detection of Chicken Skin Tumors

Songyot Nakariyakul

Department of Electrical

and Computer Engineering

Carnegie Mellon University,

Pittsburgh,

PA 15213

Advisor: Prof. David Casasent

ABSTRACT Weconsider a feature selection methodto detect skin tumors on chicken carcasses using hyperspectral data. A chicken skin tumoris an ulcerous lesion that is surrounded by a rim of thickened skin. Detection of chicken tumors is a difficult detection problembecause chicken tumors are of manysizes and shapes; sometumors appear on the side of chicken. In addition, different areas of normal chicken skins have a variety of hyperspectral response variations, someof whichare very similar to the spectral responses of tumors. Similarly, different tumors have different spectral responses. Thus, proper training is neededand manyfalse alarms are expected. Since the spectral responses on the lesion and thickened skin regions of tumorsare considerably different, wetrain our feature selection algorithm to detect lesions and thickened skin separately; we then morphologically process the resultant images and we fuse the two detection results to reduce false alarms. Forwardselection and modified branch and boundalgorithms are used to select a small numberof features that are useful for discrimination. Initial results showthat our method has a goodtumordetection rate and a low false alarm rate. Keywords: feature reduction, feature selection, hyperspectral data, product inspection.

1. INTRODUCTION

Hyperspectral (HS) image data is high-dimensional data that contains more than a hundred images in narrowlyspaced spectral bands (~,). It has been shownthat use of hyperspectral information useful for detection of objects in military applications such as detecting military vehicles [ 1, 2] and mines [3], for land use applications [4], and for manyUSDA product inspection applications [5-12]. This occurs since HSdata provides spectral information that uniquely characterizes and identifies

the chemical,

moisture, and physical properties of the constituent parts of an input object, scene region, or an agricultural

product. Hyperspectral data has successfully classified:

internal-damaged almonds from

normal ones [5], aflatoxin-infested corn kernels from good ones [6-8], vitreous durumwheat kernels from non-vitreous durumones [9], and fecal contaminated chicken carcass from clean ones [10-12]. Oneof the main problemsin classification of high-dimensional data is that there are often not enoughsamplesin the training data. It is generally accepted that the required numberof training samples mustbe at least ten times the numberof features or in this case input ~ spectral samplesper class [13] if one wants to be able to accurately predict the class of an unknownsample. This phenomenon is knownas the curse of dimensionality. Thus, use of hyperspectral data requires more than a thousand training samplesper class in order to cope with the curse of dimensionality. In general, this numberof samplesis quite difficult to obtain. Thus, it is necessaryto reduce the numberof features by either feature extraction or feature selection techniques. Feature extraction refers to algorithms that mapall of the original features into a few features (each of whichis a function of all original features), and feature selection refers to algorithmsthat select a small subset of the input feature set (use of only several ~. features) to use for classification. Feature selection is preferred to feature extraction becauseHSdata acquisition systems then only need data from a few ~. bands (this provides faster data acquisition and a less expensivesystem that requires fewer filters or simpler laser diode light sources). Thus, weconsider a newfeature selection algorithm developedearlier at CarnegieMellonUniversity [ 14]

2

In lhis rcporl, ~’e discuss llic use of fcalurc sclcclion ~)~l llypcrspcclral images skin lulllors

ill

H~

ima~cs t)I chicken carcasses. A chicken skin lumor is a round ulcerous

surrounded by a rim of Ihickcncd skin I I 5]. Iq~urc I sho~’s a l~lI!O x 1500 pixcl color imag~ of a chicken carcass wilh 14 lumors prosonl, t:igure ~ ~splays fhc 5~4 llln wav¢lcllglh band inla~e wilh all

ltimors

numbered and marked by r~ian~lcs.

correspond lo rcllcctancc

’l’hc

images in H~ daia arc gray.-~calc

and

dala of the o~cc[; lhc image in oath 1~ spcclral balld is afl~clcd by lhc Skill

color, shading, and slope of each l¢~al region o1 Ihe carcass, l:igures images of lhc Iumors numbcr~] l(t,

3a, 3b, 3c and 3d are enlarged

I l, 6 anti 14 respcclively il~ I:igure 2. Chicken s~n lumors vary il~

size l~om4x2 pixcls Io more than 4(}x25 pixe]s. In a single g~-ay scale HSimage al tree ~, suct~ ~ls Iqgurc 2, Ihe cenlral tflcerot~s lesioll regions of ttllnors appear as bri{~h[ re’glens, as seen in i:igure 3a al~¢l 3b, and lhc Ihickcncd skin surrounding file lesion regions appears as dark.-gray rings as shown in Irigurc ~a and 3b. Whcll [tllllOrs recur ~)11 Ihc side of lilt carcass, lhey appear clliplical

and arc very small. SuchItllllOrS

arc shownin ]:igurc 3c and 3d. "l’hus, dclecling chickcll skin linnets is a dffficull problem.

Figure 1. A color image of a c~icken carcass wiil~ skin Ironers

Figure 2. The 554 nmwavelengthband image of the carcass shownin Figure 1 with all tumors numberedand markedby rectangles.

(a)

(b)

(c)

(d)

Figure 3. Enlargedimagesof the tumors numbered10, I. 1, 6 and 14 respectively in Figure 2.

Prior workon detection of chicken skin tumors using ttS data considered the statistical

properties

(mean, standard deviation, skewness, kurtosis, and coefficient of variations) of HS images in three selected bands (~,) [16]. Principle componentanalysis was applied to hyperspectral imagesof normaland tumor regions to produce the first

10 PCAcigenimages. The: mgenimagewith the best contrast and

differences between tumor and normal regions was analyzed to find the three bands ~vith the most

4

important (largest) coefficients contributing to that eigenimage. The three 465 nm, 575 nm, and 705 bands were used. A square grid with a mesh size of 64x64 pixels was placed over each HSimage with 2. Statistical features (mean, skewness, kurtosis, and each pixel correspondingto a samplearea of 0.1 mm coefficient of variation, defined as (standard deviation / mean)x 100) were calculated within each square in this grid were calculated and used as inputs to fuzzy classifiers. implemented with three operations: (1) fuzzify numerical statistical

The fuzzy inference process was inputs into input membership

functions based on observations; (2) apply fuzzy operators to the antecedents of the rule base; (3) evaluate the consequentportion of the rule [16]. The fuzzy classifiers classify each grid region into one of the two categories: normal or tumorous skin. Use of three features (coefficient of variation, skewness, and kurtosis) gave successful detection rates of 91%and 86%for normal and tumorous skin tissue region, respectively. However,the grid size is too large for our database since someskin tumors in our database consist of only 10 to 20 pixels. This emphasizesthe need to classifying each pixel individually. Kimet al [17] approachedthe problem differently. They computedthe maximum intensities,

slopes, and ratios

of maximum intensities in several specific wavelengthbands for each pixel and used themas features for a linear classifier. Three features werechosenby inspection of the spectra of the training data; as a result, these features are not guaranteed to give the best solution. A simple unspecified linear classifier was used. Image pixels were classified into either tumor or normal class. Normal-class pixels that were misclassified by the linear classifier as tumor-class pixels were referred to as false alarms. Morphological imageprocessing was applied to the resultant binary 2-class imageto removefalse alarms. 31 of 41 skin tumors (76%) were detected with 12 false alarms. Fluorescence images were used in this work. They used 10 images of chicken carcasses, our 2 images were in this set; but the sensor used was different. Theyused 48 tumor pixel samples, but it is not clear if these werefrom all 41 of the tumors. Our database contains HSreflectance imagesin a total of 65 spectral bands ()0 ranging from )~ 425 to 711 nm. Weshowthe spectral responses of someof the tumor regions in Figure 4. These are the responses at one pixel in the lesion regions of tumors numbered6, 10, 11 and 14 in Figure 2 (or Figure 3c, 3a, 3b and 3d, respectively). FromFigure 4, the spectra of the lesion regions of tumors have similar

relative shapes but varying intensities.

This is expected because tumors appearing on the side of the

carcass (tumors numbered6 and 14) reflect the light away from the HS sensors resulting in lower intensities

than those of tumors appearing in the middle of the carcass (tumors numbered10 and 11).

This emphasizes the need to normalize the response at every pixel in the database before training or testing. The response at each data pixel is normalizedby dividing its response by its average wavelength response. Figure 5 showsthe normalizedversion of the spectra in Figure 4. This data indicates that the spectral responses of the lesion regions of different tumors are not exactly the same, and thus one must carefully select the training set pixel databaseto represent all tumors.

Spectraof the lesionsof tumorsin Figure2. 8000

7OOO

6OOO

5000

4000 ~’~’~’,.o..O~°~.~ 3000

’-."

2000 0

~ 10

?? .... ".,. -~’"" "~" ............ " ..... "’""" ~": ...... ~ 20

~ 30 Band

~ 40

Tumor#10 ~ Tumor #11 Tumor ~ Tumor#14 ~ ~ 50 60

Figure 4. Non-normalizedspectra of the lesions of tumors in Figure 2.

Normalized spectraof the lesionsof tumorsin Figure2.

1.3 1.2 1.1 ¢-

1 0.9 0.8

.... Tumor#10 -- Tumor #11 ..... Tumor#6 ...... Tumor #14

0,7 0.6 0

I

I0

I

20

I

30 Band

I

40

I

50

~

60

Figure 5. Normalizedversion of the spectra in Figure 4.

Next, we address whetherthere is a noticeable difference in the response of the lesion regions and the surrounding thickened skin regions of tumors. Figure 6 showsthe normalized spectra of somepixels in the lesion and thickened skin regions of tumor number10 in Figure 2. It is clear from Figure 6 that the lesions and thickened skin of tumors have quite different HSresponses. Thus, weneed different sets of feature bands to detect each. To do this, weselect a portion of the lesion region and normal skin region pixels of the chicken imagesas the pixel training/test set or the lesion pixel database. Wealso create a thickened skin pixel database that includes a portion of the thickened skin region and the normal skin region pixels of the chicken images; this is the pixel training/test set for thickened skin. Wetrain our feature selection algorithm on the lesion pixel database and on the thickened skin pixel database; the results are used to detect the lesions and the thickenedskin regions of tumors, respectively. Wemust first select the optimal features (X) to use. Theonly optimal feature selection algorithms are exhaustive search and branch and bound (BB) [18]. An exhaustive search finds the best subset of features out of n by evaluating a criterion function J for all possible combinationsand selecting the best

7

65!

one. If wewant to find the best subset of 4 features out of 65, = /645 /

-

677,040subsets

4! (65 - 4)

must be searched. In manyhyperspectral image cases that have more than a hundred features (~), exhaustive search is very time consumingand prohibitive. The BBalgorithm is moreefficient because it avoids an exhaustive search of the wholesearch space by rejecting manysubsets that are guaranteed to be sub-optimal, and it guarantees that the selected subset is the globally optimal solution for any criterion function that satisfies

monotonicity. A modified branch and bound (MBB)algorithm developed by Xue-

WenChen[14] modifies the BBalgorithm by providing a more efficient way to search the subsets in the BBalgorithm. Thus, it is faster than BBand much faster than an exhaustive search. However, for general HS data with more than a hundred feature bands (~), the computational load for the MBB algorithm is also impractical for feature selection problems. This emphasizesthe need to reducing the dimensionality of the problem before we apply the MBBalgorithm. Weuse the forward selection (FS) algorithm to select 30 initial

features and then use the MBB algorithm to select the optimal subset of

features out of these 30 FS features. This is referred to as the high dimensionalBBalgorithm [14]. We note that the FS algorithm is not guaranteed to produce optimal results like the MBBalgorithm does because it does not examineall possible subsets. The FS methodalso has the nesting problem, i.e. the subset of the best four features chosen by FS contains the subset of the best three features chosen by FS, etc. In practice, the best four features maynot contain any of the best three features, etc. Hence,weonly use FS to reduce the initial

dimensionality of the problem and we use the MBBalgorithm to select a

numberof final features (three or four features at most) to use in a nearest neighbor(NNB) classifier. To assign a test sampleto a class in the NNB,the shortest Euclideandistance from that sampleto other tumor-class and normal-class samples in the training set is computed.The NNBclassifier

assigns

the test sampleto the same class as its nearest neighbor in the training set. Using this NNBclassifier, each pixel in the chicken carcass is then classified to be either one (white) for candidate skin tumor regions or zero (black) for candidate normalskin regions. Normal-class pixels that are misclassified the NNBclassifier

as tumor-class pixels are referred to as false alarms. There are two binary output

images formed for each HSimages, one to locate lesions (using the lesion pixel database) and one locate thickened skin regions (using the thickened skin pixel database). Wethen fuse (intersect) the binary output imagesto obtain the final detection result. Fusion of the two binary output imagesis shown to reject false alarms. Creating two pixel databases, the lesion pixel database and the thickenedskin pixel database, and fusing the two binary output imagesresulting from the two versions of the feature selection algorithm is newand has not previously been employedin HSdata processing. Sect. 2 describes the database used. Sect. 3 discusses various backgroundalgorithms and issues used in this paper. Methodsand test results are presented in Sects. 4 and 5.

Normalized spectraof the lesions andthickenedskin of tumor~0 in Figure2. 1.6 ...... 1.4

1.2

13.8 ~ Lesion ...... Thickened Skin

13.6

0.4 O

10

20

30 Band

40

50

60

Figure 6. Normalizedspectra of pixels in the lesion and thickened skin region of tumor #10 in Figure 2.

9

2.

DATABASE

Chicken carcasses with skin tumors were sent for processing to the Instrumentation and Sensing Laboratory (ISL) in Maryland. The hyperspectral (HS) imaging system used consists of a CCDcamera, spectrograph, a sample transport mechanism, and lighting sources [16]. More details

on the ISL

hyperspectral imaging system are provided elsewhere [ 19]. The locations of tumors were identified by a Food Safety and Inspection Service (FSIS) veterinarian. TwoHScubes were provided to us for initial testing (a HScube contains a series of images in narrowly spaced spectral bands (~k), whereeach image corresponds to the image obtained at one specific frequency band.) Each HScube consists of 65 spectral band images ranging from ~, = 425 to 711 nm. The first

HScube contains a single chicken carcass

(Figure 1) with 14 skin tumors. The size of each image is 460x400pixels; the 554 nmwavelength band image from this HS cube was shownin Figure 2 with all tumors numberedand marked by rectangles. Tumornumber2 was identified by the FSIS veterinarian as normal tissue, but Kimet al [17] stated in their paper that it was a tumor. If welook at the color imageof the carcass (Figure 1), we agree with Kim that it is a tumor (so does our feature selection algorithm). Tumornumber12 appears to contain two small tumors close to each other, but the FSISveterinarian classified themas one tumor. Six of the 65band imagesare shownin Figure 7. Althoughthey look similar, their pixel intensities vary drastically. The second HScube has two chicken carcasses with a total of 7 tumors on them. The size of each image is 460x600 pixels; the 554 nm wavelength band image is shownin Figure 8 with tumors numberedand marked by rectangles. The color image for these carcasses is shownin Figure 9. Manytumors in the second HScube are small compared to those in the first

HS cube. Tumornumber 2 in the second HS

cube consists of only six pixels and has no lesion region. Thus, we do not expect our feature selection algorithmto detect it.

10

Figure7. HSimagesat various )~k’s, k = 10, 20, 30, .40, 5(/, 60 fromleft - to - right, up - to - down

Figure 8. The 554 nmwavelength band image of second HScube with all minors numberedand marked byre,’ctangles.

1!

Figure ~). A color image of chicken carcasses I~r lhe second HScube,

in general, one would selec! lhe pixel lraining and lest sel pixel dalabase from ~ne HS cube and lram his/her l~alure seleclion algorilhm on lhem. ()ne then presenls lhe l~ature solcclion resull ~n lhe second HS cube Ihal has nol been Irainod on belk~re. However, our dalabase is ]nniled;

we have only

lhree chickeu carcasses ~w~ilable ik)r lrainin~ and lcslin~L Welhus selecled a portion of lhe skin Itlmors l~om lhe l]rsl

HScube and lhe normal skin regions l?om I~t)tl~ ftS cubes. This was nc~:essary io reduce

l~lse alarms m lhe second image, since some normal skin ~egions on the s~on(I HS cube tlavc very dill~renl speclral responses l~om [hose on lhe iirsl region lrainin~

dala

l~om bolh

HS

HScube. Thus, il is necessary [o selecl uorm~ll sMn

cubes. The lumor re, ions in bolh images s~m Io have similar speclral

responses. Weselecled pixel sa~nples l~om only some of lhe l~,lmors in lhe l]rsl HScube lk)r lraining and losling because manylumors in lhe l]rsl HS cube are lar:~er ih~m lhose in lhe second HScube. All of the pixel training and lesl sol dala l~)r the lesion and thickened skiu regions were selected l]’om lhe I]rs[ flS cube. l:or Ihe lesion pixel dalabase, we exlracled Ihc (~fg .~ band speclral responses lk~r 1()() pixcls

only five of the 14 ulcerous lesion regions of the tumors (tumors numbered4, 7, 9, 10 and 11 in Figure 2) and labeled themas "tumor" class. Half (50 samples) of themare used for the pixel training set and half (50 samples)for the pixel test set of target (tumor) pixel data. With50 tumorpixels for the pixel training set and 65 spectral features, this represents high-dimensionaldata. Weextracted the 65 ~. band spectral responses for 360 pixels from normal skin regions of the carcass and labeled them as "normal" class. Since different areas of normal chicken skin have different spectral responses, there are more normalclass samplesthan tumor-class samples. Weused half (180 samples) of these as the pixel training set and half (180 samples) as the pixel test set for normalclass pixel data. 300 of the 360 normalsamples were selected fromvarious areas of the normalskin regions of the carcass in HScube 1 that have very different spectral responses. These regions include pale skin, pinkish skin, skin covering bony joints, and shadowy area under wings as denoted in Figure 10. Figure 11 shows the normalized spectra of some of these regions; as seen, they are quite different. 60 of the 360 normalskin samples were chosen from the normal skin regions of a carcass in the second HScube. Unlike the carcass image in the first

HScube, this

carcass image displays the side of the chicken whencaptured by the HSimaging system. Thus, some regions in this carcass as noted in Figure 12 are not present in the carcass in the first HScube, and they have different spectral responses as shownin Figure 13. Weincluded 60 pixels in these regions in our normalskin pixel training and test set. For the thickened skin pixel database, we extracted the 65 ;k band spectral responses for 100 pixels from the thickened skin region surrounding the same five tumors chosen for the lesion pixel database and labeled them as tumor class. Half (50 samples) of themare used for the pixel training set andhalf (50 samples)for the pixel test set. The normalpixel training and test set pixel databases were the sameas used for lesion detection.

13

Figure 10. Normal skin regions with dilfcreilt

spectral responses,

Normalized spectraof different areasof r~orrnat chickenskins 1.6

1¸4

1.2

pale skin pinkish skin bony area shadowyregion, [I.4 LI

i 11-I

i 29

[ 39 Band

J 40

i 50

t .-.~ 60

Figure11. Nornializcdspcclra of di ffcr¢;~nl ;lrt)[lS of normalchickenskins.

14

Figure 12. Normalskin regions in lhe second imalge wilh ditTerenl speclral responses I]~)m lhose in the l~rsl imag,,e. Normalizedspectra of different areas of norrnal skin regions on the secondimace 1.6 ’ , , r , ~ .........

1.4

/

0.8 ~ region 1 .... region 2 region 3 region 4

0.6

I].4 0

’ 10

.1.

40

I

I

50

60

Band Figure 13. Normalized sp~tra of differcnl areas o1 normal skin regions shownin I~’igure 12.

15

3.

BACKGROUND

This section summarizesthe feature selection algorithms used in this paper. The goal of each feature selection algorithm is to select features that are important for distinguishing tumor-class samples from normal-class ones. These features along with the pixel training set are used in the NNBclassifier to classify each sample. To quantify performance, we give the Pc (percentage of tumor and normal pixels correctly classified) score for the pixel training andtest sets. 3.1 Forward Selection (FS) The FS methodfirst selects the best single feature and then adds one feature at a time, whichin combination with the first and subsequent selected features maximizesthe criterion function J. Weuse the Bhattacharyadistance as the criterion function, i.e., C1 "t-

1 J=-~([ll

1 -~2)

T .C1

2

-1 (+C2)

(~t

1

-laz)+lln

2

2

C2.

,

(1)

wheregl and ~2 are the meanvectors for the tumor-class and normal-class training samples, and C1 and C2are the covariance matrices for the tumor-class and normal-class training samples, respectively [20, p. 48]. The Bhattacharyadistance is large if the meandifference betweentwo classes is large (the first term in J) and if the variances of the two classes are different (the secondterm in J). To select the best subset of m features out of n original features,

the number of subsets searched by the FS algorithm is

[(2n - m + 1)m]! 2, which is muchsmaller than the numberof subsets evaluated in an exhaustive search

m!(r~--m)! or in the branch and bound method. For example, to select the best subset of 4 features out of 65, the FS algorithm requires searching [(2x65 - 4 + 1)x4]/2 =254subsets, whereasan exhaustive

search requires searching

65~

- 677,040 subsets. However,the FS method does not examine all

4!(65-4)!

16

possible subsets, so the resulting subset is not guaranteed to producethe optimal set of features nor the best classification

rate Pc- The FS methodalso has the nesting problem i.e. the subset of four best

features chosen by FS contains the subset of three best features chosen by FS, etc. This is not normally expected to be the case. Recall that the FS algorithm producesa set of ordered features. Wethus use the FS algorithm to select 30 initial features (more than the final numberof features 3 or 4 that we want to use) and the MBB algorithm to select the optimal subset of final features (three or four features at most) out of these 30 FS features to use in our classifier. This is our FS/MBB algorithm[14]. 3.2 Modified Branch and Bound (MBB) Since the modified branch and bound (MBB) algorithm

developed at CMU[14] uses

modifications to the basic branch and bound(BB)algorithm, we thus give a brief description of the basic BBalgorithm. To select the best set of mfeatures out of n original features, the BBalgorithm selects the n - m features to be discarded. It creates a search tree with n - m levels with one feature being omitted at each level of the tree. Theproblemis to select the best path throughthe tree that yields the largest J. The BBalgorithm assumes monotonicity of J, i.e. J decreases as we movedownthe tree; this is logical because more features are omitted as wemovedownthe tree. Weuse the Bhattacharya distance in (1) the criterion function. TheBBalgorithm starts the search at the top of the tree, and all nodes at level-1 are analyzed. For a given level-1 node, it has several nodes below it. The successor node below the level-1 node N1with the largest J is analyzedfurther. Thesearch continues until it reaches the bottomof the tree, the n - m level, resulting in one full path through the tree with an estimate (a boundB) on the criteria function J. J is then evaluated at other level-1 nodes and the process is continuedto lower levels of the tree; if J < B for a given node, then J does not have to be evaluated at successor nodes under that node because J decreases as weproceed downthe tree. If J > B for a given node, paths from that node to the bottomof the tree are explored (as long as their J remainslarger than B). If a newdifferent full path with a J > B is found, the boundB is updated with the newlarger value. With a mother node has a low J < B, its successor nodes need not be analyzed. Omitting evaluation of J for a set of successor nodes

17

(when J < B at some mother nodes) speeds up the, :search,

and thus BBis more efficient

than

exhaustive search. A new BBalgorithm improvementin the MBBalgorithm is to obtain a good initial

estimate of B

[ 14]. If wecan obtain a good, high initial B, many’moresubsequentJ values higher up in the tree are less likely to give a J > B. Therefore, calculations of J for manypaths can hopefully be omitted. The MBB uses FS or other sub-optimal algorithms to order all n features from best to worst, and the tree is then constructed with this ordered featured set. Aninitial estimate of B is calculated using the m best features ordered by FS. This B bound estimate is higher than one estimated by using non-ordered features. Using an ordered set of features 0v) puts the best FS features on mother nodes with manysuccessor nodes. Hence, weexpect a low J to be obtained at these nodes because whenthe best FS features are omitted, we expect J to be less than B. Whenthis occurs, .all subsets of features below these nodes can then be omitted in the search, and this speeds up the search. Another MBBmodification is to find a proper starting search level. The motivation for this is that at the upper levels of the search tree, we do not expect J < B, since only one or two features are omitted. In MBB,we thus start the BBsearch

(J

evaluation) at level (n - m)/4, because we only expect J to be less than a good initial B estimate when somereasonable numberof features are omitted. J is .evaluated for all nodes at this level. If all nodes give J > B, we jumpto level (n - m)/2, calculate J for all of its nodes, and apply the BBsearch to nodes below all nodes with J >B. If any node at some level such as (n - m)/4 has J B. If all nodes at level In - m) have J > B, then we know that we would have had to evaluate J at all nodes above that level. This "jump search" algorithmthus saves searchingJ at all nodes abovethat level [ 14].

18

4.

First,

MATERIAL

AND METHODS

the background must be removed prior to image processing.

around the chicken carcasses

We remove the background

by placing a mask whose value is one (white) on the carcasses and zero

(black) on the background. To obtain the mask for each carcass image, we first

obtain the spectra of the

background and several skin regions on the carcass. Unlike the spectral responses of tumors and normal skins, the spectral response of the background does not noticeably vary over all 65 spectral bands. Figure 14 shows the unnormalized spectra of the background, some normal skin regions,

and two tumor regions.

From these training data, we chose to compute the difference between the responses in two separate bands (10th and 40th bands)for each pixel; we set pixels with an intensity difference less than 500 to zero and otherwise set the pixel value to one. This forms the mask. The result is shownand discussed in Sect. 5.1. spectra of the backgroundand other chicken tissues SO00 7000

-Background ...... Normal .... Lesion ........ Dermis

.............. ""’~"

-

.,. .,- ...... . ,.,÷

6000

4000 3000 2000 1000 0

0

I

L

10

20

.I

30

I

40

I

__

60

I

50

Band Figure 14. Spectra of the background and other chicken regions.

Second, we select the spectral bands to use ~Io locate the lesion and lhickened skin regions of tumors and separate

them from normal skin regions.

Wetrain

our feature

selection

algorithm on the

19

lesion pixel database and the thickened skin pixel database; these are used to detect the lesions and thickened skin regions of tumors, respectively. Weuse forward selection (FS) to select the 30 best features out of the 65 available ones and we then apply the modifiedbranch and boundalgorithm to select a numberof final features (FS/MBB algorithm). For this database (with only 65 k features), it is possible to apply the MBBalgorithm directly to the original databases without first reducing the numberof features by FS. In general, we do not expect this to be the case. Weshow in Sect. 5.2 that the two methods (MBBand FS/MBB) give the same set of final features for our pixel databases. To select the best four features out of all 65 features, MBBtook more than three hours, while the FS/MBB algorithm (MBBapplied to 30 FS features) took less than two minutes. Thus, the proposed FS/MBB algorithm preferable for manyHSapplications. Wenote that the two methods do not give the same set of final features in general. In such cases, we find small differences in the different spectra chosen. Sect. 5.2 provides results and a discussion of this. The MBBalgorithm solution is optimal, whereas the FS/MBB algorithm solution is sub-optimal because FS/MBB only gives the best set of 30 features (by FS) and these are not the optimal set of 30 features. This is expected, since feature selection is an N-Pcomplete problem [21], and only a search over the entire, database can give the best solution. In general HS applications with more than a hundredfeature bands, applying MBBto the original database is too timeconsumingto be considered. Wenowaddress the computational times for tlhe different algorithms, Chen[14] noted that to select the best four features from 137 features, an exhaustive search required a search time of seven days, MBBtook four days, and FS/MBBneeded only six minu~es on a Pentimn II 250 MHzcomputer. We now comparethe computation times for the MBBalgorithm to those for the BBand exhaustive search algorithms on the present database. Wefirst ~sed the FS methodto reduce the numberof original ~. features in the lesion pixel database from 65 to 30. The three optimal feature selection algorithms were then used to select different numbersof final fealures from the reduced 30 FS features. Table 1 lists the numberof calculations of the criterion ftmclion J thai: each of the three algorithms has to evaluate to

2O

obtain one to eight final features out of 3(I initial ones. The numberof calculations of J is a measureof the speed of the different algorithms. Whenone or two features are selected, an exhaustive search is faster than BB-basedmethods. Whenthree or more features are needed, the MBB algorithm outperforms the other two methods. The improvementfactor increases as more final features are chosen.

Table 1. The numberof calculation of J required by three optimal feature selection algorithms to select different numbersof final features from 30 FS features for the lesion pixel database. # features selected

Exhaustive search

BB

MBB

1

30

440

45

2

435

1820

466

3

4060

5433

1686

4

27405

14960

5528

5

142506

34856

16418

6

593775

59284

30429

7

2035800

121119

33450

8

5 852925

196304

78240

Wenowaddress the processing applied to the binary image produced after using the features selected by the FS/MBB algorithm and applying them (for each image pixel) to an NNBclassifier. binary imageresults, with each pixel classified as one for the tumor class and zero for the normalskin class, Werefer to this binary image after application of our feature selection algorithm and an NNB classifier as the bina~, pixel classification image. Weexpect to have a higher false alarm rate in the binary pixel classification imagethan in the pixel databases becauseof the large variation in normalskin and becausethe normalskin training samplesin the piixel databases might not represent all of the normal skin regions in a full image. In addition, we expect false alarms because normal chicken skin has a variety of hyperspectral responses, someof whichare very similar to the spectral responses of tumors at the few )vs used. Nevertheless, these false alarms should not occur at a numberof adjacent pixels if we train our systemproperly. Since a tumoris not a :~ingle pixel but is a region that weassumeconsists of at

21

least 6 pixels, weanalyze the blob colored [221 version of the binary pixel classification imageand omit any pixel blobs that form connected regions with five or less pixels, Wethen perform morphological processing on the resultant binary image. We: apply a closing operation on the resultant binary image with a structuring elementof size 3x3 pixels; this flits in internal holes on tumorsof 2x2 pixels or less in size (holes are tumor-class pixels that are misclassified by our classifier as normalskin pixels). The structuring element size of 3x3 pixels was chosen after analysis of the binary images of the 5 tumors in the training set, which showedsome2x2 pixe[ holes inside the tumor regions; we do not expect large holes in tumor regions from our feature selection algorithm. The two post-processed binary images for the lesion and thickened skin cases are then fused (AND,intersected) to produce the final classification imageresult. Webelieve that the transition region fromthe lesion region to the thickenedskin region of a tumor respondto both the lesion and thickened skin features. As a result, whenwe fuse the binary images for the lesion and thickenedskin cases, these regions are detected. This requires further analysis in future work(i.e. modifyingthe fusion algorithm used). Fusion results of the binary pixel classification images are shownin Sect. 5.3 to further reduce false alarms.

5.

IRESULTS AND DISCUSSION

In this section, we review howwe removethe backgroundfrom the carcasses and showresults in Sect. 5.1. Wedemonstrate our feature selection algorithm performance on the two pixel databases in Sect. 5.2. Sect. 5.3 summarizesthe detection results on the two HSimages. 5.1 BackgroundRemoval Processes Weremovethe backgroundfrom the carcasses by placing a maskover the image; the mask has a value of zero (black) for the backgroundand one ,~,white) tBr the chicken. Usingthe fact that the spectral response of the backgroundis nearly constant over )v, we obtain [he maskby computingthe difference in responses at each pixel betweentwo bands wilh large differences for the carcass. Weused the 10th and 40th band images(see Figure 14 in Sect. 4). Wecalculated this difference for each imagepixel, formed 22

pixel difference image, and set the pixels with an intensity difference less than 500 to zero and other pixels to one. The resultant

image for the first

HS images is shownin Figure 15a. Weremove some

unwantedbackgroundblob regions by retaining only connected regions with more than 1000 pixels (the chicken carcasses will have more than 1000 pixels).

Wethen perform morphological processing. We

close the resultant binary imagewith a 5x5 pixel structuring element; this fills in small holes in the mask with 4x4 pixels or less. Thestructuring elementsize chosen is presently ad hoc. ’Fhe final masksfor the first HSimages and the second HSimages are shownin Figure 15b and Figure 16 respectively.

(a) Pre-processing

(b) Post-processing

Figure 15. Maskfor the first HSimages.

23

Figure 16. Mask for the second HS images

5.2 Feature Selection Pixel Database Results The Lesion Pixel Database Weused FS to reduce the number of original pixel database features from 65 to 30. Table 2 lists the 30 features

selected in order for the lesion pixel database by the FS algorithm.

MBBalgorithm to select Our objective

the best subsets ef one to four features

was to keep the number o1’ final features

implementation. Thus, a maximumof four final features

Wethen used the

from these 30 FS features

as small as possible

(FS/MBB).

to allow for real-time

for each pixel database is considered.

Wealso

applied the MBB algorithm to the original 65 )~ pixel database. Table 3 compares the features selected by the FS (on all 65 original features), algorithms.

MBB(on all 65 oriiginal

As seen, the MBBand FS/MBBalgorithms

select

features),

and ES/MBB (on 30 FS features)

the same features

for discrimination.

Thus, our FS/MBBalgorithm selects the oplimMset of~, features for this pixel database. Only two of the four best features ordered by FS (features

18 and 28; in Table 2) are present in the best subset of four

24

features selected by the FS/MBB algorithm (Table 3); none of the two best features ordered by FS Table 2 are present in the best subset of two features selected by the FS/MBB algorithm. Thus, an optimal feature selection algorithm (such as MBB) is neededand the initial L feature reduction algorithm (such as FS) must provide a numberof starting features that are muchlarger than the numberof final features considered. Thus, one cannot select the best features using only the FS algorithm. The FS/MBB data showsthe nesting problemthat FS produces. The: best subset of three features (features 11, 18, and 28) does not contain any feature in the best subset of two features (features 20 and 29) or one feature (feature 34). The best subset of four features contains only two features in the best subset of three features (features 18 and 28). The FS a~gorilhmcannot handle such cases.

Table 2. The30 features chosen (in order of importance)out of 65 by the FS algorithm for the lesion pix.elt database 8

feature #

1

2

3

4

5

6

7

)~ feature

34

18

28

11

64

14

2’9 46

feature #

16

17

18

19

20

~ feature

6

9

13

16

21 20

212 32

58

23

9

10

iI

12

13

14

15

7

33

37

47

42

5l

56

24

25

26

27

28

29

30

53

24

52

41

40

61

38

65

Table 3. Best features chosen by three feature selection algorithms from the 30 FS features in Table 2 for the lesion pixe[ database. The numberof features

FS

MBB

FS/MBB

34

34

34

18,34

20,29

20,29

18,28,34

11,18,28

11,18,28

11,18,28,34

14,18,28,64

14,18,28,64

25

After using the FS/MBBalgorithm to reduce the: number of features to a low-dimensional space, each sample in the training and test pixel database is fed to the NNBclassifier

(using the training set

pixels as the NNBdatabase),

and test pixel sets are

and the classification

rates Pc for the training

obtained. In obtaining training set Pc, the training set sample being classified the NNBclassifier.

is of course removedfrom

Table 4 compares the Pc scores using the features chosen by the FS and the FS/MBB

algorithms as the numberof final features is increased. Whentwo and three features are used, the training set Pc scores (88.7% and 98.3%) using ore FS/MBBalgorithm are noticeable 85.2% and 92.6%) using the features

selected

by the FS algorithm.

higher than those (Pc

Thus our optimal FS/MBBfeature

selection algorithm is needed. Since the Bhattacharya measure J used does not relate directly to P(7 (or Pc), the features chosen by the FS algorithm can and do sometimes give a better Pc than those chosen by our algorithm (e.g.

when 4 features

are u~sed). However, the Pc difference

is small. We note that

generalization is good for all cases in Table 4. Figure 17 shows the classification

rate Pc for the training

and test pixel sets as the number of final FS/MBB features used increases.

To select the number of final

features to use, we look at Pc of the training set. From ]Figure 17, we see that Pc for the training pixel set reaches a distinct

peak when three FS/MBBfeatures are: used (Pc = 98%) and it decreases when four final

features are used. Thus, we keep three features [or the lesion pixel database. Figure 17 shows that the test set pixel database gives a similar result,

thus confirming our choice of 3 final features.

Wealso

notice that the Pc for the pixel training set when l;our FS/MBB features are used (Pc = 95%) is lower than that when three features are used. This demonstrates that more features do not always give a higher In general, we expect Pc for the training :~;et to increase as the number of features increases; thus, in general, use of a validation set (as in [ 14]) i~; needed to :select the numberof final features to use. Table lists

the training and test pixel set Pc" and P>.~lpercentage of normal skin pixels misclassified as tumor

pixels) for the three final features (features 11, 18, and 128) selected by the FS/MBB algorithm. Although we obtain a low Pva score of 1.1% for the Iraining set and 2.8% for the test set, there are more % erro~-s for normal skin pixels than for lesion pixels Image results are presented and discussed in Sect. 5.3.

Table 4. Pc results for features chosen using the FS and FS/MBB algorithms for the lesion pixel database # features

FS algorithm

FS/MBBalgorithm

Pc(train)%

Pc(test)%

Pc(train)%

Pc(test)%

76.5

76.5

76.5

76.5

85.2

82.6

88.7

85.2

92.6

93.:5

98.3

95.2

96.1

94.4

95.2

93.9

~00

95

9O

85

8O

TrainingSet ] Test Set

0

1

2 3 Number of Fir, al Features

4

6

Figure 17. Pc: for the training and test pixel sets vs the: numberof final FS/MBB features for the lesion pixel database. Table 5. Pc and PFA (% of normalskiu pixels misclassiified as tumor) for the training and test pixel sets using the three features selected by the FS/MBB algorithm for the lesion pixel database. Pc (%)

PF,a (%)

Training set

’)8.3

1.l

Test set

’05.2

2.8

27

The Thickened Skin Pixel Database Table 6 lists

the 30 selected features ordered by the FS algorithm for the thickened skin pixel

database. Table 7 compares the features selected by the FS (on all 65 original features), original

features),

and FS/MBB(on 30 FS features)

algorithms.

MBB(on all

As we can see from Table 7, the best

single feature (feature 18) and the best two features (features l and 18) chosen by FS are also the optimal ones chosen by the FS/MBBalgorithm.

The MBBand FS/MBBalgorithms

yield

the same features

(for this database) both algorithms are optimal The best three and four features algorithms.

Whenthree features are used, the I:’S algorithm only has one (feature

are similar

and

for all

42) of the optimal

features chosen by the FS/MBBalgorithm, but the other two features are close in )~ (feature 1 vs 3 and feature 18 vs 19). Whenfour features are used, the FS algorithm has three (features

1, 3, and 42) of

optimal features chosen by the FS/MBBalgorithm. The other feature is close in )~ (feature 18 vs 19). also see the nesting problem in the FS algorithm;

from Table 7, we see that the best subset of three

features (features 3, 42, and 19) does not contain any feature in the best subset of two features (features and 18). The FS algorithm cannot make such changes. Table 8 compares the Pc scores using the features chosen by the FS and the FS/MBBalgorithms as the number of final features

selected is increased.

Table 8, we see that the training set Pc scores using the FS/MBBalgorithm are consistently those using the FS algorithm when three or four features

From

higher than

are used. Figure 18 shows the classification

rates Pc for the pixel training and test sets as the numberof final features used increases. The Pc score for the training set pixels is highest when four features are used. From Figure 18, we expect when five or more FS/MBBfeatures

are used, the training

set Pc will i~mreasc. However, we set a maximumof four

wavelength bands for real-time implementation. Wethus keep four features to use in the NNBclassifier for the thickened skin pixel database (using the training set pixcls as the NNBdatabase). In obtaining training set Pc:, the training set sample under test is removedfrom the NNBclassifier.

Table 9 lists Pc and

P~:.x for the training and test pixel set for the four final features (features 1, 3, 19, and 42) selected by the FS/MBBalgorithm. The training

set P~A :score (the % of normal pixels called thickened skin) for the

28

thickened skin pixel database (5.9%for four features) is noticeably higher than the training set P~,a score for the lesion pixel database (1.1% for three features), Thus, we expect more false alarms in the binary pixel classification output image for the thickened skin pixel database. Wealso note more normal skin errors (PFA)than thickenedskin errors.

Table 6. The 30 features chosen (in order of importance)out of 65 by the FS algorithm for the thickened skin pixel database. feature #

1

2

3

4

5

6

)~ feature

18

1

42

3

19

37

feature #

16

17

18

19

20

21

)v feature

15

38

51

45

47

26

7

8

9

8

5

48

23

27

4

22

123

24

25

26

27

28

29

30

12

49

61

65

54

58

24

44

20

l0

21

ll

12 13 14 15 28

16

Table 7. Best features chosen by three feature selection algorithms from the 30 FS features in Table 4 for the thickenedskin pixel database. The numberof features

FS

MBB

FS/MBB

1

18

18

18

2

1,18

1,18

1,18

3

1,18,42

3.19,42

3,19,42

4

1,3, 18,4.2

1,3,19,42

1,3,19,42

Table 8. Pc results for features chosen using the FS and FS/MBB algorithms fl)r the thickened skin pixel database FS algorithm

# features

FS/MBBalgorithm

Pc(train)%

Pc(test)%

Pc(train)%

Pc(test)%

1

78.2

78.2

78.2

78.2

2

88.2

87.2;

88.2

87,3

3

90.1

89.1

90.9

88.6

4

88.2

89.1

91.3

90.5

29

95

9O

80

0

1

2 3 Number of Final Features

4

5

Figure 18. Pc on the pixel training andtest sets vs the: numberof final features for the thickenedskin pixel database.

Table9. Pc and Pt.A [’or the training and test pixel sets using the four features selected by the FS/MBB algorithm for the thickened skin pixel database. Pc (%)

PFA (%)

Training set

91.4

5.9

Test set

90.5

6.5

5.3 Detection Results The First Image The three chosen features (features 11, 18 and 28) for the lesion pixel database and the NNB classifier were applied to the pixels in the first HSimages. To assign an unknown pixel to a class in the NNBclassifier,

the closest Euclidean distal~ce from that sampleto any tumor-class sample d~ and to any

normal-class sampled: are computed.The sampleis then assigned to the tumorclass if d~ < dx + T (T is threshold), and to the normal class otherwise. Whenthe threshold T is zero. the NNt3classifier

is

3O

unbiased. If the threshold T is a large negative number, d~_ + T becomesa negative or small positive number. Thus, d~ is more likely to be larger than de + T, and most of the pixels will be classified as normal skin. For the lesion pixel database, whenthe NNBthreshold is zero, the false alarm rate on image one is high (PFA= 5%). Note that this is noticeably larger than the PICA = I. 1%for the lesion pixel database. Weselect a small threshold of-0.0005 to use, and found the detection result to be acceptable (PFA= 1.5%). Figure 19a shows the binary classification

result for the lesion features on image one.

Since a tumor is not a single pixel but consists of at least 6 pixels, we discard (omit) any white (one) pixels in the binary classification imagewhichare part of a connected region of five or less pixels. We then performa closing operation on the resultant binary’ imagewith a 3x3 pixel structuring elementto fill in small holes in tumor regions. The resultant imageis shownin Figure 19b (detected tumors are marked by rectangles). The numberof false alarms in Figure 19a is significantly

reduced in Figure 19b. In

Figure 19b, 12 of the 14 tumorsare detected., but morethan 20 false alarm regions are still present. Thus, use of the thickened skin pixel features is necessary to reduce false alarms. Usingonly one database, the lesion pixel database or the thickened skin pixel database, for training is not recommended. Thefour chosen features (features 1. 3, 19 and 42) [’or the thickened skin pixel database are used in the NNBclassifier

and applied to the image, Weuse the same threshold of-0.0005 for the NNB

classifier for the thickened skin pixel spectra features, since it gives goodresults. Figures 20a and 20b showthe binary classification results before and after binary ~norphological image processing for the thickened skin pixel features, respectively. The numberof false alarm regions in Figure 20a is greatly reduced to 30 in Figure 20b. In Figure: 20b, 13 of the’, 14 tumors are detected and markedby rectangles. There are manymore false alarms in the binary pixel classification output imagefor the thickened skin features (Figure 20a) than for the lesion feamre,s (Figure 19a), as weexpected frownits lower Pc score the pixel database. Wethen fuse (intersect)

the morphological-processed binary classification

image

results for the two feature cases (Figures l!~b and 20b) and obtain the final classification imageresult Figure 21a. The result indicates that the transition regions from the lesion regions to the thickened skin

31

regions of tumors are detected by both the lesion and thickened skin features. Weuse the fact that we do not expect to detect tumors on the edgeof the chicken images(we refer to this rule as post-processing) removesuch potential false alarms (6 in Figure 2.1a) that appear within 15 pixe[s of the edge of the chicken in the classification

image result. The resultant image in Figure 21b has located 12 tumors

markedby rectangles and has only a single false alarm markedby a triangle. For this image, wedetect 12 of the 14 skin tumors with one false alarm. Tumornumber5 (left center) in Figure 2 was missed; it has small lesion region (only 5-6 pixels), and only one’, pixel in this lesion region is detected by our feature selection algorithm for the lesion features (Figure 19a). Althoughthe thickened skin region of this tumor is detected as shownin Figure 20b, the final classification imageresult does not detect this tumorsince its lesion region is not detected in Figure 19b. "Fumornumber6 in Figure 2 is also missed; it has no lesion region, and our feature selection algorithm thus does not detect it in Figure 19a. Somepixels in its thickened skin region are detected in Figure 20a, but ~hey consist of five or less pixels and thus are removedin Figure 20b. Thus, both missed tumors are expected. Fusion of the binary pixel classification images significantly reduces the numberof raise alarms from more than 20 in Figure 19b and more than 30 in Figure 20b to only 7 in Figure 21a. Althoughweuse our post-processing rule to remove6 of these false alarms from Figure 21a, fusion of the two bintary pixel classification imagesis necessary.

32

(b) After morphologicalprocessing

(a) Before morphologicalprocessing

Figure19. Detectionresults using the lesion features on the first imagewith tumorsmarked by rectangles.

(b) After morphological processing

(a) Before morphologicalprocessing.

Figure 20. Detection results using the thickened skin features on the first imagewith tumors markedby re, ctangles.

33

(a) pre-processing

(b) post-processing

Figure 21. Final fused classification imageresults on the first imagewith tumors markedby rectangles and the false alarm, markedby a triangle.

The Second Image The chosen features for the lesion pixel database and the thickened skin pixel database and the NNB classifier are nowapplied to the pixels in the sec,ond HSimage. Wedo not expect our feature selection algorithm to detect tumor number2 in this image(Figure 8) because the tumor consists of only six pixels and displays no lesion region. Figures 22a and 22b shows the binary classification

images for image 2

before and after morphologicalprocessing for the lesio~a features, respectively. In Figure 22a, the lesion region of tumor number 1 in Figure 8 is detected (center of carcass on left),

but these pixels are

disconnected and thus any group of them contains only five or less pixels. This and other regions are removedand not shownin Figure 22b. Tumornumber12 (o~ the right leg of the carcass on the left) is not detected in the binary classification result f,~r the lesion fe, atures: this is expectedbecausethis tumorhas no lesion region. Thus, both missed tumors on the lefl ca~-cass are expected at the present resolution. In Figure 22b, 4 of the 7 tumors are detected and markedby’ recta~gles, but morethan 40 false alarms are

34

present. Thus, use of the thickened skin pixel features is again necessary to reduce false alarms. Figures 23a and 23b shows the binary classification

images before and after morphological processing for the

thickened skin features, respectively. Again, wesee that the binary classification imagefor the thickened skin features (Figure 23a) has morefalse alarms than the imagefor the lesion features (Figure 22a). Figure 23a, only 4 of the 6 pixels on tumor number2 in Figure 8 are detected and thus it is removedin our blob analysis. In Figure 23b, 5 of the 7 tumors are detected and markedby rectangles. The 2 missed tumors (on the left carcass) are expected, as disct~ssed earlier. Figure 24 showsthe final classification imageresult for the second imageafter fusing Figures 22b and 23b and removing3 potential false alarms on the edge of the chicken image. 4 of the 7 skin tumors are detected and markedby rectangles, and 2 false alarms occur and are markedby triangles. Tumornumber6 (lower right) in Figure 8 is detected the thickenedskin features but is missed by the lesion features. Thetumorhas a small lesion region (_-pixels), but only 3 of these pixels are detected by our feature selection algorilhm. As a result, it is not detected in the final fused classification imagein Figure 24. Fusion of the binary classification results reduces the numberof false alarms from more than 40 in Figure 22b and more than 50 in Figure 23b to only 5. 3 of these 5 false alarms are near the edge of the chicken imageand thus removedby our postprocessing rule.

35

(a) Before morphological processing

(b) After lnorphological processing Figure 22. Detection results using the lesion features on the second imagewith tumors markedby rectan,~les 36

(a) Before morphological processing

(b) After morphologicalprocessing Figure 23. Detection results using the thickened skin features on the second imagewith tumors markedby rectangles.

37

Figure 24. Final fused classification imageresult on the second imagewith tumors markedby rectangles and false alarms markedby triangles.

6.

CONCLUSIONS

Since the spectral responses on the: lesions and thickened skin portions of tumors are different, wetrain our feature selection algorithm to detect lesions and thickened skin regions separately; wethen morphologically process the resultant images, and we fuse the two detection results to reduce false alarms. The FS/MBB feature selection algorithm was described. HS data was shownto be useful for detecting chicken skin tumors. Ourfeature selection algorithm found that only 7 bands (feature i, 3, 1 i, 18, 19, 28 and 42) of HSdata were used. Our initial

test result is promising with 16 of 21 skin tumors

detected with only 3 false alarms. Four of the five misclassified tumorsare ver~ small or has small lesion regions and thus were expected (for the data at the present resolution).

Txvo of these tumors were

detected for the thickenedskin features.

38

Muchmore extensive tests are needed on muchmore data. The database should also have higher resolution, so that there are more pixels on each tumor. Creating a training and test set database is difficult becauseexact pixel tumorlocations and sizes are not clear in the present data. Their locations should be carefully addressed. Future work shouh:l consider using a k-nearest neighbor (KNN)or a neural net classifier.

ACKNOWLEDGEMENT

The author would like to thank Dr. Yud-Ren Chen, Dr. Moon Kim and Dr. Kevin Chao of the Agricultural ResearchService in Marylandfor providing the database used in this paper.

REFERENCES

[1] B. Thai, and G. Healey, "Invariant subpixel target identification in hyperspectral imagery," Proc. SPIE, vol. 3717, pp. 14-24, 1999. [2] T. Nichols, J. Thomas, W. Kober, and V. Velten, "Interference-invariant

target detection in

hyperspectral images," Proc. SPIE, vol. 3372. pp. 176-87, 1998. [3] J. Goutsias, and A. Banerji, "A morphological approach to automatic mine detection problems," IEEE Trans. Aerospaceand Electronic Systems, vol. 34.. No. 4, pp. 1085-1096,1998 [4]

J.E. Pinzon, S.L. Ustin, C.M. Casteneda, J.l:.

Pierce, and L.A. Costick, "Robust spatial and spectral

feature extraction for multispectral and hyperspectral imagery," Proc. SPIE, vol. 3372, pp. 199-210. 1998. [5] D. Casasent, and X.-W. Chen, " Hyperspectral data discrimination methods," Proc. SPIE, vol. 4203, pp. 27-36, 2000.

39

[6] T.C. Pearson, D.T. Wicklow,E.B. Maghirang,F. Xie, and F.E. Dowell, "Detecting aflatoxin in single corn kernels by transmittance and reflectance spectroscopy," Trans. qfthe ASAE,vol.44(5), pp.12471254, 2001. [7] D. Casasent, X.-W. Chen, and S. Nakariyakul, "Hyperspectral methods to detect aflatoxin in whole kernel corn," Proc. of the 2’’’t F~,mga! Genomics, 3rd FumonismElimination and 15th Aflatoxin Elimination Workshops,October 20[)2. [8] F. Dowell, T. Pearson, E. Maghirang. F. Xie, and D. Wicklow, "Reflectance and transmittance spectroscopy applied to detecting fumonisin in single corn kernels infected verticillioides,"

with Fusarium

Cereal Chem.vol. 79(2), pp. 222-226, 2002.

[9] F. Dowell, "Detecting vitreous

and non-vitreous

durum wheat kernels using near-infrared

spectroscopy," in 1999 ASAEAnnual lmernational Meeting, Paper No. 993082, 1999. [10] W.R. Windham,K.C. Lawrence, and B. Park, "Visible/NiR spectroscopy for characterizing

fecal

contamination of chicken carcasses," AmericanSociety for Agricultural Engineers, Paper No. 016004, 2001. [11] W.R. Windham,B. Park, K.C. Lawrence, and R.J. Buhr, "Selection of visible/NIR wavelengths for characterizing fecal and ingesta contamination of poultry carcasses," hzternational Conference on Near-lnfrared Spectroscopy, Abstract p. O10-5, 2001. [12] W.R. Windham,B. Park, K.C. Lawrence, and D.P. Smith, "Analysis of reflectance

spectra from

hyperspectral images of poultry carcasses for fecal and ingesta detection," International Society fi)r Optical Engineering, Paper No. 481 (>30. 2002. [13] S. Kumar,J. Ghosh, and M. Crawford, "" A hierarchical multiclassifier system for hyperspectral data analysis," in Multiple Classifi’er Svste~,,s, J. Kitter and F. Roli (Eds.), LNCS,vol. 1857, Springer, pp. 270-279, 2000. [ 14] D. Casasent, and X.-W.Chert, "Waveband selection for hyperspectral data: optimal feature selection," Proc. SPIE, vol. 5601, April 2003.

40

[ 15] B.W. Calnek, H. John Barnes, C.W.Beard, W.M.Reid, and H.W.Yoder, Diseases of poultry, Chapter 16. pp. 386-484, IowaState University Press, Ames,IA. [16] K. Chao, P.M. Mehl, M. Kim, and Y.R. Chen, "Detection of chicken skin tumors by multispectral imaging," Proc. SPIE, vol. 4206, pp. 214-223,200l. [17] I. Kim, Y.R. Chen, M. Kim, and S. Kang, "Application of hyperspectral fluorescence imaging for detection of skin tumors on chicken carcasses," in 2002 ASAEAnnualInternational Meeting, Paper No. 023142, 2002. [18] P. Narendra, and K. Fukunaga, "A branch and bound algorithm for feature subset selection," 1EEE Trans. Comput.,vol. 26, pp. 917-922, l!~77. [19] M.S. Kim, Y.R. Chen, and P.M. Mehl., ’~ Hyperspectral reflectance and fluorescence imaging system for food quality and safety," Trans. of the ASAE,vol.44(3), pp. 721-729, 2001. [20] R. Duda, P. Hart, and D. Stork, Pattern Classi~ication,

2~ ed., A Wiley-Interscience Publication, New

York, p. 48, 2001. [21] T. Cover, and J. Campenhout,"On the pos, sible orderings in the measurementselection problem," IEEETrans. Systems, Man,and Cybernetics, SMC-’7(9), pp. 657-661, 1977. [22] D. Ballard, and C. Brown,ComputerVision, Prentice-Hall, EnglewoodCliffs, N.J., p. 151, 1982.

41