Using orientation tokens for object recognition 1

Pattern Recognition Letters 19 (1998) 1125±1132 Using orientation tokens for object recognition Vassili A. Kovalev a, Maria Petrou b b,* , Yaroslav...
Author: Guest
5 downloads 1 Views 1007KB Size
Pattern Recognition Letters 19 (1998) 1125±1132

Using orientation tokens for object recognition Vassili A. Kovalev a, Maria Petrou b

b,*

, Yaroslav S. Bondar

1

a

a Institute of Mathematics, Belarus Academy of Sciences, Kirova St., 32-A, Gomel 246652, Belarus School of Electronic Engineering, Information Technology & Mathematics, University of Surrey, Guildford, Surrey GU2 5XH, UK

Received 27 October 1997; received in revised form 27 July 1998

Abstract We propose the use of co-occurrence matrices/histograms (of relative distance, relative angle) between pairs of orientation tokens for silhouette recognition and texture discrimination. The orientation tokens are de®ned as the tangent vectors to the boundary of the silhouette, or the gradient vectors for grey images. The eciency of the method is demonstrated with the help of three di€erent series of experiments with real data. Ó 1998 Elsevier Science B.V. All rights reserved. Keywords: Orientation tokens; Angle±distance histograms; Image database retrieval; Texture discrimination

1. Introduction Although the subjects of object recognition and object shape representation seem to be intertwined in many computer vision problems concerned with the identi®cation of objects in natural scenes or clattered environments, there is a large class of applications where the problem of shape or surface characterisation can be treated independently from the problem of object segmentation. This is for example the situation of inventory type image databases where each image depicts a single object and the problem is to pick the image which depicts a speci®c object. Then one can concentrate on the best way to represent the shape of an object, assuming that there will always be available the * Corresponding author. Tel.: +44 1483 259 801; fax: +44 1483 259 554; e-mail: [email protected]. 1 This work was partly supported by a Royal Society grant, and partly by an INTAS European Grant.

closed contour that represents the shape, or the texture of the object as the object is already segmented and its texture can be con®dently calculated without interference from other objects. For this purpose, some of the best known methods of shape representation include active shape models, as used by Basri et al. (1995) and Cootes et al. (1994), and polygonal representations with attributes calculated from these representations, like local curvature used by Arkin et al. (1991) and Schwartz and Sharir (1987), or moments used by Mumford (1991). One of the most important tokens discussed by Marr (1982) which humans use to recognise objects and identify their boundaries is the relative orientation of elementary structures that make up the object or its boundary. Texture anisotropy therefore has been studied by several people, among others Chetverikov (1981) and Zucker et al. (1975), and the applications, where the local texture anisotropy is exploited for the recognition of

0167-8655/98/$ ± see front matter Ó 1998 Elsevier Science B.V. All rights reserved. PII: S 0 1 6 7 - 8 6 5 5 ( 9 8 ) 0 0 0 9 3 - 2

1126

V.A. Kovalev et al. / Pattern Recognition Letters 19 (1998) 1125±1132

objects, range from cancer detection on X-ray stomach images, studied by Hasegawa and Toriwaki (1992), to the detection of faults in engineering surfaces, like the work performed by Kovalev and Chizhik (1993) and Rao (1990), and the quick sorting of images in an image database like the work of Gorkani and Picard (1994). One of the major issues that arise then is how to represent this anisotropy either in shape or in texture in a way that captures the local characteristics of the object in a holistic way and at the same time allows the easy comparison and quanti®cation of similarities and di€erences between objects. In this paper we propose the use of histograms (co-occurrence matrices) which have been used successfully in other tasks in Computer Vision as used by Haddon and Boyce (1990, 1993) and extensively discussed by Kovalev and Petrou (1996). In particular, we propose to use the histogram of the relative orientations between tokens at ®xed distance from each other. Several such 1D histograms could be constructed, one for each value of the relative distance between the compared tokens. In Section 2 we shall describe the proposed method for both shape description and texture characterisation, in Section 3 we shall present some experimental results and in Section 4 we shall present our conclusions.

2. The angle di€erence histogram In order to use the orientation tokens for recognition, we have to proceed in three steps. 1. We have to ®rst calculate these tokens. For silhouette representation we use as orientation tokens the local tangent vectors computed at each pixel of the outline of the silhouette. For this purpose we use the method that is described by Worring and Smeulders (1993), which uses two mdistant points of the contour chain. The tangent angle a at the ith pixel of the contour with coordinate position …xi ; yi † is given by the formula yi‡m ÿ yiÿm a ˆ tanÿ1 : …1† xi‡m ÿ xiÿm For grey level images we use the gradient vectors calculated at each pixel position with the help of

Sobel masks. Sobel masks are chosen because they are simple and computationally ecient. They apply minimum smoothing to the image so that texture is not over-smoothed. This is particularly relevant when we are dealing with texture recognition. Besides, Davies (1990) has shown that with the Sobel ®lters one can calculate the angle of the local gradient pretty accurately, with root mean square error equal to 0.73°. 2. Next we have to construct the co-occurrence matrix of …relative distance; relative orientation† of all available pairs of tokens. When we are describing the shape of a contour, the distance is measured by counting the pixels along the contour, counting as 1 the contribution of each tranp sition of 4-connectivity and as 2 the contribution from each transition of 8-connectivity. (We always use the shortest distance along the contour between any two pixels, in order to avoid the dependence of the calculated distance on the starting point and the ordering of the pixels along the contour.) When dealing with texture characterisation we use the Euclidean metric for the calculation of the distance. As both variables take continuous values, the construction of such a histogram will require the quantisation of both of them. Depending on the range of relative distances considered and the number of bins used, this step could be quite computationally intensive. However, it is not necessary to consider a large range of distances. Depending on the task we want to perform, we may restrict ourselves to only a small range of values and perhaps only one or two bins for the distance variable. Fixing the distance to a single value, we shall have a 1D histogram instead of a 2D co-occurrence matrix. If this distance value is small, the histogram will contain local information concerning the described shape or texture. If the value is large, global information is conveyed. 3. The histograms/co-occurrence matrices that represent di€erent objects have to be compared and their di€erence be expressed by a single number if possible. This can be achieved with the help of a metric de®ned to measure the ``distance'' between two histograms. Such a metric could for example be the Bhattacharyya distance of two distributions, the Matusita distance, the L1 or the

V.A. Kovalev et al. / Pattern Recognition Letters 19 (1998) 1125±1132

L2 norm, etc. Alternatively, each of the histograms may be described by a small number of features and the ``distance'' of the histograms may be quanti®ed by the di€erence of one or two of these features. In this paper we make use of the L2 norm, in other words we calculate the sum of the squares of the di€erences of the corresponding bins and we also propose as a feature that can discriminate between ordered and less ordered textures, the ratio of the values of the ®rst two successive bins in the histogram. The latter, however, is only a good feature for the particular type of texture we are dealing with here. In general, the choice of the best features computed from the …relative distance; relative orientation† histograms that discriminate between di€erent shapes or textures are application dependent and a matter of user choice guided by some training data. The relative orientation between the tangents of a curve at two di€erent points as a function of the distance of the points along the curve, is the same as the curvature of the curve at the limit when the two points are very close to each other. Indeed, the curvature of a curve is de®ned as the rate of change of its slope. However, here we consider also the relative slope between two tangent vectors even at large distances, not just at the limit of zero distance. So, global and local information can be incorporated in the same measure of similarity. The representation we use is not very di€erent from the scale space representations where the curve to be represented is progressively smoothed to loose details and its curvature computed and used as the shape representation (Mokhtarian et al., 1996). Our approach is somehow more general as we use it also for texture discrimination, not only shape. 3. Experimental results Three series of experiments were performed in order to test the proposed approach. First, the approach was tested for the problem of silhouette identi®cation. For this purpose, a database of 958 binary images of ®sh was used (Mokhtarian et al., 1996). The length of each contour chain varied from 256 to 1653 pixels. The bounding rectangle of each ®sh varied from 16 to 380 pixels along one axis

1127

and between 23 and 526 pixels along the other axis. For each contour we constructed the …relative distance; relative orientation† histogram as follows: We used 10 bins of width 18o each for the angle variable and 15 bins of width 1=15 half the total contour length for the distance variable. Each row of the histogram corresponding to di€erent value of the distance variable was normalised separately so that its entries summed up to 1. This is in order to avoid the characteristics of the contour at a certain distance (basically at a certain scale) to dominate over the others. Fig. 1 shows some example shapes and their corresponding histograms. It is possible to calculate features from these histograms to be used in class discrimination. For example, we de®ne: · The average contour smoothness s. Its values range between 0 and 1. This can be expressed as the entry of the ®rst bin of the histogram, the bin that corresponds to distance 1 (``next door neighbours'') and to relative orientation 0 (maximum smoothness). This number will vary between 0 and 1 as we normalise the histogram line by line for the entries of each line to sum up to 1. Maximum smoothness implies that all pairs of pixels that are next to each other have 0 relative orientation. · Convexity c of the contour. Its values range between 0 and 1. This is expressed by the value of the last bin of the histogram which contains the pixel pairs that are at maximum distance from each other, and for maximum convexity they must also have maximum di€erence in relative orientation. (The diametrically opposite points of a circle fall in this category: they are at maximum distance from each other and they have relative orientation of 180°). · Contour circularity r. This is a measure of how di€erent the contour is from a circle. We calculate it by comparing the histogram of a contour with the histogram of a circle of roughly the same size: we ®nd the average absolute di€erence between the corresponding bins and subtract it from 1. If the average di€erence is 0, r takes its maximum value 1. Fig. 2 shows some typical shapes from the database and their corresponding values of the above three features.

1128

V.A. Kovalev et al. / Pattern Recognition Letters 19 (1998) 1125±1132

Fig. 1. (a,b) Synthetic and (c,d) real contours and their (relative distance; relative orientation) histograms.

Fig. 2. Some shapes and the values of their corresponding features computed from their (relative distance; relative orientation) histograms.

Fig. 3 shows four examples of pictorially querying the ®sh database. At the top row is the query and underneath the best seven matches the system came up with are shown in order of similarity

computed. The numbers given are the sums of the squares of the di€erences of the corresponding bins between the query histogram and each of the images in the database. It turned out that this

V.A. Kovalev et al. / Pattern Recognition Letters 19 (1998) 1125±1132

1129

Fig. 3. Top: Pictorial queries to the ®sh database (985 entries). Underneath each query the seven best matches obtained with their similarity measures computed as the L2 norm between the histograms compared. Each query is answered in real time.

simple measure of similarity was very fast and adequate for searching the database in real time, and no extra features were needed.

For a second set of experiments, a set of 20 grey images of leaves was used, 10 from each of two di€erent classes (see Fig. 4). The task here was to

1130

V.A. Kovalev et al. / Pattern Recognition Letters 19 (1998) 1125±1132

Fig. 5. Feature F1 computed from the images of maple (identity numbers 1±10) and alder (identity numbers 11±20).

Fig. 4. Example entries from the grey image leaf database. (a,b): maple; (c,d): alder.

discriminate the two classes on the basis of the orientation tokens calculated from the texture. We used d ˆ 2, and only 3 bins of 30° width each for the angle histogram. (The relative orientation between any two gradient vectors was calculated modulo 90°.) By looking at the textures, it becomes apparent that what distinguishes them is the degree of order: the maple leaf appears better ``organised'', i.e. with more parallel structures, while the alder leaves have a more disorganised appearance. This characteristic can be captured by taking the ratio of the value of the ®rst histogram bin over the value of the second bin. This feature, which we call F1 , is plotted in Fig. 5 against the identity of each leaf. As expected, it characterises each class adequately, because the alder leaves with the more disorganised textures, are expected to have higher occupancy number in the second

histogram bin. If the second bin of the histogram is empty, this could be used as a cue by itself to discriminate some class. In the experiments we present here this was never the case. In the third lot of experiments, we used 182 slices of brain images from healthy subjects and subjects su€ering from various brain ailments. These slices come from Magnetic Resonance Imaging (MRI). Fig. 6 shows some example images. We constructed the histogram for each image with d ˆ 1, as we are dealing with microtextures, and three bins for the relative angle. In Fig. 7 we plot the value of F1 against the identity of each image. Note that although 182 di€erent 2D images were in the database, they came from only six di€erent 3D scans of some subjects, two of which were healthy, two su€ering from Alzheimer's disease, one su€ering from hematoma and one from Creutzfeld±Jakob disease. It is remarkable how well the healthy from the non-healthy subjects can be discriminated, on the basis of this feature. 4. Conclusions We presented here a method that exploits the orientation tokens that characterise a shape or an image for object identi®cation. The method is appropriate for identifying objects from their silhouettes, or from their grey level texture. The approach was tested with the help of a large database of ®sh. Pictorial queries by shape were

V.A. Kovalev et al. / Pattern Recognition Letters 19 (1998) 1125±1132

1131

Fig. 6. Example entries from the MRI-t2 slices database. (a,b): Normal1; (c,d): Normal2; (e,f): Alzheimer1; (g,h): Alzheimer2; (i,j): Chronic subdural hematoma; (k,l): Creutzfeld±Jakob su€erer.

answered successfully using only the L2 norm for histogram comparison. The accuracy of the answer can be improved if more features are computed from the histogram. Some example features for this purpose were proposed. On the other hand, the method was successfully tested for the identi®cation of textures in leaves, using as feature a measure of the ``order'' that appears to be present in a texture. It was proposed that this can be expressed by the ratio of the number of orientation tokens that are more or less parallel to each other over the number of tokens that form larger angles with respect to each other. The pairs of tokens considered are at ®xed distance from each

other. For a quick search of an image database this produces an ecient method of sorting images ``at a glance''. For a more detailed description, one could use many di€erent ranges of relative distance and thus produce a 2D angle±distance co-occurrence matrix for the description of the object. The use of histograms for such a task is ideal as histograms are rotation and translation invariant. On the top, they are scale invariant as well, because we consider all possible pairs of tokens and bin their distances in a ®xed number of bins, independent of the actual range of distances contained in each bin. Finally the use of orientation tokens is independent of any illumination and grey level changes,

1132

V.A. Kovalev et al. / Pattern Recognition Letters 19 (1998) 1125±1132

Fig. 7. Feature F1 plotted against the identity of each image in the MRI slice database.

something of major importance when scanning a collection of images that might have been captured under varied conditions. As a corollary of this investigation, we have shown some evidence that the MRI images of healthy subjects show more organised textures than the MRI images of patients su€ering from various brain disorders. References Arkin, E., Chew, L., Huttenlocher, D., Kedem, K., Mitchell, J., 1991. An ecient computable metric for computing polygonal shapes. PAMI 13, 209±215. Basri, R., Costa, L., Geiger D., Jacobs, D., 1995. Determining the similarity of deformable shapes. In: Workshop on Physics-based Modelling in Computer Vision, Boston, 18±19 June, pp. 135±143. Computing 12, 355±366. Chetverikov, D., 1981. Textural anisotropy features for texture analysis. In: Proceedings of IEEE Conference on PRIP, Dallas, 3±5 August, pp. 583±588.

Cootes, T.F., Hill, A., Taylor, C.J., Haslam, J., 1994. The use of active shape models for locating structures in medical images. Image and Vision Computing 12, 355±366. Davies, E.R., 1990. Machine Vision: Theory, Algorithms, Practicalities. Academic Press, New York. Gorkani, M.M., Picard, R.W., 1994. Texture orientation for sorting photos at a glance. In: Proceedings of the 12th ICPR, Vol. 1, Jerusalem, Israel, 9±13 October, pp. 459±464. Haddon, J.F., Boyce, J.F., 1990. Image segmentation by unifying region and boundary information. IEEE Trans. PAMI 12, 929±948. Haddon, J.F., Boyce, J.F., 1993. Co-occurrence matrices for image analysis. Electronics Commun. Eng. J. 4, 71±83. Hasegawa J., Toriwaki, J., 1992. A new ®lter for feature extraction of line pattern textures with application to cancer detection. In: Proceedings of the 11th ICPR, The Hague, The Netherlands, August±September, Vol. 3, pp. 352±355. Kovalev, V.A., Chizhik, S.A., 1993. On the orientation structure of solid surfaces. J. Friction and Wear 14, 45±54. Kovalev, V.A., Petrou, M., 1996. Multidimensional co-occurrence matrices for object recognition and matching. Graphical Models and Image Processing 58, 187±197. Marr, D., 1982. Vision. W.H. Freeman and Company, ISBN 0-7167-1567-8. Mokhtarian, F., Abbasi, F., Kittler, J., 1996. Robust and ecient shape indexing through curvature scale space. In: Proceedings of the 6th British Machine Vision Conference, BMVC'96, Edinburgh, 10±12 September, pp. 53±62. Mumford, D., 1991. Mathematical theories of shape: Do they model perception? In: Geometric Methods in Computer Vision. SPIE 1570, 2±10. Rao, A.R., 1990. A classi®cation scheme for visual defects arising in semiconductor wafer inspection. J. Crystal Growth 103, 398±406. Schwartz, J., Sharir, M., 1987. Identi®cation of partially obscured objects in 2 and 3 dimensions by matching noisy characteristic curves. Internat. J. Robotic Res. 6, 29±44. Worring, M., Smeulders, A.W., 1993. Digital curvature estimation. CVGIP: Image Understanding 58, 366±382. Zucker, S., Rosenfeld, A., Davies, L., 1975. Picture segmentation by texture discrimination. IEEE Trans. Comput. C-24, 1228±1233.