Coarse-Grained Entropy Rates for Characterization of Complex Time Series

Coarse-Grained Entropy Rates for Characterization of Complex Time Series Milan Palus SFI WORKING PAPER: 1994-06-040 SFI Working Papers contain accou...

Author: George George

2 downloads 0 Views 832KB Size

Report

Download PDF

Recommend Documents

ON THE PREDICTABILITY OF TIME SERIES BY METRIC ENTROPY

Time Series Clustering: Complex is Simpler!

Multi-Scale Entropy Analysis as a Method for Time-Series Analysis of Climate Data

Terahertz Time-Domain Spectroscopy for Material Characterization

Complex Fourier Series

Inverse Methods For Time Series

Entropy, biological evolution and the psychological arrow of time

Time-dependent rates of molecular evolution

A fuzzy neural network for knowledge acquisition in complex time series

One time Rates FY 2017

of entropy rates is common to the theory of stochastic processes as well as to the information theory where the entropy rates are used to characterize

Fourier Analysis of Time Series

Examples of Stationary Time Series

Characterization of Gelatin-Sodium Alginate Complex Coacervation System

Growth rates of coastal phytoplankton from time-series measurements with a submersible flow cytometer

RECURRENCE-BASED TIME SERIES ANALYSIS BY MEANS OF COMPLEX NETWORK METHODS

Time Series Analysis for the Social Sciences

Time Series Representations for Music Information Retrieval

A Framework for Time-Series Analysis*

KPSS test for functional time series

Spider Algorithm for Clustering Time Series

WEB QUANTLETS FOR TIME SERIES ANALYSIS *

KPSS test for functional time series

Learning Graphical Models for Stationary Time Series

Coarse-Grained Entropy Rates for Characterization of Complex Time Series Milan Palus

SFI WORKING PAPER: 1994-06-040

SFI Working Papers contain accounts of scientific work of the author(s) and do not necessarily represent the views of the Santa Fe Institute. We accept papers intended for publication in peer-reviewed journals or proceedings volumes, but not papers that have already appeared in print. Except for papers by our external faculty, papers must be based on work done at SFI, inspired by an invited visit to or collaboration at SFI, or funded by an SFI grant. ©NOTICE: This working paper is included by permission of the contributing author(s) as a means to ensure timely distribution of the scholarly and technical work on a non-commercial basis. Copyright and all rights therein are maintained by the author(s). It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may be reposted only with the explicit permission of the copyright holder. www.santafe.edu

SANTA FE INSTITUTE

Coarse-grained entropy rates for characterization of complex time series Milan Palus Santa Fe Institute, 1660 Old Pecos Trail, Suite A, Santa Fe, NM 87505, USA E-mail: [email protected] June 24, 1994 Abstract A method for estimation of coarse-grained entropy rates (CER's) from time series is presented, based on information-theoretic functionals - redundancies. The CER's are relative measures of regularity and predictability, and for data generated by dynamical systems they are related to Kolmogorov-Sinai entropy. A deterministic dynamical origin of the data under study, however, is not a condition necessary for the use of the CER's, since the entropy rates can be defined for stochastic processes as well. Sensitivity of the CER's to changes in the data dynamics is tested by numerically generated time series resulted from both deterministic - chaotic and stochastic processes. Potential application of the CER's in analysis of electrophysiological signals or other complex time series is demonstrated by an example from a pharmaco-EEG study.

1

Introduction

A number of descriptive measures, like dimensions or Lyapunov exponents, for characterization of complex time series have been developed based on concepts from nonlinear dynamics and theory of deterministic chaos [1, 2, 3, 4]. These measures have well-defined meanings when the analyzed data were indeed generated by a low-dimensional deterministic system. Analyzing experimental time series, like those in biology and medicine, underlying dynamical mechanisms are usually unknown and due to questionable reliability of dimensional or Lyapunov exponents

1

algorithms, applied to short and noisy data, results eventually supporting the hypothesis of low-dimensional chaos cannot be taken without reservations. Some authors [5, 6, 7, 8] do not insist any more on interpretations of their results, like finite dimension estimates, as evidence for underlying low-dimensional chaos, but propose these measures, mostly the correlation dimension (CD) [1, 2], as measures for relative characterization of different datasets. In the case of physiological time series, recorded in different physiological states, these measures are proposed to characterize physiological states of an organism or its parts. Even though these authors demonstrate that such dimensional estimates are able to distinguish time series recorded in different experimental conditions, considering the underlying processes can be high-dimensional or stochastic, these low numbers, obtained from dimensional algorithms, are probably spurious and have no theoretically justified meaning and interpretation. Moreover, it can be hardly established how robust or sensitive the measures like the CD are, when applied to relative characterization of processes, which dimensionality can be effectively infinite. In this paper we propose alternative measures, called coarse-grained entropy rates (CER's), which are easy to compute and applications of which do not involve the above theoretical and practical problems. For data generated by chaotic dynamical systems the CER's are related to Kolmogorov-Sinai entropy. A deterministic dynamical origin of the data under study, however, is not a condition necessary for the use of the CER's, since the entropy rates can be defined for stochastic processes as well. We argue, however, that the exact entropy rates cannot be computed in most of experimental situations. Therefore we do not estimate the limit values given in the definitions of the exact entropy rates, or the Kolmogorov-Sinai entropy in the case of an underlying dynamical system, we rather propose coarse-grained quantities, which are related to the exact entropy rates, however, their values can depend on particular experimental and numerical conditions. Thus the CER's are not meant as absolute quantities able to classify systems in general, or to identify chaotic systems, but rather as relative quantities for comparison of datasets recorded in the same experimental conditions and processed using the same numerical procedures. On the other hand, the CER's have the same theoretical interpretations as the exact entropy rates: The CER's are (relative) measures of regularity and predictability, i.e., if one dataset gives higher CER than the other, the former is more irregular and less predictable than the latter, and, as we demonstrate below, the exact entropy rates (or the Kolmogorov-Sinai entropies) of the underlying processes are in the same relation. The CER's, proposed in this paper, are computed from information-theoretic functionals - redundancies, which, together with entropies and mutual information 2

are introduced in Sec. 2. Further details can be found in Refs. [9, 10, 11] and references therein. The exact entropy rates and Kolmogorov-Sinai entropy are shortly reviewed in Sec. 3, for more details we refer to [9, 11, 12, 13, 14, 15]. In Section 4 we define the coarse-grained entropy rates. Their numerical properties, sensitivity to changes in the dynamics underlying the analyzed data, robustness against additive noise and some transformations of data are studied in Sec. 5. Potential application of the CER's in analysis of electrophysiological signals or other complex time series is demonstrated by an example from a pharmaco-EEG study in Sec. 6. Conclusion is given in Sec. 7.

2

Entropy, information and redundancy

Let X be a discrete random variable with a set of values ("alphabet") 3 and probability mass function p(x) = Pr{X = x}, x E 3. We denote the probability mass function by p(x) rather than px(x) for convenience. Thus p(x) andp(y) refer to two different random variables, and are different probability mass functions, px(x) and py(y), respectively. The entropy H(X) of a discrete random variable X is defined by

H(X) = - :Ep(x)logp(x).

(1)

xEE

If the logarithm is to the base 2, the entropy is measured in bits, if natural logarithm is used, the entropy is measured in nats. For a pair of discrete random variables X and Y with a joint distribution p( x, y) the joint entropy H(X, Y) is defined as

H(X,Y) = -:E :Ep(x,y)logp(x,y).

(2)

xEEyET

The conditional entropy H(Y\X) of Y given X is defined as

H(YIX)

= :Ep(x)H(YIX = x) xES

= -

:Ep(x) :Ep(Ylx)logp(Ylx) xES

= - :E

(3)

yET

:Ep(x,y)logp(Ylx).

xEEyET

The average amount of common information, obtained in the variables X and Y, is quantified by the mutual information leX; Y), defined as

leX; Y) = H(X)

+ H(Y) 3

H(X, Y).

(4)

Equalities

I(X; Y)

= H(X) - H(XIY) = H(Y) - H(YIX)

(5)

can be easily derived. The joint entropy of n variables Xl,. .. , X n with the joint distribution P(Xl, ... , x n) is defined as

H(Xl, ... ,Xn ) = -

L ... L Xl eEl

Xn

p(xl, ... ,xn)logp(Xl, ... ,Xn).

(6)

EE:n

Redundancy R(Xl ;... ;X n) quantifies the average amount of common information contained in the n variables Xl, ... , X n and can be defined as straightforward generalization of (4):

Besides (7), the marginal redundancy £I(Xl , . .. , X n - l ;X n), quantifying the average amount ofinformation about the variable X n contained in the variables Xl, ... , Xn-i> can be defined as

The relations

and

£I(X1>". , X n - l ;Xn) = H(Xn )

-

H(Xn[Xl , ... ,Xn- l ).

(10)

can be derived by simple manipulation.

3

Entropy rates

Let {Xi} be a stochastic process, Le., indexed sequence of random variables, characterized by the joint probability mass function P(Xl, . .. , x n) = Pr{(Xl , ... , X n) = (X1>".'x n)}, (Xl, ... ,Xn) E 3 1 X ••• X 3 n. The entropy rate of {Xi} is defined as (11)

In the following we will consider a process under study to be stationary and ergodic. Due to stationarity the limit in (11) exists and the entropy rate (11) is also equal to

(12) 4

Considering (10) we can write

(13) Due to stationarity (14) A way from the entropy rate of a stochastic process to the Kolmogorov-Sinai entropy (KSE) of a dynamical system can be straigthforward due to the fact that any stationary stochastic process correspond to a measure-preserving dynamical system, and vice versa [14]. Then for the definition of the KSE we can consider the above equations (11) or (12), however, the variables Xi should be understood as m-dimensional variables, according to a dimensionality of a dynamical system. If the dynamical system is evolving on continuous (probability) measure space, then any entropy depends on particular partition chosen to discretize the space and the KSE is defined as a supremum over all finite partitions [12, 14, 15].

e

In a typical application one deals with a time series {y(t)}, considered as a particular realization of a stochastic process {Y(t)}. Then, due to ergodicity, all the subsequent information-theoretic functionals are estimated using time averages instead of ensemble averages, and, the variables Xi are substituted as

Xi = y(t + (i - l)r).

(15)

Due to stationarity the marginal redundancies

en(r) == e(y(t), y(t+ r), ... , y(t+ (n - 2)r); y(t + (n - l)r))

(16)

are functions of nand r, independent of t. Note that

(17) Then, the entropy rate of {y(t)} can be written as

(18) For the entropy rate hI per a time unit the following equation holds [15]

(19) Possibilities to compute the entropy rates from data are limited to a few exceptional cases: for stochastic processes it is possible, e.g., for finite-state Markov

5

chains [9]. In the case of a dynamical system on continuous measure space the KSE can be, in principle, reliably estimated, if the system is low-dimensional and large amount of (practically noise-free) data is available. In such a case, Fraser [11] proposed to estimate the KSE of a dynamical system from the asymptotic behavior of the marginal redundancy, computed from a time series generated by the dynamical system: lim en(r) = Hl -Irlh. (20) n--+oo It was shown [11, 16], that if the underlying dynamical system is m-dimensional and the marginal redundancy en ( r) is estimated using a partition fine enough (to attain so-called generating partition [11, 12, 15]), than the asymptotic behavior

(21) is attained for n = m + 1, m + 2, ..., for some range of r. This is illustrated in Fig. la, where the marginal redundancies en(r), n = 2 - 5, for the chaotic Lorenz system [17] are presented. The Lorenz system is three-dimensional and en(r) for n = 4 and 5 and lags r = 5 - 40 (approximately) is close to a linearly decreasing function so that the KSE h can be estimated as its slope according to (21). To obtain such a result, however, a relatively fine partition and an adequate amount of data must be used. In the case, presented in Fig. la, the time series length N was one million samples and the partition was based on Q = 16 equiquantal marginal boxes. l If the same partition (Q = 16) is used for shorter time series, N = 16,384, the results are distorted (Fig. Ib). The reasons of this distortion are discussed in [16], where also the following requirement is proposed for the effective2 series length N, necessary for the estimation of the n-dimensional redundancy using Q equiquantal marginal boxes: N ~ Qn+l, (22) otherwise the results are distorted as in the example in Fig. lb. The adequate partition (Q = 5 for N = 16,384, Fig. Ie) is not fine enough to attain the asymptotic behavior (21) and no linearly decreasing region in en ( r) as a function of r is detected. This means that having a limited amount of data the KSE cannot be estimated even approximately. Similar restrictions can be found also for different methods for estimation of the KSE or the Lyapunov exponents [19]. 1 We estimate the redundancies by the box-counting method adaptive in one dimension, Le., marginal boxes are defined in such a way that there is approximately the same number of points in each marginal box. Thus a partition is defined by a number Q of equiquantal marginal boxes. For the embedding dimension n the total number of the partition boxes is then Qn. For more details see [16, 18, 20]. 'The effective series length N is N = No - (n - l)r, where No is the total series length, n is the embedding dimension and T is the time delay used in the estimation of Un ( T).

6

4

Coarse-grained entropy rates

In experimental practice the analyzed time series are usually short and contaminated by noise, so even if they resulted from low-dimensional chaotic processes, estimation of their KSE is practically impossible. And in many experiments the actual dynamical mechanism, underlying the analyzed data, is unknown, and can be either high-dimensional deterministic or stochastic. As we pointed above, unlike the dimensions or Lyapunov exponents, the entropy rates are meaningful quantities for characterization of stationary processes irrespectively of their origin. The problem is, however, that the exact entropy rate of a process usually cannot be estimated. In order to utilize the concept of the entropy rates in time-series analysis we propose to give up the effort for estimating the exact entropy rates and rather define "coarse-grained entropy rates" (GER's), which are not meant as estimates of the exact entropy rates, but quantities which can depend on a particular experimental and numerical set-up, however, quantities which have the same meaning as the exact entropy rates, i.e., can be used as measures of regularity and predictability of the analyzed time series in the relative sense: Two or several datasets can be compared according their regularity and predictability providing they were measured in the same experimental conditions and their GER's were estimated using the same numerical parameters, defined and discussed below.

The GER's are coarse-grained in space and their estimation is limited to finite time: 1. Space: If a time series resulted from a process evolving on a continuous measure space, in estimating the GER a partition used is only as fine as a series length allows (eq. 22). 2. Time: The limit for n ---7 00 is ignored and n used again depends on the series length. In many applications n = 2 or 3 is used. Also the range of l' is limited (see the definitions below). The most straighforward definition of a GER can be based on (21): h(O)

=

it( 1'0)

-

it{rt) .

(23)

1'1-1'0

This definition3 , further referred to as the GER

=

Mal, can be heavily influenced by

"For the particular choice 1'0 0 this definition is related to the approximate entropy ApEn introduced by Pincus et al. [21], however, the ApEn is estimated from correlation integrals [22], while the CER MO) is computed from the marginal redundancy estimated by an adaptive boxconnting method - see Footnote 1. One can also estimate the (generalized) redundancies using the correlation integrals [23], however, we have not explored this possibility yet.

7

the choice of "-0 and T1, as far as formula (23) is directly related to the method of estimating the KSE from the linearly decreasing marginal redundancies en (T) (Fig. la), which, as we argued above, cannot be obtained in most of experimental applications. Estimating en(T) from an experimental time series, which is short and/or high-dimensional or stochastic, the marginal redundancies en (T) do not decrease linearly, but in an exponential or power-law way (Fig. lc), or even not monotonically - Fig. Id presents en(T) of a human electroencephalogram (EEG), in which a long-term decrease is modulated by faster (about 10 Hz) oscillations. Thus the CER h(O) depends on the choice of TO and T1 and there is no criterion how to find "the best" T-S. For an alternative definition of the CER we consider the following properties of

en(T): 1. For an ergodic process the marginal redundancy

en ( T)

->

0 for T ->

00.

2. In finite-precision computation, the (coarse-grained) marginal redundancy of an ergodic process decreases to zero value in finite T, and the integral J en(T)dT is finite. 3. The integral J en(T)dT (the area under the curve) depends on a particular entropy rate of a process under study. Then, in a particular application, we compute the marginal redundancies en (T) for all analyzed datasets and find T ma" for which en ( T ma,,) f'::: 0 for all the datasets. Then we define a norm of the marginal redundancy

Ilenll = 2:~~~~ en(T). 'Tmax -

'To

(24)

In experimental applications the lags T are discrete and thus the integral J en(T)dT was substituted by the sum in (24). The lag TO is usually set to zero. Having defined the norm lien II, the difference en (To) - lien II can be considered as the alternative definition of the CER. We have found, however, that the definition of the CER, which does not depend on the absolute values of en(T), has better numerical properties, namely the estimates are more stable and less influenced by an additive noise. Thus, we define the CER M1) as

(25)

8

5

Properties of CER - numerical examples Consider an autoregressive process (ARP) given as 10

Yt = e

L akYt-k + O'et,

(26)

k=l

where ak=l, ..,lO = 0,0,0,0,0, .19, .2, .2, .2, .2, 0' = 0.01 and et are Gaussian deviates with zero mean and unit variance. For e = 1 this ARP has long coherence time [20], for e < 1 the coherence time is decreased and the entropy rate increased. In particular, we can generate a number of the ARP's with different e's and thus different entropy rates. The entropy rates of such ARP's should monotonically decrease with increasing e. Figure 2 presents the coarse-grained entropy rates for 100 ARP's with e increasing from 0.5 to 0.9. For the estimation of MO) we set TO = 0 and Tl = 1 (sample), for Ml) , To = 0 and Tmax = 100 (samples) were set. The results in Figs. 2a,b were obtained using the series length N = 16,384 samples and Q = 16 equiquantal marginal levels, in Figs. 2c,d N = 1,024 and Q = 8 were used. The embedding dimension n = 2 was used. The CER Ml), estimated from 16K samples and Q = 16 (Fig. 2b) exhibits smooth monotonic decrease, as expected. Using shorter time series (N =IK, Q = 8, Fig. 2d) the estimates are less stable, i.e., fluctuations from the smooth monotonic curve occur. The estimates of CER h(O) seems less stable than those of h(l) using the same numerical parameters. (Cf. Figs. 2a and 2b for the 16K estimates and 2c and 2d for the lK estimates.) Another example for the study of the behavior of the CER's is the chaotic baker transformation:

for Yn ::::; a, or:

(27) for Yn > aj

o:: :; X n , Yn ::::; 1, 0 < a < 1, >. was set to >. =

0.25. For this system the positive Lyapunov exponent, or, equivalently, the Kolmogorov-Sinai entropy can be expressed analytically as the function of the parameter a [24, 25]:

1 1 h(a) = alog- + (1- a)log-I-' a

-a

(28)

Thus we can generate a number of chaotic time series with different positive Lyapunov exponents and compare the behavior of the CER's with the exact dependence

9

of the positive Lyapunov exponent (or, equivalently, the KSE) on the parameter 0', displayed in Fig. 3a. Figures 3b-g display the same dependence of the CER MO) (Figs. 3b,d,f) and h(l) (Figs. 3c,e,g) estimated using different time series lengths N and numbers Q of the equiquantal marginal levels: N = 16K and Q = 16 (Figs. 3b,c), N = 1K and Q = 8 (Figs. 3d,e) and N = 256 and Q = 4 (Figs. 3f,g). The embedding dimension used was n = 2, the lags 7'0 = 0, 7'1 = 1 and 7'ma" = 100. The CER Ml) for N = 16K and Q = 16 (Fig. 3c) very well mimics the dependence of the positive Lyapunov exponent on the parameter 0'. Similarly, like in the above case of the ARP, the results obtained from shorter time series are less stable for both the MO) and Ml), while for longer time series the CER Ml) seems to perform better comparison of different time series than the CER MO), namely the fluctuations from the expected curve are larger in the case of MO). Using larger 7'1 (7'1 = 2 or 4) can slightly improve the estimations of h(O) (i.e., decrease the fluctuations), however, in experimental practice where there is no standard to compare the CER's with (like Fig. 3a in this case), there is also no method for choosing an ideal 7'1' Therefore we present here the results obtained using the simplest choice 7'1 = 1. The above examples demonstrate that the CER's MO) and Ml) can distinguish time series with different entropy rates. The CER MO) is measured in bits or nats per time unit, the CER h(l) is dimensionless quantity. As it was stressed above, neither MO) nor Ml) are meant as estimates of exact entropy rates or Kolmogorov-Sinai entropy, but as measuring tools for relative comparison of different datasets. The CER's, computed from the marginal redundancy, are invariant against linear transformations of data, and, due to the way of estimation of the marginal redundancy (marginal "equiquantization", see Footnote 1) also monotonous nonlinear transformations do not change the results, at least in the relative sense (relative comparison of different datasets). This is illustrated in Fig. 4, where the CER's MO) (Figs. 4a,c) and Ml) (Figs. 4b,d) computed from the same data as in Fig. 3, however, passed through quadratic (Figs. 4a,b; the data are positive and thus the quadratic transformation is monotonous in this case) and cubic (Figs. 4c,d) nonlinear transformations, N = 16K, Q = 16. The results in Figs. 4a,c and 4b,d are very similar to those obtained from the original data, presented in Figs. 3b and 3c, respectively. In experimental practice data are usually contaminated by noise. Influence of the additive Gaussian noise on the estimations of the CER's (N = 16K, Q = 16) is illustrated in Fig. 5. The amounts 10% (Figs. 5a,b) and 50% (Figs. 5c,d) mean that the variance of the additive Gaussian noise is 10% and 50%, respectively, of the variance of the original noise-free data. The additive noise increases the values of the CER's, however, the relative comparison of different datasets is not changed. 10

6

Applications of CER's - an example from electrophysiology

In practical applications the choice between MO) and h(l) depends on an available amount of data. As far as we demonstrated that, for longer time series, the CER MI) can have better numerical properties than h(O) and, unlike h(O), MI) is independent of choice of the time lag 71 (70 is usually set to 0, 7 ma" is given by the data) we prefer application of the CER MI), if the amount of available data is sufficient for estimation of the marginal redundancy en(7) for the range of the lags 7 large enough to attain the lag 7 ma", en ( 7 ma,,) >::J 0, at least for n = 2. Analyzing short time series, when 7ma" is comparable with the series length and computing en (7) for large range of 7'S can drastically decrease the effective series length (see Footnote 2) the CER MO) can be better choice: computing MO) with n = 2, 70 = 0 and 71 = 1 the effective series length is Ntotal - 1 so that the maximum available effective series length is used to secure maximum available reliability and stability of the results. In spite of invariance of the CER's against linear and some nonlinear transformations (as discussed above), we propose to apply the CER's as relative measures, i.e., for comparison of datasets recorded in the same experimental conditions and processed using the same numerical parameters. A demonstrative example could be data from a pharmaco-EEG4 study. The electroencephalogram (EEG) of a healthy human volunteer was recorded before and several times after (0.5, 1, 2, 3, 4 and 6 hours) a dose of alcohol was administered. Concentration of ethanol in blood (Fig. 6a) was measured from breath in the same intervals as the EEG was recorded. Reported results were obtained from the EEG signal recorded in position 0 1 with the Goldman reference electrode. The sampling frequency used was 128 Hz. An example of marginal redundancies en(7) of one of these EEG records for n = 2, 3 and 7 = 1 - 50 lags is presented in Fig. 1d. The large amount of data (16,384 samples for each recording) was available, so we chose the CER MI). The marginal redundancies en (7) decrease to zero in lags less than 0.5 sec. and 7ma" = 50 samples was chosen for the estimation of MI), embedding dimension n = 3, and number Q = 4 of the marginal equiquantal levels were used. The results - dependence of the CER MI) on the time after the dosage of alcohol - are presented in Fig. 6b: Increase of the concentration of alcohol in blood induces decrease of the coarse4Pharmacoelectroencephalography (pharmaco-EEG) is a research field in neuroscience oriented to electrophysiological brain-research in pharmacology, clinical pharmacology, neurotoxicology, pharmacopsychiatry and related fields, in which effects of psychoactive substances on the EEG are studied and evaluated [26].

11

grained entropy rate MI). Figures 6c and 6d present results of standard spectral analysis, namely the spectral powers in the alpha band (8 - 13 Hz) and in the beta band (13 - 32 Hz). Presence of alcohol in blood causes increase of the alpha activity and decrease of the beta activity in the EEG of this volunteer. Particularly, the spectral power in the beta band correlates very well with the CER MI). Thus, the CER reflects physiologically meaningful information, however, in this case it does not bring any new information. The reason is probably that the studied changes can be fully described on linear level by the means of spectral analysis.

1

Conclusion

We have introduced the coarse-grained entropy rates, quantities suitable for relative quantification of experimental time series. When a process, underlying experimental data, is probably not low-dimensional, the CER's, unlike the numbers obtained by formal applications of dimensional algorithms, have the same theoretical interpretations as the exact entropy rates, or Kolmogorov-Sinai entropies of dynamical systems: They are relative, macroscopic (or coarse-grained) measures of regularity and predictability, and, as we have demonstrated above, for known dynamics they give the same classification of datasets (and underlying processes) as the exact entropy rates or the Kolmogorov-Sinai entropies of chaotic systems. Considering the question when to use the CER's in analysis of experimental time series, let us remind that the CER's are quantities related to the entropy rates which can be defined for both the deterministic and stochastic processes. Thus the CER's are meaningful irrespectively of the origin of the data5 • However, when the data can be well described by a Gaussian process, then all dynamical information about a process is contained in its spectrum and the CER's cannot bring more information than the spectral analysis, as it was demonstrated in the above example from the pharmaco-EEG study. We believe, however, that there are many real-world problems in whicl1 nonlinearity (either deterministic or stochastic) plays an important role and the application of the CER's can bring complementary information not detectable by standard tools like the spectral analysis.

50f course, with exemption of regular deterministic processes with zero entropy rate, which can

be easily identified from the shape of the marginal redundancy curve plotted as a function of time lag T [16].

12

Acknowledgements The author would like to thank D. Prichard, J. Theiler and D. Kaplan for valuable comments. The pharmaco-EEG data were recorded in the EEG Laboratory of the Prague Psychiatric Center under the supervision of I. David, M.D., Ph.D., who is gratefully acknowledged. The author was supported by the International Research Fellowship F05 TW04757 ICP from the National Institutes of Health, the Fogarty International Center, and also by grants to the Santa Fe Institute, including core funding from the John D. and Catherine T. MacArthur Foundation, the National Science Foundation (PHY8714918), and the U.S. Department of Energy (ER-FG05-88ER25054).

References [1] P. Grassberger and I. Procaccia, On the characterization of strange attractors, Phys.Rev.Lett. 50 (1983) 346-349. [2] P. Grassberger and I. Procaccia, Measuring the strangeness of strange attractors, Physica D 9 (1983) 189-208. [3] N.B. Abraham, A.M. Albano, A. Passamante and P.E. Rapp, (eds.), Measures of Complexity and Chaos (Plenum Press, New York, 1989). [4] G. Mayer-Kress, (ed.), Dimensions and Entropies in Chaotic Systems (Springer, Berlin, 1986). [5] S.P. Layne, G. Mayer-Kress, J. Holzfuss, Problems associated with dimensional analysis of electroencephalogram data, in: G. Mayer-Kress, (ed.), Dimensions and Entropies in Chaotic Systems (Springer, Berlin, 1986), p. 246. [6] G. Mayer-Kress and S.P. Layne, Dimensionality of the human electroencephalogram, in: A.S. Mandell and S. Koslow (eds.) Perspectives in Biological Dynamics and Theoretical Medicine, Ann. N. Y. Acad. Sci. 504 (1987) 62-87. [7] M. Koukkou, D. Lehmann, J. Wackermann, I. Dvorak and B. Henggeler, Dimensional complexity of EEG brain mechanisms in untreated schizophrenia, Biol. Psychiatry 33 (1993) 397-407.

13

[8] J. Wackermann, D. Lehmann, 1. Dvorak and C.M. Michel, Global dimensional complexity of multi-channel EEG indicates change of human brain functional state after a single dose of a nootropic drug, EEG Clin. Neurophysiol. 86 (1993) 193-198.

[9] T.M. Cover and J.A. Thomas, Elements of Information Theory (J. Wiley & Sons, New York, 1991). [10] R.G. Gallager, Information Theory and Reliable Communication (J. Wiley, New York, 1968).

[11] A.M. Fraser, Information and entropy in strange attractors, IEEE Transactions on Information Theory 35 (1989) 245-262. [12] 1.P. Cornfeld, S.V. Fomin, Ya.G. Sinai, Ergodic Theory (Springer, New York, 1982). [13] P. Billingsley, Ergodic Theory and Information (J. Wiley, New York, 1965). [14] K. Petersen, Ergodic Theory (Cambridge University Press, Cambridge, 1983). [15] Ya. G. Sinai, Introduction to Ergodic Theory (Princeton University Press, Princeton, 1976). [16] M. Palus, Identifying and quantifying chaos by using information-theoretic

functionals, in: Time Series Prediction: Forecasting the Future and Understanding the Past, A.S. Weigend and N.A. Gershenfeld, (eds.), Santa Fe Institute Studies in the Sciences of Complexity, Proc. Vol. XV (Addison-Wesley, Reading, Mass., 1993) pp. 387-413. [17] E.N. Lorenz, Deterministic nonperiodic flow, J. Atmos. Sci. 20 (1963) 130-14l. [18] M. Palus, V. Albrecht and 1. Dvorak, Information-theoretic test for nonlinearity in time series, Phys. Lett. A 175 (1993) 203-209. [19] J.-P. Eckmann and D. Ruelle, Fundamental limitations for estimating dimensions and Lyapunov exponents in dynamical systems, Physica D 56 (1992) 185-187. [20] M. Palus, Testing for nonlinearity using redundancies: Quantitative and qualitative aspects, Santa Fe Institute Working Paper 93-12-076, submitted to Phys-

ica D.

14

[21] S.M. Pincus, 1.M. Gladstone and R.A. Ehrenkranz, A regularity statistic for medical data analysis, J. Clin. Monit. 7 (1991) 335-345. [22] P. Grassberger and 1. Procaccia, Estimation of the Kolmogorov entropy from a chaotic signal, Phys. Rev. A 28 (1983) 2591-2593. [23] D. Prichard and J. Theiler, Generalized redundancies for time series analysis, preprint LA-UR-94-1772, Los Alamos National Laboratory. [24] H.G.E. Hentschel and 1. Procaccia, The infinite number of generalized dimensions of fractals and strange attractors, Physica D 8 (1983) 435-444. [25] J.D. Farmer, E. Ott and J.A. Yorke, Physica D 7 (1983) 153. [26] W.M. Herrmann (ed.), Electroencephalography In Drug Research (Fischer, Stuttgart, New York, 1982).

Figure captions Fig. 1.: (a-c) Marginal redundancy as function of time lag 'T for the time series generated by the chaotic Lorenz system: (a) N = 1 million samples, Q = 16, (b) N = 16,384, Q = 16, and (c) N = 16,384, Q = 5. The four different curves are marginal redundancies for different embedding dimensions, n = 2 - 5, reading from bottom to top. (d) Marginal redundancy vs. time lag for human electroencephalogram, N = 16,384 samples, Q = 4, sampling rate 128 Hz, embedding dimensions n = 2 (lower curve) and 3 (upper curve). Fig. 2.: Coarse-grained entropy rates MO} (a,c), 'To = 0, 'Tl = 1, and Ml} (b,d), 'Tma" = 100, N = 16,384 and Q = 16 (a,b), N = 1,024 and Q = 8 (c,d), as functions of the parameter c, computed from one hundred time series generated by the autoregressive process, which exact entropy rate is a smooth monotonically decreasing function of the parameter c. Embedding dimension n = 2.

'To

= 0,

Fig. 3.: (a) Positive Lyapunov exponent (or Kolmogorov-Sinai entropy) of the chaotic baker map computed as the analytical function of the parameter a. (b-g) Coarse-grained entropy rates h(O} (b,d,f), 'To = 0, 'Tl = 1, and Ml} (c,e,g), 'To = 0, 'Tma" = 100, N = 16,384 and Q = 16 (b,c), N = 1,024 and Q = 8 (d,e), N = 256 and Q = 4 (f,g) as functions of the parameter a, computed from ninety-seven time series generated by the chaotic baker maps with the parameter a changing from 0.01 15

to 0.49. Embedding dimension n = 2. Fig. 4.: The same dependence of the coarse-grained entropy rates MOl (a,c) and Ml) (b,d) as in Figs. 3b and 3c, respectively, but for data transformed by the quadratic (a,b) and cubic (c,d) nonlinear transformations. Fig. 5.: The same dependence of the coarse-grained entropy rates MO) (a,c) and MIl (b,d) as in Figs. 3b and 3c, respectively, but for data contaminated by additive Gaussian noise, variance of which is 10% (a,b) and 50% (c,d) of the variance of the original noise-free data. Fig. 6.: (a) Concentration of alcohol in blood, (b) coarse-grained entropy rate MIl (N = 16,384, Q = 4, n = 3, 70 = 0, 7 max = 50), (c) spectral power in the alpha band (8 - 13 Hz), (d) spectral power in the beta band (13 - 32 Hz) as functions of time after the dose of alcohol to a healthy volunteer. The CER MIl and the spectral parameters were obtained from the EEG signal recorded in the position 0 1 , Goldman average reference, sampling rate 128 Hz.

16

Sat May 21 00:39:00 1994

>-

2

()

(b)

Z

«0 z :)

1.5

1.5

0

w

a:

1

1

0.5

0.5

....J

« Z

(')

a:

«

:2: 0

>()

20

40

60 0.3

1

20

0 I

40 I

I

60 I

(d)

Z

«0 z :)

0.2

-

0.1

-

0

w

a:

0.5

....J

« z

(')

a:

OW

«

:2: 0

20

40 LAG

60

0

10

V\A.~ 20

30

. 40

50

LAG

Fig. 1.

Tue Apr 19 23:25:32 1994

2.76 1.4

a: w

2.74

()

1.2 2.72

(a): h(O)

(b):h(1)

1 0.5

0.6

0.7

0.8

0.9

0.5

0.6

0.7

0.8

0.9

0.8

0.9

2.04 0.7

2.02

a: w

()

2

0.6

1.98 (c): h(O)

1.96 0.5

0.6

0.7

(d): h(1)

0.5 0.8

PARAMETERc

0.9

0.5

0.6

0.7

PARAMETERc

Fig. 2.

Tue Apr 19 23:27:30 1994

f-

z w z

0.6

0

a..

>< w > 0 z

::::>

a..

~ ---I

0.4

0.2

(a)

0.1

a:

0

0.3

0.4

0.6

0.8 w

0.2

0.4 0.6

0.4

(b): h(O)

0.1 0.2 0.3 0.4 PARAMETER ALPHA

0.2

(c): h(1)

0.1 0.2 0.3 0.4 PARAMETER ALPHA Fig. 3. Part 1 of 2.

Tue Apr 19 23:28:57 1994

0.8 0.4

a:

UJ

0

0.6 0.2 0.4

(d): h(O)

0.1

a:

0.2

0.3

(e): h(1)

0.4

0.1

0.8

0.4

0.6

0.3

0.2

0.3

0.4

UJ

o

0.2

0.4 (I): h(O)

(g): h(1)

0.1

0.2 0.1

0.2

0.3

0.4

PARAMETER ALPHA

0.1

0.2

0.3

0.4

PARAMETER ALPHA Fig. 3.

Part 2 of 2.

Mon May 23 18:58:38 1994

0.6

0.8

a: LlJ

()

0.4 0.6 (a): h(O) quadratic

0.4 0.1

0.2

0.3

0.4

LlJ

0.6

0.6

0.4

0.4

(c): h(O) cubic

0.1 0.2 0.3 0.4 PARAMETER ALPHA

(b): h(1) quadratic

0.1

0.8

a: ()

0.2

0.2

0.2

0.3

0.4

(d): h(1) cubic

0.1 0.2 0.3 0.4 PARAMETER ALPHA Fig. 4.

Mon May 23 18:58:52 1994

1 1.5

a:

0.8

1.4

0.6

UJ

0

1.3

0.4 (a): h(O) 10%

1.2

0.1

0.2

0.3

(b): h(1) 10%

0.2 0.4

0.1

0.2

0.3

0.4

2.6

a:

2.5

1

UJ

o

2.4 0.5

(c): h(O) 50%

0.1

0.2

0.3

0.4

PARAMETER ALPHA

(d): h(1) 50%

0.1

0.2

0.3

0.4

PARAMETER ALPHA

Fig. 5.

Thu Jun 23 21:18:25 1994

';f ~

(a)

0.08

(b): CER h(1)

0.9

c.:i

z

0 0

0.06

....I

0

z

oct:

:r:

0.8

0.04

I-

w

Cl

0 0

0.02 0.7

....I

lD

0 0

2

4

6

0

2

4

6

0.18 (e): ALPHA BAND

(d): BETA BAND

0:

w

3:

0.16

0.6

0

a.

0.14

....I

oct:

0: I-

0

w

a.

en

0.12

0.5

0.1 0

2

4

TIME [HOURS]

6

0

2

4

TIME [HOURS]

Fig. 6 .

6