Diffusion factors Tove Faber Frandsen Royal School of Library and Information Science, Copenhagen, Denmark

The current issue and full text archive of this journal is available at www.emeraldinsight.com/0022-0418.htm JDOC 62,1 Diffusion factors Tove Faber ...
Author: Albert McCarthy
1 downloads 2 Views 229KB Size
The current issue and full text archive of this journal is available at www.emeraldinsight.com/0022-0418.htm

JDOC 62,1

Diffusion factors Tove Faber Frandsen Royal School of Library and Information Science, Copenhagen, Denmark

58 Received 20 December 2004 Revised 13 October 2005 Accepted 17 October 2005

Ronald Rousseau KHBO, Industrial Sciences and Technology, Oostende, Belgium, and

Ian Rowlands Department of Information Science, City University, London, UK Abstract Purpose – The purpose of this paper is to clarify earlier work on journal diffusion metrics. Classical journal indicators such as the Garfield impact factor do not measure the breadth of influence across the literature of a particular journal title. As a new approach to measuring research influence, the study complements these existing metrics with a series of formally described diffusion factors. Design/methodology/approach – Using a publication-citation matrix as an organising construct, the paper develops formal descriptions of two forms of diffusion metric: “relative diffusion factors” and “journal diffusion factors” in both their synchronous and diachronous forms. It also provides worked examples for selected library and information science and economics journals, plus a sample of health information papers to illustrate their construction and use. Findings – Diffusion factors capture different aspects of the citation reception process than existing bibliometric measures. The paper shows that diffusion factors can be applied at the whole journal level or for sets of articles and that they provide a richer evidence base for citation analyses than traditional measures alone. Research limitations/implications – The focus of this paper is on clarifying the concepts underlying diffusion factors and there is unlimited scope for further work to apply these metrics to much larger and more comprehensive data sets than has been attempted here. Practical implications – These new tools extend the range of tools available for bibliometric, and possibly webometric, analysis. Diffusion factors might find particular application in studies where the research questions focus on the dynamic aspects of innovation and knowledge transfer. Originality/value – This paper will be of interest to those with theoretical interests in informetric distributions as well as those interested in science policy and innovation studies. Keywords Serials, Generation and dissemination of information, Publications Paper type Research paper

Journal of Documentation Vol. 62 No. 1, 2006 pp. 58-72 q Emerald Group Publishing Limited 0022-0418 DOI 10.1108/00220410610642048

Introduction The journal diffusion factor was introduced by Rowlands (2002) as a measure of the transdisciplinary reception of a journal, a way of summarising the breadth of a journal’s influence across the literature. He defined the journal diffusion factor as a standardised average number of citing journals per source item within a given time window. As such its mathematical form is similar to, and was indeed modelled on, that of the classical journal impact factor. Rowlands likened the publication of new ideas to pebbles being thrown into a pond. The diffusion factor may be understood metaphorically as a measure of the extent of the resulting ripples (citations) as new publications enter the literature (the pond). Classical journal indicators such as the impact factor and the median citation age do

not measure the breadth of the reception of a particular journal (nor journal issue) in the marketplace for research ideas. It is here that a diffusion factor plays a useful complementary role. In this contribution we focus on precise mathematical definitions of different forms of diffusion factors. This approach closely follows the description of different forms of journal impact factor given in Ingwersen et al. (2001) or, more recently, in Frandsen and Rousseau (2005). We will also present some worked examples of these different forms and conclude by suggesting some broader research questions that derive from the notion of measured journal diffusion.

Diffusion factors

59

The publication-citation matrix In order to communicate our ideas as precisely as possible, we will invoke a publication-citation matrix (Ingwersen et al., 2001) as an organising construct. Note that, as in any citation study, citations come from a certain pool of sources (journals). In most investigations this pool is the set of journals indexed by ISI (remembering that this is a variable pool, as the set of indexed journals changes somewhat each year). Recently, other pools have been considered for citation analysis (Jin and Wang, 1999; Wu et al., 2004). Indeed, one might also consider any reasonably comprehensive data set, perhaps a database comprising all significant journals delineating a scientific domain: our diffusion metrics may be applied at many different levels of analysis. Table I is a publication-citation matrix for one hypothetical journal, JO, from 1999 to 2004. The cells contain annual numbers of published articles and citations, with the latter deriving from, and constrained by, the whole set of journals in the pool. The first row gives the annual number of articles published in JO. We assume, for simplicity, that all articles are “citable” (i.e. that editorials, obituaries, meeting abstracts etc., are excluded from our analysis). The subsequent rows are citation rows. Thus we can see that in 2003, for example, JO received 13 citations to the articles it published in 1999. That same year it received 14 citations to the articles it published in 2002. Impact factors Impact factors are measures of mean citedness regardless of source and may be calculated using synchronous or diachronous approaches (or even a combined strategy), and they may use variable time windows for both publication and citation data (Frandsen and Rousseau, 2005).

Publication year Number Number Number Number Number Number Number

of of of of of of of

publications citations received citations received citations received citations received citations received citations received

in the in the in the in the in the in the

year year year year year year

1999 2000 2001 2002 2003 2004

1999

2000

2001

2002

2003

2004

10 5 10 13 16 13 12

15

21

25

32

34

6 8 10 14 16

7 8 12 16

7 14 12

8 15

10

Table I. A publication-citation matrix for the hypothetical journal JO

JDOC 62,1

The ISI or Garfield impact factor (Garfield and Sher, 1963) of the journal JO in the year 2001 is (based on Table I): IF 2 ð2001Þ ¼

60

13 þ 8 ¼ 0:84 10 þ 15

ð1Þ

IF2 is a synchronous impact factor involving a single citation year and two publication years. The term “synchronous” refers to the fact that the citations used for the calculation were all received in the same year. In other words, they may be found in the reference lists published in the same year (2001 in our example). In general, the n-year synchronous impact factor of a journal J in the year Y may be defined as (Rousseau, 1988): n X

IF n ðY Þ ¼

CITðY ; Y 2 i Þ

i¼1 n X

ð2Þ PUBðY 2 i Þ

i¼1

In this formula CITJ(Y,X) denotes the number of citations received from all members of the pool by a fixed journal J in the year Y, by articles published in the year X. Similarly, PUBJ(Z) denotes the number of articles published by this same journal in the year Z (the index J is often omitted). Citation data for a synchronous impact factor will always be found in the same row of the publication-citation matrix. Indeed the data in a certain citation row in our table corresponds to the data that can be obtained from ISI’s journal citation reports (JCR) when looking up a journal in the “Cited journal” view. A continuous version of the synchronous impact factor has been introduced by Egghe (1988) and is useful for modelling purposes, but it has not been applied in this contribution. Next, we introduce a diachronous impact factor, IMP. The term “diachronous” refers to the fact that the data used to calculate it derive from a number of different years with a starting point somewhere in the past and encompassing subsequent years. The 2002 two-year diachronous impact factor for the journal JO as represented in Table I is: IMP 2 ð2002Þ ¼

14 þ 12 ¼ 1:04 25

ð3Þ

7 þ 14 ¼ 0:84 25

ð4Þ

Or, if one includes the year of publication: IM P 02 ð2002Þ ¼

In general, the n-year (shifted) diachronous impact factor of a journal in the year Y is: sþn21 X

IM P sn ðY Þ

¼

CITðY þ i; Y Þ

i¼s

PUBðY Þ

ð5Þ

61

where s ¼ 0; 1; 2, . . . denotes a possible shift with respect to the year of publication. Citation data for the diachronous impact factor are always to be found in the same column of the publication-citation matrix. Therefore in order to collect data for calculations of diachronous impact factors, several volumes or files of the JCR are needed. Alternatively, data may be collected using an online methodology. We come now to formal definitions (there will be more than one) of various journal diffusion factors. Relative diffusion factors Diachronous version Rowlands (2002) did not use the term “relative” diffusion factor (RDI) but we introduce this terminology here for two reasons. First, it has a descriptive function: dividing source publications by the number of citations effectively yields a relative number. Second, it gives us the opportunity to distinguish between the diffusion factor as originally introduced by Rowlands (2002) and its subsequent elaboration and improvement by Frandsen (2004), of which more later. In this paper, we will refer to Frandsen’s indicator as the “journal diffusion factor” (JDF) to distinguish it from the RDI. In this section, we consider the diachronous version of the RDI. We can augment the publication-citation matrix of JO by including the number of unique new journals that yield the citations. In this context, “new” refers to the fixed publication year that we are considering and it means that we will consider the matrix column by column and add new journals from the top (the publication year) to the bottom, a strategy that yields Table II. So, for example, JO received 12 citations in 2003 to articles that it published in 2001. These 12 citations occurred in different journals, four of which were not yet involved in a citation (in 2001 or 2002) to articles published in JO in 2001. Note that the total number of different journals involved in those 12 citations is not shown, since we do not need to consider this information. We are now ready to define the diachronous relative diffusion factor. This will be calculated, as before, for a particular journal, in a particular year, using a particular time window. The three-year diachronous relative diffusion factor (denoted as RDI) of the journal JO for the year 2000 is defined as: RDI 3 ð2000Þ ¼

Diffusion factors

4þ6þ2 ¼ 0:50 6 þ 8 þ 10

ð6Þ

In general, the n-year diachronous relative diffusion factor of a journal J in the year Y is defined as:

JDOC 62,1

62

Table II. Augmented publication-citation matrix for the hypothetical journal JO (diachronous version)

Publication year

1999

2000

2001

2002

2003

2004

A B (1999) C (2000) D (2001) E (2002) F (2003) G (2004)

10 5-4 10-5 13-6 16-6 13-3 12-0

15

21

25

32

34

6-4 8-6 10-2 14-1 16-1

7-5 8-5 12-4 16-2

7-6 14-5 12-3

8-6 15-9

10-5

Notes: A: Number of publications; B: Number of citations received in the year 1999 – number of unique journals involved; C: Number of citations received in the year 2000 – number of unique, new journals involved (new, for the publication year on top of the column, and with respect to previous rows); D: Number of citations received in the year 2001 – number of unique, new journals involved; E: Number of citations received in the year 2002 – number of unique, new journals involved; F: Number of citations received in the year 2003 – number of unique, new journals involved; G: Number of citations received in the year 2004 – number of unique, new journals involved n21 X

RDI n ðY Þ ¼

U ðY þ j; Y Þ

j¼0 n21 X

ð7Þ

CITðY þ j; Y Þ

j¼0

Here we use the same notation for citations as in the previous section. The number of unique new journals for citations in the year Y þ j, to articles published in this journal in the fixed year Y is denoted as U ðY þ j; Y Þ. The phrase “unique new” refers to the fact that this journal has not cited an article published in the journal J in the year Y during the years Y, Y þ 1, . . . , Y þ j 2 1, but that it did cite (in the year Y þ j) an article published in the year Y. Note that any journal can only contribute to the numerator once. Although it is conceivable to begin sums later than in the year Y, this does not seem to make much sense in practice. We therefore recommend the form as defined by equation (7). Synchronous version In order to define the synchronous relative diffusion factor we have simply augmented the publication-citation matrix by including the number of unique new journals involved in citations. Note, however, that in this context the word “new” refers to the fixed citation year we are considering. Here we consider the matrix row by row and add new journals from the right (the citation year) to the left. This leads to Table III. So, for example, JO received 12 citations in 2003 to articles published in 2001. These 12 citations occurred in different journals, three of which were not involved in a citation (in the year 2003) to articles published in 2003 or 2002 in JO. We are now ready to define the synchronous relative diffusion factor. This diffusion will be defined, as for the impact factor earlier, for a particular journal, in a particular year, using a particular time window.

Publication year

1999

2000

2001

2002

2003

2004

A B (1999) C (2000) D (2001) E (2002) F (2003) G (2004)

10 5-4 10-6 13-4 16-3 13-0 12-1

15

21

25

32

34

6-4 8-5 10-3 14-2 16-0

7-5 8-4 12-3 16-3

7-6 14-5 12-3

8-6 15-8

63 10-5

Notes: A: Number of publications; B: Number of citations received in the year 1999 – number of unique journals involved; C: Number of citations received in the year 2000 – number of unique, new journals involved (new, with respect to the year 2000); D: Number of citations received in the year 2001 – number of unique, new journals involved (new, with respect to the year 2001); E: Number of citations received in the year 2002 – number of unique, new journals involved (new, with respect to the year 2002); F: Number of citations received in the year 2003 – number of unique, new journals involved (new, with respect to the year 2003); G: Number of citations received in the year 2004 – number of unique, new journals involved (new, with respect to the year 2004)

The three-year synchronous relative diffusion factor of JO for the year 2003 is: RDIF 3 ð2003Þ ¼

6þ5þ3 ¼ 0:41 8 þ 14 þ 12

ð8Þ

In general, the n-year synchronous relative diffusion factor of a journal J in the year Y is: n21 X

RDIF n ðY Þ ¼

U ðY ; Y 2 jÞ

j¼0 n21 X

Diffusion factors

ð9Þ

CITðY ; Y 2 jÞ

j¼0

The number of unique new journals for citations in the year Y, to articles published in this journal in the year Y 2 j is denoted as U(Y,Y 2 j). The phrase “unique, new” refers here to the fact that this journal has not cited an article published in the journal J in the years Y ; . . . ; Y 2 j þ 1, but that it did cite (in the year Y) an article published in the year Y 2 j. Observe that the meaning of “unique new” is different here than for the diachronous diffusion factor: in the sum of the numerator any journal can contribute at most once. Note that we include the year Y ( j ¼ 0) in our calculation of the synchronous diffusion factor. There is, indeed, no rationale for excluding possibly unique journals that cite J (only) in the year of publication. We feel that the diachronous version is “better” than the synchronous one in the sense that it more adequately captures the meaning of diffusion across time. All the diffusion factors defined in this paper lie between 0 and 1. There is, of course, no problem if one prefers a number between 0 and 100: it suffices to multiply by 100 as in Rowlands (2002).

Table III. Augmented publication-citation matrix for an hypothetical journal JO (synchronous version)

JDOC 62,1

64

In one sense, a journal is no more than a collection of articles. Hence, the definitions of relative diffusion factors as given above can be applied to any set of articles, even to a single article. An example of this is presented later. A simple diachronous model We assume that U ðY þ j; Y Þ, as defined above, is given by an exponentially decreasing function: U ðY þ j; Y Þ ¼ U 0 aj . If we assume further that CITðY þ j; Y Þ is given by an exponentially decreasing function: C0bj, where, naturally 0 , a , b , 1, and even a ! b (modelling with an exponential decrease is an often used first approximation for a real citation curve), then RDIn(Y) becomes: n21 X

RDI n ðY Þ ¼

n21 X

U ðY þ j; Y Þ

j¼0 n21 X

¼

CITðY þ j; Y Þ

j¼0

U 0a j

j¼0 n21 X

¼ C0b

j

U 0 ð1 2 bÞð1 2 a n Þ C 0 ð1 2 aÞð1 2 b n Þ

ð10Þ

j¼0

Since we assume that a ! b, this expression can be approximated as: U 0 ð1 2 bÞ U0   ¼ C 0 ð1 2 aÞð1 2 b n Þ C 0 ð1 2 aÞ 1 þ b þ b 2 þ ::: þ b n21

ð11Þ

Approximating even further, RDIn(Y) can be written as: U0 C 0 ð1 þ bÞ

ð12Þ

This modelling exercise suggests that the citation distribution is the main contributor to the RDI, and this for all n, as observed in practice (Frandsen, 2004), Moreover, the larger the number of citations, the lower RDI becomes, suggesting a negative correlation. We recall that a decreasing scatter plot (of any form) yields a negative (linear) correlation; an increasing one yields a positive (linear) correlation (Egghe and Rousseau, 1996). Journal diffusion factors Recently, Frandsen (2004) proposed a journal diffusion factor (actually a whole set) where instead of dividing by the number of citations, one divides by the number of publications instead. Denoting this new diffusion factor (n-year diachronous version) for the year Y as DI n ðY Þ, yields the following mathematical formulation: n21 X

DI n ðY Þ ¼

U ðY þ j; Y Þ

j¼0

PUBðY Þ

ð13Þ

Empirically, Rowlands’ relative diffusion factor is heavily correlated with the total number of citations (Rowlands, 2002). Frandsen’s alternative diffusion factor above, on the other hand, is heavily correlated with journal impact. It is not bounded. For single

articles the Frandsen diffusion factor coincides with the total number of different journals citing this article over a specified period of time. Similarly a synchronous diffusion factor is defined as: n21 X

DIF n ðY Þ ¼

U ðY ; Y 2 jÞ

j¼0 n21 X

ð14Þ PUBðY 2 jÞ

j¼0

The next calculation reveals the precise relation between these two diffusion factors and the (diachronous) Garfield impact factor. If we consider the diachronous relative diffusion factor RDI and multiply it with the diachronous (unshifted) impact factor then the result is the (diachronous) diffusion factor DI. Indeed: n21 X

RDI n ðY Þ IMP sn ðY Þ

¼

UðY þ j; Y Þ

j¼0 n21 X

CITðY þ j; Y Þ

n21 X

n21 X

CITðY þ j; Y Þ

j¼0

PUBðY Þ

¼

UðY þ j; Y Þ

j¼0

PUBðY Þ

j¼0

¼ DI n ðY Þ

Diffusion factors

ð15Þ

This relation shows that the diachronous journal diffusion factor is larger than the diachronous relative diffusion factor if, and only if, the diachronous impact factor (calculated for the same citation period) is larger than one. Some worked examples In the examples that follow, we hope to illustrate the usefulness of diffusion factors. Two library and information science journals We calculated diachronous diffusion factors for Libri and the Journal of Documentation for the publication years 1995 and 1996 using data from ISI’s Web of Knowledge. More precisely, we calculated RDIj(1995) and DIj(1995) for j ¼ 1 to 9; and RDIj(1996) and DIj(1996) for j ¼ 1 to 8 for these two journals. The results are shown in Figures 1 and 2, and in Table IV. When the first article of a journal J is cited its relative diffusion factor becomes 1. So, unless the same journal acts as a citing journal for J during the first year, J’s relative diffusion factor will always start at the value 1: this is indeed the case for Libri (1995 and 1996) and for JDOC (1995). Relative journal diffusion factors generally exhibit a decreasing tendency, although increases for a short period in specific cases are possible. One would ultimately expect relative diffusion factors to tend towards zero, yet even over a short period of time, as in the examples shown here, they seem to have a limiting value. Libri’s relative diffusion factors tend to the value 0.6; JDOC’s relative diffusion factors are smaller, and exhibit less of a tendency towards a limiting value.

65

JDOC 62,1

66 Figure 1. Relative diffusion factors for the journal Libri over several years (publication years 1995 and 1996)

Figure 2. Relative diffusion factors for the Journal of Documentation over several years (publication years 1995 and 1996)

JDOC

Table IV. Diffusion factors for the Journal of Documentation and Libri

1995 1996 1997 1998 1999 2000 2001 2002 2003

Libri

1995

1996

1995

1996

0.35 1.15 1.50 1.90 2.05 2.30 2.60 2.75 2.95

0.38 0.75 1.38 1.81 2.25 2.69 2.88 3.13

0.04 0.25 0.46 0.58 0.71 0.88 0.88 0.96 1.00

0.00 0.23 0.50 0.77 0.86 0.91 0.95 1.00

The lower values for JDOC are related to the fact that it receives more citations than Libri (for the diffusion factor the opposite is true). This happened because the two journals have a similar number of publications (around 20), but the fact that JDOC receives more citations makes it more probable that it has a higher journal diffusion factor. This is indeed the case.

Diachronous relative diffusion factors for four single articles In a recent article, Sherwill-Navarro and Wallace (2004) attempt to evaluate the impact of research into the value of medical library services on the basis of four high impact articles. These articles are: . The contribution of hospital library information services to clinical care: a study in eight hospitals (King, 1987). . The impact of the hospital library on clinical decision making: the Rochester study (Marshall, 1992). . Use of MEDLINE by physicians for clinical problem solving (Lindberg et al., 1993). . Effect of online literature searching on length of stay and patient care costs (Klein et al., 1994).

Diffusion factors

67

Their article lists all the articles citing at least one of these four key papers and we use these papers as basic data for the determination of the diachronous relative diffusion factor. In Table V and Figure 3 we calculate RDIs for each year since the year of first citation and for each article. Table V and Figure 3 show that the relative diffusion factor is a valuable concept when applied to single articles as well as to whole journals. As noted in the case of the two LIS journals considered earlier, these article diffusion factors also seem to converge. The limiting values though are clearly different: about 0.3 for King, 0.5 for Marshall, 0.65 for Lindberg et al. and 0.67 for Klein et al. Economics journals Next, we calculated synchronous and diachronous diffusion factors for 28 journal titles in economics. Table VI includes diachronous diffusion factors for the year 1999 with a five-year citation period (1999 to 2003), i.e. RDI5(1999) and DI5(1999), and synchronous diffusion factors with three years of publication period and one-year citation period (publication years 2001-2003 and citation year 2003). i.e. RDIF3(2003) and DIF3(2003).

1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002

King (1987)

Marshall (1992)

Lindberg et al. (1993)

Klein et al. (1994)

1.000 0.400 0.286 0.300 0.294 0.333 0.320 0.310 0.303 0.308 0.300 0.293 0.310 0.310

1.000 0.429 0.500 0.421 0.448 0.487 0.453 0.500 0.515 0.507 0.507

0.667 0.714 0.765 0.583 0.614 0.633 0.638 0.649 0.646 0.651

0.833 0.733 0.714 0.667 0.692 0.698 0.674 0.674

Table V. Yearly relative diffusion factors for key health services articles

JDOC 62,1

68 Figure 3. Relative diffusion factors for key health services articles

Journal

Table VI. Diffusion factors for 28 economics journals

Journal of Economic Literature American Economic Review Journal of Economic Perspectives Econometrica Quarterly Journal of Economics Economica Journal of Law and Economics Economic Journal Review of Economics and Statistics Journal of Political Economy Brookings Papers on Economic Ac. Review of Economic Studies International Economic Review Journal of Econometrics European Economic Review Rand Journal of Economics Journal of Labor Economics Economics Letters Scandinavian Journal of Economics Journal of Economic Theory Journal of Monetary Economics Journal of International Economics Health Economics Journal of Health Economics Journal of Financial Economics Economic Theory Econometric Theory Journal of Mathematical Economics

Relative diffusion factor Synchronous Diachronous 0.60 0.33 0.58 0.34 0.37 0.83 0.44 0.51 0.53 0.48 0.60 0.58 0.60 0.41 0.54 0.44 0.56 0.56 0.92 0.29 0.35 0.36 0.49 0.42 0.23 0.47 0.43 0.57

0.34 0.24 0.45 0.29 0.28 0.56 0.34 0.35 0.34 0.33 0.77 0.25 0.41 0.28 0.34 0.47 0.42 0.39 0.61 0.27 0.24 0.32 0.3 0.34 0.2 0.34 0.31 0.36

Diffusion factor Synchronous Diachronous 2.14 0.43 1.01 0.56 1.27 0.27 0.75 0.50 0.41 0.73 0.43 0.53 0.36 0.34 0.41 0.49 0.54 0.13 0.23 0.17 0.33 0.31 0.42 0.48 0.45 0.13 0.13 0.13

16.50 2.32 2.70 2.83 6.33 1.86 5.97 2.22 2.25 3.13 3.00 2.74 1.78 1.83 2.02 2.75 2.67 0.59 1.16 0.99 1.82 2.44 2.13 2.53 2.13 0.58 0.71 0.49

The synchronous relative diffusion factors vary in the range 0.23 to 0.92 while the synchronous journal diffusion factors vary between 0.13 and 2.14. The diachronous diffusion factors also show a difference in level as the relative diffusion factors vary between 0.2 and 0.77 and the journal diffusion factors between 0.49 and 16.50. The high

diachronous diffusion factor of the Journal of Economic Literature is explained by a high number of different citing journals (231) and a relatively low number of publications (14). American Economic Review receives citations from the highest number of different citing journals (358) but as the number of publications is 154 the diffusion factor is much lower (2.34). We will select two journals in order to analyse these differences further: American Economic Review (AER) which receives many citations every year and Kyklos which is much less heavily cited. Figures 4 and 5 illustrate how each of our four types of diffusion factor develop over time. We see that for the two definitions of diffusion the synchronous and the diachronous results are situated at different levels. This is due to

Diffusion factors

69

Figure 4. Diffusion factors over time for American Economic Review. Diachronous diffusion factors with a five-year citation period; synchronous diffusion factors with a three-year publication period

Figure 5. Diffusion factors over time for Kyklos. Diachronous diffusion factors with a five-year citation period; synchronous diffusion factors with a three-year publication period

JDOC 62,1

70

Figure 6. Diachronous diffusion factors for the journals American Economic Review and Kyklos (publication year 1995)

differences in the number of received citations as well as of published documents. This is in accordance with an earlier finding that showed that the journal diffusion factor is larger than the relative diffusion factor if and only if the impact factor is larger than one. Although situated at different levels the diffusion factors appear to be relatively stable over time. The diachronous diffusion factor in this example is calculated on the basis of only one publication year which makes it easily influenced by small changes in the number of publications. Due to the smaller numbers involved Kyklos’ diffusion factors are somewhat less stable. Note that these time series (Figures 4 and 5) are totally different from these shown in Figures 1 and 2. These showed series of the form RDIn(Y), where n was the variable. Here we show series of the form RDI5(Y) and DIF3(Y) where Y is the variable. Note in the case of AER that the journal diffusion factor is at a higher level than the relative diffusion factor, while for Kyklos it is the other way around. This is part of the explanation of why we should not let these two figures mislead us into thinking that we are dealing with two measures describing the same phenomenon. As we see in Figure 6 they are determined by the choice of citation period and journals. Figure 6 (of the same type as Figures 1 and 2) is an illustration of the diachronous diffusion factors over a period of several years for the journals, publication years 1995. As expected in the case of AER the journal diffusion factor keeps increasing over time as the numerator of the fraction can never decrease and the denominator is constant (we also note that it is not bounded by 1). On the other hand, the higher the total number of citations, the lower the relative diffusion factor. The diffusion decreases to (near) zero over a period of time. As we extend the citation period the former increases and the latter decreases. In the case of Kyklos a different pattern emerges. First of all we note that it does not start at 1 since it does not receive any citations in the first year. We also note that the relative diffusion factor is higher than the journal diffusion factor and stable over time, although we would expect it to decrease eventually. This is due to the relatively small number of citations Kyklos receives (a small number of citations means that it is “easier” to receive citations from new journals): a total of 53 different journals in 2003 compared with 483 for AER. Eventually, however, it is expected to decrease as the total number of citations increases. One lesson to be drawn from these findings is that, strictly speaking, relative diffusion factors of journals can only be compared in a meaningful way if journals

receive the same (or a least a similar) number of citations. Our examples also demonstrate that the differences between relative and non-relative diffusion factors depend critically on the choice of citation periods and journals. Conclusion and suggestions for further research This paper has offered a more formal and comprehensive description of diffusion factors than previously attempted by the authors. It is programmatic in the sense that we would like to encourage the bibliometrics community to engage with our ideas, improve them further and test their utility by engaging with meaningful research questions: at this early stage, it is not entirely clear which of our proposals best catches the notion of diffusion. Indeed, we would also like to encourage others to develop new kinds of summary bibliometric indicator and to see what new light these can throw on our understanding of scholarly communication. One important research question we would like to explore is how maps of knowledge flows might be derived from this new approach to understanding citation data. We have focused here mainly on the diffusion characteristics of journals, but there is no reason in principle why these techniques could not be applied to other units of research production: from individual articles through authors and research groups to specialties, disciplines and even to national outputs. The logistics of data collection might become rather daunting at these higher levels of aggregation but the research rewards might well be significant. It would be interesting to study the diffusion characteristics of different types of papers (reviews, articles, letters, etc.) and to see whether there are differences, for example, between papers reporting basic or applied research, or between methodological contributions and papers presenting research findings. At the policy level, such investigations might provide answers to the question of whether diffusion is “speeding up” or “slowing down” in various contexts (is diffusion – in general terms – the same now as it was 20 years ago?). The concept of diffusion is also interesting in a policy context because it addresses the question of research utility and “quality” from a different angle than the iconic Garfield impact factor. Finally, the essential ideas here could be extended beyond the journal literature into other domains. For example, they could be used to measure the diffusion of web sites by monitoring links at regular times, or by using the Web Archive, or they could summarise the relative influence of individuals in social network analysis. Since this article was submitted for publication, Egghe (2005) has further studied the relation between diffusion factors and impact factors, while the study of Fairthorne’s “Empirical hyperbolic distributions (Bradford-Zipf-Mandelbrot) for bibliometric description and prediction” in (Rousseau, 2005) provides another example of a calculated diffusion factor, and the introduction of the notion of diffusion speed. References Egghe, L. (1988), “Mathematical relations between impact factors and average number of citations”, Information Processing & Management, Vol. 24 No. 5, pp. 567-76. Egghe, L. (2005), “Journal diffusion factors and their mathematical relations with the number of citations and with the impact factor”, in Ingwersen, P. and Larsen, B. (Eds), Proceedings of ISSI 2005, Karolinska University Press, Stockholm, pp. 109-20.

Diffusion factors

71

JDOC 62,1

72

Egghe, L. and Rousseau, R. (1996), “Average and global impact of a set of journals”, Scientometrics, Vol. 36 No. 1, pp. 97-107. Frandsen, T.F. (2004), “Journal diffusion factors: a measure of diffusion?”, Aslib Proceedings, Vol. 56 No. 1, pp. 5-11. Frandsen, T.F. and Rousseau, R. (2005), “Article impact calculated over arbitrary periods”, Journal of the American Society for Information Science and Technology, Vol. 56 No. 1, pp. 58-62. Garfield, E. and Sher, I.H. (1963), “New factors in the evaluation of scientific literature through citation indexing”, American Documentation, Vol. 14 No. 3, pp. 195-201. Ingwersen, P., Larsen, B., Rousseau, R. and Russell, J. (2001), “The publication-citation matrix and its derived quantities”, Chinese Science Bulletin, Vol. 46 No. 6, pp. 524-8. Jin, B. and Wang, B. (1999), “Chinese Science Citation Database: its construction and application”, Scientometrics, Vol. 45 No. 2, pp. 325-32. King, D.N. (1987), “The contribution of hospital library information services to clinical care: a study in eight hospitals”, Bulletin of the Medical Library Association, Vol. 75 No. 4, pp. 291-330. Klein, M.S., Ross, F.V., Adams, D.L. and Gilbert, C.M. (1994), “Effect of online literature searching on length of stay and patient care costs”, Academic Medicine, Vol. 69 No. 6, pp. 489-95. Lindberg, D.A., Siegel, E.R., Rapp, B.A., Wallingford, K.T. and Wilson, S.R. (1993), “Use of MEDLINE by physicians for clinical problem solving”, Journal of the American Medical Association, Vol. 269 No. 24, pp. 3124-9. Marshall, J.G. (1992), “The impact of the hospital library on clinical decision making: the Rochester study”, Bulletin of the Medical Library Association, Vol. 80 No. 2, pp. 169-78. Rousseau, R. (1988), “Citation distribution of pure mathematics journals”, in Egghe, L. and Rousseau, R. (Eds), Informetrics 87/88, Elsevier, Amsterdam, pp. 249-62. Rousseau, R. (2005), “Robert Fairthorne and the empirical power laws”, Journal of Documentation, Vol. 61 No. 2, pp. 194-202. Rowlands, I. (2002), “Journal diffusion factors: a new approach to measuring research influence”, Aslib Proceedings, Vol. 54 No. 2, pp. 77-84. Sherwill-Navarro, P.J. and Wallace, A.L. (2004), “Research on the value of medical library services: does it make an impact in the health care literature?”, Journal of the Medical Library Association, Vol. 92 No. 1, pp. 34-45. Wu, Y., Pan, Y., Zhang, Y., Ma, Z., Pang, J., Guo, H., Xu, B. and Yang, Z. (2004), “China Scientific and Technical papers and Citations (CSTCP): history, impact and outlook”, Scientometrics, Vol. 60 No. 3, pp. 385-97.

To purchase reprints of this article please e-mail: [email protected] Or visit our web site for further details: www.emeraldinsight.com/reprints

Suggest Documents