Technical Annex

8.1 Calculating composite indexes

For each of the 7 innovation dimensions average performance will be summarized by calculating a composite innovation index. For each of the 3 blocks of dimensions average performance will be summarized by calculating a weighted composite index using the composite innovation indexes for those dimensions belonging to a specific block. Overall innovation performance will be summarized in the Summary Innovation Index. The methodology of calculating these composite innovation indexes will now be explained in detail.

 

Step 1: Transforming data

Most of the EIS indicators are fractional indicators with values between 0% and 100%. Some EIS indicators are unbound indicators, where values are not limited to an upper threshold. These indicators can be highly volatile and have skewed data distributions (where most countries show low performance levels and a few countries show exceptionally high performance levels). For these indicators – Public-private co-publications, EPO patents, Community trademarks and Community designs, all measured per million population – data will be transformed using a square root transformation.

Step 2: Identifying outliers

Positive outliers are identified as those relative scores which are higher than the EU27 mean plus 3 times the standard deviation [1]. Negative outliers are identified as those relative scores which are smaller than the EU27 mean minus 3 times the standard deviation. These outliers are not included in determining the Maximum and Minimum scores in the normalisation process (cf. Step 5).

Step 3: Setting reference years

For each indicator a reference year is identified based on data availability for all core EIS countries, i.e. those countries for which data availability is at least 75%. For most indicators this reference year will be lagging 1 or 2 years behind the year to which the EIS refers. Thus for the EIS 2008 the reference year will be 2006 or 2007 for most indicators (cf. Table 1).

Step 4: Sorting data over time

Reference year data are then used for “2008”, etc. If data for a year-in-between is not available we substitute with the value for the previous year (except for indicators using CIS data where we use the average of 2004 and 2006 to impute for 2005). If data are not available at the beginning of the time series, we replace missing values with the latest available year. The following examples will clarify this step and will show how ‘missing’ data are imputed:

Example 1 (latest year missing)

 

 

 

 

 

 

“2008”

“2007”

“2006”

“2005”

“2004”

Available relative to EU score

Missing

150

120

110

105

Use most recent year

150

150

120

110

105

 

 

 

 

 

 

Example 2 (year-in-between missing)

 

 

 

 

 

 

“2008”

“2007”

“2006”

“2005”

“2004”

Available relative to EU score

150

Missing

120

110

105

Substitute with previous year

150

120

120

110

105

 

 

 

 

 

 

Example 3 (beginning-of-period missing)

 

 

 

 

 

 

“2008”

“2007”

“2006”

“2005”

“2004”

Available relative to EU score

150

130

120

Missing

Missing

Substitute with latest available year

150

130

120

120

120

If real data will become available for the EIS 2009 or EIS 2010 for any of these ‘missing’ data, then the ‘imputed’ values will be replaced by the real data. This might cause some marginal deviations between the composite index scores between the EIS 2008, 2009 and 2010 reports.

Step 5: Extrapolating data

For all indicators and countries we extrapolate data for 2009 and 2010 by assuming the same percentage increase between “2008” and “2007”, where for all fractional indicators extrapolated data can never be above 100. The rationale for this extrapolation is to take account of further increases in indicator values beyond the maximum or below the minimum values found within the observed 5 year time period. This way we can fix the Maximum and Minimum scores (cf. Step 6) for the EIS 2009 and EIS 2010 to ensure full comparability of SII scores between the EIS 2008 report and future EIS reports.

Step 6: Determining Maximum and Minimum scores

The Maximum score is the highest relative score found for the whole time period (including the two extrapolated years) within the group of core EIS countries (i.e. those countries for which data availability is at least 75%) excluding positive outliers and ‘small’ countries with populations of 1 million or less (i.e. Cyprus, Iceland, Luxembourg and Malta) as these small countries are 1) responsible for some of the observed outliers (cf. Step 2) and 2) due to their small size cannot be taken as representative for most of the other (larger) countries. Similarly, the Minimum score is the lowest relative score found for the whole time period within the group of core EIS countries excluding negative outliers and ‘small’ countries.

Step 7: Calculating re-scaled scores

Re-scaled scores of the relative scores for all years are calculated by first subtracting the Minimum score and then dividing by the difference between the Maximum and Minimum score. The maximum re-scaled score is thus equal to 1 and the minimum re-scaled score is equal to 0. For positive and negative outliers and small countries where the value of the relative score is above the Maximum score or below the Minimum score, the re-scaled score is thus set equal to 1 respectively 0. 

Step 8: Calculating composite innovation indexes

For each year and for each innovation dimension (Human resources, Finance and support, Firm investments, Linkages & entrepreneurship, Throughputs, Innovators, Economic effects) a dimension composite innovation index (DCII) is calculated as the unweighted average of the re-scaled scores for all indicators within the respective dimension.

For each year and for each block of dimensions (Enablers, Firm activities, Outputs) a block composite innovation index (BCII) is calculated as the unweighted average of the re-scaled scores for all indicators within the respective block.

For each year the Summary Innovation Index (SII) is calculated as the unweighted average of the re-scaled scores for all indicators. The SII will only be calculated if data are available for at least 70% of the indicators.

 

8.2 Calculating growth rates

As an input to the EIS workshop in June 2008, the Joint Research Centre prepared a report presenting possible alternatives to calculating growth rates [2]. For the calculation of the average annual growth rate in innovation performance we have adopted a generalized approach:

Step 1:

We first define growth for each country c per indicator i as , i.e. as the ratio between the non-normalised values for year t and year t-1. In order to minimize the effect of growth outliers on the overall growth rate, these ratios are restricted to a maximum of 2 (such that growth in an individual indicator is restricted to 100%) and 0.5 (such that a decrease in an individual indicator is limited to -50%).

Step 2:

We aggregate these indicator growth rates between year t and year t-1 using a geometric average [3] to calculate the average yearly growth rate :

where I is the set of EIS innovation indicators used for calculating growth rates and where all indicators receive the same weight wi (i.e. 1/27 if data for all 27 indicators are available) [4].
The average yearly growth rate  is invariant to any ratio-scale transformation and indicates how much the overall set of indicators has progressed with respect to the reference year t-1.

Step 3:

We then calculate for each country c the average annual growth rate in innovation performance as the geometric average of all yearly growth rates:

where  and each average yearly growth rate receives the same weight wt.

The average annual growth rate in innovation performance is different from that used in the EIS 2007 report as it does not measure the change in the SII but the average change in the 29 innovation indicators.


[1] This approach follows the well-adopted Chauvenet's Criterion in statistical theory, but we use a range of 3 standard deviations around the mean instead of the usual range of 2 standard deviations.

[2] Tarantola, S., (2008), “European Innovation Scoreboard: strategies to measure country progress over time”, Joint Research Centre, mimeo.

[3] A geometric mean is an average of a set of data that is different from the arithmetic average. The geometric mean is of two data points X and Y is the square root of (X*Y), the geometric mean of X, Y and Z is the cube root of (X*Y*Z), and so forth.

[4] It should be noted that the following two indicators are not included in the calculation of growth rates as data are missing for too many countries: Share of SMEs introducing marketing or organisational innovations and Resource efficiency innovators.