Figure 3. Diagram of the assessment of children. Each point represents two assessments planned for a child. For the parent-teacher assessment subgroup are parent assessments on the x axis, teacher assessments on the Y axis, for the parent assessments subgroup, paternal assessments on the x axis, maternal assessments on the Y axis. Assessments for bilingual children are represented by grey, for unilingual children by black dots. The dotted lines circle the statistically identical evaluations, calculated on the basis of the reliability of the retest tests provided manually (difference of 3 T points; 23 pairs of evaluations out of 53). Straight lines include evaluations statistically identical to those calculated on the basis of Inter-Rater reliability (ICC) in our study (difference less than 12 points of T). While the correlation analyses used (most often pearson correlations) can determine the strength of the relationship between two groups of values, they do not measure the concordance between the evaluators at all (Bland and Altman, 2003); Kottner et al., 2011). However, claims of concordance between evaluators are often derived from correlation analyses (e.g., .B.
Bishop and Baird, 2001; Janus, 2001; Van Noord and Prevatt, 2002; Norbury et al., 2004; Bishop et al., 2006; Massa et al., 2008; Gudmundsson & Gretarsson, 2009.) The error of these conclusions is easy to detect: a perfect linear correlation can be obtained if one group of evaluators systematically differs (by an almost consistent amount) from another, although there is not a single absolute match. On the other hand, agreement is reached only if points are on the line (or in an area) of equality of the two notations (Bland and Altman, 1986); Liao et al., 2010). Therefore, analyses based exclusively on correlations do not provide a measure of concordance between evaluators, nor are they sufficient for an accurate assessment of reliability between evaluators. As Pointed out by Stemler (2004), reliability is not a one-size-fits-all approach and cannot be captured solely by correlations. To show how the three concepts of inter-rate linkability are expressed here as intra-class correlation coefficients (ICC, see Liao et al., 2010; Kottner et al., 2011), agreement (sometimes also consensus, see z.B. Stemler, 2004) and correlation (here: Pearson correlations) complement each other in assessing credit rating matching and is a main intention of this report. . . .