Outline

Strengths and Weaknesses of Two Empathy Measures: A Comparison of the Measurement Precision, Construct Validity, and Incremental Validity of Two Multidimensional Indices

Thomas H Costello

2018, Assessment

https://doi.org/10.1177/1073191118777636Last updatedOctober 09, 2025

visibility

…

description

15 pages

Abstract

The quality of empathy research, and clinical assessment, hinges on the validity and proper interpretation of the measures used to assess the construct. This study investigates, in an online sample of 401 adult community participants, the construct validity of the Affective and Cognitive Measure of Empathy (ACME) relative to that of the Interpersonal Reactivity Index (IRI), the most widely used multidimensional empathy research measure. We investigated the factor structures of both measures, as well as their measurement precision across varying trait levels. We also examined them both in relation to convergent and discriminant criteria, including broadband personality dimensions, general emotionality, personality disorder features, and interpersonal malignancy. Our findings suggest that the ACME possesses incremental validity beyond the IRI for most constructs related to interpersonal malignancy. Our results further indicate that the IRI Personal Distress scale is severely deficient...

777636 research-article2018 ASMXXX10.1177/1073191118777636AssessmentMurphy et al. Article Assessment Strengths and Weaknesses of Two 1–15 © The Author(s) 2018 Reprints and permissions: Empathy Measures: A Comparison of sagepub.com/journalsPermissions.nav DOI: 10.1177/1073191118777636 https://doi.org/10.1177/1073191118777636 the Measurement Precision, Construct journals.sagepub.com/home/asm Validity, and Incremental Validity of Two Multidimensional Indices Brett A. Murphy1, Thomas H. Costello1, Ashley L. Watts1, Yuk Fai Cheong1, Joanna M. Berg1, and Scott O. Lilienfeld1 Abstract The quality of empathy research, and clinical assessment, hinges on the validity and proper interpretation of the measures used to assess the construct. This study investigates, in an online sample of 401 adult community participants, the construct validity of the Affective and Cognitive Measure of Empathy (ACME) relative to that of the Interpersonal Reactivity Index (IRI), the most widely used multidimensional empathy research measure. We investigated the factor structures of both measures, as well as their measurement precision across varying trait levels. We also examined them both in relation to convergent and discriminant criteria, including broadband personality dimensions, general emotionality, personality disorder features, and interpersonal malignancy. Our findings suggest that the ACME possesses incremental validity beyond the IRI for most constructs related to interpersonal malignancy. Our results further indicate that the IRI Personal Distress scale is severely deficient in construct validity, raising serious concerns regarding past findings that have included it when computing total empathy scores. Finally, our results indicate that both questionnaires display poor measurement precision at high trait levels, emphasizing the need for future researchers to develop indices that can reliably measure high levels of empathy. Keywords empathy, construct validity, incremental validity, psychopathy, personality In the modern world of the Internet, cable television, and Interpersonal Reactivity Index (IRI; Davis, 1983) and the social media, we can now witness others’ suffering on a Affective and Cognitive Measure of Empathy (ACME; global scale. Perhaps in part as a consequence, empathy has Vachon & Lynam, 2016), respectively. To do so, we first become an increasingly popular topic in the public eye. It is used confirmatory factor analysis (CFA) and exploratory difficult to escape mentions of empathy in political speeches, structural equation modeling (ESEM) methods to examine motivational lectures, religious sermons, and popular psy- the tenability of the factor structures of each measure as chology articles, among others. Furthermore, empathy has they are typically adopted in the literature. Second, we long been a pivotal concept in clinical psychology theory and examined the measurement precision of these measures practice, at least from the time of Carl Rogers (1958). There using item response theory (IRT) techniques, to compare is substantial debate, however, among researchers and theo- their respective abilities to reliably detect empathic traits at rists regarding the definition of empathy, whether and how to low and high levels of their latent traits. parse it into subcomponents, and how best to measure it in Third, we examined these two measures’ relations with a research and clinical practice. The differences of opinion broad swath of theoretically relevant external criteria, regarding these definitional and measurement issues have including general and potentially maladaptive personality recently spilled over into debates about whether empathy is traits, with a focus on those associated with interpersonal psychologically beneficial (Baron-Cohen, 2016) or, instead, is actually harmful (Bloom, 2016, 2017). 1 Emory University, Atlanta, GA, USA To provide valuable information relevant to these ques- Corresponding Author: tions, we aimed to compare the evidence of construct valid- Brett Murphy, Department of Psychology, Emory University, 36 Eagle ity of two self-report measures of empathy, one widely used Row, Room 270, Decatur, GA 30322, USA. and the other a promising new questionnaire, namely, the Email: [email protected] 2 Assessment 00(0) malignancy (e.g., coldheartedness, meanness). Moreover, & Viding, 2014; Bloom, 2016). More specifically, a number of we examined these measures’ relations to broader emotion- researchers (cf. Decety & Michalska, 2010) contend that we ality, particularly negative emotionality (NE), a pervasive are “sympathizing” if we feel compassion for someone who is dimension of distress and emotional maladjustment that feeling depressed, whereas we are “empathizing” if we, too, courses through most indices of psychopathology (Watson feel depressed. Defined narrowly as feeling the same emotion & Clark, 1984). Fourth and finally, in addition to elucidat- that another is feeling, empathy (a) is likely to be less benefi- ing each empathy measure’s nomological networks, we cial than is other-oriented concern (Bloom, 2017), (b) would compared the two empathy measures’ incremental contribu- seem to relate conceptually to heightened negative affect and/ tions above and beyond one another in statistically predict- or emotional dysregulation, or (c) both. ing traits highly conceptually associated with interpersonal Other scholars, though, have argued that the perceiv- malignancy. er’s emotional resonance needs only to be interpersonally “appropriate” (Baron-Cohen, 2016; Baron-Cohen & The Heterogeneity of Empathy Wheelwright, 2004) to constitute empathy. This alterna- tive conceptualization does not require that one’s emo- Theorists have posited a wide range of subdimensions (or tional state be isomorphic with the emotional state of the subtypes) within the empathy construct, including but not other. The IRI, ACME, and EQ follow this broader con- limited to effortful perspective-taking, emotion recognition, ceptual perspective, including caring emotional response affective contagion, prosocial motivation, distinctions rather than only isomorphic emotional contagion. This lat- between self and other, compassion, and aversion to harming ter definition involves a different perspective on the mean- others. The closest approximation to a contemporary research ing of emotional resonance, which can be rooted in vernacular is the separation of empathy into “affective” and evolved social functionality (cf., Keltner & Haidt, 1999; “cognitive” components (e.g., Davis, 1983; Shamay-Tsoory, Keltner & Kring, 1998). At the same time, though, it may Aharon-Peretz, & Perry, 2009). In this familiar distinction, beg the question of what constitutes an interpersonally affective empathy comprises emotional resonance with or “appropriate” emotional response. caring for the feelings of others. Cognitive empathy, in con- trast, comprises the capacity to understand and predict the thoughts and feelings of others. The IRI Scales As noted briefly earlier, the Interpersonal Reactivity As we have alluded, the heterogeneity of the conceptualiza- Index (IRI; Davis, 1983) is the most widely employed mul- tion of empathy is reflected both across and within mea- tidimensional research questionnaire of empathy, cited sures. For instance, the IRI contains four scales: Empathic nearly 6,000 times at the time of this writing (according to Concern (EC), Perspective Taking (PT), Fantasy (FN), and Google Scholar). Aside from the IRI, only four other multi- Personal Distress (PD), each of which aims to reflect differ- dimensional empathy measures intending to encompass ing subdimensions of the broader empathy construct. The cognitive and affective empathy have received significant EC scale is intended to assess other-oriented feelings of construct validation: the Empathy Quotient (EQ; Baron- sympathy and concern for others (Davis, 1983), and is Cohen & Wheelwright, 2004), typically analyzed only as a largely considered an index of affective empathy. The PT unidimensional global empathy scale; the Basic Empathy scale is intended to index readiness to adopt another per- Scale (BES; Jolliffe & Farrington, 2006); the Questionnaire son’s perspective, but not necessarily the accuracy of such of Cognitive and Affective Empathy (QCAE; Reniers, perspective-taking. Although frequently relied on as a mea- Corcoran, Drake, Shryane, & Völlm, 2011); and, most sure of cognitive empathy, this practice is suspect given that recently, the ACME. In particular, the ACME attempts to the PT scale items’ content appears to comprise empathic expand the content domain of empathy. Provisional evi- motivation and agreeableness (Jolliffe & Farrington, 2006) dence indicates it may be superior to its predecessors (e.g., and the scale does not consistently relate to performance on IRI, BES) in construct validity (Vachon & Lynam, 2016). emotion recognition tasks (e.g., Spreng, McKinnon, Mar, & The aforementioned measures differ substantially in their Levine, 2009; Vachon & Lynam, 2016). conceptualizations of empathy and content coverage. For In contrast to the EC and PT scales, which are intended to instance, the BES and the QCAE differ from the other three map onto affective and cognitive empathy, respectively, the measures in that they aim to exclude compassion, empathic FN scale aims to measure tendencies to become imagina- concern, and other ostensibly prosocial emotional responses tively absorbed in the feelings and actions of characters in from the “affective empathy” construct. This exclusion reflects books and movies (Davis, 1983), and appears to capture a a major conceptual disagreement regarding the construct of trait closely allied to Tellegen’s absorption construct (Tellegen empathy. Some theorists have argued that both the empathiz- & Atkinson, 1974), with which it correlates substantially ing perceiver and the target individual must be in the same (Wickramasekera & Szlyk, 2003). The PD scale aims to mea- emotional state for affective empathy to be present (e.g., Bird sure “‘self-oriented’ feelings of personal anxiety and unease Murphy et al. 3 in tense interpersonal settings” (Davis, 1983, p. 114). The PD 2005). The lack of distinguishability of the EC and PT scale correlates highly with trait negative emotionality (Hawk scales in past studies may further indicate that the PT scale et al., 2013) and most of the items do not reference the pres- should not be relied on as a measure of cognitive empathy; ence of other people but refer generally to one’s ability to additional factorial investigation would be particularly function in “emergencies” and tense situations. valuable in this regard. Researchers have adopted a variety of approaches to using the IRI and its scales in research. Because the FN and The ACME Scales PD scale contents are not as typically associated with wide- spread conceptualizations of empathy, many studies ana- Vachon, Lynam, and Johnson (2014) observed meta-analyt- lyze only the EC and PT scales (as consistent with the test ically that the IRI and other measures of empathy demon- designer’s recommendations; Hatcher et al., 1994; Jolliffe strate very weak negative correlations with aggression, & Farrington, 2006). Nevertheless, many researchers sum even though these two constructs have long been conceptu- scores across all four subscales to create a “total score” alized as strongly related. They concluded that this finding (e.g., Decety, Lewis, & Cowell, 2015; Domes, Hollerbach, suggests weak validity for the IRI and other empathy mea- Vohs, Mokros, & Habermeyer, 2013), with some reporting sures. Partly in response to these concerns, Vachon and only this total score and omitting data for the individual Lynam (2016) developed the ACME, which contains three scales (e.g., Gill & Stickle, 2016; Wood, James, & Ciardha, scales, as an attempt to fashion a measure that relates 2014). Others have summed the EC and PD scales to create strongly to interpersonal malignancy. The Cognitive an aggregate measure of “affective empathy” and summed Empathy (CE) scale aims to measure the “ability to detect the PT and FN scales to create an aggregate measure of and understand emotional displays” (p. 136). The Affective “cognitive empathy” (e.g., Bock & Hosser, 2014; Gabay, Resonance (AR) scale aims to measure “emotional response Shamay-Tsoory, & Goldfarb, 2016), a technique that has in the observer that is congruent in valence to the target” (p. been criticized (Chrysikou & Thompson, 2016). 136). The Affective Dissonance (AD) scale aims to measure In view of the research we have reviewed, many pub- “the experience of a contradictory emotional response—for lished findings based on the aggregation of various IRI example, taking pleasure in others’ pain or feeling annoyed scales may be misleading. In this regard, the contribution of with others’ happiness” (p. 136); it is reverse-scored so that the PD scale is particularly suspect. Multiple studies have higher scores indicate less affective dissonance. found that the PD scale correlates weakly with other empa- The psychometric qualities of the ACME appear promis- thy scales (e.g., Chrysikou & Thompson, 2016) and does ing, especially its incremental validity above and beyond not load on a higher order empathy factor (Chrysikou & other empathy measures in statistically predicting a wide Thompson, 2016; Hawk et al., 2013; Pulos, Elison, & range of potentially maladaptive personality variables, such Lennon, 2004). Although PD’s low correlations with other as aggression and externalizing behaviors (Vachon & empathy scales do not necessarily demonstrate that it is Lynam, 2016). Vachon and Lynam did not, however, pres- irrelevant to the construct, relying upon a total (or global) ent their incremental validity results at the subscale level, empathy score treats constituent scales as broadly equiva- which would provide helpful information regarding the lent indicators of the latent empathy construct and may con- validity of the different elements comprising the ACME. ceal considerable differences in terms of the scales’ relations The CE scale is composed primarily of items related to with external criteria (e.g., Hengartner et al., 2014). self-reported emotion recognition ability (e.g., “I have a Moreover, even though the PD scale tends not to be signifi- hard time reading people’s emotions”), although it includes cantly associated with the EC and PT scales, Jordan, Amir, a few items that may expand the construct by assessing and Bloom (2016) found that it was substantially associated respondents’ inferences regarding the causes of others’ with measures of emotional and behavioral contagion, sug- behavior (e.g., “I can usually guess what’s making someone gesting that it might reflect an isomorphic emotion-match- angry”). Nevertheless, Vachon and Lynam (2016) reported ing definition of empathy. Further investigation of the that this scale did not correlate significantly with emotion functioning of the PD scale is necessary to evaluate its con- recognition task performance, although the affective empa- struct validity as an index of empathy. thy scales of the ACME did. They similarly reported that Although the 4-scale structure of the IRI has been almost neither the IRI PT scale nor the cognitive empathy scale of exclusively used in past research, and has been validated in the BES (which also contains clear emotion recognition a number of studies (e.g., Hawk et al., 2013; Pulos et al., item content) were significantly predictive of empathic 2004), some investigations suggest that the items coalesce accuracy in these tasks. This lack of construct validity may into a three-factor structure: a factor composed of the EC not be a flaw in any particular item pool, but a general prob- and PT items together, a factor composed of the FN items, lem for any self-report questionnaire measure of emotion and a factor composed of the PD items (e.g., Alterman, recognition abilities, given that, as multiple authors have McDermott, Cacciola, & Rutherford, 2003; Siu & Shek, pointed out, self-estimation of people’s mind-reading 4 Assessment 00(0) abilities does not appear to consistently track their actual of anxiety (cf. Todd, Forstmann, Burgmer, Brooks, & Galinsky, abilities (e.g., Ickes, Stinson, Bissonnette, & Garcia, 1990; 2015). In either case, if the IRI or ACME scales demonstrate Realo et al., 2003). Though the evidence does not compel- particularly strong correlations with negative emotionality, it lingly support the CE scale’s validity as an indicator of indi- could raise questions concerning their discriminant validity; viduals’ cognitive empathic abilities, it may indirectly the extant research raises such questions regarding the IRI PD measure other aspects of empathic functioning. scale and ACME AD scales. The ACME’s AR scale appears to represent a broader The factorial validity of the ACME scales has not yet domain of affective empathy than the IRI EC scale, the lat- been extensively explored, but the initial investigations by ter of which relates predominantly to sympathy for others Vachon and Lynam (2016) raise potential concerns regard- who are suffering. The AR scale includes a number of items ing the role of reverse-worded (RW) items in the factor similar to those of the IRI EC scale (e.g., “It makes me feel structure. RW items generally either employ a negating good to help someone in need”), but unlike the latter, it also term (e.g., “I am not compassionate toward the feelings of includes items related to the positive feelings of others (e.g., others”) or a conceptually opposite term (e.g., “I am cold- “I enjoy making others happy”). In further contrast to the hearted toward the feelings of others”). Such items may in IRI, the AR scale also includes numerous items related to certain cases generate spurious factors (“artifactors”) cor- empathic restraint (e.g., “If I see that I am doing something responding largely to the direction of item-keying. that hurts someone, I will quickly stop”). Given that empa- The exploratory factor analysis (EFA) and CFA by thy is theorized to relate to both the inhibition of harming Vachon and Lynam (2016) indicated, in two samples, that behaviors as well as the activation of caring behaviors, one the factor structure of the ACME items was complicated by might expect the AR scale to relate more strongly to inter- the presence of RW items, which comprise a majority of the personal malignancy than the IRI EC scale. Nevertheless, items. The EFA they conducted indicated minor factors Vachon and Lynam (2016) observed that the AR scale mani- based on wording method, which they elected to treat as fested nearly equivalent (negative) relations with aggres- nonsubstantive. The CFA they conducted demonstrated sion, psychopathy, and other externalizing variables as did inadequate fit for the three-factor structure, although model the IRI EC scale. This surprising finding warrants further fit improved as a consequence of including two uncorre- investigation. lated wording method factors. These wording method issues The ACME AD scale constitutes a radical departure from warrant further investigation. prior indices of affective empathy. The items appear to relate to a cruel, resentful, misanthropic, antisocial disposition (e.g., “I get a kick out of making other people feel stupid”). Present Study Whereas empathy has generally been conceptualized on a This article reports the first published validation study of dimension stretching from apathy to high empathy, this scale the ACME scales in a sample not composed of undergradu- expands the dimension so that it runs from high malicious- ate students (Vachon & Lynam, 2016) and also aims to pro- ness to high empathy, with apathy ostensibly falling in the vide a more comprehensive comparison of the ACME and middle of this expanded continuum (Vachon & Lynam, IRI than in previous research. First, we used CFAs to test 2016). Perhaps unsurprisingly given its conceptual overlap the replicability of the factor structure of the ACME scales with antagonism, Vachon and Lynam found that this scale presented by Vachon and Lynam (2016) as well as the factor bore significantly stronger negative correlations with aggres- structure of the IRI presented by Davis (1983), both of sion, psychopathy, and Machiavellianism than did the other which we predicted would replicate in our sample. We com- empathy scales they examined. Furthermore, they found that plemented these confirmatory analyses with ESEM the AD scale bore substantial negative relations with nega- (Asparouhov & Muthén 2009; Marsh, Morin, Parker, & tive affect, emotion dysregulation, anxiety, depression, and Kaur, 2014) to exploratorily examine the factor structure of anger, whereas other affective empathy scales did not. both indices. As pointed out by Vachon and Lynam (2016), one potential Second, we used multidimensional IRT models to objection to empathy measures is that they might be little more examine the measurement precision of the ACME and IRI than measures of generalized negative emotionality or emo- scales at varying trait levels. IRT is a psychometric mod- tional lability. If empathy is defined as only emotion matching, eling paradigm, encompassing a number of specific meth- such as a mother crying when her baby is crying, then it would ods, which evaluates the response properties of individual probably be positively associated with these constructs. If items (e.g., difficulty, discrimination) in relationship to empathy is defined as socially functional perspective-taking latent trait or ability dimensions (for a comprehensive and caring, such as a mother calmly comforting her crying review, see De Ayala, 2013). Classical test theory assumes baby, then it might actually be associated with reduced nega- that measurement precision is constant across varying tive emotionality (cf. Strathearn, Fonagy, Amico, & Montague, trait levels, whereas IRT techniques test this assumption 2009), perhaps especially egocentricity-generating emotions empirically. Past studies in other fields have used these Murphy et al. 5 techniques to compare the measurement precision of as ascertained by Cronbach’s alpha and mean interitem cor- competing questionnaires at varying trait levels, fre- relation (MIC; ACME Cognitive Empathy, α = .89, MIC quently observing poor measurement precision at low = .41; ACME Affective Resonance, α = .90, MIC = .42; and/or high levels of various trait dimensions (e.g., Fraley, ACME Affective Dissonance, α = .96, MIC = .69). The IRI Waller, & Brennan, 2000; Olino et al., 2013). Given that scales demonstrated similarly high internal consistencies clinicians and treatment outcome researchers may be (IRI Empathic Concern, α = .88, MIC = .52; IRI Perspec- interested in assessing substantial empathy deficits and tive-Taking, α = .86, MIC = .47; IRI Fantasy, α = .82, MIC changes following treatment (e.g., Michie & Lindsay, = .41; IRI Personal Distress, α = .86, MIC = .47). 2012; Palgi, Palgi, Ben-Ezra, & Shrira, 2014), and that researchers may be interested in measuring empathy External Criteria within presumably lower trait level populations (e.g., HEXACO Personality Inventory–60 (Ashton & Lee, prison settings; Young et al., 2015), the measurement pre- 2009). The HEXACO–60 is a 60-item general personality cision of empathy scales at various trait levels should be questionnaire containing six factors, each composed of four assessed. To our knowledge, no studies have evaluated lower order facets. It is similar to Big 5 personality mea- the constancy of measurement precision in empathy ques- sures, but also includes Honesty-Humility, which some- tionnaires; we made no a priori predictions regarding times emerges in factor analyses of personality inventories these analyses. of cross-cultural samples (Ashton & Lee, 2009; αs ranged in Third, a broad objective of our study was to compare the our sample from .77 [Emotionality] to .83 [Extraversion]). construct validity of the IRI and the ACME and their incre- The HEXACO Emotionality dimension comprises three mental validity above and beyond each other. We hypothe- lower order dimensions conceptually related to negative sized that (a) both the ACME and IRI scales would emotionality (Fearfulness, Anxiety, and Dependence) as demonstrate substantial construct validity by correlating well as a lower order Sentimentality dimension that assesses negatively with interpersonally malignant personality traits, empathic sensitivity and meaningful emotional attachment such as psychopathic meanness, sadism, and sexual objecti- to others. Because of the substantial content overlap between fication tendencies; but that (b) the ACME scales, even empathy and Sentimentality, we analyzed HEXACO when excluding the AD scale, would demonstrate superior Emotionality both with and without the Sentimentality con- incremental validity in predicting a range of interpersonally tent, hereafter referred to as HEXACO NSE (α = .72). malignant traits, above and beyond IRI scales; and (c) the IRI PD scale would demonstrate a pattern of correlations Multidimensional Personality Questionnaire–Short Form inconsistent with the other ACME and IRI scales, and (MPQ-SF; Tellegen, 1982; Harkness, Tellegen, & Waller, would demonstrate higher correlations with measures of 1995). The MPQ-SF is a 33-item self-report measure of negative emotionality than with convergent and construct general personality assessing 4 higher order dimensions validity variables. (Positive Emotionality [PE], α = .87; Negative Emotional- ity [NE], α = .73; Constraint [CON], α = .57; and Absorp- tion, α = .54) consisting of 11 lower order dimensions Method (PE: Wellbeing, Achievement, Social Potency, and Social Participants were 401 community participants (Mage = 35.5, Closeness; NE: Stress Reaction, Alienation, and Aggres- SDage = 11.0) recruited via Amazon Mechanical Turk sion; CON: Control, Harm Avoidance, and Traditionalism; (M-Turk); 46.5% were male and 53.0% were female (0.05% Absorption is a standalone indicator). reported other than male or female). White participants comprised 71.8% the sample, whereas African Americans Psychopathic Personality Inventory–Revised (PPI-R; Lilien- (7.2%), Hispanic or Latino/Latina (6.0%), Asian or Asian feld & Andrews, 1996; Lilienfeld & Widows, 2005). The American (11.0%), and other race/ethnicities (3.9%) com- PPI-R is a 154-item self-report measure of psychopathy prised far less of the sample. M-Turk samples have gener- intended largely for use with nonoffender samples. It ally been observed to yield psychometrically high-quality contains 8 lower order subscales that generally coalesce data in psychological research, with more diverse popula- into two higher order factors, Fearless Dominance (α = tion representation than undergraduate student samples .92), consisting of Fearlessness, Social Influence, and (Miller, Crowe, Weiss, Maples-Keller, & Lynam, 2017). Stress Immunity, and Self-centered Impulsivity (α = .96), consisting of Blame Externalization, Machiavellian Egocentricity, Carefree Nonplanfulness, and Rebellious Measures Nonconformity. One subscale, Coldheartedness (α = .86), Empathy does not load highly onto these higher order factors and ACME (Vachon & Lynam, 2016) and IRI (Davis, 1983). All is a standalone index of callousness and lack of sentimen- three ACME scales demonstrated high internal consistency, tality. 6 Assessment 00(0) Triarchic Psychopathy Measure (TriPM; Patrick, 2010). The callousness, we administered an amended version of the TriPM is a 58-item measure of psychopathic traits that ISOS. The original version of the ISOS measures the extent yields 3 factors: Boldness (α = .87), which is conceptu- to which individuals report experiencing sexual harassment ally similar to PPI-R Fearless Dominance; Disinhibition ( = and objectification. For this study, we amended the 21 items .91), which is conceptually similar to PPI-R Self-Centered so that they assessed the extent to which the responding Impulsivity; and Meanness (α = .94), which is similar to individual is the objectifier/harasser, rather than the target, PPI-R Coldheartedness but encompasses more content rel- of the objectification. Adapting the measure in this fashion evant to hostility and antagonism. afforded us an indicator of interpersonal malignancy in the sexual realm (α = .96). Inventory of Callous-Unemotional Traits (ICU; Frick, 2004). The ICU is a 24-item self-report measure of cal- Data Analysis lous-unemotional traits, which are conceptually related to the affective deficits of psychopathy (Frick, 2004). It Missing Data. The prevalence of missing data were less than is composed of three factors (Essau, Sasagawa, & Frick, 2% for each item. Little’s (1988) MCAR test was not statis- 2006): Callous (α = .89); Uncaring (α = .83); and Unemo- tically significant, which justified our performing single tional (α = .80). imputation using the expectation-maximization (EM) algo- rithm to impute missing data (Enders, 2001) for all compos- Narcissistic Personality Inventory (NPI; Raskin & Terry, ite scores used in external validity analyses. 1988). The NPI is the most widely used self-report mea- sure of narcissistic traits, containing 40 items. It is com- Factor Analyses. To replicate the factor structure of the ACME posed of 3 factors (Ackerman et al., 2011): Leadership/ established by Vachon and Lynam (2016), we employed fac- Authority (α = .86); Grandiosity/Exhibitionism (α = .81); tor analyses using the lavaan package (Rosseel, 2012) in R and Entitlement/Exploitativeness (α = .68). version 3.4, using the WLSMV estimator, which is most appropriate for ordinal data. Using this same method, Vachon Psychological Entitlement Scale (PES; Campbell, Bonacci, and Lynam (2016) reported that the ACME three-scale struc- Shelton, Exline, & Bushman, 2004). The PES, composed of ture demonstrated good fit, with three correlated substantive 8 items that yield a total score (α = .86), is intended to mea- factors and two uncorrelated method factors (positive-coded sure a “stable and pervasive sense that one deserves more and reverse-coded items, respectively). We similarly tested and is entitled to more than others” (Campbell et al., 2004, the model with (a) only the three trait factors and (b) with the p. 31), theoretically central to narcissism (e.g., Krizan & three trait factors plus the two method factors, which were Herlache, 2017). fixed to be uncorrelated with the substantive trait factors and with each other (this multitrait-multimethod factor analytic Personality Inventory for DSM-5–Brief Form (PID-5 BF; approach is a justified way of dealing with potential con- Krueger, Derringer, Markon, Watson, & Skodol, 2012). The founding of trait and wording/coding method covariance, PID-5 BF is a 25-item self-report measure of dimensions in e.g., Dimitrov, 2012). To investigate the factor structure of the Diagnostic and Statistical Manual of Mental Disorders, the IRI, we used a similar CFA approach. In addition to 5th edition (American Psychiatric Association, 2013) alterna- exploring the potential impact of RW items on the factor tive (Section 3) model of personality disorders. It yields scores structure, we compared the four-factor structure with the for five dimensions of Negative Affect (α = .85), Detachment alternative three-factor structure reported in a handful of (α = .87), Antagonism (α = .89), Disinhibition (α = .89), and prior studies (e.g., Siu & Shek, 2005). Psychoticism (α = .88). The Negative Affect dimension corre- In addition to the CFAs, we conducted ESEMs, a recently lates positively with most personality disorder (PD) traits but developed technique that combines features of EFA and particularly strongly (r > .7) with borderline PD and depen- CFA, as well as structural equation modeling (Asparouhov dent PD traits (Thimm, Jordan, & Bach, 2016). & Muthén, 2009; Marsh et al., 2014), in Mplus 7.0 (Muthén & Muthén, 1998-2012). These analyses, described further Varieties of Sadistic Tendencies Scale (VAST; Paulhus & Jones, below, allowed us to examine the factor structure of the two 2014). The VAST is a 13 item self-report measure of sadis- measures in an exploratory manner. tic tendencies, containing two subscales: Direct Sadism (α = .82, e.g., “I enjoy hurting people.”) and Vicarious Sadism (α Item Response Theory. To examine the measurement preci- = .78, e.g., “In video games, I like the realistic blood spurts.”). sion of the ACME and IRI scales, in their standard forms, at varying trait levels, we used graded response models (GRM; Interpersonal Sexual Objectification Scale (ISOS; Kozee, Samejima, 1969), which are commonly used for IRT pur- Tylka, Augustus-Horvath, & Denchik, 2007). As a secondary poses when dealing with ordinal polytomous response data index of behaviors and attitudes relevant to interpersonal (e.g., Likert-type scales). GRM is a two-parameter IRT Murphy et al. 7 model (2PL), which examines both the polytomous form of AD, and CE) and to themselves demonstrated good fit (CFI item difficulty for each item, as well as each item’s discrim- = .98, TLI = .98, RMSEA = .05, 95% CI [.05, .06]; χ2 = ination value (how reliably it discriminates between respon- 1079.82, df = 555, p ≤ .001). dents higher versus lower on the trait dimension). GRM With regard to ESEMs of the ACME scales, Horn’s models generate item information curves, which estimate (1965) parallel analysis indicated four dimensions underly- the measurement precision of an item depending on the ing the item pool, but only three factors were associated level of trait or ability. The individual item information with eigenvalues above 1 (the Kaiser criterion for factor curves for items in a scale are summed to generate a test extraction). The three-factor ESEM solution had an accept- information function (TIF) curve. In the TIF curve, the con- able fit (CFI = .98, TLI = .98, RMSEA = .06, 95% CI [.06, ditional standard error of measurement for a given trait/ .066], χ2 = 1079.82, df = 555, p ≤ .001). Inspection of the ability value equals the inverse square root of the informa- estimated geomin-rotated factor matrix indicated substan- tion level at that trait/ability value. TIFs can be equated with tial loadings of all AD items (all were RW) and RW AR and reliability; for instance, a TIF of 10 is equivalent to a mar- CE items (>.60) on the first factor. The second factor was ginal reliability of .9, whereas a TIF of 5 is equivalent to mainly represented by both standard and RW CE items, that of .8 (Embretson & Reise, 2000). with salient cross-loadings observed with regard to the lat- ter category (>.5). The third factor was characterized by External Validity. In exploring the nomological networks standard AR items, with cross-loadings of both standard CE associated with the ACME and IRI scales, we used the com- items and RW AR items. All relevant items had salient load- posite sum scores for the individual scales, given that most ings on their intended dimension only for the second factor, research employs these standard composite scores. We had representing CE. The RW AR and CE items manifested sig- .80 power to detect bivariate correlations at or above r = .13 nificant cross-loadings on other factors. In sum, the three at p = .01 and r = .16 at p = .001 (see Cohen, 1992). Given factors that emerged from the ESEM model did not clearly the inflated Type 1 error risk arising from the large number correspond to the intended dimensions of ACME. The four- of tests, we report both levels of significance in the tables. factor ESEM solution had slightly improved fit indices To compare the value of the ACME scales with those of (CFI = .99, TLI = .99, RMSEA = .05, 95% CI [.04, .05], χ2 the IRI scales, we performed hierarchical multiple regres- = 888.991, df = 492, p ≤.001). The first two factors were sion analyses to ascertain the incremental validity of the similar to those in the three-factor ESEM. The third factor ACME scales above and beyond the IRI scales, and vice was again characterized by standard AR items, with weaker versa, in statistically predicting variables in our data set the- but salient (>.3) cross-loadings for RW AR items. The oretically characterized by interpersonal malignancy (TriPM fourth factor had high cross-loadings from standard AR Meanness, PPI-R Coldheartedness, PID-5 Antagonism, NPI items. Entitlement/Exploitativeness, VAST Direct, ICU Callous, In attempts to control for the method effects for both and ISOS Total) given their relevance to empathy deficits. models, we respecified the models by freely estimated We entered either a single ACME scale or a combination of residual covariances between selected RW items; we used ACME scales (e.g., ACME CE and ACME AR) in the first the modification indices to guide our model re-specifica- block, and then an IRI counterpart, either a single scale or tions. We obtained similar results. The cross-loadings with combination of scales, into the second block. We also con- regard to the RW AR and CE items corroborated the CFA ducted these analyses in reverse order, with an IRI scale or findings of Vachon and Lynam (2016) and our own CFA combination of scales in the first block, and ACME counter- analyses, suggesting the likely existence of substantial parts in the second block. wording method factors, either substantive or artifactual, in the factorial structure of the ACME items. A comparison of the CFA and ESEM models indicated that the ESEM results Results did not significantly improve in either fit or interpretability over the CFA model. Factor Structure of the ACME and IRI Scales ACME. Using CFA, a correlated three-factor model, with IRI. Turning to the IRI, the traditional four-factor structure AR, AD, and CE subscales, demonstrated inadequate fit (Davis, 1983) did not fit adequately in our sample (CFI = (comparative fit index [CFI] = .90, Tucker-Lewis index .87; TLI = .86; RMSEA = .11, 95% CI [.11, .12]; χ2 = [TLI] = .89, root mean square error of approximation 1899.96, df = 344, p < .001), nor did the alternative three- [RMSEA] = .12, 95% confidence interval [CI; .12, .13]; χ2 factor structure, with PT and EC collapsed into 1 factor = 3898.76, df = 591, p < .001). Replicating the results (CFI = .85; TLI = .84; RMSEA = .12, 95% CI [.11, .12]; χ2 reported by Vachon and Lynam (2016), a five-factor CFA = 2106.38.01, df = 347, p < .001). Similar to our approach model with two method factors (positive and reverse word- with the ACME, we tested a multitrait-multimethod CFA ing) orthogonal to the three substantive factors (again, AR, model with four correlated substantive factors (PT, EC, FN, 8 Assessment 00(0) and PD) and two uncorrelated method (one positive word- level to an average of 0.7 for the trait level above 2.0. In ing and one negative wording). Including the potential sum, the item pools for most of the scales of both measures method factors, this six-factor model demonstrated accept- do not appear to measure empathy with adequate precision able fit (CFI = .95; TLI = .94; RMSEA = .08, 95% CI [.07, with regard to individuals with markedly elevated empathic .08]; χ2 = 950.69, df = 316, p < .001); a multitrait multi- traits. method CFA with 3-substantive factors and 2 method fac- tors had marginal fit (CFI = .92; TLI = .90; RMSEA = .09, Relations Between ACME and IRI Scales 95% CI [.09, .10]; χ2 = 1309.25, df = 319, p < .001). With regard to the ESEMs of the IRI items, Horn’s paral- We next turn to the correlations between the ACME and IRI lel analysis indicated 5 dimensions, but only 4 were associ- scales, which can be found in Table 1. All three ACME ated with an eigenvalue above 1. The four-factor ESEM scales, the IRI EC scale, and the IRI PT scales intercorre- solution had a marginal fit (CFI = .93, TLI = .91, RMSEA = lated positively. The IRI FN scale was not significantly cor- .09, 95% CI [.08, .09], χ2 = 1118.347, df = 272, p ≤ .001). related with the ACME AD scale but was positively Inspection of the estimated geomin-rotated factor matrix correlated with all other scales. The IRI PD scale was indicated substantial loadings of all the EC and PT items on weakly negatively correlated with most of the other empa- the first factor, both standard and RW. The second factor thy scales, weakly positively with the IRI FN scale, and was represented by PD items, both standard and RW. The negligibly with the IRI EC scale (ironically, the scale with third factor was characterized by high factor and cross- which it is frequently aggregated to create an “affective loadings from RW EC, FN, PD, and PT items. The fourth empathy” composite, e.g., Gabay et al., 2016). Although the factor was represented by the 7 FN items. The five-factor IRI PT scale is often treated as a “cognitive” empathy vari- ESEM solution had slightly improved fit indices (CFI = .97, able, it demonstrated larger positive correlations with the TLI = .96, RMSEA = .06, 95% CI [.06, .07], χ2 = 652.99, df major affective empathy scales, namely, IRI EC and ACME = 248, p ≤ .001). The first factor had high loadings on the AR, than with the ACME CE scale (smallest Steiger’s EC items; the second, PD; the third, all RW items; the (1980) z = 3.65, p < .001), pointing to problems with its fourth, PT, and finally, the fifth on the FS items. A compari- discriminant validity. son of the IRI CFA and ESEM models showed that the ESEM results did not significantly improve in fit or inter- Relations Between Empathy and External pretability (Marsh et al., 2014). Overall, our ESEM analyses pointed to the likelihood of Criteria substantial confounding of trait and method covariance for Relations With Broad-Band Personality Dimensions. Correla- both the IRI and the ACME, much as Vachon and Lynam tions between the ACME (and the IRI) and both the (2016) reported for the ACME. Similar to Vachon and HEXACO and MPQ dimensions are presented in Table 2. Lynam’s findings, though, the standard ACME and IRI The ACME scales and the IRI EC and PT scales were posi- structure models demonstrated acceptable fit once the two tively correlated with all six HEXACO dimensions. The wording method factors were added to the CFA models. pattern of correlations was similar for the ACME scales and Following their lead, and consistent with multitrait-multi- the IRI EC and PT scales, which can be interpreted as addi- method perspectives in scale development (e.g., Dimitrov, tional evidence of the ACME’s convergent validity. The FN 2012), we used the typical factor structures of the two mea- scale demonstrated similar correlations for most of the sures in our subsequent analyses, employing composite HEXACO dimensions but was not significantly correlated scores. This approach allowed us to ascertain the item prop- with Honesty/Humility or Agreeableness, indicating a lack erties and external correlates of these two measures as they of convergent validity with the other empathy scales. The have typically been used in prior studies and in clinical IRI PD scale was more highly correlated with HEXACO practice. Emotionality (r = .62) and HEXACO NSE (r = .59), than it was with the other HEXACO dimensions (all Steiger’s z tests, p < .001). The IRI PD scale was negatively correlated Measurement Precision at Varying Trait Levels with Honesty-Humility, Extraversion, Agreeableness, Con- Supplemental Figures 1 to 7, available with the online ver- scientiousness, and Openness. sion of the article, display the test information curves of the All the affective empathy scales, as well as IRI PT, were seven scales, based on the CFA models estimated for ACME significantly and positively correlated with MPQ Constraint, and IRI. Other than the CE and the PD scales to some extent, but ACME CE, IRI FN, and IRI PD were not. Absorption all the empathy scales displayed low precision at higher demonstrated small correlations with all empathy scales, empathy trait levels over 0.5 to 1.5. Their accuracy sharply except for ACME AR, with which it was not significantly decreases as the trait level increases and the standard error of correlated, and IRI FN, with which it demonstrated a measurement rises from an average of 0.25 in the low trait medium positive correlation. As expected, absorption was Murphy et al. 9 Table 1. Correlations Between ACME and IRI Scales. ACME CE ACME AR ACME AD IRI PT IRI EC IRI FN IRI PD ACME CE — ACME AR .47 — ACME AD .29 .74 — IRI PT .43 .58 .39 — IRI EC .44 .77 .52 .66 — IRI FN .33 .31 .09 .35 .42 — IRI PD −.12 −.14 −.24 −.15 −.08 .13 — Note. N = 401. Bolded is p < .001, italicized is p < .01. ACME = Affective and Cognitive Measure of Empathy; CE = Cognitive Empathy; AR = Affective Resonance; AD = Affective Dissonance; IRI = Interpersonal Reactivity Index; PT = Perspective Taking; EC = Empathic Concern; FN = Fantasy, PD = Personal Distress. Table 2. Correlations Between Empathy Measures and General Personality Indices. HEXACO MPQ H E X A C O NSE PE NE CON ABS ACME CE .21 .15 .25 .22 .35 .36 .03 .20 −.21 .07 .18 ACME AR .49 .26 .23 .45 .51 .45 .10 .08 −.45 .24 .06 ACME AD .49 .07 .10 .36 .56 .35 −.05 −.21 −.56 .25 −.20 IRI PT .41 .16 .30 .62 .41 .41 .02 .25 −.40 .23 .18 IRI EC .51 .36 .29 .46 .40 .41 .16 .22 −.36 .24 .13 IRI FN .01 .33 .12 .08 .13 .41 .25 .20 .01 .02 .41 IRI PD −.12 .61 –.42 −.23 −.27 −.17 .59 −.17 .49 .05 .13 Note. N = 401. Bolded indicates p < .001, italicized indicates p < .01. ACME = Affective and Cognitive Measure of Empathy; CE = Cognitive Empathy, AR = Affective Resonance, AD = Affective Dissonance; IRI = Interpersonal Reactivity Index; PT = Perspective Taking; EC = Empathic Concern; FN = Fantasy; PD = Personal Distress; HEXACO = HEXACO Personality Inventory; H = Honesty/Humility; E = Emotionality; X = Extraversion; A = Agreeableness; C = Conscientiousness; O = Openness; NSE = Nonsentimental Emotionality (i.e., a modification of HEXACO E such that items from the Sentimentality subscale are elided); MPQ = Multidimensional Personality Questionnaire; PE = Positive Emotionality; NE = Negative Emotionality; CON = Constraint; ABS = Absorption. more strongly correlated with IRI FN than with any other SCI, ICU Callous, PES Entitlement, VAST Direct Sadism, empathy scale (smallest Steiger’s z = 3.33, p < .001). ISOS Total, PID-5 Antagonism, PID-5 Disinhibition, PID-5 Detachment, PID-5 Psychoticism, and PID-5 Negative Relations With Personality Disorders Features or Interpersonal Affect. In contrast, the IRI EC scale only demonstrated a Malignancy. All correlations with the range of convergent significantly stronger relationship than the ACME AR scale and discriminant personality correlates are presented in with PPI-R CH and ICU Unemotional. Table 3. The scales most clearly related to affective empa- The pseudo-cognitive scales (IRI PT and ACME CE) thy (ACME AR, ACME AD, and IRI EC) all correlated displayed similar nomological networks as did the affective negatively and robustly with all interpersonal malignancy empathy scales, although the correlational associations variables, psychological entitlement and also all five per- were generally not as pronounced as those demonstrated by sonality dimensions in the PID-5. These primary affective the ACME AR and AD scales. This pattern of convergent empathy scales were not robustly associated with psychop- validity partially supports their relationships to the empathy athy or narcissism variables primarily measuring confi- construct, in some fashion, even if neither should be used as dence/social potency (PPI-R FD, TriPM Boldness, and NPI a proxy for actual cognitive empathy ability. Leadership/Authority). Though ACME AD generally dem- The IRI FN scale was, in general, only weakly nega- onstrated the highest correlations with interpersonal malig- tively associated with interpersonal malignancy and person- nancy and PD features, the ACME AR scale was also ality disorder feature variables. The IRI PD scale was typically more strongly correlated with these variables than generally positively associated with interpersonal malig- the IRI EC scale. The Steiger’s z comparing ACME AR to nancy and personality disorder features, but it was robustly the next strongest correlated IRI scale was significant, at p negatively correlated with the confidence/social potency < .001, for: TriPM Meanness, TriPM Disinhibition, PPI-R variables. 10 Assessment 00(0) Table 3. Correlations Between Empathy Measures and Indices of Personality Disorder Features and Interpersonal Malignancy. ACME IRI CE AR AD PT EC FN PD TriPM Boldness .21 .03 .01 .15 .07 .00 −.62 Meanness −.35 −.80 −.82 −.56 −.71 −.20 .19 Disinhibition −.28 −.61 −.77 −.42 −.49 −.07 .35 ICU Callous −.37 −.72 –.83 −.40 −.57 −.19 .20 Uncaring −.45 −.61 −.36 −.56 −.59 −.25 .19 Unemotional −.36 −.28 −.13 −.28 −.38 −.22 .12 PPI-R FD .06 −.11 −.19 .04 −.07 −.06 −.54 SCI −.27 −.63 −.75 −.46 −.53 −.06 .29 C −.35 −.63 −.28 −.57 −.72 −.38 −.19 NPI L/A .10 −.17 −.29 −.06 −.13 −.03 −.27 GE −.02 −.25 −.38 −.15 −.18 .04 .01 E/E −.16 −.48 −.58 −.29 −.41 −.09 .09 VAST Direct −.27 −.67 −.83 −.40 −.49 −.07 .20 Vicarious −.18 −.52 −.64 −.33 −.47 −.13 .00 PID-5 Antagonism −.27 −.67 −.81 −.39 −.49 −.11 .23 Disinhibition −.29 −.56 −.75 −.35 −.40 −.10 .28 Detachment −.35 −.57 −.57 −.39 −.49 −.20 .36 Psychoticism −.24 −.52 −.71 −.29 −.39 .02 .33 Negative Affect −.13 −.30 −.47 −.23 −.18 .11 .59 ISOS −.22 −.61 −.80 −.31 −.44 −07 .18 PES −.28 −.61 −.66 −.41 −.52 −.13 .18 Note. Bolded is p < .001, italicized is p < .01. ACME = Affective and Cognitive Measure of Empathy; CE = Cognitive Empathy; AR = Affective Resonance; AD = Affective Dissonance;IRI = Interpersonal Reactivity Index; PT = Perspective Taking; EC = Empathic Concern; FN = Fantasy; PD = Personal Distress; TriPM = Triarchic Psychopathy Measure; ICU = Inventory of Callous-unemotional Traits; PPI-R = Psychopathic Personality Inventory–Revised; FD = Fearless Dominance; SCI = Self-centered Impulsivity; C = Coldheartedness; NPI = Narcissistic Personality Inventory; L/A = Leadership/Authority; GE = Grandiose Exhibitionism; E/E = Entitlement/Exploitativeness; VAST = Varieties of Sadistic Tendencies scale; Direct = Direct Sadism; Vicarious = Vicarious Sadism; PID-5 = Personality Inventory for DSM-5–Brief Form; ISOS = Interpersonal Sexual Objectification Scale; PES = Psychological Entitlement Scale. Discriminant Validity From Broad-Band Emotional Dimensions. As Our results suggest that the IRI PD scale is more a mea- shown in Table 2, only ACME AD and IRI PD were nega- sure of negative emotionality than of empathy. The ACME tively correlated with MPQ Positive Emotionality; all other AD scale correlated more strongly with PID-5 Negative scales, except for ACME AR, were positively correlated with Affect (smallest Steiger’s z = 5.54, p < .001) and MPQ NE it. All empathy scales were negatively correlated with MPQ (smallest Steiger’s z = 4.00, p < .001) than any other empa- Negative Emotionality, except for IRI FN, which was not sig- thy scale aside from IRI PD. nificantly correlated with it, and IRI PD, which was substan- tially positively correlated with it. Most empathy scales were Incremental Validity of the ACME Above and not significantly related to HEXACO NSE, but IRI EC and Beyond the IRI, and Vice Versa FN both exhibited small positive correlations with it, and IRI PD demonstrated a large positive correlation with it. Most For full incremental validity results, see Supplemental empathy scales demonstrated small negative correlations Table 1, available with the online version of the article. The with PID-5 Negative Affect (see Table 3), but ACME AD full ACME scales (CE + AR + AD) demonstrated a high demonstrated a medium negative correlation with it, IRI FN level of incremental validity above and beyond the core IRI demonstrated a small positive correlation with it, and IRI PD scales (EC + PT) in statistically predicting our seven demonstrated a large positive correlation with it. selected interpersonal malignancy variables (average Murphy et al. 11 ΔR2 = .30). More tellingly, although weaker than the statis- Nevertheless, discrepancies between RW and standardly tical effects of the full ACME scales, the combination of worded items may sometimes reflect genuine personality only the ACME CE and AR scales also demonstrated variance (rather than pure method variance), especially medium incremental validity above and beyond the IRI EC when the items relate to social desirability or self-esteem and PT scales in statistically predicting interpersonal malig- (DiStefano & Motl, 2009). For instance, query items such nancy (average ΔR2 = .13). as “I am an empathic person” and “I am not an empathic By comparison, when the core IRI scales were entered in person” may not be direct opposites of each other, as defen- the second block, they yielded minimal incremental validity siveness or fear of negative evaluation may be more related above the combination of ACME CE and AR in statistically to the latter. In other words, agreeing to a socially negative predicting interpersonal malignancy (average ΔR2 = .03). statement about oneself may not be the precise method- Of the seven outcome variables, they only offered meaning- ological opposite of disagreeing about a positive statement ful incremental validity in predicting PPI-R CH (ΔR2 = .14) about oneself. The ACME AR scale by itself demonstrated medium incre- Our analyses do not allow us to disentangle whether the mental validity above and beyond IRI EC in statistically pre- wording method factors that emerge with the ACME and dicting interpersonal malignancy (average ΔR2 = .13). By IRI items reflect a method artifact, a substantive (personal- comparison, the IRI EC scale produced weak incremental ity) dimension, or a mixture of both. Nevertheless, in con- validity in statistically predicting interpersonal malignancy cert with the findings of Vachon and Lynam (2016), our (average ΔR2 = .03), only demonstrating meaningful incremen- findings encourage future research on wording method tal validity in statistically predicting PPI-R CH (ΔR2 = .14). effects, especially given that they can complicate other analyses. After accounting for these wording method effects in our CFAs, the three-scale structure of the ACME Discussion presented by Vachon and Lynam and the four-scale struc- Our analyses indicate that both the IRI and the ACME scales ture of the IRI presented by Davis (1983) were replicated have some strengths, but also substantial limitations. Our in our sample. results: leave some lingering concerns regarding the factorial validity of both questionnaires; indicate that both lack mea- Measurement Precision of the ACME and IRI surement precision at higher trait levels; and raise concerns regarding the construct validity of particular scales. Overall, Scales though, our analyses indicate that the ACME affords substan- In our IRT analyses, a strong general pattern held: The tial advantages over the IRI in relating to interpersonally ACME and IRI scales appear to be effective in detecting malignant traits, consistent with Vachon and Lynam’s (2016) moderate and low levels of empathic traits, but lack mea- goals in crafting it. surement precision at high levels of these traits. Because this is the first study to examine the IRT properties of these Factor Structures of the ACME and IRI empathy measures, future replication is needed. If further research comes to similar conclusions, this would indicate Consistent with the results obtained by Vachon and Lynam, that these measures may be reliable when evaluating empa- our CFAs and ESEMs indicated that the factor structure of thy deficits and their change over time, as well as in research the ACME scales is significantly complicated by the mix- with lower empathy subject samples. If future research cor- ture of RW and standard items. Our results revealed similar roborates our finding that these measures lack reliability at problems for the IRI, though it has a lower proportion of higher trait levels, however, the use of these measures in RW items. Many scales employ RW items to combat acqui- high-empathy populations, such as individuals in helping escence bias. At the same time, though, this practice may professions, may be methodologically problematic. introduce wording method confounds. Factor analyses of scales with some RW items frequently indicate the presence Construct and Incremental Validity of method covariance obscuring or confounding substantive covariance (e.g., Brown, 2003; Roszkowski & Soven, Although the IRI’s main affective empathy scale, the EC 2010). The factor analysis distortion caused by RW items scale, appears to possess a highly similar nomological net- appears to be relatively sensitive to differences in respond- work to that of the ACME AR scale, the AR scale demon- ing styles between study samples. For instance, even a strated substantial incremental validity benefits in small amount of careless responding in a sample can cause comparison, at least in terms of traits related to meanness, wording method factors to manifest in factor analyses (e.g., antagonism, and other aspects of interpersonal malignancy. Woods, 2006). CFAs of scales with RW items often fail to We observed similar results when examining the incremen- provide adequate fit unless the model is adjusted to account tal validity of combinations of the ACME scales, above and for wording method variance (e.g., Woods, 2006). beyond combinations of the IRI scales. These findings 12 Assessment 00(0) suggest that Vachon and Lynam (2016) successfully devised Our findings also elucidate the role of negative emotion- a measure that more effectively correlates with malignant ality in empathy as operationalized by the ACME and IRI. empathy deficits than does the IRI. The IRI, however, per- The main empathy scales (ACME AR, ACME CE, IRI EC, formed better than the ACME in statistically predicting and IRI PT) were not robustly associated with the HEXACO PPI-R Coldheartedness, indicating that it may still offer NSE variable, but they were robustly negatively correlated value not captured by the ACME scales (but see other with MPQ NE and PID-5 Negative Affect. This finding promising data in Vachon & Lynam, 2016, indicating could indicate that empathy is negatively associated with heightened aversion to emotionally negative stimuli as a feeling disconnected from others, stress-related irritability, more pronounced correlate of the ACME AR scale than of and other emotional correlates of interpersonal friction but the IRI EC scale). Because the Coldheartedness scale is not substantially associated with less interpersonally more a measure of passive emotional detachment than of charged aspects of negative emotionality, such as worry- active antagonism, this result raises the possibility that the proneness. Future research, using more fine-grained mea- IRI is a better marker of this feature than is the ACME. sures of specific aspects of negative emotionality, may Our results also indicate that the ACME AD scale further illuminate this aspect of the empathy construct. appears to more strongly measure interpersonal malignancy Although further research is necessary to conceptually and emotional hostility than it does general empathy. The replicate our findings, we tentatively propose that research- AD scale correlated more robustly with interpersonal ers should prefer the ACME AR over the IRI EC scale when malignancy variables than with the other empathy scales, measuring affective empathy but should administer both if reflecting the AD scale’s strong saturation with reversed possible, pending further comparative validation research. antagonism and reversed conscientiousness. Furthermore, Although the ACME CE scale is, at least facially, more the AD scale correlated more strongly with both MPQ related to cognitive empathy ability than is the IRI PT scale, Negative Emotionality and PID-5 Negative Affect than did it remains unclear how effectively it measures such abili- any of the other empathy scales aside from the IRI PD scale. ties. We advise that researchers not use the ACME CE nor This disproportionate tilt toward negative emotionality con- the IRI PT as proxies for cognitive empathy unless adequate ceptually replicates findings by Vachon and Lynam (2016) convergent validity with behavioral results emerges. The regarding the AD scale. ACME AD scale should probably be used with caution as a Our results also offer compelling evidence that the IRI measure of empathy, although it may detect some of the PD scale is much more a measure of negative emotionality more interpersonally malignant correlates of this construct. than of empathy. This scale is (a) much more strongly cor- related with negative emotionality than with the other empa- Limitations and Future Directions thy scales; (b) correlated with general personality traits in opposing or otherwise divergent ways compared with other One limitation of this study is that we did not administer the empathy scales; and (c) in general correlated positively with EQ (Baron-Cohen & Wheelwright, 2004). This self-report measures characterized by interpersonal malignancy, the measure of empathy, which can be parsed into distinguish- opposite of what one would expect from an empathy scale. able cognitive and affective empathy scales (Lawrence, As a consequence, many published analyses that have Shaw, Baker, Baron-Cohen, & David, 2004), is in much the included this scale in either a “total” IRI score (e.g., Flight & same theoretical vein as the IRI and the ACME. Further Forth, 2007; Glass, Moody, Grafman, & Krueger, 2016) or research is needed to investigate the value of the ACME aggregated it with the IRI Empathic Concern scale to create relative to the EQ. an “affective empathy” variable (e.g., Dziobek et al., 2011; Another limitation is that we did not administer task Shamay-Tsoory et al., 2009) may be seriously misleading. assessments of emotion recognition or other features of Our results, in conjunction with similarly critical analyses cognitive empathy. As highlighted by Vachon and Lynam (e.g., Chrysikou & Thompson, 2016), argue against the (2016), self-report measures of cognitive empathy abilities widespread practice of aggregating this scale with other may reflect people’s appraisal of their own accuracy rather scales to create “total” or composite empathy variables. than such accuracy per se. Unless research establishes that Although not as starkly troubling as for the PD scale, our individuals are valid judges of their own cognitive empathy results corroborate the view that the IRI FN scale possesses abilities, the ACME CE scale and other comparable scales only limited construct validity as an empathy measure. should not be relied on to quantify this aspect of empathy. Although it demonstrated some convergent validity with In sum, the ACME represents a substantial step forward in other empathy scales, our data suggest that it may be more empathy research, as it has attempted to broaden the content a measure of trait absorption than of empathy. Given the domain of empathy measurement. Such attempts can, how- low internal consistency of our short-form MPQ Absorption ever, be expanded even further. For instance, inspired by the scale, though, these results should be replicated using the IRI FN scale, future research should investigate the measure- full scale. ment of potential empathic qualities involved in appreciating Murphy et al. 13 the stories of others, or of listening skills and tendencies. Retrieved from https://www.nytimes.com/2016/12/30/books/ Although the FN scale focuses on absorption in books and review/against-empathy-paul-bloom.html movies, human empathy and human stories have existed far Baron-Cohen, S., & Wheelwright, S. (2004). The empathy quo- longer than the printing press and Hollywood, and person-to- tient: An investigation of adults with Asperger syndrome or high functioning autism, and normal sex differences. Journal person communications may be a more fruitful domain for of Autism and Developmental Disorders, 34, 163-175. empathy research and measurement. Bird, G., & Viding, E. (2014). The self to other model of empa- Similarly, as suggested by the PD scale, empathy can be thy: Providing a new framework for understanding empa- difficult and even distressing, and susceptibilities to empathic thy impairments in psychopathy, autism, and alexithymia. distress or empathic fatigue may be important aspects of Neuroscience & Biobehavioral Reviews, 47, 520-532. empathic functioning. Whereas the PD scale references Bloom, P. (2016). Against empathy: The case for rational compas- “emergencies” and ambiguous tense situations, future research sion. New York, NY: Ecco. could investigate this process more explicitly in interpersonal Bloom, P. (2017). Empathy and its discontents. Trends in empathic encounters, such as listening to a friend talk about Cognitive Sciences, 21, 24-31. the death of a loved one or an acquaintance open up about a Bock, E. M., & Hosser, D. (2014). Empathy as a predictor of recent traumatic experience. These are merely a few of the recidivism among young adult offenders. Psychology, Crime & Law, 20, 101-115. potential domains into which multidimensional empathy mea- Brown, T. A. (2003). Confirmatory factor analysis of the Penn sures could be extended. Before the contours of the empathy State Worry Questionnaire: Multiple factors or method construct can be more comprehensively mapped and appro- effects? Behaviour Research and Therapy, 41, 1411-1426. priately measured, additional effort should be invested in Campbell, W. K., Bonacci, A. M., Shelton, J., Exline, J. J., expanding the measurement content domain to better under- & Bushman, B. J. (2004). Psychological entitlement: stand the boundaries of this still poorly understand construct Interpersonal consequences and validation of a self-report (cf. Clark & Watson, 1995; Loevinger, 1957). measure. Journal of Personality Assessment, 83, 29-45. Chrysikou, E. G., & Thompson, W. J. (2016). Assessing cognitive Declaration of Conflicting Interests and affective empathy through the Interpersonal Reactivity Index: An argument against a two-factor model. Assessment, The author(s) declared no potential conflicts of interest with respect 23, 769-777. to the research, authorship, and/or publication of this article. Clark, L. A., & Watson, D. (1995). Constructing validity: Basic issues in objective scale development. Psychological Funding Assessment, 7, 309-319. The author(s) received no financial support for the research, Cohen, J. (1992). A power primer. Psychological Bulletin, 112, authorship, and/or publication of this article. 155-159. Davis, M. H. (1983). Measuring individual differences in empa- Supplemental Material thy: Evidence for a multidimensional approach. Journal of Personality and Social Psychology, 44, 113-126. Supplementary material for this article is available online. De Ayala, R. J. (2013). The IRT tradition and its applications. Oxford Handbook of Quantitative Methods: Foundations, 1, References 144-169. Ackerman, R. A., Witt, E. A., Donnellan, M. B., Trzesniewski, Decety, J., Lewis, K. L., & Cowell, J. M. (2015). Specific elec- K. H., Robins, R. W., & Kashy, D. A. (2011). What does the trophysiological components disentangle affective sharing and Narcissistic Personality Inventory really measure? Assessment, empathic concern in psychopathy. Journal of Neurophysiology, 18, 67-87. 114, 493-504. Alterman, A. I., McDermott, P. A., Cacciola, J. S., & Rutherford, Decety, J., & Michalska, K. J. (2010). Neurodevelopmental M. J. (2003). Latent structure of the Davis Interpersonal changes in the circuits underlying empathy and sympathy Reactivity Index in methadone maintenance patients. Journal from childhood to adulthood. Developmental Science, 13, of Psychopathology and Behavioral Assessment, 25, 257-265. 886-899. American Psychiatric Association. (2013). Online assessment Dimitrov, D. M. (2012). Statistical methods for validation of assess- measures: The Personality Inventory for DSM-5–Brief Form ment scale data in counseling and related fields. Alexandria, (PID-5-BF)–Adult. Retrieved from https://www.psychiatry. VA: American Counseling Association. org/psychiatrists/practice/dsm/educational-resources/assess- DiStefano, C., & Motl, R. W. (2009). Personality correlates ment-measures of method effects due to negatively worded items on the Ashton, M. C., & Lee, K. (2009). The HEXACO-60: A short Rosenberg Self-Esteem scale. Personality and Individual measure of the major dimensions of personality. Journal of Differences, 46, 309-313. Personality Assessment, 91, 340-345. Domes, G., Hollerbach, P., Vohs, K., Mokros, A., & Habermeyer, Asparouhov, T., & Muthén, B. O. (2009). Exploratory structural E. (2013). Emotional empathy and psychopathy in offenders: equation modeling. Structural Equation Modeling, 16, 397- An experimental study. Journal of Personality Disorders, 27, 438. 67-84. Baron-Cohen, S. (2016, December 30). Empathy is good, right? A Dziobek, I., Preißler, S., Grozdanovic, Z., Heuser, I., Heekeren, new book says we’re better off without it. New York Times. H. R., & Roepke, S. (2011). Neuronal correlates of altered 14 Assessment 00(0) empathy and social cognition in borderline personality disor- Keltner, D., & Haidt, J. (1999). Social functions of emotions at der. Neuroimage, 57, 539-548. four levels of analysis. Cognition & Emotion, 13, 505-521. Embretson, S. E., & Reise, S. P. (2000). Item response theory for Keltner, D., & Kring, A. M. (1998). Emotion, social function, and psychologists. Mahwah, NJ: Lawrence Erlbaum. psychopathology. Review of General Psychology, 2, 320-342. Enders, C. K. (2001). A primer on maximum likelihood algorithms Kozee, H. B., Tylka, T. L., Augustus-Horvath, C. L., & Denchik, available for use with missing data. Structural Equation A. (2007). Development and psychometric evaluation of Modeling, 8, 128-141. the interpersonal sexual objectification scale. Psychology of Essau, C. A., Sasagawa, S., & Frick, P. J. (2006). Callous- Women Quarterly, 31, 176-189. unemotional traits in a community sample of adolescents. Krizan, Z., & Herlache, A. D. (2017). The narcissism spec- Assessment, 13, 454-469. trum model: A synthetic view of narcissistic personality. Flight, J. I., & Forth, A. E. (2007). Instrumentally violent youths: Personality and Social Psychology Review, 22(1), 3-31. The roles of psychopathic traits, empathy, and attachment. Lawrence, E. J., Shaw, P., Baker, D., Baron-Cohen, S., & David, Criminal Justice and Behavior, 34, 739-751. A. S. (2004). Measuring empathy: Reliability and validity of Fraley, R. C., Waller, N. G., & Brennan, K. A. (2000). An item the Empathy Quotient. Psychological Medicine, 34, 911-920. response theory analysis of self-report measures of adult Lilienfeld, S. O., & Widows, M. (2005). Professional manual for attachment. Journal of Personality and Social Psychology, the Psychopathic Personality Inventory–Revised (PPI-R). 78, 350-365. Lutz, FL: Psychological Assessment Resources. Frick, P. J. (2004). The Inventory of Callous–Unemotional Traits Little, R. J. (1988). A test of missing completely at random (Unpublished rating scale). University of New Orleans, New for multivariate data with missing values. Journal of the Orleans, LA. American Statistical Association, 83, 1198-1202. Gabay, Y., Shamay-Tsoory, S. G., & Goldfarb, L. (2016). Loevinger, J. (1957). Objective tests as instruments of psychologi- Cognitive and emotional empathy in typical and impaired cal theory. Psychological Reports, 3, 635-694. readers and its relationship to reading competence. Journal of Marsh, H. W., Morin, A. J. S., Parker, P., & Kaur, G. (2014). Clinical and Experimental Neuropsychology, 38, 1131-1143. Exploratory structural equation modeling: An integration of Gill, A. D., & Stickle, T. R. (2016). Affective differences between the best features of exploratory and confirmatory factor analy- psychopathy variants and genders in adjudicated youth. sis. Annual Review of Clinical Psychology, 10, 85-110. Journal of Abnormal Child Psychology, 44, 295-307. Michie, A. M., & Lindsay, W. R. (2012). A treatment component Glass, L., Moody, L., Grafman, J., & Krueger, F. (2016). Neural designed to enhance empathy in sex offenders with an intel- signatures of third-party punishment: Evidence from pen- lectual disability. British Journal of Forensic Practice, 14(1), etrating traumatic brain injury. Social Cognitive and Affective 40-48. Neuroscience, 11, 253-262. Miller, J. D., Crowe, M., Weiss, B., Maples-Keller, J. L., & Lynam, Harkness, A. R., Tellegen, A., & Waller, N. (1995). Differential con- D. R. (2017). Using online, crowdsourcing platforms for data vergence of self-report and informant data for Multidimensional collection in personality disorder research: The example of Personality Questionnaire traits: Implications for the construct Amazon’s Mechanical Turk. Personality Disorders: Theory, of negative emotionality. Journal of Personality Assessment, Research, and Treatment, 8, 26-34. 64, 185-204. Muthén, L. K., & Muthén, B. O. (1998-2012). MPlus user’s guide Hatcher, S. L., Nadeau, M. S., Walsh, L. K., Reynolds, M., Galea, (7th ed.). Los Angeles, CA: Muthén & Muthén. J., & Marz, K. (1994). The teaching of empathy for high Olino, T. M., Yu, L., McMakin, D. L., Forbes, E. E., Seeley, J. school and college students: Testing Rogerian methods with R., Lewinsohn, P. M., & Pilkonis, P. A. (2013). Comparisons the Interpersonal Reactivity Index. Adolescence, 29, 961-975. across depression assessment instruments in adolescence and Hawk, S. T., Keijsers, L., Branje, S. J., Graaff, J. V. D., Wied, young adulthood: An item response theory study using two M. D., & Meeus, W. (2013). Examining the Interpersonal linking methods. Journal of Abnormal Child Psychology, 41, Reactivity Index (IRI) among early and late adolescents and 1267-1277. their mothers. Journal of Personality Assessment, 95, 96-106. Palgi, S., Palgi, Y., Ben-Ezra, M., & Shrira, A. (2014). “I will Hengartner, M. P., De Fruyt, F., Rodgers, S., Mueller, M., Roessler, fear no evil, for I am with me”: Mentalization-oriented W., & Ajdacic-Gross, V. (2014). An integrative examination intervention with PTSD patients. A case study. Journal of of general personality dysfunction in a large community sam- Contemporary Psychotherapy, 44, 173-182. ple. Personality and Mental Health, 8, 276-289. Patrick, C. J. (2010). Triarchic Psychopathy Measure (TriPM). Horn, J. L. (1965). A rationale and test for the number of factors in Retrieved from https://www.phenxtoolkit.org/index.php?page factor analysis. Psychometrika, 30, 179-185. Link=browse.protocoldetails&id=121601 Ickes, W., Stinson, L., Bissonnette, V., & Garcia, S. (1990). Paulhus, D. L., & Jones, D. N. (2014). Measures of dark personali- Naturalistic social cognition: Empathic accuracy in mixed- ties. In G. J. Boyle, D. H. Saklofske & G. Matthews (Eds.), sex dyads. Journal of Personality and Social Psychology, Measures of personality and social psychological constructs 59(4), 730-742. (pp. 562-594). San Diego, CA: Academic Press. Jolliffe, D., & Farrington, D. P. (2006). Development and valida- Pulos, S., Elison, J., & Lennon, R. (2004). The hierarchical struc- tion of the Basic Empathy Scale. Journal of Adolescence, 29, ture of the Interpersonal Reactivity Index. Social Behavior 589-611. and Personality: An International Journal, 32, 355-359. Jordan, M. R., Amir, D., & Bloom, P. (2016). Are empathy and Raskin, R., & Terry, H. (1988). A principal-components analysis concern psychologically distinct? Emotion, 16, 1107-1116. of the Narcissistic Personality Inventory and further evidence Murphy et al. 15 of its construct validity. Journal of Personality and Social Tellegen, A., & Atkinson, G. (1974). Openness to absorbing and self- Psychology, 54, 890-902. altering experiences (“absorption”), a trait related to hypnotic Realo, A., Allik, J., Nõlvak, A., Valk, R., Ruus, T., Schmidt, M., susceptibility. Journal of Abnormal Psychology, 83, 268-277. & Eilola, T. (2003). Mind-reading ability: Beliefs and perfor- Thimm, J. C., Jordan, S., & Bach, B. (2016). The Personality mance. Journal of Research in Personality, 37, 420-445. Inventory for DSM-5 Short Form (PID-5-SF): Psychometric Reniers, R. L., Corcoran, R., Drake, R., Shryane, N. M., & Völlm, properties and association with big five traits and pathological B. A. (2011). The QCAE: A questionnaire of cognitive and beliefs in a Norwegian population. BMC Psychology, 4, 61. affective empathy. Journal of Personality Assessment, 93, doi:10.1186/s40359-016-0169-5 84-95. Todd, A. R., Forstmann, M., Burgmer, P., Brooks, A. W., & Rogers, C. R. (1958). The characteristics of a helping relationship. Galinsky, A. D. (2015). Anxious and egocentric: How spe- Journal of Counseling & Development, 37, 6-16. cific emotions influence perspective taking. Journal of Rosseel, Y. (2012). lavaan: An R package for structural equation Experimental Psychology: General, 144, 374-391. modeling. Journal of Statistical Software, 48(2), 1-36. Vachon, D. D., & Lynam, D. R. (2016). Fixing the problem with Roszkowski, M. J., & Soven, M. (2010). Shifting gears: empathy: Development and validation of the affective and Consequences of including two negatively worded items in cognitive measure of empathy. Assessment, 23, 135-149. the middle of a positively worded questionnaire. Assessment & Vachon, D. D., Lynam, D. R., & Johnson, J. A. (2014). The (non) Evaluation in Higher Education, 35, 113-130. relation between empathy and aggression: Surprising results Samejima, F. (1969). Estimation of latent ability using a from a meta-analysis. Psychological Bulletin, 140, 751-773. response pattern of graded scores. Psychometric Monograph Watson, D., & Clark, L. A. (1984). Negative affectivity: The dispo- Supplement, 17(4, Pt. 2). sition to experience aversive emotional states. Psychological Shamay-Tsoory, S. G., Aharon-Peretz, J., & Perry, D. (2009). Bulletin, 96, 465-490. Two systems for empathy: A double dissociation between Wickramasekera, I. E., & Szlyk, J. P. (2003). Could empathy emotional and cognitive empathy in inferior frontal gyrus be a predictor of hypnotic ability? International Journal of versus ventromedial prefrontal lesions. Brain, 132, 617-627. Clinical and Experimental Hypnosis, 51, 390-399. Siu, A. M., & Shek, D. T. (2005). Validation of the Interpersonal Wood, J. L., James, M., & Ciardha, C. Ó. (2014). “I know how Reactivity Index in a Chinese context. Research on Social they must feel”: Empathy and judging defendants. European Work Practice, 15, 118-126. Journal of Psychology Applied to Legal Context, 6, 37-43. Spreng, R. N., McKinnon, M. C., Mar, R. A., & Levine, B. (2009). Woods, C. M. (2006). Careless responding to reverse-worded The Toronto Empathy Questionnaire: Scale development and items: Implications for confirmatory factor analysis. initial validation of a factor-analytic solution to multiple empa- Journal of Psychopathology and Behavioral Assessment, thy measures. Journal of Personality Assessment, 91, 62-71. 28, 189-194. Steiger, J. H. (1980). Tests for comparing elements of a correlation Young, S., Sedgwick, O., Perkins, D., Lister, H., Southgate, matrix. Psychological Bulletin, 87, 245-251. K., Das, M., . . . Gudjonsson, G. H. (2015). Measuring Strathearn, L., Fonagy, P., Amico, J., & Montague, P. R. (2009). victim empathy among mentally disordered offenders: Adult attachment predicts maternal brain and oxytocin response Validating VERA-2. Journal of Psychiatric Research, 60, to infant cues. Neuropsychopharmacology, 34, 2655-2666. 156-162.

References (81)

Ackerman, R. A., Witt, E. A., Donnellan, M. B., Trzesniewski, K. H., Robins, R. W., & Kashy, D. A. (2011). What does the Narcissistic Personality Inventory really measure? Assessment, 18, 67-87.
Alterman, A. I., McDermott, P. A., Cacciola, J. S., & Rutherford, M. J. (2003). Latent structure of the Davis Interpersonal Reactivity Index in methadone maintenance patients. Journal of Psychopathology and Behavioral Assessment, 25, 257-265.
American Psychiatric Association. (2013). Online assessment measures: The Personality Inventory for DSM-5-Brief Form (PID-5-BF)-Adult. Retrieved from https://www.psychiatry. org/psychiatrists/practice/dsm/educational-resources/assess- ment-measures
Ashton, M. C., & Lee, K. (2009). The HEXACO-60: A short measure of the major dimensions of personality. Journal of Personality Assessment, 91, 340-345.
Asparouhov, T., & Muthén, B. O. (2009). Exploratory structural equation modeling. Structural Equation Modeling, 16, 397- 438.
Baron-Cohen, S. (2016, December 30). Empathy is good, right? A new book says we're better off without it. New York Times. Retrieved from https://www.nytimes.com/2016/12/30/books/ review/against-empathy-paul-bloom.html
Baron-Cohen, S., & Wheelwright, S. (2004). The empathy quo- tient: An investigation of adults with Asperger syndrome or high functioning autism, and normal sex differences. Journal of Autism and Developmental Disorders, 34, 163-175.
Bird, G., & Viding, E. (2014). The self to other model of empa- thy: Providing a new framework for understanding empa- thy impairments in psychopathy, autism, and alexithymia. Neuroscience & Biobehavioral Reviews, 47, 520-532.
Bloom, P. (2016). Against empathy: The case for rational compas- sion. New York, NY: Ecco.
Bloom, P. (2017). Empathy and its discontents. Trends in Cognitive Sciences, 21, 24-31.
Bock, E. M., & Hosser, D. (2014). Empathy as a predictor of recidivism among young adult offenders. Psychology, Crime & Law, 20, 101-115.
Brown, T. A. (2003). Confirmatory factor analysis of the Penn State Worry Questionnaire: Multiple factors or method effects? Behaviour Research and Therapy, 41, 1411-1426.
Campbell, W. K., Bonacci, A. M., Shelton, J., Exline, J. J., & Bushman, B. J. (2004). Psychological entitlement: Interpersonal consequences and validation of a self-report measure. Journal of Personality Assessment, 83, 29-45.
Chrysikou, E. G., & Thompson, W. J. (2016). Assessing cognitive and affective empathy through the Interpersonal Reactivity Index: An argument against a two-factor model. Assessment, 23, 769-777.
Clark, L. A., & Watson, D. (1995). Constructing validity: Basic issues in objective scale development. Psychological Assessment, 7, 309-319.
Cohen, J. (1992). A power primer. Psychological Bulletin, 112, 155-159.
Davis, M. H. (1983). Measuring individual differences in empa- thy: Evidence for a multidimensional approach. Journal of Personality and Social Psychology, 44, 113-126.
De Ayala, R. J. (2013). The IRT tradition and its applications. Oxford Handbook of Quantitative Methods: Foundations, 1, 144-169.
Decety, J., Lewis, K. L., & Cowell, J. M. (2015). Specific elec- trophysiological components disentangle affective sharing and empathic concern in psychopathy. Journal of Neurophysiology, 114, 493-504.
Decety, J., & Michalska, K. J. (2010). Neurodevelopmental changes in the circuits underlying empathy and sympathy from childhood to adulthood. Developmental Science, 13, 886-899.
Dimitrov, D. M. (2012). Statistical methods for validation of assess- ment scale data in counseling and related fields. Alexandria, VA: American Counseling Association.
DiStefano, C., & Motl, R. W. (2009). Personality correlates of method effects due to negatively worded items on the Rosenberg Self-Esteem scale. Personality and Individual Differences, 46, 309-313.
Domes, G., Hollerbach, P., Vohs, K., Mokros, A., & Habermeyer, E. (2013). Emotional empathy and psychopathy in offenders: An experimental study. Journal of Personality Disorders, 27, 67-84.
Dziobek, I., Preißler, S., Grozdanovic, Z., Heuser, I., Heekeren, H. R., & Roepke, S. (2011). Neuronal correlates of altered empathy and social cognition in borderline personality disor- der. Neuroimage, 57, 539-548.
Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Mahwah, NJ: Lawrence Erlbaum.
Enders, C. K. (2001). A primer on maximum likelihood algorithms available for use with missing data. Structural Equation Modeling, 8, 128-141.
Essau, C. A., Sasagawa, S., & Frick, P. J. (2006). Callous- unemotional traits in a community sample of adolescents. Assessment, 13, 454-469.
Flight, J. I., & Forth, A. E. (2007). Instrumentally violent youths: The roles of psychopathic traits, empathy, and attachment. Criminal Justice and Behavior, 34, 739-751.
Fraley, R. C., Waller, N. G., & Brennan, K. A. (2000). An item response theory analysis of self-report measures of adult attachment. Journal of Personality and Social Psychology, 78, 350-365.
Frick, P. J. (2004). The Inventory of Callous-Unemotional Traits (Unpublished rating scale). University of New Orleans, New Orleans, LA.
Gabay, Y., Shamay-Tsoory, S. G., & Goldfarb, L. (2016). Cognitive and emotional empathy in typical and impaired readers and its relationship to reading competence. Journal of Clinical and Experimental Neuropsychology, 38, 1131-1143.
Gill, A. D., & Stickle, T. R. (2016). Affective differences between psychopathy variants and genders in adjudicated youth. Journal of Abnormal Child Psychology, 44, 295-307.
Glass, L., Moody, L., Grafman, J., & Krueger, F. (2016). Neural signatures of third-party punishment: Evidence from pen- etrating traumatic brain injury. Social Cognitive and Affective Neuroscience, 11, 253-262.
Harkness, A. R., Tellegen, A., & Waller, N. (1995). Differential con- vergence of self-report and informant data for Multidimensional Personality Questionnaire traits: Implications for the construct of negative emotionality. Journal of Personality Assessment, 64, 185-204.
Hatcher, S. L., Nadeau, M. S., Walsh, L. K., Reynolds, M., Galea, J., & Marz, K. (1994). The teaching of empathy for high school and college students: Testing Rogerian methods with the Interpersonal Reactivity Index. Adolescence, 29, 961-975.
Hawk, S. T., Keijsers, L., Branje, S. J., Graaff, J. V. D., Wied, M. D., & Meeus, W. (2013). Examining the Interpersonal Reactivity Index (IRI) among early and late adolescents and their mothers. Journal of Personality Assessment, 95, 96-106.
Hengartner, M. P., De Fruyt, F., Rodgers, S., Mueller, M., Roessler, W., & Ajdacic-Gross, V. (2014). An integrative examination of general personality dysfunction in a large community sam- ple. Personality and Mental Health, 8, 276-289.
Horn, J. L. (1965). A rationale and test for the number of factors in factor analysis. Psychometrika, 30, 179-185.
Ickes, W., Stinson, L., Bissonnette, V., & Garcia, S. (1990). Naturalistic social cognition: Empathic accuracy in mixed- sex dyads. Journal of Personality and Social Psychology, 59(4), 730-742.
Jolliffe, D., & Farrington, D. P. (2006). Development and valida- tion of the Basic Empathy Scale. Journal of Adolescence, 29, 589-611.
Jordan, M. R., Amir, D., & Bloom, P. (2016). Are empathy and concern psychologically distinct? Emotion, 16, 1107-1116.
Keltner, D., & Haidt, J. (1999). Social functions of emotions at four levels of analysis. Cognition & Emotion, 13, 505-521.
Keltner, D., & Kring, A. M. (1998). Emotion, social function, and psychopathology. Review of General Psychology, 2, 320-342.
Kozee, H. B., Tylka, T. L., Augustus-Horvath, C. L., & Denchik, A. (2007). Development and psychometric evaluation of the interpersonal sexual objectification scale. Psychology of Women Quarterly, 31, 176-189.
Krizan, Z., & Herlache, A. D. (2017). The narcissism spec- trum model: A synthetic view of narcissistic personality. Personality and Social Psychology Review, 22(1), 3-31.
Lawrence, E. J., Shaw, P., Baker, D., Baron-Cohen, S., & David, A. S. (2004). Measuring empathy: Reliability and validity of the Empathy Quotient. Psychological Medicine, 34, 911-920.
Lilienfeld, S. O., & Widows, M. (2005). Professional manual for the Psychopathic Personality Inventory-Revised (PPI-R). Lutz, FL: Psychological Assessment Resources.
Little, R. J. (1988). A test of missing completely at random for multivariate data with missing values. Journal of the American Statistical Association, 83, 1198-1202.
Loevinger, J. (1957). Objective tests as instruments of psychologi- cal theory. Psychological Reports, 3, 635-694.
Marsh, H. W., Morin, A. J. S., Parker, P., & Kaur, G. (2014). Exploratory structural equation modeling: An integration of the best features of exploratory and confirmatory factor analy- sis. Annual Review of Clinical Psychology, 10, 85-110.
Michie, A. M., & Lindsay, W. R. (2012). A treatment component designed to enhance empathy in sex offenders with an intel- lectual disability. British Journal of Forensic Practice, 14(1), 40-48.
Miller, J. D., Crowe, M., Weiss, B., Maples-Keller, J. L., & Lynam, D. R. (2017). Using online, crowdsourcing platforms for data collection in personality disorder research: The example of Amazon's Mechanical Turk. Personality Disorders: Theory, Research, and Treatment, 8, 26-34.
Muthén, L. K., & Muthén, B. O. (1998-2012). MPlus user's guide (7th ed.). Los Angeles, CA: Muthén & Muthén.
Olino, T. M., Yu, L., McMakin, D. L., Forbes, E. E., Seeley, J. R., Lewinsohn, P. M., & Pilkonis, P. A. (2013). Comparisons across depression assessment instruments in adolescence and young adulthood: An item response theory study using two linking methods. Journal of Abnormal Child Psychology, 41, 1267-1277.
Palgi, S., Palgi, Y., Ben-Ezra, M., & Shrira, A. (2014). "I will fear no evil, for I am with me": Mentalization-oriented intervention with PTSD patients. A case study. Journal of Contemporary Psychotherapy, 44, 173-182.
Patrick, C. J. (2010). Triarchic Psychopathy Measure (TriPM). Retrieved from https://www.phenxtoolkit.org/index.php?page Link=browse.protocoldetails&id=121601
Paulhus, D. L., & Jones, D. N. (2014). Measures of dark personali- ties.
In G. J. Boyle, D. H. Saklofske & G. Matthews (Eds.), Measures of personality and social psychological constructs (pp. 562-594). San Diego, CA: Academic Press.
Pulos, S., Elison, J., & Lennon, R. (2004). The hierarchical struc- ture of the Interpersonal Reactivity Index. Social Behavior and Personality: An International Journal, 32, 355-359.
Raskin, R., & Terry, H. (1988). A principal-components analysis of the Narcissistic Personality Inventory and further evidence of its construct validity. Journal of Personality and Social Psychology, 54, 890-902.
Realo, A., Allik, J., Nõlvak, A., Valk, R., Ruus, T., Schmidt, M., & Eilola, T. (2003). Mind-reading ability: Beliefs and perfor- mance. Journal of Research in Personality, 37, 420-445.
Reniers, R. L., Corcoran, R., Drake, R., Shryane, N. M., & Völlm, B. A. (2011). The QCAE: A questionnaire of cognitive and affective empathy. Journal of Personality Assessment, 93, 84-95.
Rogers, C. R. (1958). The characteristics of a helping relationship. Journal of Counseling & Development, 37, 6-16.
Rosseel, Y. (2012). lavaan: An R package for structural equation modeling. Journal of Statistical Software, 48(2), 1-36.
Roszkowski, M. J., & Soven, M. (2010). Shifting gears: Consequences of including two negatively worded items in the middle of a positively worded questionnaire. Assessment & Evaluation in Higher Education, 35, 113-130.
Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometric Monograph Supplement, 17(4, Pt. 2).
Shamay-Tsoory, S. G., Aharon-Peretz, J., & Perry, D. (2009). Two systems for empathy: A double dissociation between emotional and cognitive empathy in inferior frontal gyrus versus ventromedial prefrontal lesions. Brain, 132, 617-627.
Siu, A. M., & Shek, D. T. (2005). Validation of the Interpersonal Reactivity Index in a Chinese context. Research on Social Work Practice, 15, 118-126.
Spreng, R. N., McKinnon, M. C., Mar, R. A., & Levine, B. (2009). The Toronto Empathy Questionnaire: Scale development and initial validation of a factor-analytic solution to multiple empa- thy measures. Journal of Personality Assessment, 91, 62-71.
Steiger, J. H. (1980). Tests for comparing elements of a correlation matrix. Psychological Bulletin, 87, 245-251.
Strathearn, L., Fonagy, P., Amico, J., & Montague, P. R. (2009). Adult attachment predicts maternal brain and oxytocin response to infant cues. Neuropsychopharmacology, 34, 2655-2666.
Tellegen, A., & Atkinson, G. (1974). Openness to absorbing and self- altering experiences ("absorption"), a trait related to hypnotic susceptibility. Journal of Abnormal Psychology, 83, 268-277.
Thimm, J. C., Jordan, S., & Bach, B. (2016). The Personality Inventory for DSM-5 Short Form (PID-5-SF): Psychometric properties and association with big five traits and pathological beliefs in a Norwegian population. BMC Psychology, 4, 61. doi:10.1186/s40359-016-0169-5
Todd, A. R., Forstmann, M., Burgmer, P., Brooks, A. W., & Galinsky, A. D. (2015). Anxious and egocentric: How spe- cific emotions influence perspective taking. Journal of Experimental Psychology: General, 144, 374-391.
Vachon, D. D., & Lynam, D. R. (2016). Fixing the problem with empathy: Development and validation of the affective and cognitive measure of empathy. Assessment, 23, 135-149.
Vachon, D. D., Lynam, D. R., & Johnson, J. A. (2014). The (non) relation between empathy and aggression: Surprising results from a meta-analysis. Psychological Bulletin, 140, 751-773.
Watson, D., & Clark, L. A. (1984). Negative affectivity: The dispo- sition to experience aversive emotional states. Psychological Bulletin, 96, 465-490.
Wickramasekera, I. E., & Szlyk, J. P. (2003). Could empathy be a predictor of hypnotic ability? International Journal of Clinical and Experimental Hypnosis, 51, 390-399.
Wood, J. L., James, M., & Ciardha, C. Ó. (2014). "I know how they must feel": Empathy and judging defendants. European Journal of Psychology Applied to Legal Context, 6, 37-43.
Woods, C. M. (2006). Careless responding to reverse-worded items: Implications for confirmatory factor analysis. Journal of Psychopathology and Behavioral Assessment, 28, 189-194.
Young, S., Sedgwick, O., Perkins, D., Lister, H., Southgate, K., Das, M., . . . Gudjonsson, G. H. (2015). Measuring victim empathy among mentally disordered offenders: Validating VERA-2. Journal of Psychiatric Research, 60, 156-162.

FAQs

What explains the measurement precision differences between IRI and ACME empathy scales?add

The study finds that both ACME and IRI scales lack measurement precision at higher trait levels, particularly those above 2.0, indicating methodological concerns in high-empathy populations.

How do ACME and IRI compare in terms of construct validity?add

The research reveals that the ACME scales show superior incremental validity in predicting interpersonal malignancy traits compared to the IRI scales, with an average ΔR² of .30.

What are the specific factor structures of ACME and IRI scales?add

Confirmatory factor analyses indicate that ACME's structure is complicated by reverse-worded items, while IRI's four-factor structure demonstrates inadequate fit, suggesting potential method covariance.

When did the debate around empathy measurement methodologies intensify?add

Debates surrounding empathy measurement methodologies have escalated significantly since the early 2000s, influencing constructs in broader psychological research.

Why might the IRI Personal Distress scale misrepresent empathy?add

The IRI PD scale correlates more strongly with negative emotionality than with empathy measures, raising concerns about its validity as an empathy index.

Thomas H Costello

Massachusetts Institute of Technology (MIT), Post-Doc

I'm Thomas Costello, PhD (Emory '22), a research psychologist and postdoctoral fellow at MIT. I study the nexus between personality and politics, and I've published research widely on topics spanning the psychology of authoritarianism, personality disorders, the cognitive causes and correlates of political ideology, psychopathy, intellectual humility and cognitive biases, financial decision-making, sexual objectification, beliefs in free will and determinism, conspiracy theories, machine learning for scale development, and more.

Papers

Followers

121

View all papers from Thomas H Costelloarrow_forward

(PDF) Strengths and Weaknesses of Two Empathy Measures: A Comparison of the Measurement Precision, Construct Validity, and Incre

Strengths and Weaknesses of Two Empathy Measures: A Comparison of the Measurement Precision, Construct Validity, and Incremental Validity of Two Multidimensional Indices

Sign up for access to the world's latest research

Abstract

Related papers

References (81)

FAQs

Related papers

Related topics

Cited by