Psychological scales as benchmarks for automated sexism detection in social media texts

Tue-01

Presentation time:

Mattia Samory, Indira Sen, Julian Kohne, Fabian Flöck, Claudia Wagner

GESIS Leibniz Institute for the Social Sciences, Computational Social Science dept.

(a) Background

One in ten U.S. adults reported being harassed because of their gender, and being the target of sexism can have a measurable negative impact (Swim et al. 2001). Far from being a new phenomenon, sexism expanded its reach through online interactions (Pew 2017). Thus, countermeasures for sexism need to scale in keeping with the volume of online interactions. Machine learning approaches for automating sexism detection make it possible to screen online comments at a scale otherwise unachievable by human content moderators. Yet, whereas it is relatively easy to automate the detection of stark examples of sexism, such as those using gender-connotated slurs, varied and subtle examples remain challenging. On the one hand, sexism is a complex construct that affords multiple theoretical definitions. On the other hand, such definitions are often neglected while collecting and annotating data for training automated tools, raising concerns about their validity and reliability.

(b) Objectives

In this work we aim to improve construct validity and reliability in automated sexism detection by connecting machine learning practice with the operalization of sexism in social psychology (Samory et al., forthcoming). First, we aim to attune the automated tools’ predictions to the structure of sexism as a socio-psychological construct. Further, we aim to evaluate automated tools against instruments for measuring sexism in human respondents. In a nutshell, we explore ways to hold machine learning approaches and human subjects to the same standards for defining and measuring sexism.

Thus, we ask:

How do validated psychological scales operationalize the measurement of sexism, and how does it differ from state-of-the-art machine learning approaches for sexism detection?

To what extent does the touted predictive performance of machine learning approaches generalize to different definitions of sexism? Especially, how would automated tools hold up to the test of responding to validated psychological scales for sexism?

What are the hallmarks of sexist language, and what types of sexism are likely to remain undetected by automated tools?

(d) Method/Approach

We perform a comparative review of over 800 items from 30 psychological scales of sexism, including attitudes towards gender roles and hostile and benevolent sexism (García-Cueto et al. 2015, Glick and Fiske 1996). We derive a taxonomy of sexist statements according to the attitudes and beliefs of the statement’s author, and expand it with a categorization of sexist phrasing that is found in conversational settings. Then, we condense this taxonomy into a codebook that we use to annotate thousands of texts from social media, from existing as well as novel datasets intended for automated sexism detection. Further, we crowdsource adversarial modifications of statements annotated as sexist, so as to turn them into non-sexist statements while introducing minimal lexical changes. Adversarial modifications expose whether automated tools learn spurious correlations with lexical markers that are not related to sexism. We use adversarial modifications as well as scale items to test the performance of state-of-the-art machine learning approaches to sexism detection, including Logit, CNN, and BERT models.

(e) Results/Findings

We show that confronting the model with the various aspects that comprise sexism as a socio-psychological construct helps improve the validity of the model. We also improve model reliability by discouraging models from learning spurious correlation with terms that are not associated with the construct, via adversarial modifications. Through an analysis of the errors that the machine learning models make, and of the strategies employed by humans to generate adversarial modifications, we detail insights on how to build better automated sexism detection tools.

(f) Conclusions and implications

We explored ways to connect decades of research in the development of sexism measurement in social psychology with machine learning approaches to sexism detection. We proposed to leverage scale items to induce codebooks for annotating large-scale social media data, as well as data themselves for evaluating machine learning models. This approach is not specific to sexism and may be effectively applied to many of the constructs of interest to both the machine learning and psychology research communities. Yet, this opens new challenges, such as developing methods to best harmonize different proposed measures of a construct, and to evaluate their alignment with the predictions of a machine learning model.

References

García-Cueto, E., Rodríguez-Díaz, F. J., Bringas-Molleda, C., López-Cepero, J., Paíno-Quesada, S., & Rodríguez-Franco, L. (2015). Development of the gender role attitudes scale (GRAS) amongst young Spanish people. International journal of clinical and health psychology, 15(1), 61-68.

Glick, P., and Fiske, S. T. 1996. The Ambivalent Sexism Inventory: Differentiating Hostile and Benevolent Sexism. Journal of Personality and Social Psychology 70(3):491–512.

Pew Research. 2017. https://www.pewresearch.org/internet/2017/07/11/online-harassment-2017/

Samory, M., Sen, I., Kohne, J., Flöck, F., & Wagner, C. (Forthcoming). “Call me sexist, but...” : Revisiting Sexism Detection Using Psychological Scales and Adversarial Samples. ICWSM.

Swim, J. K.; Aikin, K. J.; Hall, W. S.; and Hunter, B. A. 1995. Sexism and racism: Old-fashioned and modern prejudices. Journal of personality and social psychology 68(2):199.