Testing English News Articles for Lexical Homogenization Due to Widespread Use of Large Language Models at ACL 2025
At the ACL 2025 Student Research Workshop (28–29 July, 63rd Annual Meeting of the Association for Computational Linguistics), Sarah Fitterer, Dominik Gangl, and Jannes Ulbrich (TU Berlin) presented their paper Testing English News Articles for Lexical Homogenization Due to Widespread Use of Large Language Models.
The study asks whether the massive spread of LLMs has already altered the texture of written English in online news. By comparing corpora from 2018 and 2024, the authors tested if language had become more uniform, working with established diversity metrics and introducing their own LLM-Style-Word Ratio to trace model-specific vocabulary.
The results show no decline in overall lexical diversity, yet a clear rise in LLM-style words, pointing to more subtle shifts: not language reduced, but language gently pulled toward particular stylistic grooves.
The ACL Student Research Workshop is part of the largest global conference on computational linguistics. That first-year students from our program are already contributing to this stage is something to celebrate: it means visibility, academic exchange, and the chance to build connections that can shape future research and careers.
At the ACL 2025 Student Research Workshop (28–29 July, 63rd Annual Meeting of the Association for Computational Linguistics), Sarah Fitterer, Dominik Gangl, and Jannes Ulbrich (TU Berlin) presented their paper Testing English News Articles for Lexical Homogenization Due to Widespread Use of Large Language Models.
The study asks whether the massive spread of LLMs has already altered the texture of written English in online news. By comparing corpora from 2018 and 2024, the authors tested if language had become more uniform, working with established diversity metrics and introducing their own LLM-Style-Word Ratio to trace model-specific vocabulary.
The results show no decline in overall lexical diversity, yet a clear rise in LLM-style words, pointing to more subtle shifts: not language reduced, but language gently pulled toward particular stylistic grooves.
The ACL Student Research Workshop is part of the largest global conference on computational linguistics. That first-year students from our program are already contributing to this stage is something to celebrate: it means visibility, academic exchange, and the chance to build connections that can shape future research and careers.
Testing English News Articles for Lexical Homogenization Due to Widespread Use of Large Language Models at ACL 2025
At the ACL 2025 Student Research Workshop (28–29 July, 63rd Annual Meeting of the Association for Computational Linguistics), Sarah Fitterer, Dominik Gangl, and Jannes Ulbrich (TU Berlin) presented their paper Testing English News Articles for Lexical Homogenization Due to Widespread Use of Large Language Models.
The study asks whether the massive spread of LLMs has already altered the texture of written English in online news. By comparing corpora from 2018 and 2024, the authors tested if language had become more uniform, working with established diversity metrics and introducing their own LLM-Style-Word Ratio to trace model-specific vocabulary.
The results show no decline in overall lexical diversity, yet a clear rise in LLM-style words, pointing to more subtle shifts: not language reduced, but language gently pulled toward particular stylistic grooves.
The ACL Student Research Workshop is part of the largest global conference on computational linguistics. That first-year students from our program are already contributing to this stage is something to celebrate: it means visibility, academic exchange, and the chance to build connections that can shape future research and careers.