Alexander Spangher

Research

My research breaks down into two lenses: [Computer Science] | [Computational Journalism].
You are currently viewing the Computer Science Lens. Click on the other lens to view it.

For AI to play a more integrated role in human tasks, it must align with human values and decision-making processes. Doing so, I believe, requires a radically different approach to conceptualizing and training AI; it requires an AI that understands, on a fundamental level, why humans make the decisions we do. Much of my work has focused on illuminating this process through simply observing the creative outputs humans produce. Our generative model for documents: states that human actions can be observed through the discourse elements produced in the final document.

I imagine a society where professional journalists can produce more higher-quality information at lower costs, improving our information ecosystem, better informing the public and improving our democracy. Much of my work has focused on this vision, and has targeted different steps in the news gathering process. Our workflow for journalism is: I have focused on each of these steps, and have developed computational methods to improve each step.

Publications

πŸ†πŸŽ€ Do LLMs Plan Like Human Writers? Comparing Journalistic Coverage of Press Releases with LLMs
Alexander Spangher, Nanyun Peng, Sebastian Gehrmann, Mark Dredze.
2024 Conference on Empirical Methods in Natural Language Processing 2024.
Acceptance rate 20.8%
Explaining Mixtures of Sources in News Articles
Alexander Spangher, James Youn, Matt DeButts, Nanyun Peng, Jonathan May.
2024 Conference on Empirical Methods in Natural Language Processing 2024.
Acceptance rate 20.8%
πŸ† Stay on Topic with Classifier-Free Guidance
Alexander Spangher, Guillaume Sanchez, Honglu Fan, Elad Levi, and Stella Biderman.
Forty-first International Conference on Machine Learning 2024.
Acceptance rate 27.5%
Tracking the Newsworthiness of Public Documents
Alexander Spangher, Emilio Ferrara, Ben Welsh, Nanyun Peng, Serdar Tumgoren, and Jonathan May.
Association of Computational Linguistics 2024.
Acceptance rate 23.5%
🎀 LegalDiscourse: Interpreting When Laws Apply and To Whom
Alexander Spangher, Zihan Xue, Te-Lin Wu, Mark Hansen, Jonathan May.
North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2024.
[ paper ]
Acceptance rate 23%
Identifying Informational Sources in News Articles
Alexander Spangher, Nanyun Peng, Emilio Ferrara, and Jonathan May.
2023 Conference on Empirical Methods in Natural Language Processing 2023.
Acceptance rate 23.3%
πŸ†πŸŽ€ First Steps Towards a Source Recommendation Engine: Investigating How Sources Are Used in News Articles
Alexander Spangher, James Youn, Jonathan May and Nanyun Peng.
Computation + Journalism 2023.
πŸ†πŸŽ€ NewsEdits: A News Article Revision Dataset and a Novel Document-level Reasoning Challenge
Alexander Spangher, Xiang Ren, Jonathan May and Nanyun Peng.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2022.
[ paper ]
Acceptance rate 26%
Multitask Semi-Supervised Learning for Class-Imbalanced Discourse Classification
Alexander Spangher, Jonathan May, Sz-rung Shiang, Lingjia Deng.
2021 Conference on Empirical Methods in Natural Language Processing 2021.
[ paper ]
Acceptance rate 25.6%
🎀 StateCensusLaws.org: A Web Application for Consuming and Annotating Legal Discourse Learning
Alexander Spangher and Jonathan May.
Computation + Journalism 2022.
[ paper ]
🎀 News Discourse Patterns: A Roadmap for Computational Journalism
Alexander Spangher and Jonathan May.
Computation + Journalism March 2021.
[ paper ]
Enabling Low-Resource Transfer Learning across COVID-19 Corpora by Combining Event-Extraction and Co-Training
Alexander Spangher, Nanyun Peng, Jonathan May, Emilio Ferrara.
1st Workshop on NLP for COVID-19 at ACL 2020 2020.
🎀 Don't quote me on that: Finding Mixtures of Sources in News Articles
Alexander Spangher, Nanyun Peng, Jonathan May, Emilio Ferrara.
Computation + Journalism 2020.
[ paper ]
🎀 Modeling Newsworthiness for Lead-Generation Across Corpora
Alexander Spangher, Nanyun Peng, Jonathan May, Emilio Ferrara.
Computation + Journalism 2020.
Southern California Natural Language Processing Conference 2019.
[ paper ]
🎀 Characterizing Search Engine Traffic to Internet Research Agency Web Properties
Alexander Spangher, Gireeja Ranade, Besamira Nushi, Adam Fourney, Eric Horvitz.
The Web Conference 2020 Taipei, Taiwan. 2020.
[ paper ]
Methodology for Building Scalable Knowledge Graphs using Pre-existing NASA Ontologies
Alexander Spangher, Jia Zhang, Rahul Ramachandran, Manil Maskey, Patrick Gatlin, J.J. Miller, Sundar Christopher.
IEEE Transactions on Cognitive Communications and Networking 2019.
Acceptance rate 19.4%
Actionable Recourse in Linear Classification
Alexander Spangher, Berk Ustun.
5th Workshop on Fairness, Accountability and Transparency in Machine Learning, ICML 2018.
Also presented in: Workshop on Ethical, Social and Governance Issues in AI, NIPS 2018.
[ paper ]

Publications (Middle Author)

πŸ† Are Large Language Models Capable of Generating Human-Level Narratives?
Yufei Tian, Tenghao Huang, Miri Liu, Derek Jiang, Alexander Spangher, Muhao Chen, Jonathan May, Nanyun Peng.
2024 Conference on Empirical Methods in Natural Language Processing 2024.
[ paper ]
Acceptance rate 20.8%
DisruptionBench: A Robust Benchmarking Framework for Machine Learning-Driven Disruption Prediction
Lucas Spangher, Matteo Bonotto, William Arnold, Dhruva Chayapathy, Tommaso Gallingani, Alexander Spangher, Francesco Cannarile, Daniele Bigoni, Eliana De Marchi, Cristina Rea.
Journal of Fusion Energy 2024.
[ paper ]
Acceptance rate 20%
Autoregressive Transformers for Disruption Prediction in Nuclear Fusion Plasmas
Lucas Spangher, William Arnold, Alexander Spangher, Andrew Maris, Cristina Rea.
NeurIPS 2023 Workshop: Machine Learning and the Physical Sciences 2023.
[ paper ]
Acceptance rate 75.6%
Understanding multimodal procedural knowledge by sequencing multimodal instructional manuals
Te-Lin Wu, Alexander Spangher, Pegah Alipoormolabashi, Marjorie Freedman, Ralph Weischedel, Nanyun Peng.
61st Annual Meeting of the Association for Computational Linguistics 2023.
[ paper ]
Acceptance rate 24.1%
Actionable Recourse in Linear Classification. (Expanded Version)
Berk Ustun, Alexander Spangher, Yang Liu.
Conference on Fairness, Accountability and Transparency (FAT*), ACM 2019.
[ paper ]
Acceptance rate 27%
Characterizing the Internet Research Agency’s Social Media Operations During the 2016 US Presidential Election using Linguistic Analyses
Ryan L Boyd, Alexander Spangher, Adam Fourney, Besmira Nushi, Gireeja Ranade, James Pennebaker, Eric Horvitz.
Whitepaper 2019.
[ paper ]

Publications (In Submission)

NewsEdits 2.0: Learning the Intentions Behind Updating News
Alexander Spangher, Kung-Hsiang (Steeve) Huang, Hyundong Justin Cho and Jonathan May.
2025 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2025.
A Novel Multi-Document Retrieval Benchmark Grounded on Journalist Source-Selection in Newswriting
Alexander Spangher, Tenghao Huang, Yiqin Huang, Liheng Lai, Lucas Spangher, Sewon Min, Mark Dredze.
2025 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2025.
NewsInterview: a Dataset and a Playground to Evaluate LLMs' Ground Gap via Informational Interviews
Alexander Spangher, Michael Lu, Hyundong Justin Cho, Weiyan Shi, Jonathan May.
2025 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2025.
NewsHomepages: Homepage Layouts Capture Information Prioritization Decisions
Ben Welsh, Naitian Zhou, Arda Kaz, Michael Vu, Alexander Spangher.
2025 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2025.
PatentEdits: Framing Patent Novelty as Textual Entailment
Ryan Lee, Alexander Spangher, Xuezhe Ma.
2025 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2025.