Blog
by Neil Thurman
Too many numbers and worse word choice, why readers find news articles produced with automation harder to understand
Our new study has found that readers find traditionally-crafted news articles more comprehensible than articles produced with automation.
However, readers were equally satisfied by the automated and manually-written articles in terms of the ‘character’ of the writing and their narrative structure and flow.
We recruited 3,135 online news consumers in the UK to read either one of 12 articles produced with the help of automation or one of a dozen articles that had been fully manually-written by journalists.
Overall, the readers found the 12 automated articles significantly less comprehensible.
Furthermore, they were significantly less satisfied with how the automated articles used description to help them picture what the story was about and with the appropriateness and accessibility of some of the words chosen.
Readers were also significantly less satisfied with the way the automated articles handled numbers and data: from how many numbers they contained to how they used absolute and rounded numbers, percentages, and numeric analogies.
The deficiencies readers perceived in the automated articles’ handling of numbers and word choice partly explain why they were harder to understand.
The automated articles used in the study had been sub-edited by journalists before they were published. Despite this sub-editing, readers still found them less comprehensible than the articles that had been fully manually-written by journalists.
We suggest that when creating and/or sub-editing automated news articles, journalists and technologists should aim to reduce the quantity of numbers, better explain words that readers are unlikely to understand, and increase the amount of language that helps readers picture what the story is about.
Our results indicate the importance not only of maintaining human involvement in the automated production of data-driven news content, but of refining it.
Our study is the first to investigate both the relative comprehensibility of news articles produced with and without automation and explore why a difference exists. The study’s results not only help assess readers’ acceptance of a widely-used form of automation in journalism but also reveal some of the underlying cause of the differences in the readers’ evaluations.
The automated articles used in the study were generated by the RADAR news agency, which is partly owned by PA Media. After automation, the articles were sub-edited and subsequently published by news outlets including Birmingham World, Bloomberg, the Telegraph & Argus, the Edinburgh Evening News, The Northern Echo, The Scarborough News, the Dunfermline Press, the Glasgow Evening Times, the Northumberland Gazette, and the Northampton Chronicle & Echo.
The articles that had been fully manually-written by journalists were published at news outlets including Coventry Live, National World, iNews, Yorkshire Live, The Scottish Sun, Chronicle Live, Hull Live, The Courier (Fife), The Herald, and the Northampton Chronicle & Echo.
The articles covered a range of topics including crime, sport, the economy, policing, transport, public health, immigration, and social affairs.
The survey was conducted on our behalf by YouGov.
Readers were only shown articles that were relevant to where they lived. For example, only readers in the Northeast of England were shown an article about suicide rates across the North East of England.
To ensure the automated and manually-written articles were comparable, each of the 12 pairs of automated and manually-written articles were based on the same data source, featured the same story angle, and covered the same locality.
The study, entitled “Too many numbers and worse word choice: Why readers find data-driven news articles produced with automation harder to understand”, is published in the peer-reviewed academic journal, Journalism: Theory, Practice, and Criticism.