AI Summaries Misrepresenting the News, BBC Study Reports

More than half of news summaries by AI chatbots contain “significant errors” according to a new BBC study, throwing into question the effectiveness of generative AI.

AI assistants from OpenAI, Microsoft, Google and Perplexity AI were given content from the BBC and asked questions about the news.

The results were then analysed by BBC journalists familiar with the topics who found that 51% of the summaries contained “significant errors”.

Almost all, 91%, of the responses “had at least some errors”. 19% contained factual errors and 13% included quotes sourced from BBC articles that were altered from the original source or not present in the cited article.

Examples of the errors included Google’s Gemini incorrectly stating that the NHS warns against using vaping as a method to help smokers quit. Additionally, Microsoft’s Copilot incorrectly stated that Gisèle Pelicot uncovered the crimes against her when she began having blackouts and memory loss. In fact, she found out about the crimes when police showed her videos confiscated from her husband.

The report said that as well as containing factual inaccuracies, the chatbots “struggled to differentiate between opinion and fact, editorialised, and often failed to include essential context”.

Overall, Google’s Gemini contained the most significant issues at 34%, followed by Microsoft’s Copilot at 27% and Perplexity AI at 17%. OpenAI’s ChatGPT performed the best, as only 15% of its responses contained significant issues.

“The scale and scope of errors and the distortion of trusted content is unknown,” warned Peter Archer, Programme Director of Generative AI at the BBC, in his foreword to the study.

“This is because AI assistants can provide answers on a very broad range of questions and users can receive different answers to the same or similar question. Audiences, media companies and regulators do not know the extent of the issue. It may be that AI companies do not know either.”

This is not the first time that generative AI’s capabilities to understand the news have fallen under the microscope in recent months.

In January, Apple was forced to pause its error-strewn AI-generated news alerts following mounting pressure from news organisations. The BBC were among those to complain about the alerts, which were part of Apple’s intelligence tools.

As well as pulling the AI notifications for news and entertainment headlines, Apple also confirmed that all of its AI-generated notifications are now shown in italics to differentiate from regular notifications.

Large language models (LLMs) that power generative AI chatbots like ChatGPT are trained on vast quantities of information from the internet, including news content from publishers.

In the past, OpenAI has signed deals with the Associated Press and NewsCorp, which included the Wall Street Journal and The Times among its stable, to allow it to train its AI models using content from the publications.

However, not everyone has been so accommodating. In September New York Times filed against OpenAI and Microsoft for allegedly violating copyright law by training its models using the NYT’s content. More recently, Indian Book publishers have brought their own suit against OpenAI.

Reflecting on the results of the study, Archer called on AI companies to “hear our concerns and work constructively with us” to understand how to rectify the issues identified and establish a long-term approach to “ensuring accuracy and trustworthiness in AI assistant”.

Additionally, he urged AI companies, Public Service Broadcasters like the BBC, Ofcom, the UK’s communication regulator, and the government to establish an “effective regulatory regime” for AI.

Cookie	Duration	Description
__cf_bm	30 minutes	This cookie, set by Cloudflare, is used to support Cloudflare Bot Management.
ARRAffinity	session	ARRAffinity cookie is set by Azure app service, and allows the service to choose the right instance established by a user to deliver subsequent requests made by that user.
ARRAffinitySameSite	session	This cookie is set by Windows Azure cloud, and is used for load balancing to make sure the visitor page requests are routed to the same server in any browsing session.
botd-request-id	session	No description
cf_clearance	30 minutes	cf_clearance specifies the duration our website is accessible to a visitor that successfully completed a previous Captcha or JavaScript challenge
cli_user_preference	1 year	This cookie is set by GDPR Cookie Consent plugin. The purpose of this cookie is to store whether or not the user has given consent for cookie usage. It does not store any personal data.
CONSENT	2 years	YouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
CookieLawInfoConsent	1 year	This cookie is set by GDPR Cookie Consent plugin. It records the default button state of the corresponding category & the status of California Consumer Privacy Act (CCPA). It works only in coordination with the primary cookie, viewed_cookie_policy.
isChecked	4 hours	No description
JSESSIONID	session	The JSESSIONID cookie is used by New Relic to store a session identifier so that New Relic can monitor session counts for an application.
PHPSESSID	session	This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
vuid	2 years	Vimeo installs this cookie to collect tracking information by setting a unique ID to embed videos to the website.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.

Cookie	Duration	Description
bcookie	2 years	LinkedIn sets this cookie from LinkedIn share buttons and ad tags to recognize browser ID.
bscookie	2 years	LinkedIn sets this cookie to store performed actions on the website.
lang	session	LinkedIn sets this cookie to remember a user's language setting.
lidc	1 day	LinkedIn sets the lidc cookie to facilitate data center selection.
sp_landing	1 day	The sp_landing is set by Spotify to implement audio content from Spotify on the website and also registers information on user interaction related to the audio content.
sp_t	1 year	The sp_t cookie is set by Spotify to implement audio content from Spotify on the website and also registers information on user interaction related to the audio content.

Cookie	Duration	Description
__gads	1 year 24 days	The __gads cookie, set by Google, is stored under DoubleClick domain and tracks the number of times users see an advert, measures the success of the campaign and calculates its revenue. This cookie can only be read from the domain they are set on and will not track any data while browsing through other sites.
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_ga_QFQPB3MY1P	2 years	This cookie is installed by Google Analytics.

Cookie	Duration	Description
bsid	never	No description available.
GoogleAdServingTest	session	No description
li_gc	2 years	No description
loglevel	never	No description available.
streetsign1	1 day	No description available.

More than half of AI chatbot news summaries contain ‘significant errors’, BBC study reports

Ep 194: DraftKings land King-sized endorsement deal with LeBron

Stricter regulations incoming for Nordics and Romania

Ep 095: Barca Build-up: Into the future – where does sports...