Technological determinists tend to believe that they will be able to solve wicked societal problems with the help of advanced systems in artificial intelligence. However, a quick reality check indicates that technology is not enough for solving social and political challenges such as information disorder or the so-called infodemic. Besides fundamental issues, such as lack of deep understanding of the problems we want to solve, there are also severe technical limitations due to, for instance, data issues. Rather, disinformation is largely a social and political problem that needs a broader approach than technical solutions alone.
This was one conclusion of a European panel discussion on fake news on October 6, 2020. MediaMotorEurope arranged a webinar named AI to the Rescue of Combating Fake News with panellists from all over Europe. One of the focus areas of the project is fake news, and the webinar was held to learn more about how the European media is dealing with disinformation. The panellists were:
- Mrs Polya Stancheva – Programme Director, Bulgarian National Radio
- Mr Vincent Merckx – Editor and fact-checker, VRT NWS, Belgium
- Dr Anestis Fachantidis – co-founder of Medoid AI and Adj. Lecturer at Aristotle University of Thessaloniki – AUTH, Greece, teaching Business Intelligence and Machine Learning
- Mr Tommy Shane – Head of Policy and Impact, FirstDraft, UK.
The discussion was moderated by Carl-Gustav Lindén, who is an Associate Professor of Data Journalism at University of Bergen, in Norway. He opened up with a brief introduction to the contested concept of face news, poorly defined as deliberately misleading articles designed to mimic the look of actual articles from established news organizations. As an example, the British government has actually decided that fake news should not be mentioned at all in its public communication.
Tommy Shane noted that it is easy to go wrong when taking action: “I think it’s important that if you’re looking to intervene in the space, you understand specifically which parts of it you’re looking to tackle. Then also, why? Why is that the most important thing? So, if it’s deep fakes, well, why deep fakes? Do you see deep fakes having harm in the world? Do you see them as a realistic threat in the future?”
Shane also emphasized the important difference between truth and harm: “A lot of fake news talk is around whether something is true or not. However, that’s not the only axis to consider. Hate speech, for example, is considered in terms of whether it’s harmful to other people or harmful to society.”
The webinar covered a wide range of topics but this blog post will focus on the part related to the title, “AI to the Rescue of Combating Fake News”. Detecting fake news is mostly done manually by fact-checkers and the panellist were therefore asked where the see the technology moving. The responses varied. Anestis Fachantidis is a machine learning expert who applies his knowledge on solving problems related to hate speech, which is closely related to fake news. His team mainly works with text mining and different machine learning methods when building models for detecting what is hate speech or fake news. However, technology needs human input: “I’m very confident that things will start to converge to some solution that will combine human intervention and human moderation with machine learning methods. It’s a combination of cloud intelligence and the way you may combine feedback from multiple fact-checkers or a bigger audience with the machine learning techniques.”
Early detection problem
There are specific problems related to machine learning methods, such as the need for quality data:
“One critical challenge today is early detection, machinery learning methods on the aspect of fake news lack. We can process and identify an article only after it has been spread, after it has been circulated, shared or a posted. The second big challenge is the human bias on labelling. Now labelling is a term we use for machine learning, that is flagging. When we have human annotators, moderators, fact-checkers, that flag an article on whether it is fake or not, there is bias. There can be many mistakes there. Our models are as good as their data are.”
For Tommy Shane, developing AI is largely a technical problem, while fake news is a social and political problem where training and an education in politics and society is necessary in order to understand the dynamics. That thinking has to be embedded when AI is applied to societal problems.
“Someone on your team should be able to think through those difficulties because the technical and political elements are really fused. Part of the human and technological relationship needs to be mirrored by a political and technical understanding of the problem. It’s quite a lot to expect that an expert in AI would have had the opportunity to really deeply think through those problems.”
First Draft do use some machine learning in their work, generally to speed up repetitive tasks that journalists have to do manually.
“An example of this is in our research into the vaccine misinformation, where we tagged different social media posts as having a certain argument against vaccines. We do a whole load of manual annotation on that. We are therefore looking at training AI to predict how we would tag those social media posts so that we can scale up and speed up that process of analysis to many tens of thousands of posts, rather than hundreds that we’re able to do manually. Now, that’s the task AI could be really well suited to solve”, said Tommy Shane.
Data issue
As said, a major limitation for machine learning methods is the access to quality data. Is it actually possible to do proper fact-checking of social media without having access to their data? We know that Facebook is not eager on sharing data and neither is Google, while Twitter is sharing some. That’s why most research on social networks is done on Twitter, because you get your hands on the data.
However, Facebook and Google are the main funders of fact-checking initiatives in the world. Tommy Shane noted that the Facebook third party fact-checking scheme focus on getting data manually annotated by experts. That, in a way, is solving Facebook’s data production problem, how to get the data to train AI models. Fact-checks are focused on Facebook because that’s where the money is since the company pays external experts.
These circumstances affect what is checked and what is not, says Tommy Shane: “One of the tricky things is that there’s an economic incentive to pay for fact-checks that steers the entire fact-checking industry in a certain direction. This is specifically towards social media content rather than the statements of politicians, which is generally what most fact-checkers say that they’re set up to do, to hold politicians to account. That’s not what Facebook is focused on. Facebook cares about moderation of content on their platform.”
This has become the infrastructure for producing data from fact-checking, the circumstances and constraints that shapes the data produced. Further, the Facebook data that fact-checkers have access to represents about 10% or 15% of Facebook data and emulates the system that produces data but is not a reflection of reality.
“We really don’t know what’s happening on Facebook. It’s obviously very difficult for us to have a clear picture and more dangerously, we think we have the full picture from the small data we have access to and we actually don’t.”
Therefore, the panellists were asked if there can be a sustainable fact-checking model without platforms such as Facebook, Twitter, Google or YouTube being a key part of it? Can the European Union create its own model for fact-checking or for disinformation policies without having dominant players of distribution of misinformation involved in that process?
Vincent Merckx acknowledged he had mixed feelings about the issue. One the one hand it is obvious that there maybe is no way around the big platforms like Twitter and Facebook. On the other hand, they are not transparent with how their algorithms work or the way that their moderation works.
“I also see right now how their moderation choices are questionably at the least, they are very United States centric. They impose rules on societies all around the globe. Now, if we look at the European Union, it is clear that we have a major lack of knowledge on the political level about how these platforms work, about how we could tackle certain problems that they create.”
Without this knowledge on the political level, he cannot see how EU could impose rules or have a voice. Further, as Tommy Shane underscored, these social media platforms are businesses, and they have economic imperative. The business model includes the ability to moderate the content on their platforms to avoid regulation. This means that they have a specific economic interest in providing information, using organizations like fact-checkers for that purpose.
According to the moderator, one important takeaway from the webinar was the limitations of applying artificial intelligence to fact-checking. It is easy to believe that technology will take care of the problem, but we see social media companies employing an army of people going through what people are posting and still not succeeding in removing all unwanted content outside their platform. However, that should certainly not stop people from innovating.