12 September 2024

"The models developed in the KISTRA project can help to detect unlawful content on the Internet."

Using artificial intelligence to combat hate crime on the internet: An interview with Dr. Robert Pelzer and Michael Hahne on the potential and ethical and legal frameworks for using AI models for the early detection of crimes.

In the research project "KISTRA – Using Artificial Intelligence (AI) for the Early Detection of Crimes," you investigated AI-based solutions that pre-sort potentially unlawful statements in online mass data for investigating authorities. Could you briefly explain how such an AI method works?

Robert Pelzer: As part of the BMBF-funded KISTRA research network, AI models were trained using machine learning to help law enforcement more quickly identify unlawful hate speech, such as incitements to hatred, in large amounts of data. In the sub-project at TU Berlin, we developed ethical requirements for the use of such AI models in law enforcement agencies. These include ensuring the quality of the algorithms and that they are non-discriminatory, and preserving human decision-making autonomy. To train the models, data sets were created by police experts and hate speech and similar offences were annotated. The application works like this: First, the data that is to be investigated for unlawful hate speech is imported into the tool. This can be posts from online platforms like Telegram channels. The AI then classifies the extent to which the data contains evidence of what it was trained to recognize, for instance incitement to hatred. The results of the classification are then reviewed and processed by law enforcement experts.

How can law enforcement agencies integrate this kind of tool into their day-to-day work?

Robert Pelzer: Imagine that law enforcement agencies receive reports of hate crimes from various Telegram channels that fall on the far-right and conspiracy ideology spectrum. By using the AI tool, they could quickly narrow down the tens of thousands of posts and comments. As a result, investigators can focus their attention more quickly on individual cases and identifying the suspects. This saves time that would otherwise be needed to manually screen data and allows criminal cases to be identified more efficiently. In other words, more crimes and criminals can be identified in the same amount of time.

What ethical and social issues did you investigate in the sub-project?

Michael Hahne: We examined substantive and procedural aspects, including infringements on civil liberties, the fairness of the AI models, autonomy of evaluators, transparency of the AI system, effectiveness of the process, and the legitimacy of the purposes and accountability. We examined the specific requirements that these aspects raise for regulating the use of AI in national security policing.

What were the outcomes of the TU Berlin sub-project and how are they to be implemented?

Robert Pelzer: We have developed a comprehensive set of requirements for ethical and trustworthy AI for national security policing. The European Parliament recently adopted the AI Regulation, which contains comprehensive provisions for the use of artificial intelligence, including in connection with security authorities. The legislation addresses many important ethical requirements, such as the need to certify AI systems for trustworthiness and non-discrimination. The challenge, however, lies in determining what trustworthiness and non-discrimination mean in specific scenarios in law enforcement, such as national security, and exactly what criteria should be used to evaluate an AI model. We want to contribute to these efforts with our research. However, we still need an appropriate infrastructure in order to be able to use the AI models developed in the KISTRA project in practice. And, in our view, there is also a need for solutions when it comes to “maintaining” the models; that is to say, follow-up training with new, and therefore more up-to-date data collected during operation. What is more, under the AI Regulation, law enforcement must first have quality assurance procedures in place and, last but not least, the models must first be certified, as I mentioned earlier. There is a long road ahead of us.

How can the quality of the training data and the traceability of the AI system be ensured?

Michael Hahne: These are still outstanding issues that need to be resolved before such AI systems can be implemented in police work. From an ethical standpoint, the performance of AI models must be validated by external bodies. The fairness of the AI model should also be checked, meaning that a training data set for incitement to hatred, for example, is balanced as regards different victim groups. It is also important that an AI model is as non-discriminatory as possible, in other words characteristics such as religion and ethnicity must not play a role in the classification. Traceability means, alongside a number of other things, that internal control bodies can track the AI-supported evaluation process from beginning to end and continuously implement improvement measures. From a technical point of view, this means that suitable methods must be found to visualize and describe the relationships that are relevant for an assessment, and from the perspective of the users in the authorities, documentation obligations must be met for decision-making.

The AI process could infringe fundamental rights like data protection and freedom of speech. How can this be counteracted?

Michael Hahne: Police surveys and evaluations of data in public social media spaces constitute an encroachment on the right to informational self-determination, as this is personal data. Everyone is free to decide for themselves what to do with their data, even if the posts they make are public. When large amounts of data make the leap from being viewed manually to being automatically collected, stored, and analyzed by an AI, the rights of a potentially large number of users are compromised. Infringements of fundamental rights require a legal basis. Data processing must be legitimate, for a specific purpose, and pseudonymized. It is crucial to select the data carefully in order to limit the extent of people affected. Documentation obligations and deletion protocols must also be implemented by design here.

Encroachments on the right to informational self-determination by law enforcement require a legal basis. What would that be in this case?

Robert Pelzer: In the case of suspected criminal offenses, it would be the German Code of Criminal Procedure. However, the state police laws, or in the case of the Bundeskriminalamt (BKA), which operates on the federal level, the BKA Act, also authorize law enforcement to take preventive action to avert danger. This also includes investigating any risks that present. With the exception of a few federal states, criminal procedural law and police law do not yet contain any special regulations for Internet evaluations, to the effect that such infringements must be based on what we call general clauses. However, this is only possible if the intensity of the infringement of civil liberties remains minimal, otherwise specific legal bases are required. The KISTRA team at the Ruhr-Universität Bochum, led by Professor Dr. Sebastian Golla, worked extensively on the legal framework for Internet evaluations and the use of AI.

How can automated decisions be effectively reviewed?

Michael Hahne: The decision as to whether there is a suspicion of a crime or whether a risk exists must never be made automatically by an AI system. If an AI is used to support decision-making in individual cases or, as in the example described earlier, to identify potentially unlawful statements in mass data, it must be ensured that the AI results are treated by the evaluators as suggestions, and not as decisions made by a machine. Therefore, the results must always be critically reviewed by a human. There must be no room given to pseudo-reviews. This could be implemented, for example, through measures to reduce stress and monotonous work processes, regular quality controls, for example through random spot checking, and continuous training measures based on findings and experience. By doing so, evaluators are given the necessary skills to correctly evaluate new phenomena and ambiguous cases.

How secure are these AI tools?

Robert Pelzer: The IT security requirements for law enforcement systems are high on principle, especially when it comes to sensitive data. The AI models must be awarded the same protection. Special care must be taken in the selection of training data to prevent attacks on the models. Internal quality management procedures, user training, and an internal audit body are necessary to prevent abuse by government and security agencies. A state supervisory authority ought to regularly monitor the quality and legality of internal audit procedures.

What content do platform operators currently have to report?

Robert Pelzer: The European Digital Services Act (DSA), which does not specify a list of reportable offenses, took effect on 17 February of this year. Under Article 18 of the DSA, content "involving a threat to the life or safety of a person or persons" must be reported. The circumstances under which statements inciting hatred as referred to in Section 130 of the German Criminal Code, for example, fulfill this requirement are at the discretion of the assessment made by the respective platform operator subject to the reporting obligation in each individual case. However, it is unlikely that large platform operators like Meta will be reporting hate posts on a grand scale. Users can also report hate posts to reporting centers such as “Hessen gegen Hetze,” REspect! and HateAid. Unlawful content is then forwarded to the BKA. Nevertheless, law enforcement authorities also have a duty to proactively prosecute hate crime. One example is the campaigns organized by the BKA to take targeted action against hate crime. The models developed in the KISTRA project can help to detect such unlawful content on the Internet.

Do you think that users of social media platforms will make less hateful and potentially criminal statements if they know that the AI process will accelerate prosecution?

Michael Hahne: The effect it has will depend on various factors. Users might only be deterred if they feel that criminal statements are likely to lead to legal consequences. But it also depends on how safe users feel on a given platform and how well they can disguise their own identity online. The individual's ability to relate the perceived threat of prosecution to their own actions is also relevant. This means that using AI in law enforcement can initially only help make more criminal content visible. Whether the people behind the user names can be identified is unfortunately a different matter altogether. Ultimately, AI methods are just one piece of the puzzle in improving law enforcement on the Internet.

This interview was first published on the website of the Technische Universität Berlin and was conducted by Barbara Halstenberg.