Hosting
Sunday, February 23, 2025
Google search engine
HomeArtificial IntelligenceX's latest findings reveal troubling trends in AI moderation

X’s latest findings reveal troubling trends in AI moderation


After a two-year hiatus, X (formerly Twitter) finally released its latest transparency report for the first half of 2024 in September – a marked departure from its previous practice of more frequent disclosures. Transparency reports, which platforms like Facebook and Google also release, are intended to provide insight into how companies curate content, enforce policies and respond to government requests.

The latest data from The most striking revelation: alarming reports of child exploitation, yet a significant drop in actions against hateful content.

The report highlights a major trend in content moderation: X’s increasing reliance on artificial intelligence to identify and manage harmful behavior. According to the platform, moderation is enforced through “a combination of machine learning and human review,” with AI systems taking immediate action or flagging content for further investigation. Can machines really take on the responsibility of moderating sensitive issues – or are they exacerbating the problems they are meant to solve?

More red flags, less action

X’s transparency report reveals staggering figures. In the first half of 2024, users reported more than 224 million accounts and tweets – a huge increase compared to the 11.6 million accounts reported in the second half of 2021. Despite this nearly 1,830% increase in reports, the number grew account suspensions only modest, rising from 1.3 million in the second half of 2021 to 5.3 million in 2024 – an increase of about 300%. More striking is that of the more than 8.9 million reported messages about child safety, X only removed 14,571 this year.

Some of the discrepancy in these figures can be attributed to changing definitions and policies regarding hate speech and disinformation. Under former management, X issued extensive 50-page reports detailing takedowns and policy violations. The latest report, on the other hand, is 15 pages long and uses newer measurement methods.

Additionally, the company has rolled back its rules on COVID-19 misinformation and no longer classifies misgendering or deadnaming as hate speech, complicating enforcement metrics. For example, the report shows that X suspended only 2,361 accounts for hateful conduct, compared to 104,565 in the second half of 2021.

Can AI make moral judgments?

As AI becomes the backbone of several global content creation and moderation systems – including platforms like Facebook, YouTube and X itself – several questions remain about its effectiveness in tackling harmful behavior.

It has long been shown that automated reviewers are prone to errors, struggle to accurately interpret hate speech, and often misclassify benign content as harmful. For example, in 2020, Facebook’s automated systems were seen blocking ads from struggling companies, and in April this year the algorithm incorrectly flagged posts from the Auschwitz Museum as violating community standards.

Furthermore, many algorithms are developed using datasets mainly from the Global North, which can lead to insensitivity to diverse contexts.

A September memo from the Center for Democracy and Technology highlighted the pitfalls of this approach, noting that a lack of diversity in natural language processing teams can negatively impact the accuracy of automated content moderation, especially in dialects such as Maghrebi Arabic.

As the Mozilla Foundation notes: “Because it is intertwined with all aspects of our lives, AI has the potential to reinforce existing power hierarchies and societal inequalities. This raises questions about how we can responsibly address potential risks to individuals when designing AI.”

In practice, AI falters in more nuanced areas, such as sarcasm or coded language. This inconsistency may also explain the decline in hate speech actions on X, where AI systems struggle to identify the full spectrum of harmful behavior.

Despite advances in language AI technology, detecting hate speech remains a complex challenge. A 2021 study from Oxford and the Alan Turing Institute tested different AI hate speech detection models, revealing significant performance differences. Their model, HateCheck, includes targeted tests for different types of hate speech and non-hateful scenarios that often confuse AI. Looking at tools like Google Jigsaw’s Perspective API and Two Hat’s SiftNinja, the study found that Perspective over-flagged non-hateful content, while SiftNinja under-detected hate speech.

An overreliance on AI moderation also threatens to infringe on free speech, especially for marginalized communities that often use coded language. Researchers also argue that even well-optimized algorithms can exacerbate existing problems within content policy. As platforms look to automate their moderation processes, they may end up restricting free expression, especially for marginalized communities that often rely on coded language.

A wider lens

X’s struggle is not unique. Other platforms, such as Meta (owner of Facebook, Instagram and Threads), have shared similar journeys with their AI moderation systems. Meta has acknowledged that its algorithms often fail to adequately identify misinformation or hate speech, resulting in false positives and missed instances of harmful behavior.

The troubling trends highlighted in X’s transparency report could pave the way for new regulatory policies, especially as several US states move forward with social media restrictions for minors. Experts at the AI ​​Now Institute are calling for more accountability from platforms regarding their AI moderation systems and pushing for transparency and ethical standards. Lawmakers may need to consider regulations that require a more effective combination of AI and human moderators to ensure fairer moderation.

As platforms increasingly shape political and social discourse, especially in light of the most important upcoming elections, the stakes are high. However, the current landscape, including the August shutdown of Meta’s CrowdTangle — an analytics tool that allowed researchers to monitor social media posts, specifically to detect misinformation — has made it more difficult to access data. Additionally, Elon Musk’s discontinuation of free access to the X API in early 2023 further limited access to valuable data, raising concerns about the ability to explore these trends.

In the long term, social media platforms may need to address the ethical challenges associated with AI-driven moderation. Can we trust machines to make moral judgments about the content we consume? Or will platforms need a more fundamental overhaul to move towards greater fairness and accountability?



Source link

RELATED ARTICLES
- Advertisment -
Google search engine

Most Popular