No twitter did not say only 5% of users are SPAM

Ever since Elon musk raised concerns about spam accounts on twitter, tonnes of twitter experts , tech media, and “Social media analyst companies” have been talking about how twitter’s claim in its filing that less than 5% of it’s users are spam is wrong. How their estimates are much much higher

Only problem, Twitter did not exactly make that claim , and as usual the tech media decided to ignore that, deliberately I believe(more on that later in article) .

The Claim

Lets first look at the filing that everyone keeps referring to
Here is the exact line from the filing

The actual claim is

Average of false or spam accounts during the fourth quarter of 2021 represented fewer than 5% of our mDAU during the quarter.

Lets define the terms:

DAU: Daily Active user
mDAU: Monetizable daily active user

The “m” is super important. So what is the difference? While I would love to think that it’s possibly industry specific terminology that most people do not get, twitter in its annual report actually defines for anyone who bothers to read.

We define mDAU as people, organizations, or other accounts who logged in or were otherwise authenticated and accessed Twitter on any given day through twitter.com, Twitter applications that are able to show ads, or paid Twitter products, including subscriptions

So what twitter is saying is that of the number of people who they could have shown Ads to , only 5% of them were SPAM as per their estimates.

This implies you will have to remove any accounts that tweet using systems where No ads can be shown.

For eg, its likely that you would not see an Ad if you used an API to post a tweet, and this may extend to third party clients which allow you to post. Eg: I sometimes use roam(My notes app) to directly post.

Twitter APIs allow you post 200 tweets in a span of 15 minutes

This changes a lot

Lots of bots and spams would be using automated scripts and APIs to post. They would never be on a surface where they can be shown ads, hence Non Monetisable. Thy are not counted
Real users tweeting using certain clients (Or automated scripts like IFTTT) may not be counted
Any account which is spam or even likely spam may be tagged by ad engine as such, and removed from potential monetisation and hence not counted. Twitter even mentions that in their filing in the same para as the 5% claim

After we determine an account is spam, malicious automation, or fake, we stop counting it in our mDAU, or other related metrics

So possibly a large swatch of accounts that may be labeled as potentially spam and fake never get to see an Ad, and hence not counted.

Not every potential spam account is deleted , possibly because there can be lot of false positives . Lot of real people behave like bots and the ad engine may have stricter rules

Fun thought exercise: If you behave like a bot, do you get ad free twitter?

A good visualisation of this would be something like this

Monetised SPAM accounts are 5% of the total green rectangle

So fake accounts on twitter could be 20% or even 50%, if they are not being monetised, it’s not counted.

The main claim in some sense is : If an advertiser spends money to reach users on twitter, only 5% of those users would be Fake.

This is an advertiser facing metric and not a user facing one. Your own experience is not what is being measured

Now coming back to how it gets reported. Remember the screenshot of reuters I shared above? In the the sub heading they do decide to make that distinction, indicating that they know this difference but chose to NOT talk about it in main heading.

This is repeated across many articles across various tech media sites. Either they ignore it and assume mDAU =DAU(which is incompetence) ,or hide it in text which I think is not very ethical.

This distinction is so important that it needs to be called out in the MAIN heading

Also read how other tech companies have similar metrics:

There is nothing shady about twitter’s mDAU metric

Some examples of Media reporting:

Reuters

Bloomberg

Forbes

Business Insider

So does twitter not have Bot problem?

Not exactly. There are 5% fake users on a platform is very different from “of the people who can be shown ads, only 5% are Fake”.
The data needed to verify this claim is

Who was monetised
Take a sample of these monetised users
Define and agree on the principles if what is SPAM/ Fake account
See what %age of these users fit that definition

This. is why its almost impossible to verify this claim without having access to twitters internal systems.

What percentage of SPAM accounts exists severely affects users and have a negative effect on user experience. This absolutely needs to be addressed, but the claim twitter is making is not about a user facing metric but rather an advertiser facing.

The big question that needs an answer is : What percentage of twitter’s daily active users are in monetisable bucket, but even that is not exactly relevant to the 5% claim.

Its very much possible that twitter is lying, or maybe they count every DAU as monetisable, maybe their SPAM engines are too lenient but we need internal data to know that .

I for one do not suspect twitter doing anything shady .

So do not blindly believe the headlines, and develop a lot more skepticism

//Update on Aug 24 2022//

Looks like Twitter’s Ex Head of Security became a whistleblower(Source) and revealed a lot of details about its security practices and also Spam accounting.

Keeping the security bits asides, it seems that even the whistleblower, who typically would be very antagonist to twitter, more or less confirmed that what twitter was reporting all along was correct.

SPAM in mDAUs are ONLY the users who slip through their existing spam filters

Twitter, Zatko’s disclosure claims, actually considers bots to be a part of a category of millions of “non-monetizable” users that it does not report. The 5% bots figure that Twitter shares publicly is essentially an estimate, based on human review, of the number of bots that slip through into the company’s automated count of monetizable daily active users, the disclosure states. So while Twitter’s 5% of mDAU bots figure may be useful in indicating to advertisers the number of fake accounts that might see but be unable to interact with their ads, the disclosure alleges that it does not reflect the full scope of fake and spam accounts on the platform.

Executives are incentivized to avoid counting spam bots as mDAU, because mDAU is reported to advertisers, and advertisers use it to calculate the effectiveness of ads. If mDAU includes spam bots that do not click through ads to buy products, then advertisers conclude the ads are less effective, and might shift their ad spending away from Twitter to other platforms with higher perceived effectiveness.
However there are many millions of active accounts that are not considered “mDAU,” either because they are spam bots, or because Twitter does not believe it can monetize them. These millions of non-mDAU accounts are part of the median user’s experience on the platform. And for this vast set of non-mDAU active accounts, Musk is correct: Twitter executives have little or no personal incentive to accurately “detect” or measure the prevalence of spam bots.

Twitter announced a new, proprietary, opaque metric they called “mDAU” or
“Monetizable Daily Active Users,” defined as valid user accounts that might click through ads and actually buy a product. 19 From Twitter’s perspective, “mDAU” was an improvement because it could internally define the mDAU formula, and thereby report numbers that would reassure shareholders and advertisers. Executives’ bonuses (which can exceed $10 million) are tied to growing mDAU.

Unless you’re a Twitter engineer responsible for calculating mDAU, you probably wouldn’t know what Agrawal is talking about. He is not saying that fewer than 5% of all accounts on the platform are spam. He’s saying, more or less, that Twitter starts with all the accounts on the platform, tries to automatically put all the human accounts that could be convinced by advertisers to buy products (but no spam accounts) into mDAU, and then uses humans to estimate the error rate of spam accounts that nevertheless slip through into mDAU. And naturally, Twitter “can’t share” its special sauce for determining mDAU.

Even though it’s written in a very antagonist fashion, what Zatko is saying should be music to Advertisers and twitter BD teams.

It says that Twitter took great care in making sure ads were not shown to suspected fake users and voluntarily removed them from its monetizable pool. It further claims that Exec comp was tied to increasing this specific metric rather than the “Vanity metric” DAU.

This is a GOOD thing. Anyone who works in Ad tech or marketing would tell you that.

Sure twitter can do more to fight spam, sure spam makes user experience worse, but there is currently no evidence that twitter lied in it’s SEC filing.

No twitter did not say only 5% of users are SPAM

The Claim

This changes a lot