A leak of around 364 million online records in a Chinese database, including private messages and ID numbers, has again highlighted the size and scope of Beijing's mass surveillance system.
The files show a wealth of information linked to online accounts, including GPS locations, file transfers, and chat logs, according to the database discovered by Victor Gevers, a security researcher at Dutch non-profit GDI Foundation.
The data collection appears indiscriminate - some conversations are simply banter between teenagers, like one commenting on someone's weight and clothing size.
"They know exactly who, when, where and what," Gevers told AFP, explaining that thousands of records were piped daily to different databases for local law enforcement to review.
Government procurement documents and database records shared by Gevers show that the database is linked to an "Internet cafe management system" developed by HeadBond.com, a tech firm based in eastern Shandong province.
In 2017, the public security bureau in Yancheng city, eastern Jiangsu province - where at least one Internet cafe named in the database is based - contracted HeadBond for a system that monitors online activity at Internet cafes.
On its website, the company calls its Internet cafe management system "the best solution" for identifying online users for police on its website.
HeadBond declined to comment, and the Yancheng city government and public security bureau did not respond to AFP's request for comment.
Internet cafe dragnet
Over the past decade, the Chinese government has cracked down on Internet cafes -- especially underground venues that serve minors - over concerns of game addiction and crime.
Chinese law requires Internet cafes to record the identities and "relevant" online activity of users, and provide them to the public security bureau on request -- which has resulted in an entire market of Internet cafe monitoring systems like those offered by HeadBond.
"This also explains why data leaks that involve personal information are more prevalent in China," said Lokman Tsui, an expert on Internet policy at the Chinese University of Hong Kong.
"Beijing requires most network services to register their users with real names," he told AFP.
"This means that every single mobile phone operator, Internet cafe, social media website, and so on, are legally required to have databases filled with personal information, and all these databases are potentially vulnerable to attacks and leaks."
The capture of extensive user data, such as chat logs, also extends well beyond the stated purpose of catching minors surfing the Web or playing games.
A government procurement notice posted last month by Liaoyuan city in northeastern Jilin province, for instance, outlines specifications for another "Internet cafe management system" for local police, with explicit requirements for features that support querying and analysis of content on QQ, a popular messaging app in China.
"It's shocking the amount of personal data that is being collected on Chinese people," said Bob Diachenko, a security researcher who has reported on exposed databases in the US and Europe for the past few years, and is now looking at cases in China.
In particular, it is surprising to see the amount of additional data that is linked with a user's login data, Diachenko told AFP, such as their IP address, name, and even information about their family members.
"Sometimes it's just big data and it doesn't even make sense to collect that from a user perspective," he said.
Last month, Gevers had found another publicly accessible database containing personal information such as ethnicity and GPS tracking data of 2.6 million people in Xinjiang. Access to the database has since been closed.
The restive northwestern region is home to most of China's Uighur ethnic minority, which has been under heavy police surveillance in recent years after violent inter-ethnic tensions.
"I would argue that good personal data protection is neither in the interest of the companies who gather the data for profit, nor the government who can (ab)use that data for power and surveillance," Tsui wrote in an email.
"It is the people in China and their basic human rights, in this case privacy, who end up drawing the short stick."