Invisible data janitors mop up top websites behind the scenes

Posted on at


Far behind the scenes at many of the world’s most popular websites, an invisible army is hard at work keeping content safe for the public. YouTube relies on them to screen all its videos — some 300 hours of uploaded content every single minute — for pornography, hate speech and other explicit, illegal or offensive material. Amazon’s product recommendations are only partly automated; many are checked or manually selected by workersFacebook uses them to keep explicit content out of news feeds.

These content moderators range from full-time employees working out of Google’s headquarters in MountainView, California, to subcontractors cloistered in Philippine office parks to freelancers scraping together a little extra money from their homes. Some of these digital freelancers find piecework through online labor markets such as Amazon’s Mechanical Turk, one of several crowdsourcing services that allow companies to tap into a global network of people who scrub their content, one assignment at a time.

They form the unseen backbone for much of the social Web but are rarely acknowledged and are often poorly paid. Lilly Irani, a professor at the University of California at San Diego, noted in a recent essay on the website Public Books that they have sometimes been described as data janitors who quietly scrub the social Web free of vulgar material.

Irani and others who study the hidden world of content moderators say their work has consequences far beyond the desire to shield our monitors from NSFW videos. They are asked to make implicit political decisions about what constitutes offensive or obscene content, and although they play a crucial role for companies that rely on user-generated content, their work is undervalued. The fact that they remain invisible means everyone else can indulge in the illusion that the Internet is a truly open, democratic forum for free expression and debate.

“The existence of workers who do jobs like content moderation and other kinds of work really turns the tables on the kind of myths that have surrounded the Internet for at least the past 20 years,” said Sarah Roberts, a professor of information and media studies at Western University in Ontario.

“There’s a myth that says the Internet is a mechanism for immediate participatory democracy,” she told Al Jazeera. “People can take part. People can voice their opinions. It’s you to the platform to the world. But in fact, it’s not that simple. There are all kinds of actors in between you and how you broadcast yourself to the world, and we don’t know who they are.”

Workers of the Web unite

There is very little reliable data on the size of the global workforce of content moderators, in part because of the many different labor arrangements — from full-time employment to Mechanical Turk gig work — under which they operate. But a 2014 Wired piece by Adrian Chen delved into this frequently unseen field, examining the lives and work of several content moderators in the Philippines and the United States who screen out violent and sexually explicit content. Most were deeply shaken by their experience passing judgment on an endless parade of disturbing images. “If someone was uploading animal abuse, a lot of the time it was the person who did it. He was proud of that,” a former YouTube moderator told Chen. “It just gives you a much darker view of humanity.”

Even when content moderators don’t experience mental distress, they must often fight to be recognized as employees. They work under a variety of labor arrangements, but many employers are “exploiting a gap in the law” by classifying content moderators as independent contractors instead of employees, said Miriam Cherry, a professor at the St. Louis University School of Law. Independent contractors are exempt from many of the labor protections for employees, such as minimum wage requirements.

“These laws come from the time of the Great Depression,” said Cherry, when large corporations did not rely as heavily on freelance or contract work.

The employee-employer relationship is coming under scrutiny in other parts of the tech industry. Drivers for the transportation services Uber and Lyft recently sued both companies, alleging that they were misclassified as independent contractors when the work they perform should rightfully make them full employees. Similar lawsuits are beginning to hit the content moderation business. CrowdFlower, a startup that crowdsources tasks in a fashion similar to Mechanical Turk, is negotiating a settlement with tens of thousands of workers who allege they were misclassified as contractors instead of employees.

Cherry predicted that this lawsuit is just the first of many. If more people doing freelance crowdwork are found to have been misclassified, it could force companies to drastically raise wages. Some of the people doing this work have reported earning $2 per hour or less, far below the legal minimum wage.

Even if the courts don’t compel crowdwork platforms to raise compensation, there are alternatice models that could change the balance of power in that sector of the Web. The same crowdsourcing technology that companies use to allocate digital labor could be used collectively by workers to assert more control over wages and benefits.

Trebor Scholz, a professor of culture and media at the New School in New York City, is trying to envision what these uses might look like. One potential alternative would be crowdsourcing platforms operated by unions and worker-owned cooperatives. In such a model, he said, the crowdsourcing platform would serve as a “virtual hiring hall.” The profits generated by content moderation and other digital labor could go toward pension funds and child care.

“What would it look like if you built a platform that is run by the Freelancers Union or is run by one of the worker cooperatives in New York City and the profits could go to them?” he said. “And they could actually do some socially responsible work with that.”

Like many other types of jobs, content moderation gigs may face a distant threat from advances in automation.

“There’s certainly some ability for computers to deal with some things,” said Roberts. “Text-based comments can be screened for keywords, and there are certain kinds of filters that can be deployed where a program can match against a video to determine what percent of that video has color values that fall into the range of human flesh. If something has a lot of flesh color, it’s probably porn.”

However, those tools are still extremely crude compared with the judgment of a human moderator.

“The field of computer image recognition, called computer vision, is in its infancy. And that’s just talking about static imagery,” said Roberts.

That means human content moderators are here to stay, likely indefinitely. While their continued relevance might be assured, it remains to be seen how their work will be valued — assuming it is valued at all.

An earlier version of this article mistakenly identified Lilly Irani as the one who coined the term "data janitors."


TAGS:


About the author

160