Abstract

This article investigates the rise of Amazon Mechanical Turk (MTurk), Amazon Web Services, Inc.’s crowdsourcing labor platform, in social science research since 2005. A new “digital sweatshop,” the platform hired online workers to do precarious, extremely low-wage tasks to support artificial intelligence (AI) and survey research, while effectively stripping workers of all protections except those they built for themselves. Bringing together labor history and the history of science through an investigation of MTurk, this article intervenes in the historiography bidirectionally. Interpreting research participation as work, it argues, first, that the history of knowledge production is a labor history. To understand the ensuing conflict between workers and researchers on the MTurk platform, one must understand its labor context. Their struggle lay at the intersection between social science's notion of ideal research subjects and the concerns, interests, and vulnerabilities of crowdsourced participants as a class of exploited and unprotected workers. This article asks, second, how the labor conditions of research subjects impacted the knowledge produced from them. As in other industries, dialectics of labor exploitation shaped (and spoiled) the knowledge products that digital piecework yielded. The “labor” being deskilled, in this case, was being human.

“Bot Panic”

Academic social science and the “gig economy” collided in 2018.1 Max Hui Bai, a psychology graduate student from the University of Minnesota, noticed abnormalities in his data. He asked online researcher forums, “Have [sic] anyone used Mturk in the last few weeks and notice any quality drop?” MTurk, short for “Mechanical Turk,” was a crowdsourcing labor platform developed by the online bookseller Amazon, which over the past decade has become newly popular among social scientists seeking online research participants. Through Facebook groups and word of mouth, Hui learned that over ninety studies worldwide identified similar data abnormalities, spotting duplicated GPS locations (e.g., the same string of numbers, “88639831,” appearing in multiple respondents’ coordinates) and nonsense answers (e.g., respondents indiscriminately replying “nice” or “good” to all open-ended questions). Burgeoning anxiety in social science about the reliability of data obtained from new, cheap, and convenient online platforms exploded into a “bot panic.” Researchers wondered whether the data corruption was new, or statistically significant, or a threat that jeopardized thousands of already published studies conducted using these methods.2

MTurk aspired to provide any individual or organization with online access to “a global, on-demand, 24 × 7 workforce.”3 Opening publicly in 2005, it was an early leader in what became a multi-billion-dollar online outsourcing industry—the whole industry, the World Bank projected, earning $2 billion in 2013, and $15–25 billion in 2020.4 Competitors like CrowdFlower, Clickworker, and Toluna joined MTurk's niche of “crowdsourced,” “piecemeal,” “micro,” or “gig” labor.5 Academic researchers—particularly social scientists—swarmed MTurk for research, conducting up to fifty thousand studies there annually and publishing thousands.6 MTurk was the new “digital sweatshop,” hiring online workers to do precarious, extremely low-wage tasks while effectively stripping them of all protections except those they built for themselves.7

Bringing together labor history and the history of science through an investigation of MTurk, this project intervenes in the historiography bidirectionally. It argues, first, that the history of knowledge production is a labor history. An extensive discourse in the history of science has examined what happened when scientists grew anxious about the validity of their data, whether in physics, biology, medicine, climate science, or psychology.8 When a device or procedure produced signal, a meditation on the reliability of that signal ensued. Yet to understand the validity of MTurk-produced research data, one must understand its labor context. The methodological tumult triggered by Hui's “bot panic” lay at the intersection between social science's notion of ideal research subjects and the concerns, interests, and vulnerabilities of crowdsourced participants as a class of exploited and unprotected workers. Interpreting research participation as work, I trace labor extraction on MTurk as an inheritance of the long history of labor deskilling, demonstrating how it implicated social scientists as buyers of labor, paying (cheaply) to harvest crowdsourced data. Understanding research conducted on MTurk through its exploitative labor relations sheds light on researchers and scientists as agents in a capitalist order, and on MTurk as a platform providing access to a growing class of deskilled workers furnishing raw materials for behavioral data extraction.

This article asks, second, how the labor conditions of research subjects impacted the knowledge produced from them. Just as conditions of production impacted the quality and composition of products in the industrial age (when goods were material products), they likewise impact the quality and composition of products in the so-called digital age (when goods include knowledge products). Although some may label the United States a postindustrial society, by pointing to flagging labor union membership and manufacturing that relocated offshore, at the cutting edge of modern-day labor, industrial logics of how employers exploited and how workers resisted have not only persisted but even intensified. Dialectics of labor exploitation shaped (and spoiled) the knowledge products that digital piecework yielded. The “labor” being deskilled, in this case, was being human.

(Human) Raw Materials

Up to now, whenever scientific research has concerned humans, it has inevitably required human raw materials: cadavers for dissection, skeletons for anatomy teaching, or live human subjects for experiments. To produce knowledge, the human sciences have demanded a supply of humans.9 In medieval and early Renaissance Europe, for instance, anatomists and their students bought, stole (robbed graves), or bribed (hangmen or gravediggers) to acquire cadavers and skeletons for research, medical training, art, or display in natural history cabinets.10 Laws like the Anatomy Act of 1832 in England expanded anatomists’ pool, allowing them to procure cadavers not only from prosecuted and hanged individuals but also from those who died destitute in workhouses or were otherwise unclaimed after death.11 The traffic in dead human bodies for science proliferated to the United States’ burgeoning medical schools and, in some cases, became a cornerstone of eugenics.12 By the 1950s, long-standing practices of body-snatching and looting gave way to formal body donation programs for medical education and research.13

But human raw materials were useful—even necessary—for science both dead and alive. A similarly ethically fraught history followed human experimentation on living human subjects, from earlier experiments on adults and children to present-day, risky Phase I pharmaceutical trials on healthy, paid “volunteers.” Scientists widely sourced markets of human raw materials to fuel research, but their human subjects were unevenly distributed in race, class, and social station, concentrated among the most vulnerable.14

As social science professionalized in the first half of the twentieth century, practitioners faced similar shortages in human research participants. Throughout the 1940s and 1950s, social psychology pooled diverse populations, drawing subjects from radio listeners, voters, soldiers, civilians, residents of housing projects, and industrial factory workers, surveying them in their “natural habitats.” Starting in the late 1950s, however, these studies relocated to the laboratory, and undergraduate students, the researcher's “familiar, captive, and largely friendly data base,” became the primary research population. The transition was thorough. By 1980, over three-quarters of studies published in top journals used undergraduates as their sole subjects.15

This “undergraduate” solution created its own problems, as anxiety about the status of experimenters and their subjects plagued psychological laboratory research beginning in the late 1960s.16 Social scientists, for decades, critiqued their field's heavy reliance on undergraduates.17 Many argued that undergraduates were too narrow a pool for deriving generalized claims about human behavior—unrepresentative, as they were, of humans along numerous axes, including likelihood for social compliance, attitude changeability, behavioral consistency, and introspection. Further research argued that using participants only from Western, Educated, Industrialized, Rich, and Democratic (WEIRD) societies skewed several behavioral metrics, including visual perception, fairness, cooperation, spatial reasoning, and self-concept. But undergraduates proved a pool too convenient and cheap not to use. In the face of critiques, researchers rarely justified their use of undergraduates, and these studies persisted, resulting in the average American college student being over four thousand times more likely than someone outside the West to be a social science research participant.18

In the new millennium, academics flocked to MTurk, which appeared to offer an alternative, abundant human subjects pool. The academic database JSTOR heralded in 2018 that MTurk gifted social science its next methodological revolution, providing a “sudden availability of a conscientious, diverse, non-student source of survey respondents,” which “has led to something of a golden age in survey research” and “may change the practice of social science” altogether.19 As one behavioral scientist explained, when transitioning from graduate school to a job at Yahoo Research, he realized, “we had no undergraduates and no classrooms.” How were his experiments to proceed? In 2008, he began using MTurk workers, or “Turkers,” as research subjects, thereby continuing his scholarship even after moving from university to corporation.20 This new pool appeared promising, as Turkers were demographically more diverse than both standard online research pools and traditional American college students, and MTurk data appeared to be at least as reliable as data obtained through traditional methods.21 Cheap survey research was a boon to other sectors, too, like Pete Buttigieg's presidential campaign, which reportedly spent $20,000 on MTurk survey research, polling Turkers as approximations for American voters.22

But this solution soon posed new problems à la Hui's “bot panic.” Clamor about MTurk data integrity sounded. How bad was the problem? This outcry spawned a cottage industry of MTurk metastudies—research about doing research on MTurk. Dubious data already existed on professional online survey platforms like Survey Sampling International, but MTurk research magnified the problem. Some argued that the “bot epidemic” was overblown, as only approximately 20 percent of Turkers provided false data (e.g., by circumventing location tagging requirements, or by providing multiple submissions using blacklisted, duplicated, or missing IP addresses). An additional 5 to 7 percent of Turkers engaged in “trolling” or “satisficing”—the former connoting various common, abusive online behaviors, and the latter a portmanteau of “satisfy” and “sufficing,” referring to work that provided the minimum response required to pass the lowest bar of acceptability. “Satisficing” was not unlike industrial America's “soldiering,” employers’ word for workers’ strategies of artificially restricting output or doing the least amount of work that avoided punishment. The combined impact of low-quality responses from “trolling” and “satisficing” attenuated researchers’ treatment effects by a sizable but not alarming 10 percent.23 Yet other researchers argued that MTurk's corruption of a quarter (25–27 percent) of research data was significant and frustrating and warranted its discontinuation.24 To understand researchers’ conflicts with their new digital workforce, one must return to the labor context.

The Hoax: Artificial Artificial Intelligence

Amazon originally designed MTurk for internal use. Hosting millions of webpages, the company needed to remove duplicates. Combing through webpages was a task that algorithms performed imperfectly but that—Amazon discovered—humans performed easily when paid a few cents per page. Enjoying its creation, Amazon began allowing other companies to use its site in 2005, while it collected a cut of profits.25 Calling itself “a crowdsourcing marketplace,” MTurk aimed to “harness the collective intelligence, skills, and insights” of crowdsourced workers for artificial intelligence (AI) training, market research, or anything its users could envision and pay for.26 It called its workers “Turkers,” its jobs “Human Intelligence Tasks” (HITs) or “microtasks” or simply “tasks,” and its job-posting employers “requesters.” The pay was low as 1 cent per task, not including a 20 percent site-use fee paid to Amazon, which had doubled from 10 percent in 2015. Researchers estimated that MTurk facilitated $100 million in tasks each year, garnering tens of millions in revenue for Amazon annually.27

MTurk drew its name from the famous eighteenth-century Mechanical Turk, a chess-playing automaton dressed as an Orientalized Turk, invented by Johann Wolfgang Ritter von Kempelen for the court of the Austrian empress Maria Theresa. The device toured the globe, defeating challengers like Benjamin Franklin, Frederick the Great, and Charles Babbage from 1770 to 1850. By design, it was a hoax. An illustration from a 1789 pamphlet (fig. 1) showed that sitting inside the alleged automaton was an unseen human chess master, manipulating its moves like a puppeteer from within. The eighteenth-century Mechanical Turk captured intrigue because it fed burgeoning anxieties about industrialization's runaway mechanical creations and long-standing Orientalist fascinations and fears about the liminal, disciplined, productive, and docile body of Muslim subjects.28

Like its historical precursor, Amazon's MTurk was designed as a human-labor-concealing trick for the modern automaton, AI. The construction of machine learning technologies required human labor because it needed to be trained on existing data sets to learn patterns. Data had to first be labeled and sorted by humans before it was fed to machines. Before AI could identify pictures of apples, a person had to do so first.29 MTurk sold labor in units called Human Intelligence Tasks (HITs), which juxtaposed crowdsourced laborers’ human intelligence with their artificial intelligence counterparts. Thus, MTurk supplied, at scale, the human crutch to adolescent AI. In the words of its patent application, MTurk integrated intelligence—both human and artificial—into a “hybrid machine/human computing arrangement.” Like a massive collective cyborg, composed of any willing worker on the internet, MTurk coalesced its computer and human elements into “one embodiment,” or so its architects hoped.30

Like the human chess master manipulating pawns from within the belly of the eighteenth-century Mechanical Turk, anonymous online laborers animated AI as the ghost in the machine, powering “ghost work” in a “ghost economy,” where 8 percent of Americans have labored.31 Adam Selipsky, then–vice president of Amazon's product management and developer relations, explained, “Usually people get help from computers to do tasks. In this case, . . . computers get help from people to do tasks,” “turning the traditional computing paradigm on its head.” According to CEO Jeff Bezos, MTurk specialized in tasks “easy for a human but extraordinarily hard for a computer.” A joker, Bezos called this class of AI's prerequisite human labor “artificial artificial intelligence.”32 The “artificial artificial” in this case was human.

Companies used MTurk in the idiom of its forebear—through “invisibility by design”—to conceal human labor within an AI artifice. Automated text transcription companies, for instance, found that it was cheaper to pay for four deskilled human intelligences to check their transcriptions than to construct an adequately sophisticated artificial intelligence. They profited from their concealment systems by paying Turkers as little as 19 cents, while charging clients 42 to 75 cents, per minute of audio.33 The more successfully start-ups hid “human computation” in their technologies, the higher their valuations as “software” companies.34

Most journalists and academics writing about MTurk interpreted its namesake, the eighteenth-century Mechanical Turk, as evidence of the platform's insidious labor-concealment. The ever-tricky Bezos hid the human element!35 Yet this compelling perspective omitted the most interesting irony in the platform's cheeky name. In electing the historical Mechanical Turk as namesake, MTurk did not conceal this sleight of hand; rather, it alerted audiences to the ruse, offering self-reflexive commentary on AI's disguising of its human labors. A hoax that announced itself as such, MTurk encapsulated the digital economy's need for a shadow labor force and its cavalier incorporation of duplicitous mechanisms into its technologies. Pointing out its own machinations through its name, MTurk operated like Poe's purloined letter—hidden effectively by being left in plain sight.

MTurk as Digital Piecework

At the cutting edge of the so-called digital age, industrial logics have not only persisted but intensified.36 MTurk resembled “piecework,” or “piecemeal work,” the historical category of labor compensated based on quantity of “pieces” completed, rather than time or skill.37 Breaking skilled labor into smaller, more easily repeatable increments had been a hallmark of efficiency practices like Taylorism, which transferred discretion, authority, and knowledge from workers to managers. MTurk continued this miniaturization of the units through which human labor, time, and energy were sold online.38 In digital piecework, however, the kind and rate of “deskilling” transformed. The unit of work shrank from “jobs,” “services,” or even “projects” to “microtasks.” The micro in MTurk's “microtasks,” like the micro in Frank and Lillian Gilbreth's “micro-motion study,” signaled this intensified task-decomposition.39 MTurk's creators fantasized that their invention may one day become “a computer system [that automatically] decomposes a task . . . into subtasks for human performance.” At this telos of deskilling, even the deskilling process itself will be automated.40

The labor being deskilled in this case was being human. Pushing deskilling to its extreme, logical conclusion, MTurk made the base qualification for many of its tasks simply possessing human cognition. The bulk of its tasks were designed for “any person” to complete with little conscious thought or training, such as recognizing patterns, interpreting images, catching innuendos, identifying cultural references, and deciphering context. Deskilled labor's race to the bottom resulted in this form of work for which all humans purportedly qualified by virtue of being human.

Through MTurk's conduit, fields outside AI discovered the profitability of common human cognition. Dozens of studies in medical research, for instance, demonstrated how Turkers, as laypersons, provided time-, cost-, and labor-saving alternatives to expert medical judgment. One study on the surgical procedure cricothyrotomy found that the averaged assessment of a group of thirty Turkers was comparable in accuracy to that of three surgical experts. Whereas trained surgeons took sixty days to evaluate the procedures, Turkers completed the same assessments in ten hours, earning 50 cents per survey.41 Other studies found comparable results, indicating that “surgery-naive,” “ophthalmologically naive,” and medicine-naive Turkers scored as well as doctors in assessing procedures like urinary bladder closures, robotic partial nephrectomy (RPN), digital retinal imaging, and drug indication annotations.42 Averaged crowdsourced intelligence posed an ironic refrain to the classic tagline “4 out of 5 doctors recommend this product” by posing this question: If the averaged judgment of 9 out of 10 laypersons was as good as that of 1 trained pathologist, was that “good enough” that we could dispense with the expert?43 MTurk's wisdom of the crowd gave medicine exciting opportunities to “scale” innovations and profit. Even hallowed medical expertise could be deskilled through digital piecework.

This contrast between human and artificial intelligences became ubiquitous, such as in law, which deployed these logics in their inverse. When a crime-forecasting algorithm was “no better” than Turkers’ lay judgment at predicting recidivism, researchers argued that the court should not use it to make “life-altering decisions.”44 Turkers’ common human cognition became a ruler against which to measure expertise.

Employers reaped the benefits of microtask labor. It enabled them to experiment with new areas of business, contracting workers at the smallest unit of obligation to fulfill emergent needs, without taking on the liabilities and financial burdens of a full employer-employee covenant. Like social media companies, which evaded Federal Communications Commission regulation and content moderation duties by classifying themselves not as publishers but merely as “platforms,” MTurk abdicated responsibility for employer-employee relations, worker protection, pay, and fairness by claiming to be not an “employer” but merely a “marketplace” for labor transactions. Amazon insisted that requesters and workers were bound by their own voluntaristic arrangements; it merely held space for their initial meeting. Abstaining from Turker-Requester disputes over unpaid work, Amazon left workers no recourse for unfair, abusive, or arbitrary treatment.

Requesters’ power over Turkers—their authority to reject or accept work at will—was near total, tempered only by the latter's crowd-sourced solutions to hold requesters accountable. Requesters could “reject” unsatisfactory work, and workers could “return” tasks they declined to perform, but workers were paid for neither “rejects” nor “returns.” Requesters admitted to arbitrarily rejecting a certain percentage of all work to cover Amazon's site-use fees, effectively cheating workers out of pay, since requesters could keep (and not pay for) work they rejected.45

This microtask employment strategy was a cousin to the widespread, dodgy Silicon Valley cost-saving strategy of “permatemping”—or permanent temping—in which companies permanently retained contract workers purportedly filling in for temporary roles, forming a persistent underclass of workers who were ineligible for employee benefits but served alongside full-time workers in identical or complementary functions.46 Like others precariously employed in the gig economy, like Uber or TaskRabbit workers, Turkers did not qualify for protection under the Fair Labor Standards Act, leaving them out of provisions such as minimum wage, overtime, and workplace benefits, although the gig worker reclassification rule proposed by the Biden administration in October 2022 could alter this terrain. The legal status of the “crowd” remained unclear.47 Critics labeled digital piecework “the biggest paradigm shift in innovation since the Industrial Revolution,” responsible for “wiping away a century of labor struggles.”48 Being free agents in this free market meant that Turkers were free from labor rights and protections, too.

For all this, Turkers were paid basement wages. One user reported that short transcriptions earned $2.30, surveys $6.85, and Google search descriptions $15.15 per hour.49 But a study of 2,676 Turkers from 2017 performing 3.8 million tasks showed that the median wage was $2 per hour. Only 4 percent of the study's participants earned more than $7.25 per hour, the federal minimum wage.50 MTurk's defenders argued that piecework—in its historic and digital forms—intended to differentially reward workers based on output, so experienced workers should earn several times more than novices for their efficiency.51 But even muckraking reportage demonstrated digital piecework's pervasiveness in all knowledge industries. The very media organizations that printed MTurk exposés, like the New York Times, ProPublica, and Pew Research Center, used MTurk for other projects. MTurk embroiled even its critics in its labor extraction.52

Companies excused MTurk's low wages, claiming that Americans “do this for fun” and that “poor people in the Third World” consider this “a good salary.”53 But defenders and critics agreed that “drone image tagging” could never amount to a “career.”54 Yet these common beliefs conflicted with the reality of Turker demographics. The majority of Turkers resided in the United States, with India a distant second. An increasing number of Turkers—25 percent in 2016—depended on MTurk for all or most of their income.55 MTurk echoed broader trends as 44 percent of US workers in 2011 considered themselves “free agents,” generating $141 billion in the US economy annually in pursuit of full-time or supplementary income (e.g., among retirees).56 Almost one in four American adults worked in the “platform economy” in 2015,57 and the same number resorted to platforms due to lack of other opportunities in their region.58 For many in the United States, gig work's microtasks were the best professions the modern economy afforded.

Like its forerunner, online piecework worsened intersectional oppression. During industrial piecework's surge in the early twentieth century, employers used incentive-pay schemes to induce high output from immigrant women, already marginalized from American society and dependent on this supplemental income. Although Turkers were more diverse than undergraduates, digital pieceworkers resembled their historic predecessors in concentrating among marginalized populations. Those who relied on MTurk for all or most of their income tended to be Hispanic, younger, and less educated and live in lower-income housing.59

Resistance Online

Turkers were spatially distributed like those who engaged in industrial “homework” (industrial piecework done at home) in the late nineteenth and early twentieth centuries. Physically scattered, “homeworkers,” unlike industrial workers who congregated physically on shop floors, had few opportunities to collectively organize. Some feared that crowdworkers, composed of digitally distributed online strangers, suffered from similar barriers to collective action that “homeworkers” faced.60 But crowdsourced labor extraction provoked crowdsourced labor resistance. Absent protections from governments or employers, Turkers collectively became their own guarantors-of-last-resort. Much of the behaviors that academics found frustrating in their MTurk research subjects arose out of workers’ strategies for self-preservation in the online gig economy.

Piecework effectively passed costs for training, professionalizing, acquiring tasks, and taking breaks from employers to workers. To make subsistence wages, Turkers, like contractors, spent enormous, uncompensated time prowling the site for their next piecemeal task. But unlike traditional contractors, Turkers’ contracts came in increments of a few cents each, making them appear more disposable to employers, while also requiring them to secure many more units of “microcontracting” work to string together full-time employment. Turkers were paid only for their productivity, never for breaks between periods of productivity when they rested or looked for more tasks. By contrast, breaks, rest, reprieve, and downtime during work among hourly and salaried workers were historically paid for and even engineered by employers. At the height of Progressive Era zeal for reform, scientists argued that introducing rest periods was the most cost-effective way to improve worker efficiency.61 Within the gig economy, however, “downtime” became an externality shouldered by workers themselves.

As if a perverse reply to last century's time studies, controlling one own's time became digital piecework's benefit and curse.62 Turkers could complete tasks at their self-appointed schedule, unrestricted by the traditional workday's parameters. Yet workers reported that the platforms that granted them the most time flexibility also made them feel most burdened, as they spent their uncompensated time creating for themselves the structures that their employers omitted, developing informal tools, practices, and communities necessary to complete their jobs. Among crowdsourcing platforms, workers rated MTurk as ostensibly “freest” and most time-constraining, as they needed to remain constantly “on call” to make a consistent income.63 This “invisible labor of finding tasks” bred constant, hounding anxiety, as workers felt they could never leave the computer lest they missed the chance to seize the latest marginally higher-paying microtask.64 Work advertised to give them control over their time morphed into work that required them to give up all of their time, to be ready to work at a moment's notice.

Because piecework paid workers per unit worked, it incentivized them to work as much as possible, often beyond their physical limits. Historically, nondigital piecework augmented musculoskeletal disorders (MSDs) and physical suffering among pieceworkers.65 Amazon's warehouses have likewise received citations from the Occupational Safety and Health Administration (OSHA) at federal and state levels for creating ergonomically hazardous workplaces through notoriously intrusive productivity monitoring practices. Moving at breakneck speeds, workers refrained from taking bathroom breaks or shaking out their hands, increasing their likelihood of developing MSDs.66 Piecework's high physical demands, poor working conditions, low worker control over jobs, and little supervisor support, along with pieceworkers’ typically low socioeconomic status, yielded adverse health consequences.67

Like traditional pieceworkers, online pieceworkers reported elevated levels of repetitive strain injuries, arising from the pressure to chase extreme productivity to eke out additional income. Turker Kristy Milland set an alarm on her computer to notify her when requesters posted new lucrative tasks paying over 25 cents. After twelve years, she left MTurk with a ganglion cyst on her wrist the size of a marble. Without paid sick leave and unable to cover postoperative surgery costs, she treated her “Bible bumps,” as they were called, with the folk remedy of smashing them with a heavy book. That treatment, however, radiated the pain up her forearm and elbow. She left with mental scars, too, from sorting graphic images, like ISIS videos of “wicker basket[s] full of human heads,” echoing the psychological toll experienced across the online content moderation industry.68 When companies did not pay for downtime, workers elided over their need for breaks to pursue the next task.

MTurk's free-for-all ethos empowered not only corporations and academics but also Turkers to declare new terms for workplace engagement, devising their own ways to combat workplace abuse. Because anyone with an Amazon account could post MTurk tasks, privacy concerns abounded. Some, for instance, asked Turkers to transfer strangers’ personal data (e.g., credit card information or disability status paperwork) from photos onto spreadsheets without explanations of the photos’ origins or the task's purposes. Others appeared odd or “creepy,” asking Turkers to film themselves doing niche activities, record their video or audio, disclose their personally identifiable information, or provide scans of their health insurance card. Turkers suffered stalking, spamming, phishing, and scamming on MTurk with virtually no recourse. Reporting false personally identifiable information could lead to their work being rejected or the worker becoming blacklisted on the site. If Turkers aborted a task for any reason, including privacy concerns, suspicious requests, unclear instructions, or technical malfunction, their “completion rate” decreased, adversely impacting their ability to secure future tasks. The allowable margins for incompletion were narrow, as many requesters, including academics, required Turkers to possess above 95 percent ratings to qualify for their tasks.69

With time, Turkers acquired tacit knowledge about screening suspicious requests. For example, most avoided the “foot fetish guy on MTurk.” As one worker explained, “I have like 40% of requesters filtered out, because I know better now. But when you start you just try everything. and that's what screws you over.” Turkers felt safer disclosing to seemingly trustworthy requesters like academics, but ultimately income trumped other needs. They reported overriding privacy concerns and discomfort to proceed with dubious tasks for money.70

Turkers also devised collective solutions through online forums, such as Turker Nation (moderated by Kristy Milland with the Bible bumps), mturkgrind, Turkopticon, MTurk Crowd, and an MTurk Reddit. They shared homegrown strategies for outsmarting the platform, such as using multiple Amazon accounts, distraction-elimination systems, alerts for new tasks, and bots to auto-complete surveys.71 Commentators saw the browser extension Turkopticon, which allowed Turkers to rate requesters on communicativity, generosity, promptness, and fairness, as the most promising instance of Turker resistance. Built by socially engaged academics, this tool attempted to correct MTurk's information asymmetries, wherein requesters could see and filter Turkers by ratings, but Turkers could not rate or filter abusive or “shady” requesters. Its cocreators, Lilly Irani and Six Silberman, began Turkopticon in 2008 as a graduate school class project, naming it after the Panopticon, the all-seeing prison that compelled its inhabitants (prisoners) to self-discipline their behavior, designed (but not built) by Jeremy Bentham and made famous by Michel Foucault.72 This crowdsourced tool enabled Turkers to counter-surveil their protected employers, forcing on the platform a reciprocal transparency. Most professional Turkers used Turkopticon. In its first six years it hosted over 40,000 users, who posted 200,000 reviews of 34,000 requesters.73

Self-fashioning workplace protections added to Turkers’ invisible labor. MTurk was not merely an online pool of research subjects but a complex gig workplace rife with harassment, resistance, and myriad financial incentives for all parties.

Managing Human Subjects

Universities and institutional review boards (IRBs) struggled to keep pace with regulating this research frontier. The University of Washington's finance office warned of “countless opportunities for issues, fraud, [and] lack of accountability” on MTurk, noting that, although researchers heavily relied on it, should excessive risks emerge, the “use of [MTurk] may be prohibited for all of campus.”74 In 2013, the US Department of Health and Human Services acknowledged IRB's deficiencies in regulating MTurk research, writing that “current human subjects regulations, originally written over 30 years ago, do not address many issues raised by the unique characteristics of Internet research.” Although some Turker forums recommended that Turkers escalate abusive behavior to researchers’ IRBs, existing reports have yielded little reform.75

Researchers, left on their own and with their data quality at stake, published “best practices” for their communities. Most recommended designing “attention checks” into experimental protocol to weed out inattentive, trolling respondents.76 For AI training, content moderation, and biomedical applications, attention checks were effective in discovering mistakes, as each task had a correct answer. These quality-assurance mechanisms, while useful for researchers, exacerbated workers’ stress, as workers reported feeling extreme pressure to maintain utmost attentiveness in otherwise monotonous tasks where mistakes were easily discovered.77

Even then, such “checks” were insufficient. In contrast to “mechanical tasks,” which “ha[d] a correct answer” and were “known knowns,” verifying behavioral data was murkier. Whereas AI and biomedical researchers sought Turkers as stand-ins for average human intelligence, social scientists used Turkers to produce new behavioral knowledge through surveys, for which, they explained, there were no “objectively correct answer[s].” In survey research, “genuine responses” were “difficult to parse from insincere ones,” and data quality “nearly impossible to observe.” Even when scholars used attention checks on MTurk, their results were ambiguous. Duplicated GPS locations could be mere coincidence, and nonsense answers were open to interpretation. As researchers knew too well, selecting the same answer repeatedly in a survey “is not conclusive evidence of satisficing.” Responding “nice” or “good” to open-ended questions could indicate anything from auto-fill bots to respondents’ carelessness or poor English skills.78

In co-opting MTurk for behavioral research, social scientists used this workforce for unintended purposes. Turkers were not inattentive but rather far too attentive, passing attention checks more frequently than traditional subject pools that built much of contemporary behavioral science (e.g., college students). By filtering out low-performing Turkers, academics were essentially selecting for the highest-performing workers (i.e., the most conscientious survey-takers). Whereas undergraduates may have been lax about surveys due to lack of incentives to perform otherwise, Turkers felt intense pressure to pass attention checks so that they could maintain high ratings, remain eligible for future work, and continue earning subsistence wages.79 In using a piecework labor platform to employ research participants, social scientists inadvertently crafted incentives that linked their research subjects’ financial self-interests to their survey responses. Researchers had feared that American undergraduates were aberrant in having too few material concerns. Their new reliance on Turkers skewed their data to the opposite extreme.80

The appeal of surveys further augmented MTurk's evolutionary pressures. Some feared that surveys were too attractive—“macrotasks” in a sea of microtasks—taking more time, were less automated, and calling for more worker discretion.81 Surveys offered Turkers welcomed reprieves from other tedious labor, and Turkers regarded research participation as more meaningful than other work. By bringing comparatively more interesting, and sometimes more lucrative, macrotasks to MTurk, researchers unwittingly lured the highest-performing “Super Turkers” to repeatedly seek them out, worsening their research predicament.82 In response, academics advised that there was an “art” to pricing tasks, so as not to skew too high or low relative to the site's norms. Prices that were too low yielded little interest in your task, but prices that were too high attracted profiteers entering multiple fraudulent submissions for triple or quadruple pay.83 Pomona College's IRB waived its $15-per-hour minimum rate for research participation if scholars used MTurk, for fear of contributing to so-called participant coercion through inflated wages.84 Revising research regulations to suit digital methods, such guidelines, for the sake of producing good data, encouraged scientists to pay the platform's baseline rate, lest their projects enticed scammers. But MTurk's typical pay was so low that research best practices entrenched exploitative wages.

The situation became problematic for academics as “Super Turkers” morphed into a class of “professional” survey-takers. Behavioral science studies were not meant to be conducted on veteran test subjects, so overexposed to surveys that they were no longer “naive” to experimental manipulations. The cognitive reflection test (CRT)—a hallmark of modern behavioral science that tested participants’ likelihood to give answers based on incorrect “gut” instinct versus critical reflection—was especially vulnerable, as it assessed participants’ automatic, deliberate thought processes, often through trickery or deception. Turkers admitted to no longer being naive to CRTs, as many prememorized answers to the most common prompts.85

To glimpse what Turkers felt, one may reference one's own experiences as a consumer steeped in modern advertising. Where psychological manipulation tactics are rampant, nonnaïveté becomes a problem not only for laboratory research but also for marketing. Studies document that average consumers, like Turkers, have become numb to repeated behavioral “nudges,” like indications of scarcity (“Only two rooms left!”) and social proof (“16 other people viewed this room”).86

Moreover, although most ethics guidelines suggested that deception be reserved as a last resort, studies involving deception (whether attention checks or CRTs) were ubiquitous on MTurk, even in cases where nondeceptive alternatives were available. As Turkers developed canned responses to behavioral scientists’ recycled deception prompts, researchers raced to invent new ones, breeding a vicious cycle of mutual distrust.87

The rise of “opt-in” and nonprobability online research panels had provoked concerns about “professional” survey-takers before MTurk's rise. Scholars had wondered whether the same small pool of people repeatedly self-selecting to participate in surveys (for monetary incentives) was skewing data. One 2006 study, for instance, found that 1 percent of online survey panel participants completed 34 percent of questionnaires.88 MTurk replicated this problem. Of the 48 million registered global crowdsourcing workers89 and five hundred thousand registered Turkers,90 likely only 10 percent and 2 percent, respectively, were active. Saturated by surveys, the typical Turker participated in more studies in a week (twenty studies) than the average undergraduate did in a lifetime (fifteen studies).91

When a research participant completed three hundred academic studies as the median Turker had, scientists’ assessment of that person's “gut” response—as an approximation for averaged human behavior—ceased to have almost any meaning. Turkers’ nonnaïveté corrupted behavioral data, as participants exposed to identical or similar experimental tasks a second time yielded effect sizes reduced by 25 percent. One Turker shared that whenever she felt “really mechanical” after too many surveys, she stopped to “regain [her] humanity,” telling herself, “I'll start again tomorrow when I . . . feel like a person again.”92 In this labor regime for mass-producing human behavioral knowledge, participants furnishing raw data were surveyed until they ceased to feel human. As such, Turkers’ public letter-writing campaign to CEO Bezos repeated, “I Am a Human Being, Not an Algorithm.”93

The rise of “Super Turkers” paralleled the rise of “serial participants” in Phase I pharmaceutical trials, the riskiest, first-in-human clinical tests, where subjects earned up to $5,175 per trial. Those participating serially could earn under $20,000 annually, making a career of clinical research participation. While some Phase I “serial participants” talked of unionizing, others crowdsourced tips for surviving the work with fewest negative consequences (e.g., avoiding lumbar punctures when possible).94 When research participation became a profession, repeated exposure compounded risks of being studied.

The Tragedy of the (Commons) Crowd

MTurk's nonnaïveté resembled the canonical “tragedy of the commons,” an economics concept wherein individuals with free access to a shared good, like the environment, depleted it for their self-interest, against the common interests of all users. MTurk, a resource of (previously untapped) online human labor, held in common by all potential online requesters, was exploited to such an extreme that it broke.95 Behavioral economist David Rand, an early adopter of MTurk for research (having published the first MTurk study in his field), mourned the site's increasing depletion, stating, “Because it got so popular, it's overexploited, and now it doesn't work for the things that I was originally wanting to use it for.” He lamented, “Man, couldn't we have just kept our mouth shut and kept it as this nice, clean thing for ourselves?”96

With this common resource in disrepair, trust plummeted between Turkers and requesters, as each side gamified the system in an arms race against the other. Turkers created workarounds to outsmart researchers’ checks and achieve their “personal best,” as one might in sports or video games.97 Researchers saw Turkers’ crowdsourced solutions as threats to data integrity. Turker forums became permanent, searchable databases for circumventing research design.98 (Online patient forums have similarly challenged double-blind biomedical research as participants crowdsourced information to discover who was in the placebo or treatment groups.)99 Through web-based info-sharing, research participants disrupted the secrecy and information asymmetry built into extant protocols for scientific experiments.

Turkers’ online communities, formed in resistance, sadly also suffered from the “tragedy of the commons.” Silberman, Turkopticon's cocreator, reported that workers, requesters, and others attempted to sabotage the resource by creating fake reviewer accounts to artificially sink or inflate requester scores, blackmail requesters, harass other reviewers, or simply “troll.” Despite the creators’ intention to design a “worker-centered” tool, Silberman received harsh appraisal from Turkers for his stewardship, noting that “Turkopticon itself has been subject to criticism at least as severe as that received by [MTurk] itself.” He admitted that this was warranted, as he and Irani “struggled to respond effectively” to abusive behavior occurring on Turkopticon.100 Turkers perceived Turkopticon as a contaminated commons, reporting that they distrusted its ratings because they were written by other Turkers competing for the site's best tasks. In their experiences, good requesters were well rated, but the best requesters were rated either poorly or not at all. One worker explained, “Many Turkers won't give away their great ones for fear the work will get scooped up from them.”101

Assisting invisible labor, in this case, also required invisible labor. Silberman served as the system's de facto lead programmer and database administrator since its creation, and he and Irani moderated the forums alongside volunteers.102 Such arrangements, he noted, could not continue “indefinitely,” as Turkopticon had no funds to pay for staff or operational costs.103 Academic free labor, therefore, also had its limits. In 2016, Silberman sought to forge a new commons with professional Turker Rochelle LaPlante called MTurk Crowd, “a worker-owned, democratic forum” allowing openness, transparency, and collaboration.104 Collective action was possible in a digitally distributed workforce. But as the labor pool was squeezed, fierce competition between fragmented actors constantly threatened common resources to collapse on themselves.

In the void of a collapsed commons, new, for-profit companies emerged to address crowdsourced research's shortcomings. Data consultancies courted scientists to assist in designing and implementing MTurk surveys. Competitors, often founded by behavioral scientists familiar with MTurk's flaws, offered expanded participant pools in Europe; guaranteed research-naive subjects; fraud, location, identity, and VPN detection; participant protection; mediation of researcher-worker disagreements; handling of workplace abuse reports; and a minimum wage of $6.50 per hour, or “fair living wage.” Such companies hoped their platforms could build “trust and virtuous cycles” that MTurk had not.105 The original, however, has proven difficult to unseat. Some challengers followed it in designing themselves as a hoax within a hoax, seeking to overtake MTurk by using MTurk. One contender, for instance, sought to surpass MTurk but drew 20 percent of its labor force from MTurk.106

Academics in Capitalism

Social scientists had themselves been economic subjects facing market pressures. Since at least the depressed job markets of 1979, they argued, increased difficulty of academic survival required researchers to produce “low-difficulty, high-volume studies.” The push for shorter studies, more studies, and more participants found its solution in MTurk. In 2005, no studies in top behavioral science journals used MTurk. By 2015, one-third did, and half of all studies occurred online. MTurk studies were, on average, shortest in length (taking least amount of time), even compared to laboratory tests of undergraduates. Pressure for research volume was intense, with the average number of studies in each published article among top journals climbing from 1.27 in 1968, to 1.58 in 1978, 1.78 in 1988, 2.75 in 2005, 3.09 in 2010, and 3.80 in 2015. MTurk met contemporary researchers’ need for speed and volume cheaply. What behavioral science lost in the MTurk revolution, however, were high-difficulty, low-volume studies needed to tackle human behavior's most challenging questions.107

MTurk's controversies highlighted that research had always been implicated in labor relations. From the golden age of laboratory study of undergraduates and prior, the human sciences had possessed insatiable demand for cheap, plentiful, and naive populations to feed their research. MTurk merely made explicit formerly shrouded labor relations in knowledge production. Just as employers downplayed crowdsourced labor as workers’ supplemental income and recreation “to stop questions about . . . actual working conditions of actual people,” similar frames long obscured the underpaying of undergraduate research participants.108 Compensating undergraduates was an easy question to skirt in part because their student status seemed to exempt them from consideration as workers. But as more workers relied on survey participation for their entire income, academics became tightly ensnared in questionable labor ethics.

MTurk marked a key shift in the political economy of research participation as the primary arena for recruiting worker-participants migrated from universities to an explicitly profit-maximizing institution like Amazon, where digital piecework was part of the business model. Universities, even neoliberal ones—job market pressures and incentives notwithstanding—still ostensibly functioned differently from for-profit businesses. MTurk's muddying of distinctions between for-profit employment ethics and nonprofit research ethics shed light on worker-participants’ exploitation while also exposing the profit-naive guise of academic research. One scholar observed that her peers “rationalize payment decisions by suggesting that paying low wages to many people is the only way to establish the statistical power necessary to create publishable research.”109 As MTurk illustrated, within current incentive structures, willingness to exploit workers went hand in glove with productivity and success in research.

Plagued by shrinking research budgets and widespread underemployment, social scientists improvised their own strategies to achieve statistical power needed to publish, or perished.110 When not even deception worked to rein in MTurk survey respondents, scholars went rogue. In more than one-third of studies using MTurk, researchers selectively deleted survey responses during data analysis, sometimes arbitrarily (e.g., to eliminate “suspicious” outlier data). Many failed to disclose in their published articles that they had done so. Researchers mirrored the lawlessness of the MTurk labor terrain by taking their own vigilante liberties, allowing themselves enormous leeway in correcting data that contradicted their hypotheses. Many found it easy to discard data at will because of the low wages they paid respondents, asking, “Who would do studies for such low pay anyway?”111 Like capitalist employers, researchers paid MTurk's prevailing wage, extracted surplus labor-value, and, when laborers attested to terrible conditions, claimed that the labor was worthless all along.112 Like cost-cutting practices across other industries, these innovations revealed their final consequences in their shoddy products. The professionalization of research participation threatened knowledge production, so researchers responded. The result was social science that purported to capture truth about human behavior but was itself marred by its distorted processes of production.

Conclusion: Humans in the Human Sciences

This article featured the contradictions underlying AI's tenuous promissory claims, illustrated through the machinations of a labor platform built for tasks that any mundane human could purportedly do more effectively than the best AI. A tongue-in-cheek 2007 New York Times review of MTurk concluded wryly, “We probably have at least another 25 years before computers are more powerful than human brains.” The shortfalls of AI guaranteed people job security because until computers surpass human intelligence, “people will be able to sell their idle brains” on platforms like MTurk.113 But Turkers were sold, one microtask at a time, not so much as idle brains as idle humans. By being human, workers qualified for most MTurk tasks, no training necessary. Turkers provided not brain power but intuitive, contextual, and innate human processing capabilities—a human essence the site called “human intelligence”—that artificial intelligence could not yet adequately copy. MTurk's disparate work, whether in AI or behavioral research, shared one key similarity: they all sought a humanness—to capture, approximate, and ultimately surpass that human essence. As the almost twenty years since the site's public launch has revealed, human assistance was not a preparatory step for AI's rise, to be discarded once technology was sufficiently powerful; rather, it is an integral part of how AI has been designed, trained, and deployed.114 No matter how advanced technology became, cheap, fungible, and often hidden human labor had to usher it along its way. The platform has been described as anything from “a spooky, but elegant, business idea” to a novel creation that was “at once an opportunity, a sweatshop, and a game.”115 But MTurk's primary genius was in how its very name satirized the labor exchange occurring on the site, exposing the hoax within AI, all while profiting seamlessly from the collective power of the underemployed, exploited, and fragmented twenty-first-century crowd.

Human behavioral data, commentators argued, is the prime currency in this age of surveillance capitalism, the economic surplus from which most profits are and will be generated. Whereas some citizen science participants have been credited as coauthors of research, Turkers have not—because Turkers were not so much coagents in producing scientific knowledge as they were raw materials for its creation. Psychologist David O. Sears, when critiquing his field's sole reliance on undergraduate research participants in 1980, dreamed of a return to the norms of the 1940s and 1950s, when behavioral research was conducted on adults “in their natural habitats with materials drawn from ordinary life.” Troubling as MTurk's labor and research developments were, their next iteration may be bleaker, as researchers bypass even platforms like MTurk to harvest data, in situ and cost free, directly from homes, devices, and bodies of technologies’ users.116

In the final analysis, MTurk is a case study not only in human cognition but also in human nature itself. Driven by perennial labor shortages in social science research participants, academics’ methodological innovation to use MTurk for research led to a vicious cycle in which scholars and Turkers attempted to outmaneuver each other for financial gain, knowledge production, and self-protection. As workers devised bots and strategies for earning more money, researchers engineered new “checks” for verifying participants’ attention, honesty, and humanness (proving they are not bots). In an arms race of exploitation and resistance, Turkers worked feverishly for abysmal pay, and cheapskate professors were burned by their exploited subjects. Behavioral scientists may have exploited online labor to gain insights into human behavior, but the ensuing labor conflict between workers and academics painted just as much a portrait of humanity as survey research did. Scientists seeking to understand abstract or generic human behavior ran up against the unavoidable, human particularities of workers. This drama revealed an essence of human propensities: willingness to exploit others, and human ingenuity in resisting exploitation.117

In the wake of ChatGPT, the large language model (LLM) developed by OpenAI that launched in late 2022, the dynamics highlighted in this article have only intensified. One new study tracked Turkers’ copy-and-paste keystrokes, discovering that likely 33 to 46 percent of Turkers secretly used ChatGPT to complete their work. This trend, if it were to proliferate, would be problematic for the AI industry, as AI training may increasingly, and inadvertently, be outsourced to AI itself. Creating another loop in the recursive pun, the authors titled this new study “artificial artificial artificial intelligence,” remarking that “our results call for . . . new ways to ensure that human data remain human.”118

This project illuminated the role of labor in knowledge creation—and the role of knowledge industries in labor and economic production. Its key tensions implicate not only experimental psychology but also research more broadly. During the “replication crisis” that hit psychology and biomedicine in the 2010s, scholars found themselves unable to reproduce core findings in their disciplines. The Open Science Collaboration in 2015 replicated only 36 percent of results from one hundred studies, the pharmaceutical company AmGen replicated only six out of fifty-three “landmark” basic cancer research studies, and psychology replicated only 1 percent of its research.119 By scientists’ self-reporting, the conditions of modern academia spurred misconduct, as publication pressures that medical scientists felt correlated with their likelihood to commit research fraud.120 As markets squeezed academics, costs trickled down not only to research subjects (workers) but also to consumers—those in the public for whose supposed benefit new knowledge was produced. Knowledge production was not exempt from other industries’ lessons. Economic pressures to produce quickly, coupled with limited initial understanding about cost-cutting's consequences, yielded flimsy output to the detriment of all who relied on those products. The durability and quality of knowledge products have come home to roost.

This article began as a term paper in a class taught by Rebecca Lemov. I'm grateful for the generous feedback I received on drafts from Rebecca Lemov, David Jones, Lizabeth Cohen, Elizabeth Lunbeck, Lisa McGirr, Aaron Bekemeyer, Leah Xue, the Harvard Bowdoin Prize Committee, and the Business History Conference (BHC) K. Austin Kerr Prize Committee; and on presentations of this research at the BHC and the European Society for the History of the Human Sciences. Support from the National Science Foundation Graduate Research Fellowship Program made this research possible. My deep gratitude also goes to Labor's reviewers and editorial team for the time and care they put into reviewing this piece. Their close attention and incisive feedback were instrumental in producing a clearer, stronger article.

Notes

1.

Sources for a history of science include those that shed light on processes surrounding scientific knowledge production. They may, for example, capture how and where scientists discuss political, social, and epistemological controversies in their fields. Venues include scientific publications, semiprofessional forums (blogs and newsletters), social media, traditional news media, etc. Also relevant are bodies governing science's ethics and legality, especially institutional review boards (IRBs). Where science intersects with commerce, patents, marketing, advertising, and news coverage about products may all be pertinent. Materials from each of these types appear in this study.

9.

The history of “human science,” i.e., scientific study of human phenomena, has a long genealogy that some trace through the ancient Greeks and European Enlightenment. Broad concern about the “human sciences,” plural, now animates new histories of medicine, psychology, economics, anthropology, etc., that seek to tell coherent narratives about disparate disciplines in social and medical science. R. Smith, Norton History of the Human Sciences; McCallum, Palgrave Handbook of the History of Human Sciences.

43.

This question, too, buttressed the expansion of citizen science: see National Human Genome Research Institute, “Overview of Citizen Science Methodologies.” 

112.

I thank an anonymous reviewer for this analysis.

References

Aghdasi, Nava, Bly, Randall, White, Lee, Hannaford, Blake, Moe, Kris, and Lendvay, Thomas. “
Crowd-Sourced Assessment of Surgical Skills in Cricothyrotomy Procedure
.”
Journal of Surgical Research
196
, no.
2
(
2015
):
302
6
.
Ahler, Douglas, Roush, Carolyn, and Sood, Gaurav. “
The Micro-task Market for Lemons: Data Quality on Amazon's Mechanical Turk
.”
January
22
,
2020
,
1
39
. http://www.gsood.com/research/papers/turk.pdf?source=post_page.
Alkhatib, Ali, Bernstein, Michael, and Levi, Margaret. “
Examining Crowd Work and Gig Work through the Historical Lens of Piecework
.” In
Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems
,
4599
4616
.
CHI ’17, Association for Computing Machinery, Denver, CO
.
Amazon
. “
Amazon Mechanical Turk
.” Accessed
March
29
,
2020
. https://www.mturk.com/.
Anderson, Craig, Allen, Johnie, Plante, Courtney, Quigley-McBride, Adele, Lovett, Alison, and Rokkum, Jeffrey. “
The MTurkification of Social and Personality Psychology
.”
Personality and Social Psychology Bulletin
45
, no.
6
(
2019
):
842
50
.
Anderson, Monica, Mcclain, Colleen, Faverio, Michelle, and Gelles-Watnick, Risa. “
The State of Gig Work in 2021
.”
Pew Research Center
(blog),
December
8
,
2021
. https://www.pewresearch.org/internet/2021/12/08/the-state-of-gig-work-in-2021/.
Ayteș, Ayhan. “
The ‘Other’ in the Machine: Oriental Automata and the Mechanization of the Mind
.” PhD diss.,
University of California
,
San Diego
,
2012
.
Bai, (Max) Hui. “
Evidence That a Large Amount of Low Quality Responses on MTurk Can Be Detected with Repeated GPS Coordinates
.”
Maxhuibai
(blog),
August
8
,
2018
. http://www.maxhuibai.com/blog/evidence-that-responses-from-repeating-gps-are-random.
Bai, Max Hui. “
PsychMAP: Have Anyone Used Mturk in the Last Few Weeks and Notice Any Quality Drop
.” Facebook,
August
7
,
2018
. https://www.facebook.com/groups/psychmap/posts/656859794690946/.
Benanav, Aaron.
Automation and the Future of Work
.
New York
:
Verso
,
2020
.
Bradley, Bradley. “
Bots and Data Quality on Crowdsourcing Platforms
.”
Prolific Blog
(blog),
August
10
,
2018
.
Brinkmann, Rory, Turner, Andrew, and Podolsky, Scott. “
The Rise and Fall of the ‘Personal Equation’ in American and British Medicine, 1855–1952
.”
Perspectives in Biology and Medicine
62
, no.
1
(
2019
):
41
71
.
Brisson, Chantal, Vinet, Alain, Vézina, Michel, and Gingras, Suzanne. “
Effect of Duration of Employment in Piecework on Severe Disability among Female Garment Workers
.”
Scandinavian Journal of Work, Environment and Health
15
, no.
5
(
1989
):
329
34
.
Buhrmester, Michael, Kwang, Tracy, and Gosling, Samuel. “
Amazon's Mechanical Turk: A New Source of Inexpensive, Yet High-Quality, Data?
Perspectives on Psychological Science
6
, no.
1
(
2011
):
3
5
.
Chan, Duo. “
Combining Statistical, Physical, and Historical Evidence to Improve Historical Sea-Surface Temperature Records
.”
Harvard Data Science Review
3
, no.
1
(
2021
):
1
28
.
Chan, Duo, Kent, Elizabeth C., Berry, David, and Huybers, Peter. “
Correcting Datasets Leads to More Homogeneous Early-Twentieth-Century Sea Surface Warming
.”
Nature
571
, no.
7765
(
2019
):
393
97
.
Chan, Duo, Vecchi, Gabriel, Yang, Wenchang, and Huybers, Peter. “
Improved Simulation of 19th-and 20th-Century North Atlantic Hurricane Frequency after Correcting Historical Sea Surface Temperatures
.”
Science Advances
7
, no.
26
(
2021
):
1
8
.
Chandler, Jesse, Mueller, Pam, and Paolacci, Gabriele. “
Nonnaïveté among Amazon Mechanical Turk Workers: Consequences and Solutions for Behavioral Researchers
.”
Behavior Research Methods
46
, no.
1
(
2014
):
112
30
.
Chandler, Jesse, Paolacci, Gabriele, Peer, Eyal, Mueller, Pam, and Ratliff, Kate. “
Using Nonnaive Participants Can Reduce Effect Sizes
.”
Psychological Science
26
, no.
7
(
2015
):
1131
39
.
Chen, Carolyn, White, Lee, Kowalewski, Timothy, Aggarwal, Rajesh, Lintott, Chris, Comstock, Bryan, Kuksenok, Katie, Aragon, Cecilia, Holst, Daniel, and Lendvay, Thomas. “
Crowd-Sourced Assessment of Technical Skills: A Novel Method to Evaluate Surgical Performance
.”
Journal of Surgical Research
187
, no.
1
(
2014
):
65
71
.
Choi, Charles Q.
Court Software No Better Than Mechanical Turks at Predicting Repeat Crime
.”
IEEE Spectrum
,
January
17
,
2018
.
CloudResearch
. “
TurkPrime MTurk Toolkit by CloudResearch
.” CloudResearch powered by TurkPrime. Accessed
May
12
,
2020
. https://www.cloudresearch.com/products/turkprime-mturk-toolkit/.
Cowie, Jefferson.
Capital Moves: RCA's Seventy-Year Quest for Cheap Labor
.
Ithaca, NY
:
Cornell University Press
,
1999
.
Crawford, Kate.
Atlas of AI: Power, Politics, and the Planetary Costs of Artificial Intelligence
.
New Haven, CT
:
Yale University Press
,
2021
.
Cushing, Ellen. “
Amazon Mechanical Turk: The Digital Sweatshop
.”
Utne
,
February
2013
.
Damer, Ekaterina. “
Stop Using MTurk for Research
.”
Medium
(blog),
July
31
,
2019
. https://medium.com/@ekadamer/stop-using-mturk-for-research-4b9c7b4f6f56.
Daston, Lorraine, and Galison, Peter.
Objectivity
.
New York
:
Zone Books
,
2007
.
Dennis, J. M.
Are Internet Panels Creating Professional Respondents?
Marketing Research
13
, no.
2
(
2001
):
34
38
.
Derksen, Maarten.
Histories of Human Engineering: Tact and Technology
.
Cambridge
:
Cambridge University Press
,
2017
.
Dreyfuss, Emily. “
A Bot Panic Hits Amazon's Mechanical Turk
.”
Wired
,
August
17
,
2018
.
Ellmer, Markus. “
The Digital Division of Labor: Socially Constructed Design Patterns of Amazon Mechanical Turk and the Governing of Human Computation Labor
.”
Momentum Quarterly
4
, no.
3
(
2015
):
174
86
.
Fabian, Ann.
The Skull Collectors: Race, Science, and America's Unburied Dead
.
Chicago
:
University of Chicago Press
,
2010
.
Felstiner, Alek. “
Working the Crowd: Employment and Labor Law in the Crowdsourcing Industry
.”
Berkeley Journal of Employment and Labor Law
32
(
2011
):
143
203
.
Fisher, Jill.
Adverse Events: Race, Inequality, and the Testing of New Pharmaceuticals
.
New York
:
New York University Press
,
2020
.
Fleck, Ludwik.
Genesis and Development of a Scientific Fact
.
Chicago
:
University of Chicago Press
,
1979
.
Fort, Karën, Adda, Gilles, and Cohen, Bretonnel. “
Amazon Mechanical Turk: Gold Mine or Coal Mine?
Computational Linguistics
37
, no.
2
(
June
2011
):
413
20
.
Garson, Barbara.
The Electronic Sweatshop: How Computers Are Transforming the Office of the Future into the Factory of the Past
.
New York
:
Penguin Books
,
1989
.
Gordin, Michael D.
On the Fringe: Where Science Meets Pseudoscience
.
New York
:
Oxford University Press
,
2021
.
Gray, Mary, and Suri, Siddharth.
Ghost Work: How to Stop Silicon Valley from Building a New Global Underclass
.
Boston
:
Houghton Mifflin Harcourt
,
2019
.
Grim, Ryan, and Lacy, Akela. “
Pete Buttigieg's Campaign Used Notoriously Low-Paying Gig-Work Platform for Polling
.”
The Intercept
,
January
16
,
2020
.
Guerrini, Anita. “
The Whiteness of Bones: Sceletopoeia and the Human Body in Early Modern Europe
.”
Bulletin of the History of Medicine
96
, no.
1
(
2022
):
34
70
.
Hamilton, David P.
Publishing by—and for—the Numbers
.”
Science
250
, no.
4986
(
1990
):
1331
32
.
Hara, Kotaro, Adams, Abi, Milland, Kristy, Savage, Saiph, Callison-Burch, Chris, and Bigham, Jeffrey. “
A Data-Driven Analysis of Workers’ Earnings on Amazon Mechanical Turk
.”
arXiv:1712.05796 [Cs]
,
December
28
,
2017
.
Harinarayan, Venky, Rajaraman, Anand, and Ranganathan, Anand. “
US Patent 7197459—Hybrid Machine/Human Computing Arrangement
.” Issued
March
27
,
2007
.
Harnett, Sam. “
‘Two-Tiered Caste System’: The World of White-Collar Contracting in Silicon Valley
.”
KQED
,
April
19
,
2019
.
Harris, Mark. “
Amazon's Mechanical Turk Workers Protest: ‘I Am a Human Being, Not an Algorithm.’
The Guardian
,
December
3
,
2014
.
Hauser, David, and Schwarz, Norbert. “
Attentive Turkers: MTurk Participants Perform Better on Online Attention Checks Than Do Subject Pool Participants
.”
Behavior Research Methods
48
, no.
1
(
2016
):
400
407
.
Henrich, Joseph, Heine, Steven, and Norenzayan, Ara. “
The Weirdest People in the World?
Behavioral and Brain Sciences
33
, nos.
2–3
(
2010
):
61
135
.
Hillygus, Sunshine, Jackson, Natalie, and Young, McKenzie. “
Professional Respondents in Non-probability Online Panels
.” In
Online Panel Research: A Data Quality Perspective
,
219
37
.
New York
:
John Wiley & Sons
,
2014
.
Hitlin, Paul. “
Research in the Crowdsourcing Age, a Case Study
.”
Pew Research Center
,
July
11
,
2016
.
Holst, Daniel, Kowalewski, Timothy, White, Lee, Brand, Timothy, Harper, Jonathan, Sorensen, Mathew, Truong, Mireille, al et.
Crowd-Sourced Assessment of Technical Skills: Differentiating Animate Surgical Skill through the Wisdom of Crowds
.”
Journal of Endourology
29
, no.
10
(
2015
):
1183
88
.
Hyman, Louis.
Temp: How American Work, American Business, and the American Dream Became Temporary
.
New York
:
Penguin Random House
,
2018
.
Ipeirotis, Panos. “
Mechanical Turk, 97 Cents per Hour, and Common Reporting Biases
.”
Behind Enemy Lines
(blog),
November
18
,
2019
.
Irani, Lilly. “
Difference and Dependence among Digital Workers: The Case of Amazon Mechanical Turk
.”
South Atlantic Quarterly
114
, no.
1
(
2015
):
225
34
.
Kent, Robert. “
Micro-motion Study: A New Development in the Art of Time Study
.”
Scientific American
,
January
25
,
1913
,
84
.
Kessler, Sarah. “
The Crazy Hacks One Woman Used to Make Money on Mechanical Turk
.”
Wired
,
June
12
,
2018
.
Khare, Ritu, Burger, John, Aberdeen, John, Tresner-Kirsch, David, Corrales, Theodore, Hirchman, Lynette, and Lu, Zhiyong. “
Scaling Drug Indication Curation through Crowdsourcing
.”
Database
bav016 (
2015
):
1
10
.
Kittur, Aniket, Nickerson, Jeffrey, Bernstein, Michael, Gerber, Elizabeth, Shaw, Aaron, Zimmerman, John, Lease, Matt, and Horton, John. “
The Future of Crowd Work
.” In
Proceedings of the 2013 Conference on Computer Supported Cooperative Work
,
1301
18
. CSCW ’13.
San Antonio, TX
:
Association for Computing Machinery
,
2013
.
Kneese, Tamara, Rosenblat, Alex, and Boyd, Danah. “
Understanding Fair Labor Practices in a Networked Age
.”
Open Society Foundations’ Future of Work Commissioned Research Papers
,
October
8
,
2014
.
Lacey, Rosie, Lewis, Martyn, and Sim, Julius. “
Piecework, Musculoskeletal Pain and the Impact of Workplace Psychosocial Factors
.”
Occupational Medicine
57
, no.
6
(
2007
):
430
37
.
LaPlante, Rochelle, and Silberman, Six. “
Building Trust in Crowd Worker Forums: Worker Ownership, Governance, and Work Outcomes
.”
2016
, 2. https://wtf.tw/text/laplante_trust_in_worker_forums.pdf.
Lederer, Susan.
Subjected to Science: Human Experimentation in America before the Second World War
.
Baltimore
:
Johns Hopkins University Press
,
1995
.
Lederer, Susan, and Lawrence, Susan. “
Rest in Pieces: Body Donation in Mid-Twentieth Century America
.”
Bulletin of the History of Medicine
96
, no.
2
(
2022
):
151
81
.
Lee, Frederic S.
Industrial Efficiency: The Bearings of Physiological Science Thereon: A Review of Recent Work
.”
Public Health Reports (1896–1970)
33
, no.
2
(
1918
):
29
35
.
Lehdonvirta, Vili. “
Flexibility in the Gig Economy: Managing Time on Three Online Piecework Platforms
.”
New Technology, Work and Employment
33
, no.
1
(
2018
):
13
29
.
Lepore, Jill. “
Happiness Minutes
.” In
The Mansion of Happiness: A History of Life and Death
,
97
110
.
New York
:
Knopf
,
2012
.
MacLean, Diana Lynn, and Heer, Jeffrey. “
Identifying Medical Terms in Patient-Authored Text: A Crowdsourcing-Based Approach
.”
Journal of the American Medical Informatics Association
20
, no.
6
(
2013
):
1120
27
.
Marder, Jenny, and Fritz, Mike. “
The Internet's Hidden Science Factory
.”
PBS NewsHour
,
February
11
,
2015
.
Matthijsse, Suzette, Leeuw, Edith de, and Hox, Joop. “
Internet Panels, Professional Respondents, and Data Quality
.”
Methodology
11
, no.
3
(
2015
):
81
88
.
McCallum, David.
The Palgrave Handbook of the History of Human Sciences
.
Singapore
:
Springer Nature Singapore
,
2022
.
Mieszkowski, Katharine. “
I Make $1.45 a Week and I Love It
.”
Salon
,
July
24
,
2006
.
Mitry, Danny, Zutis, Kris, Dhillon, Baljean, Peto, Tunde, Hayat, Shabina, Khaw, Kay-Tee, Morgan, James, Moncur, Wendy, Trucco, Emanuele, and Foster, Paul. “
The Accuracy and Reliability of Crowdsource Annotations of Digital Retinal Images
.”
Translational Vision Science and Technology
5
, no.
5
(
2016
): 6.
Montgomery, David.
The Fall of the House of Labor: The Workplace, the State, and American Labor Activism, 1865–1925
.
New York
:
Cambridge University Press
,
1987
.
Montgomery, David.
Workers’ Control in America: Studies in the History of Work, Technology, and Labor Struggles
.
New York
:
Cambridge University Press
,
1979
.
Morawski, Jill. “
Epistemological Dizziness in the Psychology Laboratory: Lively Subjects, Anxious Experimenters, and Experimental Relations, 1950–1970
.”
Isis
106
, no.
3
(
2015
):
567
97
.
Mortensen, Karoline, and Hughes, Taylor. “
Comparing Amazon's Mechanical Turk Platform to Conventional Data Collection Methods in the Health and Medical Research Literature
.”
Journal of General Internal Medicine
33
, no.
4
(
2018
):
533
38
.
MTurk Data
. “
Your Online Publishing Source
.” Accessed
March
29
,
2020
. http://www.mturkdata.com/index.html.
National Human Genome Research Institute
. “
Overview of Citizen Science Methodologies
.” Trans-NIH Workshop to Explore the Ethical, Legal and Social Implications of Citizen Science, 2015. YouTube video, 38:56, posted
January
23
,
2015
. https://www.youtube.com/watch?v = ZAvtCeBTDeU&list = PL1ay9ko4A8smo8OjEts6UTYw0M3_JtpxB&index = 3.
National Human Genome Research Institute
. “
Session 4 Discussion Panel
.” Trans-NIH Workshop to Explore the Ethical, Legal and Social Implications of Citizen Science, 2015. YouTube video, 1:07:56, posted
January
23
,
2015
. https://www.youtube.com/watch?v = dxyZvq8_yjw&list = PL1ay9ko4A8smo8OjEts6UTYw0M3_JtpxB&index = 22.
Newman, Andy. “
I Found Work on an Amazon Website. I Made 97 Cents an Hour
.”
New York Times
,
November
15
,
2019
.
O'Hear, Steve. “
Prolific Wants to Challenge Amazon's Mechanical Turk in the Online Research Space
.”
TechCrunch
,
December
4
,
2019
.
OSHA Federal
. “
Federal Safety Inspections at Three Amazon Warehouse Facilities Find Company Exposed Workers to Ergonomic, Struck-by Hazards
.”
January
18
,
2023
.
OSHA Federal
. “
US Department of Labor Finds Amazon Exposed Workers to Unsafe Conditions, Ergonomic Hazards at Three More Warehouses in Colorado, Idaho, New York
.”
February
1
,
2023
.
OSHA Washington State
. “
Citation #1503638 for Amazon Com Services LLC
.”
May
4
,
2021
.
Paolacci, Gabriele, Chandler, Jesse, and Ipeirotis, Panagiotis. “
Running Experiments on Amazon Mechanical Turk
.”
Judgment and Decision Making
5
, no.
5
(
2010
):
411
19
.
Peer, Eyal, Brandimarte, Laura, Samat, Sonam, and Acquisti, Alessandro. “
Beyond the Turk: Alternative Platforms for Crowdsourcing Behavioral Research
.”
Journal of Experimental Social Psychology
70
(
2017
):
153
63
.
Pontin, Jason. “
Artificial Intelligence, with Help from the Humans
.”
New York Times
,
March
25
,
2007
.
Powers, Mary, Boonjindasup, Aaron, Pinsky, Michael, Dorsey, Philip, Maddox, Michael, Su, Li-Ming, Gettman, Matthew, et al.
Crowdsourcing Assessment of Surgeon Dissection of Renal Artery and Vein during Robotic Partial Nephrectomy: A Novel Approach for Quantitative Assessment of Surgical Performance
.”
Journal of Endourology
30
, no.
4
(
2015
):
447
52
.
Prolific
. “
Online Participant Recruitment for Surveys and Market Research
.” Accessed
May
12
,
2020
. https://www.prolific.co/.
Rao, Srinivas, and Michel, Amanda. “
ProPublica's Guide to Mechanical Turk
.” ProPublica,
October
15
,
2010
.
Richardson, Ruth.
Death, Dissection, and the Destitute
.
Chicago
:
University of Chicago Press
,
2000
.
Ritchie, Stuart.
Science Fictions: How Fraud, Bias, Negligence, and Hype Undermine the Search for Truth
.
New York
:
Metropolitan Books
,
2020
.
Roberts, Sarah.
Behind the Screen: Content Moderation in the Shadows of Social Media
.
New Haven, CT
:
Yale University Press
,
2019
.
Samuel, Alexandra. “
Amazon's Mechanical Turk Has Reinvented Research
.”
JSTOR Daily
,
May
15
,
2018
.
Sannon, Shruti, and Cosley, Dan. “
Privacy, Power, and Invisible Labor on Amazon Mechanical Turk
.” In
Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems
,
1
12
. CHI ’19.
Glasgow, Scotland UK
:
Association for Computing Machinery
,
2019
.
Sappol, Michael.
A Traffic of Dead Bodies: Anatomy and Embodied Social Identity in Nineteenth-Century America
.
Princeton, NJ
:
Princeton University Press
,
2018
.
Schneider, Nathan. “
Intellectual Piecework
.”
Chronicle of Higher Education
,
February
16
,
2015
.
Schwartz, Oscar. “
Untold History of AI: How Amazon's Mechanical Turkers Got Squeezed Inside the Machine
.”
IEEE Spectrum
,
April
22
,
2019
.
Schwartz, Oscar. “
Untold History of AI: When Charles Babbage Played Chess with the Original Mechanical Turk
.”
IEEE Spectrum
,
March
18
,
2019
.
Sears, David O.
College Sophomores in the Laboratory: Influences of a Narrow Data Base on Social Psychology's View of Human Nature
.”
Journal of Personality and Social Psychology
51
, no.
3
(
1986
):
515
30
.
Semuels, Alana. “
The Internet Is Enabling a New Kind of Poorly Paid Hell
.”
The Atlantic
,
January
23
,
2018
.
Shapin, Steven, and Schaffer, Simon.
Leviathan and the Air-Pump: Hobbes, Boyle, and the Experimental Life
.
Princeton, NJ
:
Princeton University Press
,
2011
.
Shaw, Simon. “
Consumers Are Becoming Wise to Your Nudge
.”
Behavioral Scientist
,
June
12
,
2019
.
Sheehan, Kim Bartel. “
Crowdsourcing Research: Data Collection with Amazon's Mechanical Turk
.”
Communication Monographs
85
, no.
1
(
2018
):
140
56
.
Sheng, Ellen. “
Silicon Valley's Dirty Secret: Using a Shadow Workforce of Contract Employees to Drive Profits
.”
CNBC
,
October
22
,
2018
.
Silberman, Six. “
Human-Centered Computing and the Future of Work: Lessons from Mechanical Turk and Turkopticon, 2008–2015
.” PhD diss.,
University of California
,
Irvine
,
2015
.
Siraisi, Nancy G.
Medieval and Early Renaissance Medicine: An Introduction to Knowledge and Practice
.
Chicago
:
University of Chicago Press
,
1990
.
Smaldino, Paul E., and McElreath, Richard. “
The Natural Selection of Bad Science
.”
Royal Society Open Science
3
, no.
9
(
2016
):
160384
.
Smith, Aaron. “
The Gig Economy: Work, Online Selling and Home Sharing
.”
Pew Research Center
,
November
17
,
2016
.
Smith, Roger.
The Norton History of the Human Sciences
.
New York
:
Norton
,
1997
.
Stark, Laura. “
The Hidden Racism of Vaccine Testing
.”
New Republic
,
June
29
,
2020
.
Stewart, Neil, Chandler, Jesse, and Paolacci, Gabriele. “
Crowdsourcing Samples in Cognitive Science
.”
Trends in Cognitive Sciences
21
, no.
10
(
2017
):
736
48
.
Strickland, Eliza. “
In the Coming Automated Economy, People Will Work for AI
.”
IEEE Spectrum
,
November
30
,
2018
.
Telford, Taylor. “
Biden Wants to Let Gig Workers Be Employees
. Here's Why It Matters.”
Washington Post
,
October
17
,
2022
.
Tijdink, Joeri K., Verbeke, Reinout, and Smulders, Yvo M.
Publication Pressure and Scientific Misconduct in Medical Scientists
.”
Journal of Empirical Research on Human Research Ethics: An International Journal
9
, no.
5
(
2014
):
64
71
.
Timmermann, Michael. “
MTurk Review: How to Make Money Online with Amazon Mechanical Turk
.”
Clark Howard
,
June
19
,
2018
.
University of Washington Procurement Services
. “
Amazon Mechanical Turk Policy
.” Accessed
May
12
,
2020
. https://finance.uw.edu/ps/amt/policy.
Van Noorden, Richard. “
The Science That's Never Been Cited
.”
Nature
552
, no.
7684
(
2017
):
162
64
.
Veselovsky, Veniamin, Ribeiro, Manoel Horta, and West, Robert. “
Artificial Artificial Artificial Intelligence: Crowd Workers Widely Use Large Language Models for Text Production Tasks
.”
arXiv
,
June
13
,
2023
.
Wakabayashi, Daisuke. “
Google's Shadow Work Force: Temps Who Outnumber Full-Time Employees
.”
New York Times
,
May
28
,
2019
.
White, Lee, Kowalewski, Timothy, Dockter, Rodney, Comstock, Bryan, Hannaford, Blake, and Lendvay, Thomas. “
Crowd-Sourced Assessment of Technical Skill: A Valid Method for Discriminating Basic Robotic Surgery Skills
.”
Journal of Endourology
29
, no.
11
(
2015
):
1295
1301
.
Williams, Rhiannon. “
The People Paid to Train AI Are Outsourcing Their Work . . . to AI
.”
MIT Technology Review
,
June
22
,
2023
.
Woike, Jan. “
Upon Repeated Reflection: Consequences of Frequent Exposure to the Cognitive Reflection Test for Mechanical Turk Participants
.”
Frontiers in Psychology
10
(
2019
):
1
24
.
Zuboff, Shoshana.
The Age of Surveillance Capitalism: The Fight for a Human Future at the New Frontier of Power
.
New York
:
PublicAffairs
,
2019
.