Automating Your Own Job Through Data Labeling
Think tech jobs are safe from automation? Think again.
We are Stella and Amy. We share firsthand stories and perspectives that are either lost in translation or simply inaccessible to you.
Not a subscriber yet? Here you go:
This time, Stella and Amy sat down with Sabrina, a senior data scientist from the UK. Sabrina once had a side gig as a data labeler on Remotasks, and she shared her experiences with us.
The Conversation
Sabrina: I started data labeling out of curiosity. I worked on a Remotasks project where we had to create SQL prompts and write corresponding code. It's like generating questions and answers for our day to day work. There were about 30-40 professional data scientists or engineers working on this task.
Stella: Interesting. Is this still considered data labeling? It sounds more like generating training data, doesn't it?
Sabrina: Exactly, it's generating training data. They call us all "labelers" but with different levels. I started as an "attempter" generating text and code pairs.
Amy: So you're basically creating the dataset that could train AI to do your job? That's... ironic.
Sabrina: Haha, yeah. Later I became a reviewer without even realizing it. I guess that's a promotion.Â
Stella: So as a labeler, you were creating data to train AI, and then as a reviewer, you were quality-checking the data?
Sabrina: I had to review and score other labelers' work. Sometimes I'd see entries that looked AI-generated, but I couldn't tell if they were from other labelers using ChatGPT or if they were actually mixed in by the platform.
Amy: Wait, so you're not only creating data to potentially replace yourself but also quality-checking it? Haha the irony...
Stella: This raises interesting questions about the data labeling industry. Sabrina, do you know if these tasks come directly from clients, or does the platform collect data to sell?
Sabrina: I'm not entirely sure, but I know Scale AI, the company behind the platform, works with OpenAI, Anthropic, and others. My guess is that they collect data based on client requirements, but I don't have inside information.
Amy: So high-skilled workers like data scientists are unknowingly contributing to their own potential obsolescence. It's like Amazon Mechanical Turk on steroids – you do the task, get paid, and sign away your rights without knowing the end use.
Stella: This really highlights the complex dynamics in AI development. On one hand, it provides opportunities for a new gig economy. On the other, it accelerates the development of AI that could replace those very same workers.
Sabrina: True. As a senior data scientist, I initially saw it as a way to learn more about AI models. But now I can't help but wonder if we're just hastening our own replacement.
Amy: Well, we're getting paid to dig our own graves, right?Â
Stella: This conversation really underscores the need for a broader discussion about the ethical implications and long-term consequences of AI development on the workforce, especially for high-skill jobs we once thought were irreplaceable.
The Cocoon
Oh, sure, you think you’re "leveraging your skills" to make a little extra cash on the side.
But really, you’re just fast-tracking your way to obsolescence by automating your own job with data labeling.
On Remotasks, a gig-based platform, data scientists like Sabrina can find a range of data labeling jobs, from basic image labeling tasks such as scene understanding and satellite imagery to more advanced work like generating SQL queries and creating training data for AI models. While the freelance nature of these tasks offers flexibility and extra income, Sabrina found herself confronting an unsettling reality. Many of the tasks she was labeling—like writing SQL queries or creating Jupyter Notebook tutorials—were similar to the work she previously did as a data scientist. She soon realized that by training AI models, she was potentially automating her own job.
This concern deepened when she encountered AI-generated responses during the Remotasks interview process that were surprisingly polished—better than many human candidates she had interviewed from past experiences. These AI outputs made her realize that routine tasks in data science, such as unit testing and model interpretability, could soon be automated. As AI technology continues to advance, Sabrina feels that data science roles, particularly those focused on repetitive tasks, are increasingly at risk of being replaced by machines. This experience has raised important questions about the future of the profession.
Stella & Amy’s Comment
What an irony it is to be working on tasks designed to eliminate one’s own job. Yet, with the advancements in AI models, this seems to be the fate of many high-skilled workers.
A recent paper titled The Effects of Generative AI on High Skilled Work: Evidence from Three Field Experiments with Software Developers sheds light on how AI is impacting software engineering jobs. In the study, researchers evaluated the use of GitHub Copilot, an AI-based coding assistant, across three major companies—Microsoft, Accenture, and a Fortune 100 electronics manufacturer. The results were striking: developers who used the AI tool saw a 26% increase in the number of tasks they completed.
This finding is important because it mirrors what’s happening in other high-skilled roles, like data science, where automation is starting to take over tasks that once required human expertise. Software development and data science, jobs once considered prestigious and secure, are now seeing rapid changes as AI tools like Copilot and Cursor reshape the way work gets done.
We see the headlines and a wide range of discussions about high-skilled workers, especially software engineers, being the first ones to be replaced. But it feels so real when our guest Sabrina shares her firsthand experience working on data labeling to replace herself.
We wonder what’s the right thing to do? Should we jump on the bandwagon and become a data labeler to make better AI that can do our job? It’s going to happen anyways.Â
Interesting followup story: Sabrina is not a data labeler anymore. She now works at a Generative AI startup. We could be choose to be AI doomers or AI accelerationists, but in reality, we just find the next job and move on.
Please subscribe The Cocoons for more exclusive conversation snippets and stories.
We are Stella and Amy. We share firsthand stories and perspectives that are either lost in translation or simply inaccessible to you.