Children of the Magenta Line
"Coding is solved" sounds bold until you remember pilots who flew a perfect aircraft into a mountain while following ...
9 min read
29.06.2026, By Stephan Schwab
If you are building an AI product to sell, you need software developers. Not eventually. Not once you have traction. From day one. Here is the part most founders quietly miss: you are probably not training your own model. Almost nobody is. You are calling an existing model through an API, maybe fine-tuning it, and wrapping it in something people will pay for. The model is the easy part — you can rent the best ones by the token. The hard part, the part that is 95% of any real AI product, is everything around it: the API you wrap it in, the multi-tenancy, the auth, the tests, the deployment, the monitoring, the retrieval and caching, the cost controls so a clever prompt does not bankrupt you, and keeping it all alive when paying customers hit it at 2 a.m. That work is software development, not data science. You cannot skip it, you cannot hand-wave it as "just calling the API," and a demo that wows investors is not the same thing as a product that survives real customers. The AI startups that build a real business treat developers as the people who turn a borrowed model into something customers will actually pay for and keep using. The ones that don't keep mistaking an impressive demo for a product — and run out of runway before they ever ship one.
The cleanest short definition still belongs to Drew Conway’s Data Science Venn Diagram: data science sits where hacking skills, statistics and math, and substantive domain expertise overlap. Strip away the statistics and math part and you are not doing data science. You are doing software development with data.
A data scientist’s job is to turn messy reality into a defensible inference. Frame a question. Pull data. Clean it. Explore it. Fit a model. Argue with the results. Quantify uncertainty. Decide whether the answer is good enough to act on. The deliverable is usually a finding, a forecast, a recommendation, or a trained model artifact.
A software developer’s job is to build and keep alive a system that does something useful for real users. Decide what the system must do. Model the domain. Design the boundaries. Write the code. Cover it with tests. Wire it into CI/CD. Make it observable. Handle failure. Keep it secure. Keep it changeable. The deliverable is a product or platform that survives contact with customers, auditors, regulators, and Monday morning.
Both are technical. Both touch code. The shape of the work is very different.
In 2012, Davenport and Patil published Data Scientist: The Sexiest Job of the 21st Century in Harvard Business Review. A decade later, the same authors revisited the claim in Is Data Scientist Still the Sexiest Job of the 21st Century? and admitted the role had fragmented. Machine learning engineers, data engineers, MLOps specialists and analytics engineers had quietly absorbed huge parts of what the original headline had bundled into one mythical hire.
That fragmentation is the real signal.
The “data scientist” boom collided with the “AI” boom and produced a convenient executive story: hire data scientists, and AI will happen. It rarely does, because most of the work needed to turn a model into a product is not data science work at all. It is software development, infrastructure, security, integration, and operations.
That is not opinion. It is well documented.
The classic citation here is from Google: D. Sculley et al., Hidden Technical Debt in Machine Learning Systems, NeurIPS 2015. The paper’s central figure shows the ML code as a tiny black rectangle surrounded by configuration, data collection, feature extraction, data verification, machine resource management, analysis tools, process management, serving infrastructure, and monitoring. The authors call ML systems “the high-interest credit card of technical debt” and warn that they incur all the usual costs of complex software plus a set of ML-specific ones.
Google’s own MLOps: Continuous delivery and automation pipelines in machine learning builds on the same point: a trained model is not a product. Putting a model into production requires versioned data, reproducible training, continuous integration, continuous delivery, monitoring for data drift, rollback paths, and clear ownership. Those are all software development disciplines.
Andrej Karpathy made the cultural version of the argument in Software 2.0. His point was not that programmers go away. His point was that the artifact being shaped changes: weights instead of hand-written rules. The surrounding stack — pipelines, evaluation, deployment, tooling — still has to be built and maintained by people who can build and maintain software.
If your AI investment thesis assumes that hiring a few PhDs replaces an actual development capability, the paper trail in this field has been arguing the opposite for over a decade.
This is not a tear-down of the role. Good data scientists are extraordinarily valuable.
A strong data scientist can spot that a metric is leaking from the future into the past. They can tell you that your A/B test is underpowered and that the win you are about to celebrate is noise. They can show that a customer segmentation is unstable across time and warn you off building a strategy on it. They can pick the right model for the question instead of the most fashionable one. They can quantify how confident you should be in a forecast, which is exactly the part executives most want to skip.
They are also the people who notice when a “successful” AI demo only works on a curated slice of the data. That is a service.
What they are usually not optimised for is shipping. Building reliable services with versioned APIs, authentication, rate limiting, audit trails, deployment pipelines, on-call rotations and graceful degradation is not their core training. Many can grow into it. Many do not want to, and should not have to.
A software developer’s edge is making something work in production, again, tomorrow, under load, with auditors watching.
That includes the boring-looking parts: writing tests that prevent silent regressions, splitting a system into modules that can change independently, choosing where the data lives, deciding which calls are synchronous and which are not, integrating with third-party APIs that lie about their uptime, designing for observability so a 3 a.m. incident is debuggable, and shaping the codebase so the next change does not cost more than the last.
For AI specifically, software developers are the ones who:
Without that work, a model is a curiosity. With it, the model becomes a feature.
I said it in a recent interview and I will repeat it here. I am the one who builds the tooling, the pipelines, the integrations, and the production surface that lets a data scientist’s work reach customers. With forty-plus years in software, I understand the inner workings of LLMs and modern AI systems as software: tokens, context windows, embeddings, retrieval, evaluation, latency, cost, failure modes, deployment patterns. That is the part executives keep underestimating.
What I do not pretend to be is a mathematician. Deep measure theory, advanced statistical inference, or the latest twist in optimisation research is not my home turf. When that depth matters, you want a real data scientist or applied researcher. When the question is whether your AI system will actually run, scale, integrate, stay secure, and not bankrupt you on inference cost, you want a developer who has shipped systems for decades.
Two different blades. Same toolbox.
If you are a CEO, CTO, or investor evaluating an AI plan, a few practical checks save a lot of money.
Ask who turns the model into a product.
If the plan is heavy on data scientists and thin on developers, platform people, and SREs, the model will not reach customers in any reliable way. The first MVP might. The second release will hurt.
Ask where the engineering discipline lives.
Source control, code review, automated tests, CI/CD, observability, security review, incident response. If none of those words appear in the AI plan, you do not have an AI plan. You have an experiment with a marketing budget.
Ask how the team evaluates models in production.
Offline accuracy on a static dataset is not evaluation. You want continuous evaluation, monitoring for drift, and a clear answer to "how would we know this got worse this week?". That is software work, not a notebook.
Ask about cost and failure modes.
LLM features can quietly burn through inference budgets, leak data, or fail in ways that look like success. Developers who understand the runtime are the ones who put guardrails around all of that.
Do not fire the developers who built your product.
The fastest way to make an AI initiative fail is to treat your existing development capability as legacy headcount while pouring money into a parallel "AI team" that cannot ship. The teams that win combine deep domain code with new AI capability inside the same system, not next to it.
Data scientist and software developer are not synonyms. They are complementary.
You need data scientists to ask sharp questions and to keep the organisation honest about what the data actually supports. You need software developers to build and operate the systems that turn those answers into products customers use every day.
If your AI strategy has one without the other, it is not a strategy. It is a bet that the missing half will materialise on its own.
It won’t.
Let's talk about your real situation. Want to accelerate delivery, remove technical blockers, or validate whether an idea deserves more investment? I listen to your context and give 1-2 practical recommendations. No pitch, no obligation. Confidential and direct.
Need help? Practical advice, no pitch.
Let's Work TogetherVisibility and hands-on delivery
Navigator gives your leadership clear insight into patterns, blockers, and capacity. Our Developer Advocate writes production code with your team and gets delivery moving.