Stack Overflow Data Reveals the Hidden Productivity Tax of 'Almost Right' AI Code (venturebeat.com)
- Reference: 0178523594
- News link: https://developers.slashdot.org/story/25/07/31/1314207/stack-overflow-data-reveals-the-hidden-productivity-tax-of-almost-right-ai-code
- Source link: https://venturebeat.com/ai/stack-overflow-data-reveals-the-hidden-productivity-tax-of-almost-right-ai-code/
Only 33% trust AI accuracy today, down from 43% last year. The core problem isn't broken code that developers can easily spot and discard. Instead, two-thirds report wrestling with AI solutions that appear correct but contain subtle errors requiring significant debugging time. Nearly half say fixing AI-generated code takes longer than expected, undermining the productivity gains these tools promise to deliver.
[1] https://venturebeat.com/ai/stack-overflow-data-reveals-the-hidden-productivity-tax-of-almost-right-ai-code/
[2] https://survey.stackoverflow.co/2025
Such a surprise (Score:5, Insightful)
Nobody saw that one coming...
Re: (Score:2)
programmers have been saying it for years - it takes far more time to review and debug code than it does to write it in the first place.
why this would be a surprise to anyone, I can't even imagine.
Re: (Score:2)
It's a variation on the old truism about project management:
80% of the project takes 80% of the time. The remaining 20% also takes 80% of the time.
Re: (Score:2)
Well, yeah - I've seen AI actually write decent code... without error conditions, or bounds checking, but following the happy path, it really does work. Still needs to learn how people with bad intentions get in, but, you know, the happy path code is serviceable. Can save me a few hours, but I still need to program in sad paths. Don't know if that is more time or less time, really, but it kind of negates itself, I still need to code review and fix shit.
Re: Such a surprise (Score:2)
Try telling it to add tests for specific conditions you want to check. The good ones will fix the code as necessary.
Re: Such a surprise (Score:2)
Code review is slow UNLESS you are pair programming, e.g. the reviewer knows exactly what you are trying to do and how you're trying to do it. This explains the intuition that enhanced autocomplete is a good thing, but we get nervous using AI for more.
Re: (Score:2)
The real question is what happened to those other 33%.
Personal anecdote... (Score:2)
I told ChatGPT I wanted it to implement a particular open source Java interface using a specific major version of a dependency. It mixed imports from the previous and the current major versions. Obviously... that's a problem when the major release is a major rewrite of the public API.
I asked it specifically "restrict to version X.Y.Z," it confirmed it was going to do that, then went right back to generating mixed major release code.
Wasn't a problem for me. Took 5 minutes to debug with IntelliJ's decompiler
"undermining the productivity gains" (Score:2)
You mean I can't get something for nothing? Gee, what good is it then?
Claude Code is a Slot Machine (Score:4, Interesting)
[1]Claude Code is a Slot Machine" [rgoldfinger.com]: "I'm guessing that part of why AI coding tools are so popular is the slot machine effect. Intermittent rewards, lots of waiting that fractures your attention, and inherent laziness keeping you trying with yet another prompt in hopes that you don't have to actually turn on your brain after so many hours of being told not to. The exhilarating power of creation. Just insert a few more cents, and you'll get another shot at making your dreams a reality."
[1] https://rgoldfinger.com/blog/2025-07-26-claude-code-is-a-slot-machine/
This has been my experience exactly (Score:5, Interesting)
I am not a developer but I do write Perl and PowerShell scripts when the need arises.
I usually learn as I go and enjoy the process of figuring out how best to turn my problem into a well-functioning automation.
Sometimes, though, I would just like to get the LLM to output something really easy, like getting a list of all users in my company and running a simple process on them. It's something I could figure out and write in maybe an hour of web searching and document reading.
What I get though, is broken with invalid cmdlet parameters or not optimized like not using built-in filtering and relying on client-side filtering.
I am spending more time figuring out what is broken and doing the research to do things properly anyway. So, in the end, I haven't saved any time.
Maybe I am just not good at writing the prompts in the first place.
What's the issue? (Score:3, Insightful)
This is how humans operate. They can produce code, but there are subtle flaws which are revealed only through debugging.
AI is being trained on stuff produced by humans. Why would you expect it to be any different?
Re: What's the issue? (Score:2)
I have wondered if we should be building a âoethoroughly debugged codeâ repository, where lots of programmers vet the code, and then see what happens with an LLM trained only on that.
As expected (Score:2)
There is a big difference between creating a cool, simple demo for youtube and writing solid, bulletproof code
I use AI tools to guide me through complex and confusing documentation, but always check the doc to make sure
I use AI tools to create sample code that I study and then write my own version once I understand the sample
The fiction of claiming that non-programmers can effortlessly "vibe code" complex systems is dangerous
Same as it ever was (Score:1)
Vetting and maintenance have always been the bottlenecks of software development, not code creation. RAD pushers keep selling clueless bosses on the creation part. RAD pushers have been around for more than 5 decades.
(RAD can be done right, and reasonably flexible, but one has to accept certain conventions. They may be good conventions, but people are spoiled and want it their way.)
AI does scut work well. (Score:5, Insightful)
But it does everything else horribly.
Think of it as a partly trained intern. Tell it to do something it has done before or something really simple and it does a good job. Then you start to trust it and think it is smart, so you give it more and more.
When it fails, it does not come forward and ask for help. Instead it panics, lies, makes up crap, and covers up it's failures.
Re: AI does scut work well. (Score:2)
I would call ai a knowledgeable but clumsy assistant that should check its code by running it.
I literally today got code for a parser from Copilot that did "if lines begins with '%' ... else if line begins with '%%' ...".
Ship It! (Score:3)
> I literally today got code for a parser from Copilot that did "if lines begins with '%' ... else if line begins with '%%' ...".
It compiles. Ship it!
Re: (Score:2)
> It compiles. Ship it!
Hasn't that been the way Microsoft has worked since 1975?
Re: (Score:2)
Four trillion dollars, man, Four. Trillion. Dollars.
It does though (Score:5, Interesting)
Well, if you use something like Cursor or its competitors, you can perfectly well include comprehensive test writing, compilation, linting and loops until the code passes your every requirement.
It can still get stuck in some deeply worrying loops, go tumbling down a rabbit hole, or make a hundred additional classes and functions to solve a simple problem, but if your prompts and rules are good enough, you'll get your grunt work done exceedingly quickly.
TDD is king with AI tooling. Ensure you read and understand those tests.
Also, I made mine swear like a drunken sailor (in the chat feedback, not the code it writes), which makes me chuckle from time to time, so there's that.
Re: (Score:2)
I think people just expect too much from AI. Ask it to make a full UI email client and yes it may fail. But I have asked copilot things like "I need a python script that uses selenium to scrape all the images from this page" then "only take the images with the 'cats' directory in the link", then "click on the next page link and download these images until there are no more pages".
I ask it to show all the code after each query and paste the code and try to run it between of course. It will give many hel
Re: (Score:1)
It entirely depends on what language you are working in, which is expected due to it's "intelligence" coming from scraping the internet. It seems to be very good at python.
Re:AI does scut work well. (Score:5, Insightful)
A partly trained intern would learn and get better. An AI model keeps making the same mistakes, again and again and again.
Re: (Score:2)
The intern also likely burns a lot less resources.
Re: (Score:2)
Yup, first semester intern that needs absolute explicit instructions. The only immediate benefit is that they type really really fast
Re: (Score:2)
I would not say that AI does scut work well. I have a case that is trivial to do, you could even hire a first grader to do it, but, AI does it with 90% accuracy, when 100% accuracy would be needed.
Instead what AI does really well is work where accuracy does not matter. AI is good solution when 90% accuracy is good enough for you, but if you don't want any mistakes in your data, you should not use AI to make it. Good example of such work is writing proof on concept code. Something that you use to test your i
Re: (Score:2)
"Think of it as a partly trained intern." Think of it as a partly trained hamster. Fixed it for you