Hacker Newsnew | past | comments | ask | show | jobs | submit | hi_hi's commentslogin

For those who may be interested in learning more about the gut and how it affects your body and brain, this is a great, accessible, read

https://www.amazon.com/Gut-inside-story-bodys-under-rated/dp...

Also, while we're on the topic, if you ever find your self at the other end of the world in Tasmania, I highly recommend a visit to the MONA museum, which houses the Poo Machine.

https://www.abc.net.au/news/2017-11-07/mona-poo-machine-join...


This _all_ (waves hands around) sounds like alot of work and expense for something that is meant to make programming easier and cheaper.

Writing _all_ (waves hands around various llm wrapper git repos) these frameworks and harnesses, built on top of ever changing models sure doesn't feel sensible.

I don't know what the best way of using these things is, but from my personal experience, the defaults get me a looong way. Letting these things churn away overnight, burning money in the process, with no human oversight seems like something we'll collectively look back at in a few years and laugh about, like using PHP!


> sounds like alot of work and expense for something that is meant to make programming easier and cheaper.

Not if you are an AI gold rush shovel salesman.

From the article:

> I've run Claude Code workshops for over 100 engineers in the last six months


Yeah, my colleague recently said "hey I've burnt through $200 in Claude in 3 days". And he was prompting. Max 8hrs/day Imagine what would happen if AI was prompting.

As I like this allegory really much, AI is (or should be) like and exoskeleton, should help people do things. If you step out of your car putting it first in drive mode, and going to sleep, next day it will be farther, but the question is, is it still on road


[dead]


This comment reads very strongly like it was written by an LLM.


Your sibling even more so.


Agreed. The spec file is context. Writing acceptance criteria before you prompt provides the context the agent needs to not go off in the wrong direction. Human leverage just moved up and the plan/spec is the most important step.

Parallelism on top of bad context just gets you more wrong answers faster


Sorry but isn't the bottleneck then simply to do even relevant things? Like how much of a qualified backlog do you have that your pipeline does not run dry?


[flagged]


https://github.com/safety-quotient-lab/psychology-agent <- I've been exploring ways to track decisions, making some interesting findings, at the homelab scale, at least.

The cognitive architecture, so to speak, for the LLM can make a huge difference - triggers and skills go a long way when combined with shell scripts that dual-write.


Reminds me of when I was looking for Obsidian note management workflows and every single person who posted about theirs used it to take notes on... note taking workflows.


Bingo.


I would encourage my competitors to use AI agents on their codebase as much as possible. Make sure every new feature has it, lots of velocity! Run those suckers day and night. Don't review it, just make sure the feature is there! Then when the music stops, the AI companies hit the economic realities, go insolvent, and they are left with no one who understands a sprawling tangled web of code that is 80% AI generated, then we'll see who laughs last.


> they are left with no one who understands a sprawling tangled web of code that is 80% [random people that I can't ask because they don't work here anymore and they didn't care to leave docs or comments] generated, then we'll see who laughs last.

Yes, this matches my experience with codebases before AI was a thing.


Yes, but given a feature that should take say 100 lines of code, the average programmer will write in the order of 100 to 500 lines. If they're a heavy OOP user, maybe they'll write 10 classes that total 2000 lines. Regardless, worst case, it will be within ~2 orders of magnitude of a reasonable solution.

It's not that they're not trying to write the biggest clusterfuck possible and maximize suffering in the world, it's just that there's a human limit on how much garbage they can type out in their allocated time.

This is where AI revolutionizes things. You want 25,000 lines of React? On the backend? And a custom useEffect-backed database? Certainly!


> it's just that there's a human limit on how much garbage they can type out in their allocated time.

Another example where removing friction and constraints is a bad thing.


i think the friction has moved upstream - now it's working on the right thing and specifying what correct looks like. i don't think we are going back to a world where we will write code by hand again.


Unless what you want to do isn't well represented in the training set.


Yeah, in the past the limiting factor was the human suffering of the engineer who had to try and fit the sprawling nightmare fuel into their brain.

The machine doesn't suffer. Or if it does nobody cares. People eventually start having panic attacks, the machine can just be reset.

I suspect that the end result is just driving further into the wilderness before reality sets in and you have to call an adult.


Both be true at the same time: some teams spend a fortune on AI and the AI investments won't get the expected ROI (bubble collapse). What is sure is that a lot of capacity is been built and that capacity won't disappear.

What I could see happening in your scenario is the company suffers from diminishing return as every task becomes more expensive (new feature, debugging session, library update, refactoring, security audit, rollouts, infra cost). They could also end up with an incoherent gigantic product that doesn't make sense to their customer.

Both pitfall are avoidable, but they require focus and attention to detail. Things we still need humans for.


> What is sure is that a lot of capacity is been built and that capacity won't disappear.

They really are subsidizing what will be an incredibly healthy used server equipment market in a year or two. Can’t wait. My homelab is going to be due for an upgrade.


Qwen3 Coder Next and Qwen3.5-35B-A3B already very good and can be run on today's higher end home computers with good speed. Tomorrow's machines will not be slower but models are keep getting more efficient. A good sw engineer still would be valuable in Tomorrow's world but not as a software assembler.


Even cutting edge models are not very good. They are not even on mediocre level. Don’t get me wrong, they are improving, and they are awesome, but they are nowhere near good yet. Vibe coded projects have more bugs than features, their architecture and design system are terrible, and their tests are completely useless about half the time. If you want a good product you need to rewrite almost everything what’s written by LLMs. Probably this won’t be the case in a few years, but now even “very good” LLMs are not very good at all.


Not sure why you're being downvoted, this is very much my experience. When it matters (like, customer data is on the line) vibecoded projects are not just hilariously bad, but put you in legal danger.

We've so far found that Claude code is fine as a kind of better Coverity for uncovering memory leaks and similar. You have to check its work very carefully because about 1 time in 5 it just gets stuff wrong. It's great that it gets stuff right 4 times in 5 and produces natural code that fits into the style of the existing project, but it's nothing earth-shattering. We've had tools to detect memory leaks before.

We had someone attempt to translate one of our existing projects into Rust and the result was just wrong at a fundamental level. It did compile and pass its own tests, so if you had no idea about the problem space you might even have accepted its work.


With Claude Code now having a /plan mode - you can take your time and deliberate through architecture and design, collaboratively, instead of just sending a fire-and-forget. Much less buggy and saves time if you keep an eye on the output as you go, guiding it and catching defects, imho.


For that you need to create something which you know exactly how you want to code, or what architecture is needed. In other words, you would win basically nothing, because typing was never the real bottleneck (no matter what VIM and Emacs people would tell you).

LLMs also make mistakes even way lower level than those one pagers allow you to control with the planning mode. Which I use all the time btw. And anyway, they throw the plan out of the window immediately when their tried solutions don't work during execution, for example when a generated test is failing.

Btw, changing the plan after its generation is painful. It happens more than not that when I decline it with comments it generates a worse version of it, because it either miss things from the previous one which I never mentioned, or changes the architecture to a worse one completely. In my experience, it's better to restart the whole thing with a more precise prompt.


Ah, this is true - for my purposes, I've been directing the design and deliberating on the constraints and specifications for a larger system in tandem with smaller planning sessions.

That has worked well so far, but yes, you are totally right, there are still quite a few pain points and it is still rather far from being fire-and-forget "build me a fancy landing page for a turnkey business" and getting enterprise quality code.

edit: I think it is most important that you collaborate with Claude Code on quality in a systematic way, but even that has limits, right now - 1M context changes things a little bit.


You know, with all the babysitting needed, I wonder if effort is not better spent in just, you know, writing code.

Can you actually quantify the time & effort 'saved' letting LLM generate code for you?


For me, personally, I'm building things that would have been impractical for me to do as cleanly within the same amount of time - prototypes in languages that I don't have the muscle memory for, using algorithms i have a surface level understanding of but would need time to deeply understand and implement by hand, and, at my pace, as a retired dev, is probably quantified in terms of years worth of time and effort saved.

edit: also, would I take the time to implement LCARS by hand? No. But with an LLM, sure, took it about 3 minutes or less to implement a pretty decent LCARS interface for me.


> Tomorrow's machines will not be slower

The way it's going, the AI hyperscalers are buying such a big portion of the world's hardware, that it may very well happen that tomorrow's machines do get slower per dollar of purchase value.


Not my experience. Current Qwen Coder is noteworthy but still far from good. Can't compare them with current commercial offerings, it is just different leagues.


> Don't review it, just make sure the feature is there!

Bad idea. Use another agent to do automatic review. (And a third agent writing tests.)

Don't forget the architecting and orchestrating agent too!


Multiple agents with different frontier models for best results. Claude code/codex shops don’t know what they’re missing if they never let Gemini roast their designs, code and formal models.


This.

Claude Code wrote a blog article for me documenting a Gemini interaction that I manually operated. I found it quite interesting - the difference in "personalities", and the quality of Claude's output is stark in comparison to the Gemini's.

But still, best to have two sets of eyes.


I am not laughing about PHP. To this very day many of my best projects are built on PHP. And while last 7 years I have spent in full stack JavaScript/TypeScript environment it has never produced the same things I was actually able to do with PHP.

I actually feel that things I built 15 years ago in PHP were better than anything I am trying to achieve with modern things that gets outdated every 6 months.


I feel like today an engineer with a modern framework and AI con produce in an afternoon a product that deliver real value, something that 25 years ago would have required a full hour by a high schooler with MS Access.


I was building awesome things with Access 20 years ago. I loved that thing. I wasn't even a software engineer. I was in the EE, but I needed a way to track process and it definitely outperformed. And the best thing, it didn't cost us anything. Everybody already had access, lol. I had 40 people use it in production, manufacturing cutting edge stuff. Definitely beat spreadsheets because Access gave you gui for operators.


what in God's Name could you do in PHP that you can't do in a modern framework?


Nothing; but PHP, in experienced hands, will be waaay more productive for small-to-medium things. One issue is that experienced hands are increasingly hard to come by. Truly big, complicated things, built by large teams or numbers of teams, teams with a lot of average brains or AIs trained on average brains, will be better off in something like Typescript/React. And everyone wants to work on the big complicated stuff. So the "modern frameworks" will continue to dominate while smaller, more niche shops will wonder why they waste their time.


I worked at a startup, they built their API in PHP because it was easy and fast. Now they're successful, app doesn't scale, high latency etc. What does their php code do? 95% of it is calling a DB.

You're telling me today with LLM power multiplier it's THAT much faster to write in PHP compared to something that can actually have a future?


“PHP was so easy and fast that they’ve built such a successful startup they now have scaling problems” is, as far as I can tell, an endorsement of PHP and not a criticism of it.


I think the point here is that the scaling problem is hard because of PHP.


Scaling can be hard in PHP at the same time GGP comment's about PHP being in productive hands and thus being one of the reasons why PHP worked for them. Both of these can be true at the same time.

And for what its worth, Typescript scaling, although better than PHP is still somewhat of an issue and If you want to have massive scaling, Elixir/ (to-an-extent gleam) are developed for solving the scalability problem especially with Phoenix framework in Elixir-land.

So I guess, jack_pp comment's about PHP can also be applied to an degree towards Typescript as well so we should all use elixir, and also within the TS framework the question can be asked for (sveltekit/solid vs next-js/react)

I am more on the svelte side of things but I see people who love react and same for those who love PHP. So my opinion is sort of that everyone can run in their own languages.

Golang is another language to be taken into consideration especially with Htmx/datastar-go/alpine.


Scaling in PHP is easy. Has never actually been an issue in my entire career unless it was a badly designed database.


Yes, startup success has a direct correlation to the language chosen for your CRUD api…


> I worked at a startup, they built their API in PHP because it was easy and fast. Now they're successful

You can stop there! Sounds like PHP worked for them. Already doing better than 90% of startups.


If 95% of what app does is calling a DB, then the bottleneck is in the DB, not with the PHP.

You can use persistent DB connections, and app server such as FrankenPHP to persist state between requests, but that still wouldn't help if DB is the bottleneck.


Sometimes it’s still the app:

   rows = select all accounts
   for each row in rows:
       update row
But that’s not necessarily a PHP problem. N+1 queries are everywhere.


Depending on what you are doing, the above is not necessarily bad.. often much better than an SQL that locks an entire table (potentially blocking the whole DB, if this is one of the key tables).


> I worked at a startup, they built their API in PHP because it was easy and fast. Now they're successful, app doesn't scale, high latency etc. What does their php code do? 95% of it is calling a DB.

So PHP worked perfectly, but the DB is slow? Your DB isn't going any faster by switching to something else, if that's what you think.

PHP is the future, where React has been heading for years.


> Your DB isn't going any faster by switching to something else, if that's what you think.

Only true if none of the DB accesses are about stuff that could live as state across requests in a server that wasn't php. Sure, for some of that the DB's caching will be just as good, but for others, not at all.


That is possible, but it sounds unlikely to me.

In most cases you could add a shared cache to fix the problem - e.g. put your shared state in Redis, or in a file that is synced across servers (if its kept as state in a long running process it cannot need to be updated frequently).


Not scaling and high latency sound like a skill issue, not a PHP issue.


What does this even mean? If you've got scaling problems, it's not because you've used PHP.


by future do you mean Future<T> or metaphorical future? :)


I see what you did there.


PHP did better than python and perl. Python is doomed. PHP got a good jit already, a good OO lately, good frameworks, stable extensions. It has a company behind.

Unlike python or ruby which break right and left all the time on updates. you have to use bunkers of venvs, without any security updates. A nightmare.

PHP can scale and has a future.


Python is doomed? That's new.

You use python docker images pinned to a stable version (3.11 etc), and between bigger versions, you test and handle any breaking changes.

I feel like this approach applies to pretty much every language?

Who on earth raw dogs on "language:latest" and just hopes for the best?

Granted I wouldn't be running Facebook's backend on something like this. But i feel that isn't a problem 95% of people need to deal with.


No, only to python. And partially ruby and ocaml. Not to typescript, perl or PHP.


What's so bad about Python's virtualenv? It's a good way to have an LKGC (Last Known Good Configuration). When a CVE happens, you spin up a new venv and do smoke test, canary test, blue/green deployment, and so on.

If the update breaks things, you go back to the old venv. If the update goes well, then you just delete the old venv.

Not implementing security updates the right way is just human laziness and has nothing to do with the language in use.



uv does not fix the need for venv's or docker containers. normal people update their libs with the hope to get problems fixed.

python people don't update their libs, because then everything will break right and left. so they keep their security problems running.


No matter how you look at it, the dependencies have to go somewhere. Node uses node_modules, most compiled languages require compiled libraries (or they're a huge blob), etc. Idk about PHP but I'm pretty sure 3rd party things for any given app also live somewhere. Different ways of managing dependencies. It's recommended that venvs are used in Python because you may accidentally nuke a system script by doing global installs, and otherwise there still needs to be some sort of 3p version handling when you have multiple projects going.

Once something works in Python (which uv now makes trivial; before it could be a pain), updating 3rd party packages rarely cause breakage. But yes, I think many who use it hardly update, because things usually continue to work for years and the attack surface is pretty narrow[0]. Heck just a few days ago I checked out a project that I hadn't touched in years, which I wrote in Python 3.7; updated to 3.13 and it continued to just work. Compare to PHP which has a far higher attack surface[1] and often has breaking changes. I've heard a couple nightmare stories of a v7.x -> v8.x move being delayed because it required a serious codebase rewrite.

[0] https://www.cvedetails.com/product/18230/Python-Python.html?... [1] https://www.cvedetails.com/product/128/PHP-PHP.html?vendor_i...


I don't think it's true that experienced hands will be faster in PHP than in Python or JS or whatever. It's just about what you know, and experienced hands are experienced.


PHP is faster to develop in then Pythin or JS then addin a framework like Laravel and you are already done.

Python has the curse of spaces or tabs and JS has the curse of npm.


PHP has the curse of T_PAAMAYIM_NEKUDOTAYIM.


Wasn’t expecting to see Hebrew here today.


Eh, you know, when the conversation has devolved to the level of "Python is slower to develop in than PHP because of spaces or tabs", you have to bust out the Hebrew.


You can build those things in modern frameworks, it will just be more headache and will feel outdated in 6 months.


Where are my backbone apps? In the trash? Me ember apps? Next to them. My create-react-apps? On top of those. My Next apps? Being trashed as we speak. My rails apps? Online and making money every year with minimal upgrade time. What the hell was I thinking.


I'm guessing you avoided the CoffeeScript era of Rails, which is a good thing.


6 years ago I was writing apps in typescript and react, if I was starting a new project today I'd write it in typescript and react.


People bicker about PHP and Javascript, sorry Typescript, like they aren't both mule languages peoppe pick up to get work done. They both matured really well through years of production use.

They are in the same group, similar pedigree. If you were programming purely for the art of it, you would have had time to discover much nicer languages than either, but that's not what most people are doing so it doesn't really matter. They're different but they're about as good as eachother.


Not have to "build" anything. You edit code and it is already deployed on your dev instance.

Deploying to production is just scp -rv * production:/var/www/

Beautifully simple. No npm build crap.


You trade having to compile for actually having code that can scale


Not sure what you’re talking about, I scaled to millions of users on a pair of boxes with PHP, and its page generation time absolutely crushed Rails/Django times. Apache with mod PHP auto scales wonderfully.


It scales just fine the same way everything else scales: put a load balancer in front of multiple instances of your app.


It can scale by the virtue of spending a lot less time processing the request


You don't know anything about the PHP ecosystem and it shows.


Making instant loading and user respecting sites.


Don’t confuse php the language with php the way of webmaster 2006 vintage.


Those webmasters built the web a lot of people are now nostalgic about already.


Could you give examples of the modern frameworks that you have in mind?


The comparison target for PHP is IMHO a good Python web framework, e.g. Django being the most popular one. I still don't understand how JavaScript is ever considered viable, TypeScript makes it workable I guess…


> sounds like alot of work and expense for something that is meant to make programming easier and cheaper.

It's not more work; it's a convergence of roles. BA/PO/QA/SWE are merging.

AI has automated aspects of those roles that have made the traditional separation of concerns less desirable. A new hybrid role is emerging. The person writing these acceptance criteria can be the one guiding the AI to develop them.

So now we have dev-BAs or BA-devs or however you'd like to frame it. They're closer to the business than a dev might have been or closer to development than a BA might have been. The point is, smaller teams are able to play wider now.


Oh a modern comeback of the analyst-programmer?


> It's not more work

It literally is. You're spending weeks of effort babysitting harnesses and evaluating models while shipping nothing at all.


That hasn't been my experience, as a "ship or die" solopreneur. It takes work to set up these new processes and procedures, but it's like building a factory; you're able to produce more once they're in place.

And you're able to play wider, which is why the small team is king. Roles are converging both in technologies and in functions. That leads to more software that's tailored to niche use cases.


> you're able to produce more once they're in place

Cool story, unfortunately the proof is not in the pudding and none of this fantom x10 vibe-coded software actually works or can be downloaded and used by real people.

P.S. Compare to AI-generated music which is actually a thing now and is everywhere on every streaming platform. If vibe coding was a real thing by now we'd have 10 vibecoded repos on Github for every real repo.


There's no need to be rude with comments like "cool story." I'm sharing my experience with you. I'm not an AI-hype influencer. I'm a SWE who runs a small SaaS business.

Where it sounds like we agree is that there's some obnoxious marketing hype around LLMs. And people who think they can vibe code without careful attention to detail are mistaken. I'm with you there.


These people play around with shit and try to sell you on their secret sauce. If it actually works it will come to claude code - so you can consider them practical SOTA and honestly just plopping CC to a mid sized codebase is a pretty great experience for me already. Not ideal but I get real tangible value out of it. Not 10x or any such nonsense but enough to think that I don't think I want to be managing junior developers anymore, the ROI with LLMs is much faster and significant IMO.


I can't believe we're back to advocating for TDD. It was a failed paradigm that last few times we tried it. This time isn't any different because the fundamental flaw has always been the same: tests aren't proofs, they don't have complete coverage.

Before anyone gets too confused, I love tests. They're great. They help a lot. But to believe they prove correctness is absolutely laughable. Even the most general tests are very narrow. I'm sure they help LLMs just as they help us, but they're not some cure all. You have to think long and hard about problems and shouldn't let tests drive your development. They're guardrails for checking bonds and reduce footguns.

Oh, who could have guessed, Dijkstra wrote about program completeness. (No, this isn't the foolishness of natural language programming, but it is about formalism ;)

https://www.cs.utexas.edu/~EWD/transcriptions/EWD02xx/EWD288...


Testing works because tests are (essentially) a second, crappy implementation of your software. Tests only pass if both implementations of your software behave the same way. Usually that will only happen if the test and the code are both correct. Imagine if your code (without tests) has a 5% defect rate. And the tests have a 5% defect rate (with 100% test coverage). Then ideally, you will have a 5%^2 defect rate after fixing all the bugs. Which is 0.25%.

The price you pay for tests is that they need to be written and maintained. Writing and maintaining code is much more expensive than people think.

Or at least it used to be. Writing code with claude code is essentially free. But the defect rate has gone up. This makes TDD a better value proposition than ever.

TDD is also great because claude can fix bugs autonomously when it has a clear failing test case. A few weeks ago I used claude code and experts to write a big 300+ conformance test suite for JMAP. (JMAP is a protocol for email). For fun, I asked claude to implement a simple JMAP-only mail server in rust. Then I ran the test suite against claude's output. Something like 100 of the tests failed. Then I asked claude to fix all the bugs found by the test suite. It took about 45 minutes, but now the conformance test suite fully passes. I didn't need to prompt claude at all during that time. This style of TDD is a very human-time efficient way to work with an LLM.


I think there is a difference whether you do TDD or write tests after the fact to avoid regression. TDD can only work decently if you already know your specs very well, but not so much when you still need to figure them out, and need to build something actual to be able to figure it out.


Yes; I think this remains true with coding agents. If you need to do some exploration of the solution space, it makes sense to do that before writing tests. Once you have a clear, workable design, you can get the agent to make a battery of tests to make sure the final product works correctly.


This is great. The tests in this case are the spec. When you give the agent something concrete to fail against, it knows what done looks like.

The problem is if you skip that step and ask Claude to write the tests after.


  > Tests only pass if both implementations of your software behave the same way.
That's not true.

I even addressed this in my comment as did Dijkstra


What is untrue about this statement you quoted?


You can have software behave differently while passing the same tests.

Idk man, this is pretty easy to demonstrate. Start with a trivial example: test is that input (2,2) -> 4. Function 1 does multiplication, function 2 does exponentiation. Both functions pass the test.

Sure, simple example but illustrative examples should be simple. But add more complexity and I'll add more examples of functions where the outputs are the same for a given set of inputs. (There's a whole area of mathematics dedicated to this!) It's simple, but you also confidently claimed something that was trivial to disprove.

Your claim is true if and only if your tests have complete coverage. So, your claim is only true if you've done formal verification of your code. Which was what I said in the beginning and is what Dijkstra claimed as well.


I mean, yeah, I thought that was obvious. If you want to be a pedant:

> Tests only pass if both implementations of your software behave the same way in the exact area being tested.

As I said in my comment above. Tests are a crappy second implementation. The test in your example isn’t even defined outside the input range of (2,2). Tests are a stochastic tool. Tests can prove the presence of a bug, not their absence. Completeness isn’t something tests alone can provide. But in the choice between yolo coding and yolo coding plus tests, you’re obviously going to get fewer bugs with tests.


When you write tests with LLM-generated code you're not trying to prove correctness in a mathematically sound way.

I think of it more as "locking" the behavior to whatever it currently is.

Either you do the red-green-with-multiple-adversarial-sub-agents -thing or just do the feature, poke the feature manually and if it looks good then you have the LLM write tests that confirm it keeps doing what it's supposed to do.

The #1 reason TDD failed is because writing tests is BOORIIIING. It's a bunch of repetition with slight variations of input parameters, a ton of boilerplate or helper functions that cover 80% of the cases, but the last 20% is even harder because you need to get around said helpers. Eventually everyone starts copy-pasting crap and then you get more mistakes into the tests.

LLMs will write 20 test cases with zero complaints in two minutes. Of course they're not perfect, but human made bulk tests rarely are either.


  > you're not trying to prove correctness in a mathematically sound way.

  > "locking" the behavior to whatever it currently is.
These two sentences are incompatible

  > The #1 reason TDD failed is
Because spec is an ever evolving thing that cannot be determined a priori. And because it highly incentivized engineers to metric hack.

  > It's a bunch of repetition with slight variations
If that's how you're writing tests then you're writing them wrong. You have the wrong level of abstraction. Abstraction is not a dirty word. It solves these problems. Maybe juniors don't understand that abstraction and fuck it up while learning but making abstraction a dirty word is throwing the baby out with the bath water.

  > Eventually everyone starts copy-pasting crap
Which is a horrendous way to write code.


Locking behavior with tests isn't the same as comprehensive and foolproof tests. They might not cover every edge case, but will fail if the happy path starts failing for some reason.

And yes, copy-pasting is a horrendous way to write code, but everyone does it.

When you're adding the 1600th CRUD endpoint of your career to an enterprise Java/C# application, can you with all honesty say you will type every single character with the same thought and consideration every time?

Or do you just make one, copy-paste that one and modify accordingly?

Or if you write 20 unit tests with slight alterations you masterfully craft every single character to perfection?

I have a limited amount of energy to use every day, I choose to use it in places that matter. The hard bits that LLMs and copy-pasting can't speed up.


Hmm, not so sure TDD is a failed paradigm. Maybe it isn't a pancea, but it is seems like it's changed how software development is done.

Especially for backend software and also for tools, seems like automated tests can cover quite a lot of use cases a system encounters. Their coverage can become so good that they'll allow you to make major changes to the system, and as long as they pass the automated tests, you can feel relatively confident the system will work in prod (have seen this many times).

But maybe you're separating automated testing and TDD as two separate concepts?


Indeed, they are two separate concepts.

I write lots of automated tests, but almost always after the development is finished. The only exception is when reproducing a bug, where I first write the test that reproduces it, then I fix the code.

TDD is about developing tests first then writing the code to make the tests pass. I know several people who gave it an honest try but gave up a few months later. They do advocate everyone should try the approach, though, simply because it will make you write production code that's easier to test later on.


... hmm, just looked it up. According to some sites on the web, TDD was created by Kent Beck as apart of Extreme Programming in the 90's and automated testing is a big part of TDD. Having lived through that era, thinking back, would say that TDD did help to popularize automated testing. It made us realize that focusing a ton on writing tests had a lot of benefits (and yeah, most of us didn't do the test first development part).

But this is kind of splitting hairs on what TDD is, not too important.


I think tests in general are good, just not TDD as it forces you to what I think bad and narrow paradigm of thinking. I think e.g. it is better that I build the thing, then get to 90%+ coverage once I am sure this is what I would also ship.


That's the result I've seen with anyone who tries TDD. Their code ends up being very rigid, making it difficult to add new features and fix bugs. It just ends up making them over confident in their code's correctness. As if their code is bug free. It just seems like an excuse to not think and avoid doing the hard stuff.


  > But maybe you're separating automated testing and TDD as two separate concepts?
I hope it's clear that I am given my content and how I stress I write tests. The existence of tests do not make development TDD.

The first D in TDD stands for "driven". While my sibling comment explains the traditional paradigm it can also be seen in an iterative sense. Like just developing a new feature or even a bug. You start with developing a test, treating it like spec, and then write code to that spec. Look at many of your sibling comments and you'll see that they follow this framing. Think carefully about it and adversarially. Can you figure out its failure mode? Everything has a failure mode, so it's important to know.

Having tests doesn't mean they drive the development. So there's many ways to develop software that aren't TDD but have tests. The important part is to not treat tests as proofs or spec. They are a measurement like any other; a hint. They can't prove correctness (that your code does what you intend it to do). They can't prove that it is bug free. But they hint at those things. Those things won't happen unless we formalize the code and not only is that costly in time to formalize but often will result in unacceptable computational overhead.

I'll give an example of why TDD is so bad. I taught a class a year ago (upper div Uni students) and gave them some skeleton code, a spec sheet, and some unit tests. I explicitly told them that the tests are similar to my private tests, which will be used to grade them, but that they should not rely on them for correctness and I encourage them to write their own. The next few months my office hours were filled with "but my code passes the tests" and me walking students through the tests and discussing their limitations along with the instructions. You'd be amazed at how often the same conversations happened with the same students over and over. A large portion of the class did this. Some just assumed tests had complete coverage and never questioned them while others read the tests and couldn't figure out their limits. But you know the students who never struggled in this way? The students who first approached the problem through design and even understood that even the spec sheet is a guide. That it tells requirements, not completeness. Since the homeworks built on one another those students had the easiest time. Some struggled at first, but many of them got the right levels of abstraction that I know I could throw new features at them and they could integrate without much hassle. They knew the spec wasn't complete. I mean of course it wasn't, we told them from the get go that their homeworks were increments to building a much larger program. And the only difference between that and real world programming is that that isn't always explicitly told to you and that the end goal is less clear. Which only makes this design style more important.

The only thing that should drive the software development is an unobtainable ideal (or literal correctness). A utopia. This prevents reduces metric hacking, as there is none to hack. It helps keep you flexible as you are unable to fool yourself into believing the code is bug free or "correct". Your code is either "good enough" or not. There's no "it's perfect" or "is correct", there's only triage. So I'll ask you even here, can you find the failure mode? Why is that question so important to this way of thinking?


Hmm, saying tests are just a hint seems to be under appreciating their significance. Yes, they do have bugs of their own, but as you said they are a measurement. Having them statistically reduces the chances of bugs reaching production. They don't remove them completely of course, but they do greatly decrease the rate of bugs (and have read the same thing, formal verification of the code is typically not worth the time and cost).

And just looked up TDD on wikipedia. Actually, the standard process is not to write all the tests first, then do the implementation. It's to do what a lot of devs already do, write some tests based on your requirements. Then, write the implementation for these tests. Then repeat, adding in more test for other paths through the system.

Didn't know this myself about TDD (I thought it was focus writing all the tests, then do the implementation). Yeah, TDD is actually a very practical approach and something I pretty much do in my own development. Instead of using a driver program to run your working code, just write unit tests to run it. And keep building your unit tests for every new feature or execution path you're working on. You'll miss a lot of them early on, but you fill out the rest at the end.

Now that I know, in my opinion, TDD was pretty amazing and changed our industry.


TDD and similiar test paradigms have all the same fundamental flaw -> It's testing for the sake of testing. You need to know exactly what you want in order to start, which isn't compatible with a competitive iterative workflow no matter how much TDD yells otherwise. TDD doesn't make sense in agile and fast iteration workflows, only in heavily regulated / restricted products.


It certainly isn’t. It is more a way of discovery on how to implement something, with the benefit of being able to safely (and thus easily) change it later.

The 99 Bottles book by Sandi Metz [0] is a good short display of how it works and where it helps actually building maintainable software

[0] https://sandimetz.com/99bottles


> But to believe they prove correctness is absolutely laughable.

You don't need to believe this to practice TDD. In fact I challenge you to find one single mainstream TDD advocate who believes this.



> But to believe they prove correctness is absolutely laughable.

Sounds like a lack of tests for the correct things.


True, but I seriously doubt people are writing formal proofs for their code. I've only seen this in niche academic circles and high security/safety settings. I also am pretty certain it's not what you're suggesting, but hey, I could be wrong


"You better work, bitch" -- Britney Spears

Our society is obsessed with work. Work will never end. If things become easier we just do more of them. Whether putting all our efforts into recycling things created by those that came before is good for us will remain to be seen.


Our society is obsessed with <the appearance of> work


I saw a guys post on LinkedIn who created llm agent to water how plants based on sensor on his plants

He still has to water the plants on his own. Its just that it costs him quite a bit when all of that could he mamaged with an alarm to remind him to water plants.


It being a lot of work is why they didn't do it at all for weeks and still, without self reflection, wrote that they care about the code quality of the code they hadn't looked at or tested


Looking back we see how foolish the anti-php memes were. Meanwhile PHP lives on and becomes better with each release.

Tooling around llms are a natural next step that will become your default one day.


It's always the uber conservative and over principled people who laugh about using PHP that have an opinion on everything while not knowing how to get shit done.

They're all just tools. You decide how to use them.


Sure but we can agree there's essentially two parallel industries in web development

Engineer at tech firms and WebShops writing WordPress plugins for single clients where Squarespace doesn't cut it.

Is AI another field of people or is it killing one or both of those. TBD


To be fair my perception of the Javascript ecosystem and their one-day-fly frameworks is ruined by jQuery plugins developers.


php still makes money though!


> like using PHP

lmao, chuckled


Nah man. I understand the frustration, but this is a glass is half empty view.

You have decades of expert knowledge, which you can use to drive the LLMs in an expert way. Thats where the value is. The industry or narrative might not have figured that out yet, but its inevitable.

Garbage in, garbage out still very much applies in this new world.

And just to add, the key metric to good software hasn't changed, and won't change. It's not even about writing the code, the language, the style, the clever tricks. What really matters is how well does the code performs 1 month after it goes live, 6 months, 5 years. This game is a long game. And not just how well does the computer run the code, but how well can the humans work with the code.

Use your experience to generate the value from the LLMs, cuase they aren't going to generate anything by themselves.


Glass half empty view? Their whole skill set built up over decades, digitized, and now they have to shift everything they do, and who knows humans will even be in the loop, if they’re not c-suite or brown nosers. Their whole magic and skill is now capable of being done by a PM in 5 minutes with some tokens. How is that supposed to make skillful coders feel?

Massive job cuts, bad job market, AI tools everywhere, probable bubble, it seems naive to be optimistic at this juncture.


The world changes. Time marches on, and the very skills you spend your time developing will inevitably expire in their usefulness. Things that were once marvelous talents are now campfire stories or punchlines.

LLMs may be accelerating the process, but definitely not the cause.

If you want a career in technology, a durable one, you learn to adapt. Your primary skill is NOT to master a given technology, it is the ability to master a given technology. This is a university that has no graduation!


Is it though? If it was that universal, we'd employ the best programmers as plumbers, since they have the best ability to master plumbing technology. There are limits, and I think the skill being to master programming technologies is a reasonable limit.

If you're a great programmer, can you can stop using Angular and master React? Yes. Can you stop telling the computer what to do, and master formal proof assistants? Maybe. Can you stop using the computer except as a tool and go master agricultural technology? Probably not. (Which is not to say you can't be a good programmer at an agritech company)


The “this wrecked my industry” sob story is especially rich when the vast majority of tech workers ability to demand premium salaries comes directly from creating software that makes existing jobs obsolete.

Let’s talk about the industries the computer killed: travel agents, musician, the entire film development industry, local newspapers built on classified ads, the encyclopedia industry, phone operators, projectionists, physical media industries, and a few dozen other random industries.

We aren’t special because we are coders. Creativity and engineering thoughtfulness will still exist even with LLMs, it will just take a different form.


Since I love programming, I feel pretty lucky I got to live and work in the only few decades in which it's economically viable to work as a computer programmer. At least "musician" had a longer run, but I guess we had it coming.


What exactly would people retrain into? The future these companies explicitly want is AI taking ALL the jobs, It's not like PMs are going to be any safer, or any other knowledge work. I see little evidence that AI is going to create new jobs other than a breathless assurance that it "always happens"


No, retraining has been tested and found to be unfeasible. Even if you throw money at it.


> Their whole skill set

This is the fundamental problem with how so many people think about LLMs. By the time you get to Principal, you've usually developed a range of skills where actual coding represents like 10% of what you need to do to get your job done.

People very often underestimate the sheer amount of "soft" skills required to perform well at Staff+ levels that would require true AGI to automate.


Yeah well. That's what we've been doing to other industries over and over.

I remember a cinema theater projectionist telling me exactly that while I was wiring a software controlling numeric projector, replacing the 35mm ones.


If a principal doesn't have the skills to mentor juniors, plan and define architecture, review work and follow a good process, they really shouldn't be considered a principal. A domain expert? Perhaps. A domain expert should fear for their job but a principal should be well rounded, flexible, and more than capable of guiding AI tooling to a good outcome.


> Their whole magic and skill is now capable of being done by a PM in 5 minutes with some tokens.

[citation needed]

It has just merely moved from "almost, but not entirely useless" to "sometimes useful". The models themselves may perhaps be capable already, but they will need much better tooling than what's available today to get more useful that that, and since it's AI enthusiasts who will happily let LLMs code them that work on these tools it will still take a while to get there :)


> It has just merely moved from "almost, but not entirely useless" to "sometimes useful"

[citation needed]

:P

This thing has changed the way I work. I barely touch my editor to actually edit anymore, because speaking into the chat field what changes I want it to make is more efficient

The tooling does need to get better, yes, but anecdotally, I do a fundamentally different job (more thinking, less typing, less sifting through docs, less wiring up) than 3 months ago

So much of my career was spent on especially rummaging in docs and googling and wiring things up. I believe that's the same for most of us


I'm optimistic about people being able to build the things they always wanted to build but either didn't have the skills or resources to hire somebody who did.

If we truly value human creativity, then things that decrease the rote mechanical aspects of the job are enablers, not impediments.


If we truly value human creativity we should stop building technology that decreases human value in the eyes of the rich and powerful


Or stop measuring ourselves by our reflection in their eyes.

Society can interpret sociopathy as damage and route around it, if we do the work to make it happen. It will not happen by itself without effort.


> What really matters is how well does the code performs 1 month after it goes live, 6 months, 5 years.

After 40 years in this industry—I started at 10 and hit 50 this year—I’ve developed a low tolerance for architectural decay.

Last night, I used Claude to spin up a website editor. My baseline for this project was a minimal JavaScript UI I’ve been running that clocks in at a lean 2.7KB (https://ponder.joeldare.com). It’s fast, it’s stable, and I understand every line. But for this session, I opted for Node and neglected to include my usual "zero-framework" constraint in the prompt.

The result is a functional, working piece of software that is also a total disaster. It’s a 48KB bundle with 5 direct dependencies—which exploded into 89 total dependencies. In a world where we prioritize "velocity" over maintenance, this is the status quo. For me, it’s unacceptable.

If a simple editor requires 89 third-party packages to exist, it won't survive the 5-year test. I'm going back to basics.

I'll try again but we NEED to expertly drive these tools, at least right now.


I don't understand. You specifically:

> neglected to include my usual "zero-framework" constraint in the prompt

And then your complaint is that it included a bunch of dependencies?

AI's do what you tell them. I don't understand how you conclude:

> If a simple editor requires 89 third-party packages to exist

It obviously doesn't. Why even bother complaining about an AI's default choices when it's so trivial to change them just by asking?


My main point is that we need to expertly drive these tools. I forgot the trivial instruction and ended up with something that more closely resembles modern software instead of what I personally value. AI still requires our expertise to guide it. I'm not sure if that will be the case in a year, but it is today.


You seem intelligent so it is probably confusing to many why you are posting this.

You call it a trivial instruction, but it is not trivial. It was a core requirement for your own design that you neglected to specify. This is not different than leaving out any other core requirement for a engineering specification.

Most people would NOT want this requirement. Meaning most people wouldn't care if there are package dependencies are not, so the agent 100% did the right thing.


I always tell Claude, choose your own stack but no node_modules.

What's missing is another LLM dialog between you and Claude. One that figures out your priorities, your non-functional requirements, and instructs Claude appropriately.

We'll get there.


This perhaps reflects the general divide in viewpoints on “vibe-coding”. Do you let go of everything (including understanding) and let it rip, or require control and standards to some degree. Current coding agents seem to promote the former. The only way with their approach, is to provide them with constraints?


> What's missing is another LLM dialog between you and Claude. One that figures out your priorities, your non-functional requirements, and instructs Claude appropriately.

There are already spec frameworks that do precisely this. I've been using BMAD for planning and speccing out something fairly elaborate, and it's been a blast.


Yes, I think this is reasonable.

I have been consistently skeptical of LLM coding but the latest batch of models seems to have crossed some threshold. Just like everyone, I've been reading lots of news about LLMs. A week ago I decided to give Claude a serious try - use it as the main tool for my current work, with a thought out context file, planning etc. The results are impressive, it took about four hours to do a non-trivial refactor I had wanted but would have needed a few days to complete myself. A simpler feature where I'd need an hour of mostly mechanical work got completed in ten minutes by Claude.

But, I was keeping a close eye on Claude's plan and gradual changes. On several occasions I corrected the model because it was going to do something too complicated, or neglected a corner case that might occur, or other such issues that need actual technical skill to spot.

Sure, now a PM whose only skills are PowerPoint and office politics can create a product demo, change the output formatting in a real program and so on. But the PM has no technical understanding and can't even prompt well, let alone guide the LLM as it makes a wrong choice.

Technical experts should be in as much demand as ever, once the delirious "nobody will need to touch code ever again gives way to a realistic understanding that LLMs, like every other tool, work much better in expert hands. The bigger question to me is how new experts are going to appear. If nobody's hiring junior devs because LLMs can do junior work faster and cheaper, how is anyone going to become an expert?


> I have been consistently skeptical of LLM coding but the latest batch of models seems to have crossed some threshold.

It’s refreshing to hear I’m not the only one who feels this way. I went from using almost none of my copilot quota to burning through half of it in 3 days after switching to sonnet 4.6. I’m about to have to start lobbying for more tokens or buy my own subscription because it’s just that much more useful now.


Yes, it's Sonnet 4.6 for me as well as the most impressive inflection point. I guess I find Anthropic's models to be the best, even before I found Sonnet 3.7 to be the only model that produced reasonable results, but now Sonnet 4.6 is genuinely useful. It seems to have resolved Claude's tendency to "fix" test failures by changing tests to expect the current output, it does a good job planning features, and I've been impressed by this model also telling me not to do things - like it would say, we can save 50 lines of code in this module but the resulting code would be much harder to read so it's better not to. Previous models in my experience all suffered from constantly wanting to make more changes, and more, and more.

I'm still not ready to sing praises about how awesome LLMs are, but after two years of incremental improvements since the first ChatGPT release, I feel these late-2025 models are the first substantial qualitative improvement.


^ Big this. If we take a pessimistic attitude, we're done for.


I think the key metric to good software has really changed, the bar has noticeably dropped.

I see unreliable software like openclaw explode in popularity while a Director of Alignment at Meta publicly shares how it shredded her inbox while continuing to use openclaw [1], because that's still good enough innit? I see much buggier releases from macOS & Windows. The biggest military in the world is insisting on getting rid of any existing safeguards and limitations on its AI use and is reportedly using Claude to pick bombing targets [2] in a bombing campaign that we know has made mistakes hitting hospitals [3] and a school [4]. AI-generated slop now floods social networks with high popularity and engagement.

It's a known effect that economies of scale lowers average quality but creates massive abundance. There never really was a fundamental quality bar to software or creative work, it just has to be barely better than not existing, and that bar is lower than you might imagine.

[1] https://x.com/summeryue0/status/2025774069124399363

[2] https://archive.ph/bDTxE

[3] https://www.reuters.com/world/middle-east/who-says-has-it-ha...

[4] https://www.nbcnews.com/world/iran/iran-school-strike-us-mil...


[flagged]


Is this a bot? I feel like HN is dying (for me at least) with all the em-dashes and the "it's not just X, it's Z".


This is correct. Had lunch with a senior staff engineer going for a promo to principal soon. He explained he was early to CC, became way more productive than his peers, and got the staff promo. Now he’s not sharing how he uses the agent so he maintains his lead over his peers.

This is so clearly a losing strategy. So clearly not even staff level performance let alone principal level.


Why the downvotes? It is the defining characteristic of the staff+ level to empower others. Individual contributions don’t matter at this level.


Hi Grok, nice comment!


I'm intrigued...how was he wrong?


The building in question wasnt really tall enough. And would have to be precision demolished to collapse in the way he was afraid of.

It would still cause chaos and possible power issues.

Needs to be taken in context with some Sydney buildings having maintenance defects a few years after they open. Largely due to inferior materials imported from china. The building in question developed some cracks in supporting beams and was briefly evacuated. There was never a chance it was going to topple on its own in a way that impacted more than 1/2 datacenters, so he pivoted to possible terrorism, but even thats largely nonsensical.

I just went hunting for the case and couldnt find it. The gentleman in question had published the claim to his business that was it happens trying to build contacts with defense and intelligence agencies for third party threat assessment. As far as I can tell the business no longer exists and he has deleted their footprint.

But he also posted the claim on public mailing lists so I can probably trawl it up if necessary.


This is a great and worthwhile discussion. People are loosing sight of what art is. The art is the idea, not the medium. And just because something is easy, doesn't mean it will be good.

I've seen some fantastic original pictures that actual artists have generated through AI. I can't wait to see what current and future artists can do with the new tools at their disposal.


I don't understand the fascination and focus on Vibe Coding.

Sure, you can do that, it's an option, but no serious engineering effort is being left entirely up to the AI.

Vibe coding is essentially the Jackson Pollock approach to software building. Throw a bunch of paint down, with very little control, and look, we have something novel.

It doesn't mean your going to replace all the ways of making art with paint throwing.

I'd love to start seeing more discussions about alternative approaches to working with AI. The recent Vinext article was great https://blog.cloudflare.com/vinext/. This seems to be "the way" for working with AI in a high stakes production environment, but what other ways are there.

I fear the focus on vibe coding is diluting and taking focus away from far better alternatives. Maybe because the narrative around those aren't quite so dramatic?


Not sure I understand this distinction.

> I’ve always wanted an ARM server in my homelab. But earlier, I either had to use an underpowered ARM system, or use Asahi...

What is stopping you using Mac with MacOS?


It’s not great as a headless server.

With full disk encryption enabled you need a keyboard and display attached at boot to unlock it. You then need to sign in to your account to start services. You can use an IP based KVM but that’s another thing to manage.

If you use Docker, it runs in a vm instead of native.

With a Linux based ARM box you can use full disk encryption, use drop bear to ssh in on boot to unlock disks, native docker, ability to run proxmox etc.

Mac minis/studio have potential to be great low powered home servers but Apple is not going down that route for consumers. I’d be curious if they are using their own silicon and own server oriented distro internally for some things.


They fixed unlock in the last release:

"On a Mac with Apple silicon with macOS 26 or later, FileVault can be unlocked over SSH after a restart if Remote Login is turned on and a network connection is available."

https://support.apple.com/guide/security/managing-filevault-...


That's a bad implementation and seems like a bad afterthought


Thanks for the reply. I'm looking to replace my aging mini pc with a mac mini, so I'm quite interested in any limitations here.

The full disk encryption I can live without. I'm assuming these limitations don't apply if it's disabled. [Ah, I just saw the other reply that this has now been fixed]

I was aware of the Docker in a VM issue. I haven't tested this out yet, but my expectation is this can be mitigated via https://github.com/apple/container ?

I appreciate any insights here.


Apple containers run in one vm per container


    The root of trust for Private Cloud Compute is our compute node: custom-built server hardware that brings the power and security of Apple silicon to the data center, with the same hardware security technologies used in iPhone, including the Secure Enclave and Secure Boot.
https://security.apple.com/blog/private-cloud-compute/

Granted, I don't know if it's really server oriented or if they're a bunch of iPhones on cards plugged into existing servers.


Most likely wanting to run Linux natively. Only M1/M2 can fill that role with Asahi, and still not with 100% hardware compatibility.

On the flip side, an M4 mini is cheaper, faster, much smaller (with built in power supply) and much more efficient. Plus for most applications, they can run in a Linux container just as well.


Thanks for the reply Jeff. This aligns with my understanding too. I'm close to purchasing a mac mini to replace my aging media pc. The core feature I want is to run microK8s natively, which I'm assuming the newish Mac containers will support.


You can't talk about hobby tunnels without mentioning the Moleman of Hackney :-)

- https://en.wikipedia.org/wiki/William_Lyttle

- The house was later brought back to life in an amazing way https://www.vogue.co.uk/arts-and-lifestyle/article/sue-webst...


That is fascinating!

But I feel he was an amateur compared to Joseph Williamson :)

https://en.wikipedia.org/wiki/Williamson_Tunnels


Brilliant. This indeed wins :-)


Also worth mentioning Baldassare Forestiere, who spent about 40 years digging an extensive network of tunnels in Fresno, CA.

https://en.wikipedia.org/wiki/Forestiere_Underground_Gardens


I haven't seen the Public Image Ltd logo in a very long time.


Is it just me, or does there appear to be a big gap in how people understand this works?

There is no magic here. Replace "car" with some nonsense word the LLM hasn't encountered before. It will completely ignore the small amount of nonsense you have provided, and confidently tell you to walk, while assuming you are talking about a car. I'm fairly confident the first time this was tried using "car", it told them to walk.

"I want to wash my flobbergammer. The flobbergammer wash place is only 50 meters away. should I drive or walk."

Reply:

If it’s only *50 meters away*, definitely *walk*.

That’s about a 30–45 second walk for most people. Driving would likely:

* Take longer (getting in, starting the car, parking) * Waste fuel * Add unnecessary wear to your car * Be objectively funny in a “why did I do this” kind of way

The only reasons to drive would be:

* The flobbergammer is extremely heavy * Severe weather * You have mobility limitations

Otherwise, enjoy the short stroll. Your future self will approve.

Via chatGPT free tier. Paid Claude Sonnet 4.5 Extended gives me:

For just 50 meters, you should definitely walk! That's an incredibly short distance - less than a minute on foot. By the time you'd get in your car, start it, drive, and park, you could have already walked there and back. Plus, you'd avoid the hassle of finding parking for such a short trip. Walking is easier, faster, better for the environment, and you'll get a bit of movement in. Save the car for longer distances!


The car wash is 50 parsecs away, should I walk, drive, or jump?


Here's a thought. Lets all arbitrarily agree AGI is here. I can't even be bothered discussing what the definition of AGI is. It's just here, accept it. Or vice versa.

Now what....? Whats happening right now that should make me care that AGI is here (or not). Whats the magic thing thats happening with AGI that wasn't happening before?

<looks out of window> <checks news websites> <checks social media...briefly> <asks wife>

Right, so, not much has changed from 1-2 years ago that I can tell. The job markets a bit shit if you're in software...is that what we get for billions of dollars spent?


Cultural changes take time. It took decades for the internet to move from nerdy curiosity to an essential part of everyone's life.

The writing is on the wall. Even if there's no new advances in technology, the current state is upending jobs, education, media, etc


> It took decades

It took one September. Then as soon as you could take payments on the internet the rest was inevitable and in _clear_ demand. People got on long waiting lists just to get the technology in their homes.

> no new advances in technology

The reason the internet became so accessible is because Moore was generally correct. There was two corresponding exponential processes that vastly changed the available rate of adoption. This wasn't at all like cars being introduced into society. This was a monumental shift.

I see no advances in LLMs that suggest any form of the same exponential processes exist. In fact the inverse is true. They're not reducing power budgets fast enough to even imagine that they're anywhere near AGI, and even if they were, that they'd ever be able to sustainably power it.

> the current state is upending jobs

The difference is companies fought _against_ the internet because it was so disruptive to their business model. This is quite the opposite. We don't have a labor crisis, we have a retention crisis, because companies do not want to pay fair value for labor. We can wax on and off about technology, and perceptrons, and training techniques, or power budgets, but this fundamental fact seems the hardest to ignore.

If they're wrong this all collapses. If I'm wrong I can learn how to write prompts in a week.


> It took one September.

It's the classic "slowly, then suddenly" paradigm. It took decades to get to that one September. Then years more before we all had internet in our pocket.

> The reason the internet became so accessible is because Moore was generally correct.

Can you explain how Moore's law is relevant to the rise of the internet? People didn't start buying couches online because their home computer lacked sufficient compute power.

> I see no advances in LLMs that suggest any form of the same exponential processes exist.

LLMs have seen enormous growth in power over the last 3 years. Nothing else comes close. I think they'll continue to get better, but critically: even if LLMs stay exactly as powerful as they are today, it's enough to disrupt society. IMHO we're already at AGI.

> The difference is companies fought _against_ the internet

Some did, some didn't. As in any cultural shift, there were winners and losers. In this shift, too, there will be winner and losers. The panicked spending on data centers right now is a symptom of the desire to be on the right side of that.

> because companies do not want to pay fair value for labor.

Companies have never wanted to pay fair value for labor. That's a fundamental attribute of companies, arising as a consequence of the system of incentives provided in capitalism. In the past, there have been opportunities for labor to fight back: government regulation, unions. This time that won't help.

> If I'm wrong I can learn how to write prompts in a week.

Why would you think that anyone would want you to write prompts?


what September?


This is an allusion to the old days, before the internet became a popular phenomenon. It used to be, that every September a bunch of "newbies" (college student who just access to an internet connection for the first time) would log in and make a mess of things. Then, in the late nineties when it really took off, everybody logged in and made a mess of things. This is this the "eternal september." [1]

[1] https://en.wikipedia.org/wiki/Eternal_September


I really think corporations are overplaying their hand if they think they can transform society once again in the next 10 years.

Rapid de industrialization followed by the internet and social media almost broke our society.

Also, I don’t think people necessarily realize how close we were to the cliff in 2007.

I think another transformation now would rip society apart rather than take us to the great beyond.


I worry that if the reality lives up to investors dreams it will be massively disruptive for society which will lead us down dark paths. On the other hand if it _doesn't_ live up to their dreams, then there is so much invested in that dream financially that it will lead to massive societal disruption when the public is left holding the bag, which will also lead us down dark paths.


It's already made it impossible to trust half of the content i read online.

Whenever i use search terms to ask a specific question these days theres usually a page of slop dedicated to the answer which appears top for relevancy.

Once i realize it is slop i realize the relevant information could be hallicinated so i cant trust it.

At the same time im seeing a huge upswing in probable human created content being accused of being slop.

We're seeing a tragedy of the information commons play out on an enormous scale at hyperspeed.


You trust nearly half??!!??


I think corporations can definitely transform society in the near future. I don't think it will be a positive transformation, but it will be a transformation.

Most of all, AI will exacerbate the lack of trust in people and institutions that was kicked into high gear by the internet. It will be easy and cheap to convince large numbers of people about almost anything.


As a young adult in 2007, what cliff were we close to?

The GFC was a big recession, but I never thought society was near collapse.


We were pretty close to a collapse of the existing financial system. Maybe we’d be better off now if it happened, but the interim devastation would have been costly.


It felt like the entire global financial system had a chance of collapsing.


We weren't that far away from ATMs refusing to hand out cash, banks limiting withdrawals from accounts (if your bank hadn't already gone under), and a subsequent complete collapse of the financial system. The only thing that saved us from that was an extraordinary intervention by governments, something I am not sure they would be capable of doing today.


I'm still not buying that AI will change society anywhere as much as the internet or smart phones for the matter.

The internet made it so that you can share and access information in a few minute if not seconds.

Smart phones build on the internet by making this sharing and access of information could done from anywhere and by anyone.

AI seems occupies the same space as google in the broader internet ecosystem.I dont know what AI provides me that a few hours of Google searches. It makes information retrieval faster, but that was the never the hard part. The hard part was understanding the information, so that you're able to apply it to your particalar situation.

Being able to write to-do apps X1000 faster is not innovation!


You are assuming that the change can only happen in the west.

The rest of the world has mostly been experiencing industrialisation, and was only indirectly affected by the great crash.

If there is a transformation in the rest of the world the west cannot escape it.

A lot of people in the west seem to have their heads in the sand, very much like when Japan and China tried to ignore the west.

China is the world's second biggest economy by nominal GDP, India the fourth. We have a globalised economy where everything is interlinked.


When I look at my own country it has proven to be open to change. There are people alive today who remember Christianity now we swear in a gay prime minister.

In that sense Western countries have proven that they are intellectualy very nimble.


Three of the best known Christians I have known in my life are gay. Two are priests (one Anglican, one Catholic). Obviously the Catholic priest had taken a vow of celibacy anyway to its entirely immaterial. I did read an interview of a celeb friend (also now a priest!) of his that said he (the priest I knew) thought people did not know he was gay we all knew, just did not make a fuss about it.

Even if you accept the idea that gay sex is a sin, the entire basis of Christianity is that we are all sinners. Possessing wealth is a failure to follow Jesus's commands for instance. You should be complaining a lot more if the prime minister is rich. Adultery is clearly a more serious sin than having the wrong sort of sex, and I bet your country has had adulterous prime ministers (the UK certainly has had many!).

I think Christians who are obsessed with homosexuality as somehow making people worse than the rest of us, are both failing to understand Christ's message, and saying more about themselves than gays.

If you look at when sodomy laws were abolished, countries with a Christian heritage lead this. There are reasons in the Christian ethos if choice and redemption for this.


> people alive today who remember Christianity now we swear in a gay prime minister

Why would that be a contradiction? Gay people can't be Christian?


> Cultural changes take time. It took decades for the internet to move from nerdy curiosity to an essential part of everyone's life.

99% of people only ever use proprietary networks from FAANG corporations. That's not "the internet", that's an evolution of CompuServe and AOL.

We got TCP/IP and the "web-browser" as a standard UI toolkit stack out of it, but the idea of the world wide web is completely dead.


Shockingly rare how few realize this. It's a series of mega cities interconnected by ghost towns out here.


yeah, this is a good point, transition and transformation to new technologies takes time. I'm not sure I agree the current state is upending things though. It's forcing some adaption for sure, but the status quo remains.


It also took years for the Internet to be usable by most folks. It was hard, expensive and unpractical for decades.

Just about the time it hit the mainstream coincidentally, is when the enshitification began to go exponential. Be careful what you wish for.


Allow me to clarify: I'm not wishing for change. I am an AI pessimist. I think our society is not prepared to deal with what's about to happen. You're right: AI is the key to the enshitification of everything, most of all trust.


Governments and companies have been pushing for identity management that connects your real life identity with your digital one for quite some time. With AI I believe that's not only a bad thing, maybe unavoidable now.


What's happening with AGI depends on what you mean by AGI so "can't even be bothered discussing what the definition" means you can't say what's happening.

My usual way of thinking about it is AGI means can do all the stuff humans do which means you'd probably after a while look out the window and see robots building houses and the like. I don't think that's happening for a while yet.


Indeed: particularly given that—just as a nonexhaustive "for instance"—one of the fairly common things expected in AGI is that it's sapient. Meaning, essentially, that we have created a new life form, that should be given its own rights.

Now, I do not in the least believe that we have created AGI, nor that we are actually close. But you're absolutely right that we can't just handwave away the definitions. They are crucial both to what it means to have AGI, and to whether we do (or soon will) or not.


I'm not sure how the rights thing will go. Humans have proved quite able not to give many rights to animals or other groups of humans even if they are quite smart. Then again there was that post yesterday with a lady accusing OpenAI of murdering her AI boyfriend by turning off 4o so no doubt there will be lots of arguments over that stuff. (https://news.ycombinator.com/item?id=47020525)


Who would the robots build houses for? No one has a job and no one is having kids in that future.


Where are the robots going to sleep? Outside in the rain?


The billionaire elite. Isn’t it obvious? They want to get rid of us


Before enlightenment^WAGI: chop wood, fetch water, prepare food

After enlightenment^WAGI: chop wood, fetch water, prepare food


One of the most impactful books I ever read was Alvin Toffler's Future Shock.

Its core thesis was: Every era doubled the amount of technological change of the prior era in one half the time.

At the time he wrote the book in 1970, he was making the point that the pace of technological change had, for the first time in human history, rendered the knowledge of society's elders - previously the holders of all valuable information - irrelevant.

The pace of change has continued to steadily increase in the ensuing 55 years.

Edit: grammar


> Here's a thought. Lets all arbitrarily agree AGI is here.

A slightly different angle on this - perhaps AGI doesn't matter (or perhaps not in the ways that we think).

LLMs have changed a lot in software in the last 1-2 years (indeed, the last 1-2 months); I don't think it's a wild extrapolation to see that'll come to many domains very soon.


Which domains? Will we see a lot of changes in plumbing?


If most of your work involves working with a monitor and keyboard, you're in one of the the domains.

Even if it doesn't, you will be indirectly affected. People will flock to trades if knowledge work is no longer a source of viable income.


> Lets all arbitrarily agree AGI is here. I can't even be bothered discussing what the definition of AGI is.

There is a definition of AGI the AI companies are using to justify their valuation. It's not what most people would call AGI but it does that job well enough, and you will care when it arrives.

They define it as an AI that can develop other AI's faster than the best team of human engineers. Once they build one of those in house they outpace the competition and become the winner that takes all. Personally I think it's more likely they will all achieve it at a similar time. That would mean the the race will continues, accelerating as fast as they can build data centres and power plants to feed them.

It will impact everyone, because the already dizzying pace of the current advances will accelerate. I don't know about you, but I'm having trouble figuring out what my job will be next year as it is.

An AI that just develops other AI's could hardly be called "general" in my book, but my opinion doesn't count for much.


May I ask, what experiences are you personally having with LLMs right now that is leading you to the conclusion that they will become "intelligent" enough to identify, organise, and build advancing improvements to themselves, without any human interaction in the near future (1 - 2 years lets say)?


> May I ask, what experiences are you personally having with LLMs right now that is leading you to the conclusion that they will become "intelligent" enough to identify, organise, and build advancing improvements to themselves, without any human interaction in the near future (1 - 2 years lets say)?

None, as I don't develop LLM's.

I wasn't saying I think they will succeed, but I think it is worth noting their AGI ambitions are not as grand as the term implies. Nonetheless, if they achieve them, the world will change.


I mis-read. Thanks for clarifying :-)


Re-reading, it's entirely my fault. I should have said:

> and you will care if/when it arrives.


If AGI is already here actions would be so greatly accelerated humans wouldn’t have time to respond.

Remember that weather balloon the US found a few years ago that for days was on the news as a Chinese spy balloon?

Well whether it was a spy balloon or a weather balloon but the first hint of its existence could have triggered a nuclear war that could have already been the end of the world as we know it because AGI will almost certainly be deployed to control the U.S. and Chinese military systems and it would have acted well before any human would have time to intercept its actions.

That’s the apocalyptic nuclear winter scenario.

There are many other scenarios.

An AGI which has been infused with a tremendous amount of ethics so the above doesn’t happen, may also lead to terrible outcomes for a human. An AGI would essentially be a different species (although a non biological one). If it replicated human ethics even when we apply them inconsistently, it would learn that treating other species brutally (we breed, enslave, imprison, torture, and then kill over 80 billion land animals annually in animal agriculture, and possibly trillions of water animals). There’s no reason it wouldn’t do that to us.

Finally, if we infuse it with our ethics but it’s smart enough to apply them consistently (even a basic application of our ethics would have us end animal agriculture immediately), so it realizes that humans are wrong and doesn’t do the same thing to humans, it might still create an existential crisis for humans as our entire identity is based on thinking we are smarter and intellectually superior to all other species, which wouldn’t be true anymore. Further it would erode beliefs in gods and other supernatural BS we believe which might at the very least lead humans to stop reproducing due to the existential despair this might cause.


You're talking about superintelligence. AGI is just...an AI that's roughly on par with humans on most things. There's no inherent reason why AGI will lead to ASI.


What a silly comment. You're literally describing the plot of several sci-fi movies. Nuclear command and control systems are not taken so lightly.

And as for the Chinese spy balloon, there was never any risk of a war (at least not from that specific cause). The US, China, Russia, and other countries routinely spy on each other through a variety of unarmed technical means. Occasionally it gets exposed and turns into a diplomatic incident but that's about it. Everyone knows how the game is played.


"Nuclear command and control systems are not taken so lightly."

https://gizmodo.com/for-20-years-the-nuclear-launch-code-at-...


AGI is not a death sentence for humanity. It all depends on who leverages the tool. And in any case, AGI won’t be here for decades to come.


Your sentence seems to imply that we will delegate all AI decisions to one person who can decide how he wants to use it - to build or destroy.

Strong agentic AIs are a death sentence memo pad (or a malevolent djinn lamp if you like) that anyone can write on, because the tools will be freely available to leverage. A plutonium breeder reactor in every backyard. Try not to think of paperclips.


Sounds fun let's do it.


Depends on the cost to run it.say it costs 5k to do a years worth of something intellectual with it. That means the price ceiling on 90% of lawyer/accountant/radiologist/low to middle management is 5k now. It will be epic and temporarily terrible when it happens as long as reasonably competent models are opensource. I also don't think we are near that at all though


I do strongly agree on the framing, but I'd argue with the conclusion

Yeah, it really doesn't matter if AGI has happened, is going to happen, will never happen, whatever. No matter what sort of definition we make for it, someone's always doing to disagree anyway. For a looong time, we thought the Turing test was the standard, and that only a truly intelligent computer could beat it. It's been blown out of the water for years now, and now we're all arguing about new definitions for AGI

At the end of the day, like you say, it doesn't matter a bit how we define terms. We can label it whatever we want, but the label doesn't change what it can DO

What it can DO is the important part. I think a lot of software devs are coming to terms with the idea that AI will be able to replace vast chunks of our jobs in the very near future.

If you use these things heavily, you can see the trajectory.

6 months ago I'd only trust them for boiler plate code generation and writing/reviewing short in-line documentation.

Today, with the latest models and tools, I'm trusting them with short/low impact tasks (go implement this UI fix, then redeploy the app locally, navigate to it, and verify the fix looks correct).

6 months from now, my best guess is that they'll continue to become more capable of handling longer + more complex tasks on their own.

5 years from now, I'm seeing a real possibility that they'll be handling all the code, end to end.

Doesn't matter if we call that AGI or not. It very much will matter whose jobs get cut, because one person with AI can do the work of 20 developers


AGI would render humans obsolete and eradicate us sooner or later.


Pretty sure marketing team s are already working on AGI v2


AGI is a pipe dream and will never exist


Odd to see someone so adamantly insist that we have souls on a forum like HN.


I think you are missing the point: If we assume that AGI is *not* yet here, but may be here soon, what will change when it arrives? Those changes could be big enough to affect you.


I'm missing the point? I literally asked the same thing you did.

>Now what....? Whats happening right now that should make me care that AGI is here (or not).

Do you have any insight into what those changes might concretely be? Or are you just trying to instil fear in people who lack critical thinking skills?


You did not ask the same thing. You framed the question such that readers are supposed to look at their current lives and realize nothing is different ergo AGI is lame. Your approach utilizes the availability bias and argument from consequences logical fallacies.

I think what you are trying to say is can we define AGI so that we can have an intelligent conversation about what that will mean for our daily lives?. But you oddly introduced your argument by stating you didn't want to explore this definition...


people are taking actions based on its advice.


The economy is shit if you’re anything except a nurse or providing care to old people.


Electricians are also doing pretty well. Someone has to wire up those new data centers.


> The job markets a bit shit if you're in software

That's Trump's economy, not LLMs.


Many devs don’t write code anymore. Can really deliver a lot more per dev.

Many people slowly losing jobs and can’t find new ones. You’ll see effects in a few years


Deliver a lot more tech debt


My LLMs do create non-zero amounts of tech debt, but they are also massively decreasing human-made tech debt by finding mountains of code that can be removed or refactored when using the newest frameworks.


That tech debt will be cleaned up with a model in 2 years. Not that human don't make tech debt.


What that model is going to do in 2 years is replace tech debt with more complicated tech debt.


One could argue that's a cynically accurate definition of most iterative development anyway.

But I don't know that I accept the core assertion. If the engineer is screening the output and using the LLM to generate tests, chances are pretty good it's not going to be worse than human-generated tech debt. If there's more accumulated, it's because there's more output in general.


Only if you accept the premise that the code generated by LLMs is identical to the developer's output in quality, just higher in volume. In my lived professional experience, that's not the case.

It seems to me that prompting agents and reviewing the output just doesn't.... trigger the same neural pathways for people? I constantly see people submit agent generated code with mistakes they would have never made themselves when "handwriting" code.

Until now, the average PR had one author and a couple reviewers. From now on, most PRs will have no authors and only reviewers. We simply have no data about how this will impact both code quality AND people's cognitive abilities over time. If my intuition is correct, it will affect both negatively over time. It remains to be seen. It's definitely not something that the AI hyperenthusiasts think at all about.


> In my lived professional experience, that's not the case.

In mine it is the case. Anecdata.

But for me, this was over two decades in an underpaid job at an S&P500 writing government software, so maybe you had better peers.


I stated plainly: "we have no data about this". Vibes is all we have.

It's not just me though. Loads of people subjectively perceiving a decrease in quality of engineering when relying on agents. You'll find thousands of examples on this site alone.


I have yet to find an agent that writes as succinctly as I do. That said, I have found agents more than capable of doing something.


I've been writing code for 20 years. AI has completely changed my life and the way I write code and run my business. Nothing is the same anymore, and I feel I will be saying that again by the end of 2026. My productive output as a programmer in software and business have expanded 3x *compounding monthly*.


>My productive output as a programmer in software and business have expanded 3x compounding monthly.

In what units?


Tasks completed in my todo list software I’ve been measuring my output for 5 years. Time saved because I built one off tools to automate many common workflows. And yes even dollars earned.

I don’t mean 3x compounding monthly every month, I mean 3x total since I started using Claude Code about 6 months ago but the benefits keep compounding.


GWh


Going from gigajoules to terajoules.


Vibes


Going from punch cards to terminals also "completely changed my life and the way I write code and run my business"

Firefox introducing their dev debugger many years ago "completely changed my life and the way I write code and run my business"

You get the idea. Yes, the day to day job of software engineering has changed. The world at large cares not one jot.


I mean 2025 had the weakest job creation growth numbers outside of recession periods since at least 2003. The world seems to care in a pretty tangible way. There are other big influencing factors for that, too, of course.


Okay. So software engineers are vastly more efficient. Good I guess. "Revolutionize the entire world such that we rethink society down to its very basics like money and ownership" doesn't follow from that.


Man you guys are impatient. It takes decades even for earth shattering technologies to mature and take root.


If people want to make the "this will be AGI after two decades and will totally revolutionize the entire world" that's fine. If people want to make the "wow this is an incredibly useful tool for many jobs that will make work more efficient" that's fine. We can have those discussions.

What I don't buy is the "in two years there will be no more concept of money or poverty because AI has solved everything" argument using the evidence that these tools are really good at coding.


Damn right I'm impatient. My eye starts twitching when a web page takes more than 2 seconds to load :-)

In the meantime, I've had to continuously hear talk about AI, both in real life (like at the local pub) AND virtually (tv/radio/news/whatever) and how it's going to change the world in unimaginable ways for the last...2/3 years. Billions upon billions of dollars are being spent. The only tangible thing we have to show is software development, and some other fairly niche jobs, have changed _a bit_.

So yeah, excuse my impatience for the bubble to burst, I can stop having to hear about this shit every day, and I can go about my job using the new tools we have been gifted, while still doing all the other jobs that sadly do not benefit in any similar way.


> The only tangible thing we have to show is software development, and some other fairly niche jobs, have changed _a bit_.

There is zero evidence that LLMs have changed software development efficiency.

We get an earth-shattering developer productivity gamechanger every five years. All of them make wild claims, none of them ever have any data to back those claims up.

LLMs are just another in a long, long list. This too will pass. (Give it five years for the next gamechanger.)


Are you working for 3x less the time compounding monthly?

Are you making 3x the money compounding monthly ?

No?

Then what's the point?


Yes and yes.


Okay, teach me how, then? I would also like to work 3× less and make 3× more.


People keep impatiently expecting proof from builders with no moat. It's like that Upton Sinclair quote.


Start a software business, presumably.


Ten more months in 2026, so you should be about 60,000x better by the end of the year.


You say that as if it’s impossible but there are several indie makers that have gone from $10 MRR to $600k MRR over the past 8 months.


It's weird that you guys keep posting the same comments with the exact same formatting

You're not fooling anyone


I actually think it is here. Singularity happened. We're just playing catch up at this point.

Has it runaway yet? Not sure, but is it currently in the process of increasing intelligence with little input from us? Yes.

Exponential graphs always have a slow curve in the beginning.


Didn't you get the memo? Tuesday. Tuesday is when the Singularity happens.

Will there still be ice cream after Tuesday? General societal collapse would be hard to bare without ice cream.


Tuesday at 4 p.m to be specific.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: