Confusing headline. This is "Tech CEOs are breaking Tesler’s Law", which states that you can't eliminate irreducible complexity from your product. The argument is that Tech CEOs think they can replace their workforce with generative AI, which violates that law because it takes humans to design for human problems.
You could ask the same question about any tool ever created. Users who figure out ways to use their agents that are profitable to them make money. Everyone else spends money.
> Users who figure out ways to use their agents that are profitable to them make money
Does anyone know any of these users? Most of the agentic-coding boosters seem to be pretty much exclusively building personal knowledge bases and more agentic coding tools
> Pew writes that 44 percent of U.S. adults now say they use OpenAI’s chatbot, a figure that’s more than doubled since 2023.
> The next most popular chatbot is Gemini (24 percent), followed by Copilot (17 percent) and MetaAI (14 percent), with Grok (8 percent), Claude (6 percent) and Character.ai (3 percent) lagging behind.
Claude in 6th place, behind Gemini and Copilot and MetaAI and Grok?
No wonder the general public still think AI is junk.
The question there was "% of U.S. adults who say they ever use the following AI chatbots", so it's not a measure of overall usage, just exposure. Not surprising Gemini and Grok and MetaAI rank higher then.
I think there is a valid point here that Anthropic has a found a great product-market fit among programmers.
By comparison, all the rest of the tools non-programmers get exposure to are floundering around trying to be everything to everyone. It's a push not a pull.
The rest of the pack, when given everyday real-world computing tasks, for people that don't know what a terminal is, just suck. (e.g. "copilot, fix the spacing issue in this word document" or literally any apple genmoji attempt with more than two basic english words)
I had a big culture shock moment when I had to prep some slides a few weeks back. I'd assumed it would be a breeze now: I've always been good at making slide decks, I had a clear classification-friendly idea of exactly what I wanted them to look like, and there's even an AI native integration! Nope, didn't work, just had to shuffle components around like I always have.
Weaker models and less powerful harnesses give people a very sub-par experience compared to what you get if you pay for access to the better tools and models.
Normal users aren't using harnesses in the sense developers think of them. They're interacting with models where they've been shoehorned in for no good reason, or they're using them nearly entirely through chat interfaces.
> What happened in 2025 was this: the economics of code production were turned upside down. Instead of being very hard, time-consuming, and expensive to generate code, it became effectively free and instant. Lines of code went from being treasured, reused, cared for and carefully curated, to being disposable and regenerable, practically overnight.
I've been thinking about this a whole lot recently. So much of my intuition about software development is based on 25 years of accumulated experience on how long it will take to write different bits of code.
Should I add validation for this one edge-case which won't break everything but will make a little bit of a mess if someone hits it? If that's an extra couple of hours of code I might skip it. If it's one more prompt, why wouldn't I?
That's just on the small scale. There are entire projects that I'd never previously have considered, because I don't need a custom SQLite SELECT query parsing library enough to justify spending a week or more building one. But now... https://github.com/simonw/sqlite-ast
People get VERY upset (and condescending) any time you suggest that being able to produce lines of code faster is a valuable thing. And sure, measuring output through "lines of code" is stupid.
But measuring "lines of verified code that deliver valuable" isn't stupid at all. That's the thing we can do faster now.
I’m gonna say this in the most polite way that I can but who cares?
Look around you - google is valuable because it hoovers up data to generate revenue from advertising and has minimal expenditures compared with the revenues. All those bets? Lol yeah what about them?
Engineering for the sake of engineering has no value to the economy - aka it’s irrelevant. It’s the hard truth nobody wants to hear. There’s a limited set of things that can existence in the economy at any given moment in time - only those that provide value and can be sustained w.r.t economics stay the course.
> Engineering for the sake of engineering has no value to the economy
I think that's the adventure we're on now. If recreating something is low cost, what is the value in investing in designing it well in the first place? We can empirically discover issues and the the AI to address them.
I certainly routinely find in supervising what the LLM is writing that it's making terrible internal design choices and correct them. Usually things one level up from code. "This will cache every image on the client and cause a huge amount of bloat. Change it to pull the image in real time from the server" kind of stuff. You do slowly build that up in the project documentation - "Never store unnecessary data on the client: we assume they are using low powered devices without substantial storage". But it takes time and the road to discovering that empirically is through a lot of unhappy users.
So I think there is still a lot of room for genuine engineering - that is, at the technical design level. Levels up from that - code structure etc - are much less clear. I am guessing that over time we will heavily optimise code written by AI for maintenance by AI. Which may be mostly about matching the context window to the code module size. Factoring something to 5 modules may be less of a good idea if it means the context window has to hold all of them for the LLM to work. But that is the path of discovery we are on which history tells us is a 20 year journey.
> Engineering for the sake of engineering has no value to the economy - aka it’s irrelevant.
I would put that as my signature if HN supported that. I see a lot of systems being built where the whole point seems to be about the ritual, not anything valuable for the user.
I was surprised that GLM 5.1/5.2 are not vision models - they are text input only.
That's actually pretty uncommon these days. All of the OpenAI/Anthropic/Gemini models accept images, and so do the other leading open weight families - Gemma 4, Qwen 3.6, Kimi 2.x.
In GLM's case image input would be useful because it's a model that scores very highly for tasks like web design, but without image input it can't take a screenshot and output HTML+CSS.
Don't get me wrong, GLM is a phenomenal model, but the image thing is a bit of a gap.
Configure a subagent in your coding harness to spin up a new sub-session with any vision model for those tasks and feed the result back to the main model. No need for "one model that does everything"
That doesn’t work well in a lot of scenarios. The text LLM doesn’t know what to look for in an image before it sees a description, you might need multiple rounds of back and forth.
Vision decoding outside of the latent space of the model is lossy, but claude opus's vision isn't that great outside of UI screenshots. I mean it works in a pinch. At least in my testing, if you're looking at non UI images, there are better image to text models that can turn into a very precise documents that any LLM can easily parse.
I don't see this being such a big gap. There are some use-cases for sure but apart from UX/UI work it is not really needed. Besides, none of the frontier models can replicate actual images - the can approximate at least in my own experience.
a pretty fun and quick tests i do with vision models is to screenshot the hackernews homepage and ask the model to return a json representation of the screenshot - qwen 3.5 0.8b did surprisingly well at this.
They are different things. Government money is very predictable and consistent, and based on different calculations that typical consumer-oriented sales. Profits are usually easier.
"OpenAI generated $13.07 billion in revenue in 2025"
Considering just four years ago they were a research lab with hardly any revenue at all, and no corporate muscles for earning revenue, I think that is a very impressive number.
(Sure, they're losing a whole lot of money too. Same goes for almost every other hyper-growth company in the history of tech.)
> Same goes for almost every other hyper-growth company in the history of tech
Except it's not true. No one lost $38.5B in a year just to 'hyper-grow' or whatever it means. Uber accumulated ~$30B loss over a decade.
Edit: I read it wrong. The loss was mostly caused by one-time event[0]:
> Before OpenAI’s switch late last year to become a public benefit corporation, investors in the company received convertible interest rights rather than conventional equity. Under US accounting rules, those interests were treated as liabilities and periodically revalued as the company’s valuation increased.
It looks like that OpenAI is actually quite in line with other companies that lost money to grow.
That argument supports any levels of losses, however I also think it’s rather misleading.
Growth means some inefficiencies, but their expenses are largely around commodities like electricity and data centers not a sudden army of salespeople. They also got 150M 11 years ago and 1 billion 7 year ago, they where quite large in 2022.
Basically you don’t get better at writing checks to your local utility which limits how much they can control costs.
In 2022 they only had 335 employees (according to various internet searches but I can't find an original source for that number.) I can't find credible numbers for revenue from the GPT-3 API, which did have some usage - GitHub Copilot started charging a subscription fee on June 21, 2022 - https://github.blog/changelog/2022-06-21-github-copilot-is-n... - and that was running on the OpenAI Codex model so presumably OpenAI had some revenue from that.
That said, in many ways 335 employees is the midpoint between 3 employees and 30,000 employees. The CEO can’t keep track of everyone’s names and what they’re doing, you need layers of management, HR, etc. It’s not really a simple exponential function but 335 to 336 is way more automated than going from 3 to 4.
and WeWork is awesome example because it fell apart before IPO. It didn't even make it that far. On the other hand, for all of the shit talking that goes on online, SpaceX is up 49% from IPO price.
All of the shit that people said about SpaceX is still true. It's still up 49%. I'm sure it'll take a dump the next time anything bad happens, like a rocket explodes, but now that it's public, I'ma be watching all their rocket launches so I can buy if that happens and sell right after. I'm also going to be watching because going to space is fucking awesome but I can't buy a trip on that rocket yet and and no one's gonna pay me to watch it.
Your argument went from "big number good" to redefining "stupid", and you think that somehow supports your original statement?
What word would you use to describe someone that:
- told you to put glue on pizza?
- thinks there's 1 'r' in strawberry?
- is incapable of stopping terminal flickering?
- deletes your production database?
- bankrupts you trying to scan the entire IPv6 address space of a play network interface?
- can only attempt to draw a bird on a bike in the most bland and unimaginative style possible yet still completely failing?
All while being given the entire US economy and polluting the only planet we have to do so?
I mean, sure you said "LLMs", rather than "LLMs in the last 12 months", and sure, you completely abandoned your original argument, and sure, you ignored the other things listed, and of course everyone knows that list is a comprehensive list of the only failings of genai rather than a honeypot to positively identify you as a shameless shill, but ultimately, the fact that HN chose someone this terrible at making a defensible logical argument to be their favorite genai financial interest mouthpiece is a strong indicator defending the criticisms in the original submission.
If your argument is that LLMs are stupid in the same way that NFTs were stupid I don't think it's worth spending any more time discussing this with you.
If your losses scale with your growth, while at the same time your competitors are eating into your future user-base, how are you ever gonna become profitable? Only two ways comes to my mind: regulatory capture, and moving upwards into full software-development house.
Look at how a utlity works, in setting price specifically, for things that are considered a public good. The story is not about how much profit or revenue they make. Its about how do you keep it afloat and expanding in the coming year. Thats it.
AI doesn't work like the rest of the tech industry. The cost of selling another license for a software program is approximately zero.
In the case of AI the marginal cost of the next token is not zero, and is in fact probably not going down much with volume, if at all.
So I'm not sure one can argue that scale will solve everything. It's very much like the old adage "we lose money on every sale, but make it up in volume".
It's wild to think how efficient Internet services were prior to AI. The most expensive thing would probably have been something like encoding video. Now you've a substantial portion of a rack dedicated to a user in the case of something like fable
Best analogue we have is probably video streaming. Or maybe more so live streaming. Unless subscription based and limited time events it seems those don't do well. Twitch has lost money for how long? And most smaller players seem propped up in other ways.
So if there is real cost involved things start to look lot worse and might not be overcome. OpenAI is unlikely to be exception for me.
But there is no indication they are losing money on tokens when R&D and other expenses are factored out? The margins on API are likely very high so the higher the volume the more likely they will be able to cover the other mostly fixed costs.
Also, what are they calling "R&D" exactly? If it is training new models, which needs to be done almost constantly and means spending billions on energy and newer GPUs, then it's not really R&D, but rather operating costs.
They gave up on video because three separate Chinese companies were kicking their ass (and for cheaper).
Google has a better image model in the majority of cases. Much faster, too.
Claude Opus and Fable are like a billion times better. It's not even funny. Codex can't do Rust at all.
What does that leave them? Ads in ChatGPT? I've started to just rely on Google search blended with Gemini answers now because it's faster and doesn't spit out a 20-page essay of useless effusive prose.
Open source models will eat them from the bottom.
Will those enterprise contracts be renewed in a market full of alternatives?
There's nothing sticky about this company.
They're making a necklace with Jony Ive though, I guess?
They still have the most recognized AI brand name and they are still the most popular LLM. For most users, a 10% diff between Claude and GPT isnt going to move the needle plus it seems to be a horse race anyways. I think their user base is stickier than you would think. Still, it isn't as sticky as social media and it is cheaper to switch AIs than email accounts.
Its still dominant and a lot higher % wise if you count paying users. Gemini was integrated into Google search so its not necessarily people using Gemini as their daily assistant.
So, just like Fable? You can shorten the thinking effort to tweak the "slow and expensive" part a little bit, but at the higher end being more meticulous than even Fable is actually a benefit.
Subscribed to Claude Opus for 2 months, with a few months gap between subscriptions to try different versions.
The UX/UI around Anthropic's products was excruciatingly annoying, right from the payment process, and Claude's AI was often hilariously dumb and "trying too hard", constantly full of "oops, you're right" backtracking and often borderline dangerous.
I tried Claude and ChatGPT Codex side by side on some tasks, with the same prompts. Each time, my confidence in Claude fell.
I've been subscribed to the $20 ChatGPT plan for more than 1 year, and this month, I am trying the $100 plan for 1 month.
ChatGPT Codex has been actually helpful and made me more productive enough that I can't imagine going back to coding without it.
I use LLMs more in the context of peer-reviewing and also came to a similar conclusion, gpt-5.5 codex xhigh reasoning seemed to catch more edge cases and went "deeper" into analysis than Opus 4.7/4.8.
My preliminary tests of Fable were pretty promising but that's DOA for everyone for now.
example.com is also great for that reason when something fails about a captive portal on a public WiFi.
I open my web browser and go to http://example.com and get redirected to the captive portal page again and retry completing what they need from me to get internet access.
Plus, it feels nice to depend on the reserved domain name example.com instead of relying on a domain that any one specific corporation has to maintain :D
What gives you confidence example.com won't start serving the HTTPS redirect though? There isn't any reason they wouldn't, and given that browsers are clearly tending towards showing big scary warnings to even accessing something over cleartext, I wouldn't be surprised if they flipped that switch just to avoid confusing noobs.
True, that could happen. If it does do that then I will have to switch over to remembering a different URL instead. But as long as it hasn’t I will keep using http://example.com :)
reply