Claude AI Finds Bugs In Microsoft CTO's

Claude AI Finds Bugs In Microsoft CTO's 40-Year-Old Apple II Code - Slashdot
Close
binspam
dupe
notthebest
offtopic
slownewsday
stale
stupid
fresh
funny
insightful
interesting
maybe
offtopic
flamebait
troll
redundant
overrated
insightful
interesting
informative
funny
underrated
descriptive
typo
dupe
error
180945426
story
An anonymous reader quotes a report from The Register:
AI can reverse engineer machine code and find vulnerabilities in ancient legacy architectures, says Microsoft Azure CTO Mark Russinovich, who
used his own Apple II code from 40 years ago as an example
. Russinovich
wrote
: "We are entering an era of automated, AI-accelerated vulnerability discovery that will be leveraged by both defenders and attackers."
In May 1986, Russinovich wrote a utility called Enhancer for the Apple II personal computer. The utility, written in 6502 machine language, added the ability to use a variable or BASIC expression for the destination of a GOTO, GOSUB, or RESTORE command, whereas without modification Applesoft BASIC would only accept a line number. Russinovich had Claude Opus 4.6, released early last month, look over the code. It decompiled the machine language and found several security issues, including a case of "silent incorrect behavior" where, if the destination line was not found, the program would set the pointer to the following line or past the end of the program, instead of reporting an error. The fix would be to check the carry flag, which is set if the line is not found, and branch to an error.
The existence of the vulnerability in Apple II type-in code has only amusement value, but the ability of AI to decompile embedded code and find vulnerabilities is a concern. "Billions of legacy microcontrollers exist globally, many likely running fragile or poorly audited firmware like this," said one comment to Russinovich's post.
You may like to read:
Meta Acquires Moltbook, the Social Network For AI Agents
'USB-A Isn't Going Anywhere, So Stop Removing the Port'
New Book Argues Hybrid Schedules 'Don't Work', Return-to-Office Brings Motivation and Learning
'A Black Hole': America's New Graduates Discover a Dismal Job Market
Toxic Workplaces Are Worsening: 80% of U.S. Workers Say Their Job Hurts Mental Health
WSJ: Tech-Industry Workers Now 'Miserable', Fearing Layoffs, Working Longer Hours
Submission: Claude AI Finds Bugs In Microsoft CTO's 40-Year-Old Apple II Code
OpenAI Is Walking Away From Expanding Its Stargate Data Center With Oracle
This discussion has been archived.

No new comments can be posted.
Claude AI Finds Bugs In Microsoft CTO's 40-Year-Old Apple II Code
More
Claude AI Finds Bugs In Microsoft CTO's 40-Year-Old Apple II Code
Comments Filter:
All
Insightful
Informative
Interesting
Funny
The Fine Print:
The following comments are owned by whoever posted them. We are not responsible for them in any way.
How very relevant.
Score:
, Interesting)
by
nightflameauto
( 6607976 )
writes:
on Tuesday March 10, 2026 @01:11PM (
#66033568
This will be extremely useful for the once every couple years I bust out my old IIgs. Pity we can't run the AI on the Apple hardware to find these vulnerabilities in code we don't use at all in production anymore. These are critical issues that must be addressed!
Share
Re:
Score:
by
Junta
( 36770 )
writes:
Well the latter point may have more relevance, that a lot of embedded scenarios are like the Apple II scenario, never subjected to rigorous security review and largely banking on no one bothering to reverse engineer the closed source runtimes.
So this can shift the cost/benefit ratio to go look at some of those embedded applications and find ways to induce misbehavior. Depending on the scenario, the vendor is long gone or the design was never made to be field upgradeable. So you end up with known vulnerabil
Re:
Score:
by
Tarlus
( 1000874 )
writes:
There's a reason he cited Apple II
as an example
. You might be alarmed to learn how much legacy code - decades old - is still run on equally aged hardware
in production
. Perhaps not on Apple II, but a recent example I saw was a rack of PDP-9's still in control of machinery at an observatory.
Security Theater
Score:
by
leptons
( 891340 )
writes:
on Tuesday March 10, 2026 @01:13PM (
#66033578
"It decompiled the machine language and found several security issues"
Security issues on an Apple II? It's difficult to imagine what kind of "security" they think is possible on an Apple II.
Share
Re:
Score:
by
Junta
( 36770 )
writes:
So, for the open ended general purpose of a platform without the concept of privilege separation, you are right, and that's realistically where Apple II sits.
But what if you had a similarly loose platform but it's running a kiosk and that kiosk software is purportedly designed to keep the user on acceptable rails. Then finding a way to break that kiosk software might be significant.
So I'll grant that the concept *could* map to real-world concerns, given how wild west a lot of embedded applications have bee
Good example of why it's wrong
Score:
, Insightful)
by
DrYak
( 748999 )
writes:
on Tuesday March 10, 2026 @02:19PM (
#66033712
Homepage
But what if you had a
similarly
loose platform but it's running a kiosk and that kiosk software is purportedly designed to keep the user on acceptable rails.
There is a lot of leverage done by the "similarly".
Apple's computers run on 6502.
This was an insanely popular architecture. It's been used in metric shit tons of other hardware from roughly that era. There are insane amounts of resource about this architecture. It was usually programmed in assembly. There has been a lot of patching of binaries back then. These CPUs have also been used in courses and training for a very long time, most of which are easy to come by. So there's an insane amount of material about 6502 instructions , their binary encoding, and general debugging of software on that platform that could be gobbled by the training of the model. The architecture is also extremely simple and straightforward with very little weirdness. It could be possible for something that boils down to a "next word predictor" to not fumble too much.
Anything developed in the modern online era, where you would be interested in finding vulnerabilities is going to be multiple order of magnitude more complex (think more multiple megabytes of firmware not a 120 bytes patch), rely on very weird architecture (a kiosk running on some x86 derivative? one of the later embed architecture that uses multiple weird addressing mode?) and very poorly documented.
Also combine this with the fact that we're very far into the "dimishing returns" part of the AI development, where each minute improvement requires even vastly more resources (insanely large datacenter, power requirement of entire cities) and more training material than available (so "habsburg AI" ?), it's not going to get better easily.
The fact that a chat bot can find a fix a couple of grammar mistake in a short paragraph of English doesn't mean it could generate an entire epic poem in a some dead language like Etruscan (not Indo-European, not that many examples have survived, even less Etruscan-Latin or -Greek bilingual texts have survived to assist understanding).
The fact that a chat bot successfully reverse engineered and debugged a 120-byte snipped of one of the most well studied architecture doesn't mean it will easilly debug multi-mega bytes firmware of some obscure proprietary microcontroller.
Parent
Share
Re:
Score:
by
Junta
( 36770 )
writes:
You have a fair point that the selection of a 40 year old 6502 application is interesting, and likely driven by the reality that the LLMs fall apart with vaguely modern application complexity.
It may however help if someone identifies a small digestable chunk as security relevant and set it about the task of dealing withi t.
And complexity
Score:
, Informative)
by
DrYak
( 748999 )
writes:
on Tuesday March 10, 2026 @03:01PM (
#66033810
Homepage
the selection of a 40 year old 6502 application is interesting,
Not even the application, just a 120 byte-long binary patch.
It may however help if someone identifies a small digestable chunk as security relevant and set it about the task of dealing withi t.
And that chunk doesn't have any weirdness that requires a seasoned and actually human reverse-engineer.
(Think segmented memory model on anything pre "_64" of the x86 family - the kind of madness that can kill Ghidra).
Also, if it's not from the 8bit era or the very early 16bit era, chances are high that this bit of machine code didn't start as hand-written assembler but some higher-level compiled language (C most likely). It might be better to run Ghidra on it and have some future ChatBot trained on making sense of that decompiled code.
In short there so many thousands of blockers that have been carefully avoided by going to that 40 year old 120-byte long patch of 6502 binary.
Parent
Share
Re:
Score:
by
gewalker
( 57809 )
writes:
There are many widely used libraries, and code trees within libraries that are far from multi-megabyte. Looking for security vulnerabilities in such widely deployed code seems like a potentially useful source of zero day exploits if you happen to be a ad actor with significant resources to look for such. Maybe few as simple as a 128 byte patch, but lot's of potentially juicy targets to throw at the AI.
Re:Security Theater
Score:
, Informative)
by
allo
( 1728082 )
writes:
on Tuesday March 10, 2026 @02:53PM (
#66033794
There are two points to it:
1) It can find security issues in machine language
2) It even can do this for Apple II
I am always confused why people don't understand proof of concepts. If you get doom to run on your toaster, you are not looking for the best gaming platform, but are proving what you can do with the toaster hardware. If you find security bugs in Apple II binaries, you do not want to fix decades old software, but show that your tool understands decades old binaries. In practice you then apply your skills to real-world problems that are (hopefully) simpler because you do not need to shave the last byte to fit things in the toaster's RAM.
Parent
Share
Re:
Score:
by
leptons
( 891340 )
writes:
>I am always confused why people don't understand proof of concepts.
I'm not sure what about my comment made you think I don't understand "proof of concepts".
My comment was strictly about the phrasing "security issues" within a system that has no login prompt, and no concept of security to begin with. It's an absolute trash way to present the findings, framing it as "security issues". It's nonsense. The Apple II never had any security to begin with, none at all, zero...
Proof of Concept likes simplified cases
Score:
by
drnb
( 2434720 )
writes:
There are two points to it:
1) It can find security issues in machine language
2) It even can do this for Apple II. I am always confused why people don't understand proof of concepts.
Excellent point, but this proof of concept works *because* it is Apple II. Extremely simple CPU and platform architectures. Proof of concept often uses simplified cases.
Re:
Score:
by
AmiMoJo
( 196126 )
writes:
To be fair though, 6502 machine code is much simpler than modern CPUs, so it's perhaps not the best test.
Re:
Score:
by
UnknowingFool
( 672806 )
writes:
The problem I think everyone here is trying to point out is how irrelevant is it to use Apple II code. It is very simplistic code used by machines that stop selling decades ago that did not have security designed into it. This is also Microsoft; you're telling me a company like Microsoft has no newer code they could have used for a "proof of concept"? There's no MSDOS or Windows 3.11 or Windows 8 code to analyze? Or would analyzing it paint Microsoft in a bad light?
Re:
Score:
by
allo
( 1728082 )
writes:
I think the point of such a proof of concept is, that the AIs are probably way better with x86 machine code than with Apple II code. There is already a lot of research that does similar things for x86_64 and other popular architectures, and there is way more training data for that. So the Apple II binaries are probably the bigger challenge.
Re:
Score:
by
vux984
( 928602 )
writes:
I am always confused why people don't understand proof of concepts
It is like demonstrating a system can see toy boats through a 5mm sheet of slightly tinted glass and then talking about how the same tech will be able to help researchers find shipwrecks at the bottom of the ocean, after a century of decay, half buried by silt,
... from a satellite in space.
A proof of concept is a non-production demonstration that provides convincing evidence you'll be able to scale it up and do the ACTUAL thing in the real world that you claim it can do.
This demonstration just isn't convin
Re:
Score:
by
allo
( 1728082 )
writes:
Ever looked at the toy example in image recognition? Most proof of work are on synthetic data or toy data, e.g., matching Waldo in a where is waldo image. You won't use the network later to spoil the fun of finding Waldo, you later fine-tune it on the objects you're looking for. But finding Waldo in a crowded image is the nicer demonstration than finding a ship in an image that seems to only contain sea.
Re:
Score:
by
vux984
( 928602 )
writes:
Most proof of work are on synthetic data or toy data, e.g., matching Waldo in a where is waldo image. You won't use the network later to spoil the fun of finding Waldo, you later fine-tune it on the objects you're looking for.
The difference being that finding waldo in a sea of faces almost but not quite waldo, some with the right hat but no glasses, some with the stripe shirt but no hat, etc etc is a lot more representative of the real problem.
It always starts with a synthetic or toy problem but, again, its about selecting a good representative proof-of-concept to be for it to be convincing.
If you showed me the exact same waldo image recognition system and demonstrated it finding waldo on a blank page, it would in fact be the sa
Re:
Score:
by
drnb
( 2434720 )
writes:
Security issues on an Apple II? It's difficult to imagine what kind of "security" they think is possible on an Apple II.
Malware could upload your Apple Writer and VisiCalc files using a modern. OS, application, and data files being on the same disk.
:-)
Re:
Score:
by
leptons
( 891340 )
writes:
No amount of fixing bugs in any software, especially the one described in the article, is going to prevent any of that on an Apple II.
I just found their description of it as a "security issue" to be rather amusing.
Re:
Score:
by
arglebargle_xiv
( 2212710 )
writes:
The Apple II is actually incredibly secure. Sit a script kiddie in front of it and they'll say "what's that rattling whirring noise? Who removed the mouse? Where do I click to start an app? Is it that strange ']' icon?"
Re:
Score:
by
leptons
( 891340 )
writes:
It's funny that you think "script kiddie" is still a relevant thing in tech, but this is slashdot I guess.
Re:
Score:
by
strikethree
( 811449 )
writes:
It's difficult to imagine what kind of "security" they think is possible on an Apple II.
As a literal security professional, I can assure you that "availability" is one of the pillars of "success". In this case, security was "violated" by having the potential to fail catastrophically (Apple II computers could be permanently damaged if you "POKE"d into the wrong area) through causing the program to crash.
Security is simple and not simple at all. It is simple in that if you take a conservative view to everything, you generally don't have to think much; however, while the conservatism confers some
Re:
Score:
by
leptons
( 891340 )
writes:
Nothing you wrote matters. It's an Apple II, security was not a thing when they built this system.
Oh my god!
Score:
, Funny)
by
Ossifer
( 703813 )
writes:
on Tuesday March 10, 2026 @01:19PM (
#66033582
Why hasn't Apple released a security fix?! They've been sitting on this for DECADES!!!
Share
Re:
Score:
by
sacrilicious
( 316896 )
writes:
Reminds me of an episode of Futurama where Fry runs into the room and breathlessly exclaims, "I got here as quickly as I could once I found out what happened a thousand years ago!"
Re:
Score:
by
Jeremi
( 14640 )
writes:
Apple IIs are all highly secure, thanks to their built-in air-gap firewall!
Mustn't be very effective finding Windows vulns
Score:
by
doragasu
( 2717547 )
writes:
on Tuesday March 10, 2026 @01:19PM (
#66033584
If they are wasting time on Apple II instead of fixing the hot mess Windows 11 is.
Share
Re:
Score:
by
Junta
( 36770 )
writes:
No, it's busy making more of a mess of Windows 11.
Re:
Score:
by
Krishnoid
( 984597 )
writes:
I wonder if
Dr. Mark Russinovich
[digiater.nl] himself would be interested if the AI could identify the differences between
Windows NT and Server
[digiater.nl].
Re:
Score:
by
zlives
( 2009072 )
writes:
just use copilot, i am sure it will be very helpful.
Re:
Score:
by
unixisc
( 2429386 )
writes:
Is Windows 11 a hot mess due to
security vulnerabilities
, or is it a hot mess due to the enshittification of the platform for the benefit of CoPilot? Such as removing Wordpad, making Notepad a complex Wordpad, making Paint less versatile,....?
On the other hand, who put Claude up to finding bugs in that Apple II? Mark Russinovich?
Re:
Score:
by
gweihir
( 88907 )
writes:
Win11 is too complex as that AI could fix anything in there. It can make things worse though. "Code Review" AI already already starts failing in more complex teaching examples.
Re:
Score:
by
gweihir
( 88907 )
writes:
Idiot. Thanks for demonstrating that.
Hmmmmm.
Score:
by
jd
( 1658 )
writes:
imipak@yaho[ ]om ['o.c' in gap]
on Tuesday March 10, 2026 @01:27PM (
#66033592
Homepage
Journal
Whereas, if he'd used the software engineering techniques that were well-known and well-described at that time, he'd not have included the bugs in the first place. Or, if he had, he'd have detected them in testing.
I do not find it reassuring that a chief technology officer is pleased that he wasn't clever enough to write or test code correctly. What I do find is that I fully understand how he can be a CTO in an organisation notorious for defective software and even more defective bugfix releases.
Share
Re:
Score:
by
Moridineas
( 213502 )
writes:
What a ridiculous take.
Russinovich is 59 years old this year (2026). 40 years ago he was 19 years old and in high school.
Are you really criticizing someone who wrote code with a bug (or really, incomplete error handling) as a teenager? That may be one of the most "terminally online" comments I've ever seen. Check out the guy's Wikipedia page. He's done some neat stuff.
Re:
Score:
by
ConceptJunkie
( 24823 )
writes:
Dude's got an impressive CV, no doubt. Using this to slam Microsoft is lame. He's written a ton of impressive code, literally using it as a CV to get a job at Microsoft (with SysInternals, nee WinInternals).
Re:
Score:
by
jd
( 1658 )
writes:
I remember being 19. I was writing AI software for radionucleotide analysts. A couple of years later, I was writing the data store for a particle accelerator.
I am not the best coder in the world. I would regard myself as adequate. But, hy virtue of that, I will hold ALL 19 year olds to the same standard. It is a perfectly reasonable, achievable srandard. I kniw that because I achieved it and I am not the best.
Re:
Score:
by
Junta
( 36770 )
writes:
This was some little program a guy wrote at 20 years old that doesn't have any *real* reason to test for security (if you could run his code, you could just run whatever code you wanted anyway, it was a single user platform without any authentication or anything), and that should say anything one way or another about his capabilities as a 60 year old person?
Re:
Score:
by
jd
( 1658 )
writes:
You learn habits when you are young. You learn hubris when you are young.
The habit I learned from the start (around 8) was to assume I'd made mistakes, and therefore rigorously find them.
The habit he learned from the start was that if it isn't caught, it isn't a foul.
One of these two habits leads to a robust system. The other leads to the popular meme involving woodpeckers and civilisation.
Re:
Score:
by
Zak3056
( 69287 )
writes:
You know, you usually have some really interesting things to say. I have you on my friend list so your comments get a +6 so I see what you have to say regardless of what people moderating think about your comments. I've been reading your journal entries for literally decades.
But, if I may paraphrase Bill Gates here, "this is the dumbest fucking thing I've read since I've been on Slashdot." You're suggesting that someone who hacked the BASIC interpreter on the Apple ][ forty years ago should have been usi
Re:
Score:
by
gweihir
( 88907 )
writes:
Indeed. Well said. The thing is a self-own, nothing else. Of course the AI fan idiots will not see it that way.
Re:
Score:
by
Moridineas
( 213502 )
writes:
Gweihir,
We've engaged in some back and forths, and I get your position. I do remain 50/50 as to whether this is an elaborate long-running troll or not. I usually think not, but this is such a horrifically bad take, it makes me lean back towards the troll angle.
I have been coding for a bit shy of 40 years. I started with gwbasic and over the years taught myself a lot. I studed computer science in college. My main career is not as a developer, but I write code almost every single day. I try to learn a new lan
Re:
Score:
by
gweihir
( 88907 )
writes:
Have you read the story and understood what it is supposed to claim?
Re:
Score:
by
Moridineas
( 213502 )
writes:
By story do you mean the Linkedin post and attached PDF that I linked for you? Yeah. Did you?
If you can explain how a very highly credentialed coder, using an AI tool to analyze a binary from 40 years, when he was a teenager, identifying an issue in the code, is a "self-own, nothing else," I'd love to hear it.
Occam's razor says you saw the word "AI" and started calling it bullshit immediately.
Re:Hmmmmm.
Score:
, Insightful)
by
Jeremi
( 14640 )
writes:
on Tuesday March 10, 2026 @06:11PM (
#66034246
Homepage
I do not find it reassuring that a chief technology officer is pleased that he wasn't clever enough to write or test code correctly.
I was a shitty programmer once. So were you. So was every now-decent programmer, because being a shitty programmer and paying the price for making n00b mistakes is how one learns to becomes a good programmer. Nobody was born with the knowledge of how to apply all known best practices, and there's no shame in admitting it.
Parent
Share
Re:
Score:
by
jd
( 1658 )
writes:
I was a careful programmer, which is why, at 24, I was writing data stores for CERN. At 18, I wrote nucleotide analysis software. It is bug-free. At 16, I had written a comprehensive meteorological database system. That did have a few bugs, but none were critical.
I have to go back to age 10-12 before I was making the mistakes he was making at 19, because I learned to test and test again. Yes, I actually blew computers up. Literally, whole sections if motherboard destroyed. During testing. I never released c
Re:
Score:
by
Jeremi
( 14640 )
writes:
because I learned to test and test again
Yes,
because you learned
. Like everyone else; some more quickly than others.
Re:
Score:
by
strikethree
( 811449 )
writes:
I do not find it reassuring that a chief technology officer is pleased that he wasn't clever enough to write or test code correctly.
Bro, I understand your message, but ya gotta understand: We all start somewhere and we all make mistakes.
Shall we examine some of your earlier projects?
Not Copilot or OpenAI
Score:
by
brunes69
( 86786 )
writes:
[gro.daetsriek] [ta] [todhsals]
on Tuesday March 10, 2026 @01:39PM (
#66033602
Interesting he used Claude in this example. Very telling.
Share
Re:
Score:
by
caseih
( 160668 )
writes:
Not really. Claude is considered the best for coding and analysis. The others are not too bad either, but not quite as good as opus 4.6. So it's logical he'd use Claude for this personal experiment. If you think it's political you're adding that yourself. However it would be interesting to take the original apple ii buggy code and see if the other coding affects can find the same bugs.
Re:
Score:
by
caseih
( 160668 )
writes:
Other coding agents. Still no AI contextual awareness in Google keyboard... Maybe that's a good thing.
Anyway if he was willing to post his original, unfixed code to GitHub I would be interested to run opencode on it with a number of different models.
Re:
Score:
by
caseih
( 160668 )
writes:
I found his original code and I tried Opencode on it with OpenCode Zen Big Pickle, which is really a Chinese model called GLM. It did admirably. It disassembled the code and made some sort of sense out of it, but it definitely did not find the bugs.
On the other hand Claude Opus failed too for me. It claimed there was a bug that would prevent the example usage code given in the article from even working at all, which is clearly false. It did work. So it missed the bugs that Russinovich found with his Claud
Re:
Score:
by
caseih
( 160668 )
writes:
So after Opus 4.6 gave up, I finally gave it the list of bugs that Russinovich has in his post. After it was pointed out to it my instance of Opus confirmed the DORESTORE missing line-not-found check bug and explained why it was a problem. However it disagreed with one of the other issues found by Russinovich's Opus instance. It said: "Token comparison logic bug â" That other Opus instance was wrong here. The JMP $0314 goes to the CMP, not the LDA. The accumulator retains the token byte. It's corre
Re:
Score:
by
Junta
( 36770 )
writes:
A lot of these stories include the final "look at the magical thing the LLM output" while conveniently skipping the "boring lead up" where they basically manually have to tell it what *not* to say before they get it to generate the thing they intended to. And if even that fails, they just skip writing the post.
Re:
Score:
by
ConceptJunkie
( 24823 )
writes:
I work for a Microsoft shop and use Copilot a lot. When I have a hard question, I use Claude Opus, otherwise ChatGPT is fine.
Re:
Score:
by
allo
( 1728082 )
writes:
I think you can't beat Claude Opus at such tasks with other models currently. But that comes at a price. Literally.
CVSS
Score:
, Funny)
by
TWX
( 665546 )
writes:
on Tuesday March 10, 2026 @01:55PM (
#66033642
So what's the CVSS score on this vulnerability?
Share
Example vs Practical
Score:
by
darkain
( 749283 )
writes:
on Tuesday March 10, 2026 @01:57PM (
#66033650
Homepage
I knew everyone would come in here to bash this example just because it is an old platform not in general availability anymore.
But take a step back and realize what that means. Less documentation. Less availability. Less general knowledge on how the platform works overall.
These tools can handle it. And yes, these tools are already being used on modern hardware too.
Also most seem to be overlooking the "microcontroller" aspect of this: small microcontroller firmware files runs our world. It is now becoming trivially easy to fully reverse engineer proprietary firmware on these things. But more so, beyond that, these tools also are working considerably well now on X86 and ARM code for modern systems.
There is a certain level of "security via obscurity" in the close-sourced world, and that's now being blown wide open. This is it, this is the REAL story. But ya'lls are getting hung up on "OMG its an old Apple system"
Share
Re:
Score:
by
drinkypoo
( 153816 )
writes:
But take a step back and realize what that means. Less documentation. Less availability. Less general knowledge on how the platform works overall.
That's 100% backwards, except for the parts which are irrelevant, like availability. Being such an old processor there is more documentation, more commentary, it's very well known.
Re:
Score:
by
allo
( 1728082 )
writes:
Think larger. What about binary obfuscation techniques? A LLM can read the machine language and collect facts without getting frustrated by all the reverse-engineering traps developers may have put in there and slowly gets to the core of what the thing does. Things may become quite interesting soon.
Re:
Score:
by
telek83
( 1350439 )
writes:
So I think the Azure CTO is getting his terminology mixed up.
Machine Code = already assembled code
Assembly = Human readable code.
I asked Copoilte if any AI can currently decompile a program and the answer is no. AI can't take raw bytes and disassemble the code on its own, this requires reasoning and AI only does pattern recognition.
Here is what probly happened.
CTO gave his assembly source to Claude and Claude saw the bug, there was no disassembly of this code, not by AI anyway.
Also the summery doesn't make
Re:
Score:
by
allo
( 1728082 )
writes:
I find it hard to follow your reasoning.
Machine Code and Assembly are close. Machine code may be harder to digest for LLM because their tokenizer is usually not well-suited for binary and because the training material is probably filtered for human readable text. So it knows more assembly than binary.
"I ask copilote" - Do you think that's the absolute source of truth for the latest research?
"AI can't take raw bytes and dissamble the code" - why should this be impossible?
"this requires reasoning" - No. Your
Re:
Score:
by
telek83
( 1350439 )
writes:
Assembly and Machine Code aren't close at all
Assembly is human mnemonic form of machine code.
Machine Code is assembled/compiled code that isn't human readable.
So that means the LLM would not only have to disassemble the machine code to get at the code, but it would have to know exactly what its disassembling, If I say fix this binary blob it's an apple binary, it won't magically know how to do it. Because if that's the case why stop at apple binaries, let AI disassemble older games and programs or even mode
Re:
Score:
by
Junta
( 36770 )
writes:
I was swayed by another comment in this discussion that points out that, for whatever reason, his example is an LLM analysis of a single routine manifested as 120 bytes of machine code. The choice to use something so utterly short is enough to perhaps re calibrate expectations for practical use. It did spot a couple of real issues but mostly buried the user in a list of "I know already" about how the general environment is not exactly credibly secure at all. 75% of the 'findings' were just "Hey, Apple II
Re:
Score:
by
Zak3056
( 69287 )
writes:
Less documentation. Less availability. Less general knowledge on how the platform works overall.
Yes, the little known 6502. What an
obscure device
[wikipedia.org]. Only like five billion of them were produced, definitely not a lot of documentation or knowledge out there.
Re:
Score:
by
strikethree
( 811449 )
writes:
Less documentation.
Say WHAT? Documentation was golden back then. We could even get schematics for the chips and motherboards. Nowadays, you can barely find manuals that explicitly cover CPU instructions and errata only... and you have to pay to get anywhere near the level of documentation that was provided freely originally.
Dit it actually decompile it?
Score:
by
FalcDot
( 1224920 )
writes:
I mean, did someone actually "see" the AI decompiling the code?
Or try to recompile whatever the AI claims was the result of the decompilation?
Or are we just accepting this AI's word that it decompiled the code and then found errors?
Re:
Score:
by
ccr
( 168366 )
writes:
These are exactly the kind of questions that came to my mind too.
Russinovich's post offers scarce details on how it was done. I would be interested if the "AI decompiled" code was compared to actual disassembler output to verify accuracy (or if the model used some external disassembler tool for it? Shrug.)
Re:
Score:
by
allo
( 1728082 )
writes:
Does it matter if it "decompiles" or reads the machine language without intermediate step (even though one might suspect some kind of decompiled representation in the latents then)?
The point is the thing had the machine code (I guess some hex representation of it?) and understood it and found a bug.
Re:
Score:
by
caseih
( 160668 )
writes:
When I asked Claude Opus to disassemble the code and add comments, it did that, yes. I'd post it here (with comments) but it would trip the lame lameness filter. Ironic that posting code to slashdot is considered "junk."
That's kind of interesting to consider that the model is large enough to encode an assembler and disassembler in its parameter matrix.
This is a good thing
Score:
by
MpVpRb
( 1423381 )
writes:
AI companies have been approaching software in the wrong order
The correct order would have been to design, test and verify tools that could find bugs, edge cases and security vulnerabilities first, then once the bugfinder was mature, work on code generation.
Instead, what we got was "vibe coding" tools that allow the clueless to effortlessly create bloated, slow, inefficient, bug-ridden, insecure slop while the hypemongers proclaim "software engineering is dead"
firefox
Score:
by
groobly
( 6155920 )
writes:
Same story just broke about firefox.
Legitimate
Score:
by
TwistedGreen
( 80055 )
writes:
on Tuesday March 10, 2026 @03:46PM (
#66033926
Automating this kind of tedious work that nobody wants to do is one of the most legitimate hopes for a coding LLM. It's not replacing people because nobody would pay to do this anyways, even if you had an intern with nothing to do. The real test is whether it can do it on a more complex codebase with a modern language and not just inundate the user with false positives. We're not there yet.
Share
There are bugs to be found in ALL 40-year-old code
Score:
by
Tony Isaac
( 1301187 )
writes:
on Tuesday March 10, 2026 @05:14PM (
#66034124
Homepage
People long ago have stopped looking for bugs in cold that old. Many bugs back then weren't considered severe enough to worry about. A null reference was just an inconvenience, not a security threat. I mean, whatever you (the end user) did to get that null reference, stop doing that!
It's hard to imagine any old software, or for that matter, any software, that would hold up to this scrutiny.
Share
Who cares?
Score:
by
gweihir
( 88907 )
writes:
This is beyond ridiculous. That code is historic and irrelevant. Nobody cares to look at it. If that is the great proof of performance they have for their thing, I can only conclude it is a toy.
Re:
Score:
by
Jeremi
( 14640 )
writes:
If that is the great proof of performance they have for their thing, I can only conclude it is a toy.
No need to conclude anything; try it for yourself. Take your best code, the code that you've been debugging and polishing for years, the code that you've shipped in a hundred releases already, the code that you've run through every static analyzer and runtime test-harness you could get your hands on to try and ferret out any bugs, to the point where all of them returned "no further issues found". Dump that codebase into Claude Code (or whatever AI you think it appropriate) and ask it to scan the codebase
Re:
Score:
by
gweihir
( 88907 )
writes:
My code is all KISS compliant and very well reviewed and sometimes has been running for decades. None of it has security functionality (because that gets handles on a different level in high-security application). Hence totally unsuitable for this approach.
That said, is "try it yourself" also what you recommend when people are wondering about the effects of Cocaine?
Re:
Score:
by
Jeremi
( 14640 )
writes:
My code is all KISS compliant and very well reviewed and sometimes has been running for decades
Hey, mine too! And yet, there's always one more bug, isn't there? People who think their code is 100% bug free are kidding themselves.
That said, is "try it yourself" also what you recommend when people are wondering about the effects of Cocaine?
I mean, if they think it's worth exposing themselves to the health and legal risks, sure?
But I don't think asking an AI to scan the codebase of an open-source software library involves any risk at all; at best it helps you improve your codebase; at worst it comes back with nothing useful and you haven't lost anything. For closed-source code, OTOH, you'd risk your code gett
Fair enough, Apple II dev finding bug in AI Code
Score:
by
drnb
( 2434720 )
writes:
Fair enough, this former Apple II developer is finding bugs in AI generated code.
:-)
Fixed!
Score:
by
Dan East
( 318230 )
writes:
AI made the code fully type-safe, implemented buffer overflow checks, verifying all parameters in and out, and the perfectly-running result can't fit into the memory of an Apple II or onto a floppy disc...
(I just made that up, but I'm sure the code is much larger after adding all the security and boundary checks)
and was it even relavant while this was used?
Score:
by
NotAMarshallow
( 9040905 )
writes:
40 years ago. There was no internet. No one has cared about this for 30 years so why waste your time, unless of course you love wasting time to point out irrelevant issues.
Related Links
Top of the:
day
week
month
243
comments
'USB-A Isn't Going Anywhere, So Stop Removing the Port'
209
comments
New Book Argues Hybrid Schedules 'Don't Work', Return-to-Office Brings Motivation and Learning
200
comments
'A Black Hole': America's New Graduates Discover a Dismal Job Market
187
comments
Toxic Workplaces Are Worsening: 80% of U.S. Workers Say Their Job Hurts Mental Health
166
comments
WSJ: Tech-Industry Workers Now 'Miserable', Fearing Layoffs, Working Longer Hours
next
OpenAI Is Walking Away From Expanding Its Stargate Data Center With Oracle
41
comments
previous
Meta Acquires Moltbook, the Social Network For AI Agents
30
comments
Slashdot Top Deals
The universe seems neither benign nor hostile, merely indifferent.
-- Sagan
Close
Working...