One thing that's universally agreed upon, is the malloc() implementation in the ancient yet highly compatible MSVCRT.DLL that a lot of open source software uses, is incredibly slow. - Daniel Lemire (https://x.com/lemire/status/1837666489509220429)
This tweet by Daniel Lemire reminded me of a story circulated within Microsoft about MSVCRT.DLL in the 90s that may explain its ongoing poor performance. Remember, back then memory was so precious that saving physical memory by aliasing code pages was a key benefit of DLLs. No matter how many apps were using a DLL, only one copy of its code had to be present in physical memory. Because it was built into Microsoft’s Windows toolchain, making it available to every Windows application, MSVCRT.DLL especially benefited from this arrangement.
The idea too was that DLLs’ development was decoupled from that of their client applications. You could deliver performance benefits by upgrading the DLL “out from under” the application.
According to legend, Microsoft optimized small-malloc performance in its runtime and utilized this deployment mechanism to surreptitiously replace the MSVCRT.DLL being used by almost every Windows application—including the AOL client. The AOL client was used by millions of users. Breaking it in the field meant they could no longer access the Internet! (Remember, this is the 1990s, most accessed the Internet via dial-up) Microsoft spun up a strike team to debug the issue and lo and behold, the bug turned out to be in the AOL client! They were using memory after it had been freed. So they worked with AOL to find and fix this bug in their program.
The customer impact was so widespread that Bill Gates had called a meeting to be briefed on the situation. As the story goes, the person presenting to Bill detailed how it was that installing an application unrelated to the AOL client could result in this regression: any program that used the newer MSVCRT.DLL would ‘upgrade’ it in place for *all* programs that used MSVCRT.DLL. And the presenter triumphantly said, And the bug is in AOL’s client! Intimating that AOL, not Microsoft, was responsible for the regression.
“They’ve already fixed the bug and made the new application available to their customers.”
Bill asked: “How are AOL’s customers going to get the updated app?”
The hapless presenter said, “It’s available for download from AOL’s Web site.”
And Bill asked: “How are they going to download the updated app if they can no longer access the Internet?” Mic drop.
Subtle dependencies on *behavior* as opposed to specification are why the landscape was called “DLL hell” and the solution, for Microsoft at least, was to create .NET manifests that specify the exact versions of all components.
But this love affair for self describing code would become a monster in its own right. Enter .NET Reflection, which would yield concepts like WS-* and Remoting. Customers would appropriately use synchronized calls described in a contract - or a manifest, as the generator would blindly crawl a MarshalByRefObject. Alas! One of the methods would expose a lock(). Inside this lock the blind would lead the blind, calling a callback that blind called back on the lock… creating a circular *distributed* deadlock. Who is at fault when there are two servers? It is the client you say.. in a distributed system one must always fault the service - not the client. But aren’t they both clients you muse with incest in mind. Well a callback is most certainly a client behavior you plead! But no my friend - it is a server receiving an event notification… through a subscription!
But we obeyed the contract! So how could we go wrong!
Contracts are a good intention. What’s the mechanism? In a world where CrowdStrike can still prevent me from getting treatment at a hospital on time and supply chain attacks are discovered everyday, I just feel like we have not solved this problem.
When entire nations exploit these manifests to gain an information advantage, I still feel Microsoft - and dare I say even Apple - have not managed to solve this critical user experience and security problem. Perhaps it can never be solved and we must always just depend on trusting… a manifest (now with a cryptographic signature, so we rrreeeeeaaaalllllyyyy trust it!).
Good intentions and a road to Satan I read somewhere… maybe it was at Amazon. ;-)
I couldn’t help it Nick. Sid’s manifest. Give me a ring: srao at positron networks dot com. Three one seven 478 three six three 4. I’m sure a LLM can decipher that one. ;-)
A classic story that highlights - your dependency failure is not just theirs but yours also as your customers are impacted. This also shows why someone like Bill Gates was successful in doing what they did. Thanks for sharing.