Asked on MacRumors. - ThoughtSimple

Asked on MacRumors.com by @Yebubbleman

“The M1 has been lauded, by tech journalists and users alike, as an industry game changer for personal computing. I’m curious: Do you agree with that statement? If so, how do you think the personal computing industry has been forever changed by the advent of the M1 and Apple Silicon Macs at large?

Personally, I do think it was a fantastic move for Apple and I would argue that it’s a game-changer for the Mac. But I don’t see the personal computer industry changing or adopting similar strategies to this as a result of Apple doing it. Do I think we’ll see more SoCs in non-Mac personal computers? Abso-friggin-lutely. But we’re never going to see a computer maker own the entire hardware and software stack the way Apple now does with Apple Silicon Macs like the ones we now have with M1. Microsoft may have an SQ1 or SQ2 for the Surface Pro X, but that thing is a Qualcomm SoC. Samsung makes SoCs for its phones and tablets, but Samsung isn’t Samsung’s only customer for those SoCs. And while they do have their own version of Android (albeit one of the worst ones out there), it’s not their OS underneath it all! I think Microsoft and Samsung have the best chance of trying to follow Apple on something like this. Maybe NVIDIA, now that they own ARM Holdings. But I think any one of the three of them doing it would take so much time to catch up to Apple. So, no, I don’t think it’s an “Industry Game Changer”; though I do think it’s a massive game changer for the Mac itself. What say you all on this? Do you think the personal computing industry will forever be changed by this? And if so, how and when?”

This is an interesting question. And of course any answer is purely speculative. The mainstream PC market is a very conservative place. Change comes slowly and compatibility is everything. For corporations, compatibility is defined as replacing their PC fleet on a regular cadence, changing system software as little as possible, and never having to update their line-of-business customized applications. Consumers are a little more flexible but the market for mainstream x86_64 PCs is a declining market. Users in the space are as likely to abandon a home PC and replace it with a phone or tablet as to upgrade to a new notebook computer. So, given that Apple has a revolutionary Mac processor, does that make a difference to the majority of the market?

If PC buyers decide that notebooks with real all-day battery life in ultra-book style designs are important, then I think the M1 and its follow ons will have an impact. Getting tablet like battery life out of a PC is new territory despite what the marketing of various PC brands have been promoting for the last few years. Getting high performance and long battery life hasn’t really been possible previously in small and light notebooks. The M1 is a game changer in that when you need performance, you get some of the fastest in the market yet you still get long, all day battery life. In my experiments with an M1 MacBook Air, I can be doing CPU intensive work for hours a day on battery and still get more than 10-12 hours total battery life. Given the current Intel and AMD x86_64 designs in small and light notebooks, competing with the M1 Macs in performance and battery life is going to extremely difficult or nearly impossible.

There are a couple of reasons for this but the main one is that the x86 CPU architecture is old and encumbered with a bloated design for its ISA, its instruction set architecture (the opcodes and native machine codes that are the very lowest level of execution in a CPU.) For compatibility reasons, that ISA can’t realistically be improved. The market demands that x86 code from years and even decades past continue to run. Apple with the Mac has never had to contend with such a requirement. They modify and replace their base hardware and software as they and their customers needs evolve. For Intel, using an ancient design hasn’t hindered their business.

Intel, up until recently, has had the revenue to roll out new R & D for CPU manufacturing to stay ahead of the competition. They’ve had tremendous success in updating their CPU designs to match the simpler designs of more modern ISAs. This has mostly been because their process technology for making smaller and smaller transistors on a silicon die has lead the industry for decades. But Intel has stumbled. They no longer lead the industry and have allowed their arch-rival AMD to catch up and exceed them on process via TSMC’s 7 nm and soon for AMD, 5 nm fabrication plants.

AMD, however, is still using the same old, bloated x86_64 instruction set as Intel (the 64-bit extensions were actually invented by AMD and named by them AMD64.) If it was still just a horse race between Intel and AMD, then the PC market would probably continue apace with AMD and Intel jockeying back and forth for the top spot. With Apple now in the mix, that changes. Apple is showing the PC world that with better architecture design you can get a tremendous improvement over the current x86 designs with very few compromises. Apple’s code translator technology, Rosetta 2, is proof that even radical architecture changes don’t have to impact compatibility much at all. Microsoft has similar technology in Windows on ARM for translating 32-bit x86 binaries (and maybe soon 64-bit) to native Arm64 code.

To understand why an old ISA is a problem but why it hasn’t been much of an issue until now, a little computer architecture history is needed. When Intel invented the microprocessor, they had a very limited number of transistors to work with so they discarded much of what the larger computer industry had discovered over the previous decades about CPU design. Early microprocessors were relatively simple devices that executed instructions one at a time, in the order that the programmer or compiler issued them. Adding complex instructions that reduced the number of instructions needed to accomplish a task seemed like a good idea, had no performance issues except using up the increasingly available transistors, and chip manufacturers marketing departments could tout the new instructions as a benefit to their customers.

This didn’t become much of a problem until sometime in the early 90s when microprocessor CPUs started becoming much more sophisticated in how they executed instructions. Previously a few CPU designers noticed that it was easier to create faster CPUs with simpler instructions—Acorn RISC Machines known as ARM was a notable example. Now the big players like Intel, Motorola, IBM, and others took notice and tried to solve the problem with differing solutions. IBM along with Motorola and partner Apple decided to also take the RISC path and create a simpler ISA to get similar advantage to what ARM was seeing. Intel took a different approach and used their superior process technology to ramp up the number of available transistors to create increasingly complex designs while keeping their ISA mostly intact thus maintaining the all important x86 compatibility. Intel kept their complex instructions while internally to the CPU they decoded these complex opcodes to simpler instructions for execution—becoming RISC like operations in the process though hidden by the CPU.

Over time, it became clear that Intel’s superior process technology and compatible approach with x86 had won the battle. The benefits of decoding a simpler ISA in a RISC CPU was swamped by the number of transistors Intel had at their disposal to create more sophisticated decode steps and transform their instructions to the RISC like operations for execution. The RISC CPU suppliers in the PC market fell by the wayside and eventually even Apple had to abandon their RISC approach with the PowerMac and switch to Intel. While AMD’s x86 kept pace with Intel for a while in the late 90s and early 2000s eventually they also fell to Intel’s superior process technology. For the next 20 years or so, Intel stayed on top of the PC CPU market though they did have to adopt AMD’s 64-bit ISA extensions during the process. With the decline of AMD, Intel became more monopolistic and radically slowed its R & D pace. CPU performance improvements slowed and generational improvements to battery life became anemic.

This leads up to the recent Apple M1 announcements. Apple has been designing low power CPUs with relatively high performance for a decade using an Arm ISA. This Arm is not quite the same company as the Acorn RISC Machines (ARM) described above though they are of that company’s lineage. The current company Arm Holdings was created in the early 90s as Advanced RISC Machines Ltd. (Arm) to create low power CPUs needed by Apple and others for small handheld computers and Personal Digital Assistants (PDAs) like the Apple Newton. Starting with the ARM instruction set and low power CPU designs Arm Holdings reset the expectations for low power performance. While Apple eventually abandoned the Newton and sold their Arm Holdings stock, they continued to use Arm CPUs in their iPod lineup.

With the iPhone Apple needed a low power CPU with enough performance to handle a simplified version of the OS X operating system to simplify developing application software on the phone. At time time, with Apple’s success and experience with the iPod, using an Arm CPU was an obvious solution. They even approached Intel to use Intel’s long neglected XScale Arm CPU (born of the StrongARM CPU of Apple Newton fame) but Intel wasn’t interested in selling anything that wasn’t x86 based.

First with iPod and then even more importantly since the iPhone, Apple’s revenue now overwhelms Intel’s and allows Apple to have nearly unlimited R & D dollars to spend. For the first time, a company that designs CPUs for themselves can out compete Intel in silicon design. The consequence of this shift is that Apple has done something that eluded IBM and other RISC designers in the past which is that they can take complete advantage of the simplified instruction decoder logic in their RISC CPU design. For Intel and AMD with their most modern CPUs, they can usually decode 3-5 instructions simultaneously. When they try to add more decode logic, they meet diminishing returns that doesn’t speed up their CPUs but uses a lot of their transistor budget. Apple has discovered with their simplified RISC ISA that they can currently decode up to 8 instructions simultaneously and get the benefit of that wider decode unlike the older CISC designs. This is a new thing in the industry and is a surprise given the previous history that said that RISC vs. CISC didn’t much matter anymore. It turns out that with a more intelligent design and with an equal footprint for a transistor budget, it really does matter.

Starting with this 8 wide decode, Apple has added other complimentary designs to their M1 that includes much larger internal cache memories and the ability to reorder incoming instructions for execution to a much higher degree than is possible with the x86_64 architecture to create a CPU that is at the top of the industry in executing instructions per clock cycle or IPC. This allows Apple to keep the frequency of the CPU in the low 3 GHz range which saves power over designs which peak at 4.5-5 GHz while still getting better overall performance for each CPU core. This breakthrough allows Apple’s M1 to get tremendous CPU performance per Watt compared to its peers.

So the secret to Apple’s new silicon design is that they use a modern ISA, CPU architecture, and other tweaks to sustain high performance with the least amount of battery usage in the industry. Will this make a difference in the overall market? Is it an industry game changer? I think it depends on Intel and AMD and to a lessor extent Microsoft.

Let’s start with Microsoft. Microsoft has a version of their Windows operating system that runs on Arm CPUs. They have both their own notebooks designs using Arm CPUs and limited support in the PC market from OEMs. These Arm notebooks haven’t been particularly successful mostly because they underperform and require additional performance sapping translation software similar to Apple’s Rosetta 2 to run 32-bit x86 Windows software (Win32). They have some advantages like longer lasting battery and integrated LTE wide area networking but their price vs. performance hasn’t particularly met the needs of the PC market. The main problem is that the Arm CPUs used are low power but pretty anemic in performance. Apple’s entry in the Arm notebook space changes this but it is up to Microsoft to want the M1 Macs to be able to run Windows on Arm in virtual machine environments. If Microsoft allows it, Apple can show the Windows PC market that Arm notebooks do not have to be only low cost, slow, and underspecified notebooks to be successful.

If Microsoft allows Windows on Arm in an M1 VM, then a new class of Windows PC might rise up to challenge Apple to the high performance, low power, long lasting battery crown. Right now, the Windows PC market has nothing to compete with Apple in that notebook market. Even PCs with faster processors than the new M1 effectively can’t compete, no matter how large their chassis and batteries are. Does the overall PC market care about this? It is unclear but I think it will matter over time, creating a market that really can’t be completely served by Apple’s M1 notebooks.

If an Arm PC market does grow and start to outperform the equivalent Intel and AMD PCs, what does this mean for those companies assuming they are not the companies making new, high efficiency, and high performance Arm CPUs? To compete, Intel will have to change their architecture approach for the first time in decades. I doubt that Intel or AMD can create an x86_64 design that can reproduce what Apple has done with the M1 and keep up with what is likely to be a steady stream of Apple Silicon improvements. AMD is more likely to just switch to creating ARM CPUs but they could also just copy what Intel does. One thing that Apple has shown is that with a considered design, you can translate between different instruction sets effectively. Apple’s Rosetta 2 has a 20% to 40% cost in performance over native M1 code. For the vast majority of business applications and more importantly, old line-of-business customized applications that corporations use, this performance hit is negligible with the speed of the M1 processors.

What Intel and AMD can do is create a new light-weight x86_64 like architecture that uses modern RISC design principles. Make the instruction length fixed like Arm64. Switch from strong memory ordering to weak memory ordering which speeds up low level operating system tasks. Eliminate little used complex instructions and complex memory addressing modes. For x86 compatibility, these changes can be made to binaries with Rosetta style translation while giving a path forward for more advanced, more efficient native code. And they have an inherent advantage over Apple in that they’ve already created hardware that does much of this translation for them in their CPU decode logic. Intel and AMD have studied how to speed up x86 code extensively to improve performance in their current x86_64 CPUs.

Apple has shown the way forward. I wonder if Intel and AMD can see the path?

December 3, 2020

Apple 13″ M1 MacBook Air Review It is hard to explain just how incredible the performance of the new M1 SoC from Apple is in the 13” MacBook Air. It

Previous post