The Business & Technology Network
Helping Business Interpret and Use Technology
«  
  »
S M T W T F S
 
 
 
 
 
 
1
 
2
 
3
 
4
 
5
 
6
 
7
 
8
 
9
 
 
 
 
 
 
 
 
 
18
 
19
 
20
 
21
 
22
 
23
 
24
 
25
 
26
 
27
 
28
 
29
 
30
 
 
 
 
 
 
 

Lunar Lake deep-dive: Intel’s new laptop CPU is radically different

DATE POSTED:June 3, 2024

Intel launches Lunar Lake, its next entrant into its Core Ultra series of laptop processors, today at Computex, ushering in a new generation of AI-infused Copilot+ PCs that have been initially overshadowed by Qualcomm.

Stop us if you’ve heard this before: Intel is prioritizing low power, perhaps feeling the pressure from Qualcomm’s just-launched Snapdragon X Elite. Several tweaks to Lunar Lake’s design, however, resulted in power savings and performance boosts, including shifting all of the E-cores to a low-power architecture. The Xe2 GPU at the heart of Intel’s “Battlemage” is here. Oh, and hyperthreading? Gone.

But there’s a fairly major change that affects you, a potential laptop buyer: Intel is embedding the DRAM onto the chip package. Yes, the PC’s memory. For now, if you buy a Lunar Lake laptop, you’ll have a choice between 16GB and 32GB of DRAM, but with no option to upgrade it later.

Yes, there’s AI. As chief executive Pat Gelsinger said at its Compute keynote: “Every device will be an AI device. Every company will be an AI company.”

We’re diving deep into Lunar Lake in this story, so feel free to jump ahead to the section you’re interested in. We’d expect Intel to eventually market Lunar Lake as the Intel Core Ultra Series 2, the unofficial 15th-gen Core chip.

Intel Lunar Lake empty tile Intel’s Lunar Lake chip. Intel Lunar Lake empty tileIntel’s Lunar Lake chip.

Intel

Intel Lunar Lake empty tileIntel’s Lunar Lake chip.

Intel

Intel

Lunar Lake: Made in Taiwan?

First, let’s be clear: Though Intel announced Lunar Lake at Computex, this isn’t a product yet. Intel is working with early production steppings, but Lunar Lake (and presumably laptops) won’t ship until sometime in the third quarter.

IFA, the Berlin trade show that begins Sept. 6, is the projected launch venue, sources at notebooks vendors say. Arrow Lake, the next iteration of Intel’s desktop processors (and possibly mobile chips for gaming laptops), are also due this year and could launch around IFA, too.

Intel Lunar Lake tiles Intel Lunar Lake tiles

Intel

Intel Lunar Lake tiles

Intel

Intel

While Intel’s Meteor Lake was a relatively complex chip with multiple tiles, Lunar Lake is a simpler design. While there are four tiles, only two matter: there is a compute tile (fabricated on TSMC’s 3nm-class N3B process) and the platform controller tile (on TSMC N6, an older 7 nm process). There is also a “filler” tile, a structural “blank” piece of silicon that’s just there to fill out the remainder of the chip and avoid it bending. It’s all mounted over a passive interposer, the “base” tile, which provides interconnections between the chips.

That’s a significant change: Intel had always targeted Lunar Lake as the first of the “angstrom” generation, fabricated on its 18A process. Meteor Lake was the first time that Intel mixed and matched tiles from its own fabs as well as TSMC. The key there, though, was that the compute tile was manufactured on Intel’s Intel 4 process, as it originally promised. With Lunar Lake, only the base tile is manufactured at Intel, according to executives, though Intel handles the assembly.

“You’ve probably heard my boss Pat [Gelsinger, Intel’s CEO] talk a little bit about 18A and we’re on track to fully utilize this process,” said Michelle Johnston Holthaus, executive vice president and general manager of the Client Computing Group at Intel. “We’re going to market on B0 silicon and we’re on track to be in production in [the third quarter] of this year.”

Following Apple: On-package memory

When you buy a laptop, a PC maker will install memory: sometimes soldered on, sometimes with slots that allow more memory to be added in the future. Now, Lunar Lake puts that memory within the chip package itself.

Apple has most recently been known for adding on-package memory with its M3-based Macs (with up to 128GB of unified memory) and the M4-based iPad, which follows suit. Now Intel is joining the crowd. Lunar Lake will mount 16GB and 32GB of LPDDR5X memory (with up to 8.5 gigatransfers per chip in two ranks), saving up to 250 sq. mm on the motherboard.

“I said, how do we build the best thin-and-light PC, and memory on package with our customers was by far the desired first step,” said Jim Johnson, senior vice president of the Client Computing Group and general manager of the Client Business Group at Intel, in an interview.

Intel Lunar Lake memory on package Intel Lunar Lake memory on package

Intel

Intel Lunar Lake memory on package

Intel

Intel

“The technical part is that we want to have an exquisite notebook that will take on ecosystem competitors,” Johnson added. “And that’s what we built. And we think 16[GB] and 32[GB] is the right matchup and yes, it’s not upgradable beyond that, but this is the cornerstone of our architecture moving forward and we will offer those options in the future.”

If you don’t like the idea of not being able to upgrade your memory, or if you want more memory configurations, it sounds like they might be coming. “I would just say that the next turn of the roadmaps are going to offer more traditional options,” Johnson said, which other Intel executives said referred to Lunar Lake’s successor, Panther Lake.

Low-power DDR DRAM needs to be soldered as close to the CPU as possible, so Intel’s decision makes sense — if weren’t for the recent introduction of LPCAMM2, an upgradable module which actually allows you to replace the memory, too.

Lunar Lake e-core are all low power now

Intel’s Lunar Lake makes two major changes to the CPU designs that you’re familiar with. First, what’s known as the “Skymont” efficiency core no longer has the low-power E-core that its predecessor, Meteor Lake, shipped with — all of the Skymont E-cores are essentially low-power E-cores, period.

But there’s a bigger twist: hyperthreading has been completely disabled across the board. All cores simply have a single thread associated with them for performance reasons. Even the performance cores, known as “Lion Cove,” are single-threaded. More on that later.

Intel Lunar Lake Skymont power performance Intel’s Skymont E-cores offer substantive performance and power gains over Meteor Lake, Intel says. Intel Lunar Lake Skymont power performanceIntel’s Skymont E-cores offer substantive performance and power gains over Meteor Lake, Intel says.

Intel

Intel Lunar Lake Skymont power performanceIntel’s Skymont E-cores offer substantive performance and power gains over Meteor Lake, Intel says.

Intel

Intel

Lunar Lake has four E-cores and four P-cores. Stephen Robinson, an Intel fellow and the lead architect for the new Skymont E-core, explained that at least for this generation, the E-cores should be thought of as a “brick,” which implies that Lunar Lake products will have blocks of four E-cores each — so a Lunar Lake chip with six E-cores sounds highly unlikely.

Lunar Lake’s E-core has a number of substantial architectural enhancements — wider machine decoding and out-of-order engines, a 4MB level-2 cache shared among all four cores — but the improved performance is startling.

Lunar Lake’s E-cores make the now-familiar tradeoff: they can either be run at lower power or at substantially higher performance for the same power. Here, the low-power cores can either be run at one-third the power of Meteor Lake’s E-cores, or else offer a substantial 1.7X performance improvement.

Intel Lunar Lake Skymont vs Raptor Cove e core Intel is even claiming that its E-cores outperform the 13th-gen Core’s performance CPU, Raptor Cove. Intel Lunar Lake Skymont vs Raptor Cove e coreIntel is even claiming that its E-cores outperform the 13th-gen Core’s performance CPU, Raptor Cove.

Intel

Intel Lunar Lake Skymont vs Raptor Cove e coreIntel is even claiming that its E-cores outperform the 13th-gen Core’s performance CPU, Raptor Cove.

Intel

Intel

At peak load, Lunar Lake’s E-core performance is basically double that of Meteor Lake, Robinson said. In multithreaded performance (where the four E-cores in Lunar Lake double the two low-power E-cores in Meteor Lake) multithreaded performance reaches 2.9X or 4X at peak clock speeds.

If put in a desktop compute tile, the Skymont E-cores actually outperform Raptor Cove, the 13th-gen Core CPU tile by about 2 percent in both fixed-point and floating-point operations, with some variation. Lunar Lake is not a desktop architecture. Instead, that’s a tip that may point to how the next-gen Intel desktop chip, Arrow Lake, performs.

Intel is not saying how fast that Lunar Lake will be clocked, unfortunately. For now, it’s just talking about the design of the chip itself.

Intel Thread Director gives Windows more control

Intel’s Thread Director has thankfully been simplified within Lunar Lake, too. Thread Director interacts with the Windows operating system, determining where and when to send tasks on to which cores. On Lunar Lake, it’s simple: tasks are assigned to the E-cores first. If they’re full or the workload exceeds their capabilities, then they’re routed to the P-cores.

As you might expect, there is a wrinkle: the creation of “OS containment zones.” Users have been asking for years for controls to specify playing a game, for example, on all of the chip’s P-cores. It’s not quite clear whether users will be granted this sort of specificity, but the OS will. For example, Microsoft Teams has been granted an OS containment zone so that the app will run only on the E-cores, and won’t touch a P-core, according to a presentation by Rajshree Chabukswar, an Intel fellow.

As a result, Teams power was cut by 35 percent, Chabukswar said.

Lunar Lake’s P-cores kill hyperthreading

The performance core within Lunar Lake, Lion Cove, is 14 percent faster than the P-core within Meteor Lake, known as Redwood Cove. And that’s with a huge change: Intel has turned off hyperthreading across Lunar Lake. Yes, hyperthreading, the SMT technology that’s been a staple of Intel’s chips for about twenty years.

Intel Lion Cove P-core hyperthreading Intel is making the case that hyperthreading is just too expensive in terms of power and cost. Intel Lion Cove P-core hyperthreadingIntel is making the case that hyperthreading is just too expensive in terms of power and cost.

Intel

Intel Lion Cove P-core hyperthreadingIntel is making the case that hyperthreading is just too expensive in terms of power and cost.

Intel

Intel

So why get rid of hyperthreading? According to Ori Lempel, the senior principal engineer of Intel’s P-Core, Intel’s goals were to optimize single-threaded performance, with an eye toward maximizing the performance per watt per area on the chip — low performance per watt costs battery life, and low performance per area essentially costs Intel money in manufacturing costs.

Hyperthreading does make sense for performance parts and datacenters, Lempel noted. But it requires physical space for the hyperthreading logic and the associated silicon. But in thin-and-light laptops, the target for Lunar Lake, Intel engineers discovered that they achieved 15 percent more performance per watt and 10 percent more performance per area with hyperthreading turned off than a hyperthreading-enabled processor.

Intel Lion Cove P-core performance Intel’s Lion Cove, and its relative performance. Intel Lion Cove P-core performanceIntel’s Lion Cove, and its relative performance.

Intel

Intel Lion Cove P-core performanceIntel’s Lion Cove, and its relative performance.

Intel

Intel

There are two other key changes in the P-Core. First, if a Lunar Lake needs to add or subtract performance, it will do so more gradually. Intel processors currently increase and decrease in 100MHz increments; Lunar Lake will step up and step down at 16.67MHz intervals. Second, Intel has added a small “AI” controller, which will monitor and watch the system in real time. The idea is that Lunar Lake systems will make small, incremental adjustments to power and speed, maximizing performance and battery life for users.

From a security standpoint, Intel has added a “partner security engine” to the Intel silicon security engine and the Intel graphics security controller. That partner security engine is Pluton, the Microsoft-AMD security engine that has successfully protected the Xbox.

It’s time for Xe2 to debut

Intel has steadily increased the performance of its integrated GPU in successive generations, but Lunar Lake marks a sharp leap: this is the debut of the Xe2 graphics architecture. Tom Petersen, an Intel fellow, confirmed that Xe2 is inside Lunar Lake, and this is the same architecture that will debut later in a discrete GPU for desktops, code-named “Battlemage.”

Intel Xe2 Lunar Lake intel’s Xe2 architecture: Lunar Lake on the left, Battlemage on the right. Intel Xe2 Lunar Lakeintel’s Xe2 architecture: Lunar Lake on the left, Battlemage on the right.

Intel

Intel Xe2 Lunar Lakeintel’s Xe2 architecture: Lunar Lake on the left, Battlemage on the right.

Intel

Intel

Again, Intel isn’t talking specifics, including Xe2’s clock speeds, memory, or details of the Lunar Lake implementation. But Intel provided a more general overview of how Lunar Lake’s Xe2 implementation compares to the integrated GPU within Meteor Lake.

Petersen described the Xe2 architecture as “more compatible with games and with a higher utilization.”

Intel Xe2 relative performance Intel isn’t providing actual performance numbers yet, but it providing some comparisons to the first-gen architecture. Intel Xe2 relative performanceIntel isn’t providing actual performance numbers yet, but it providing some comparisons to the first-gen architecture.

Intel

Intel Xe2 relative performanceIntel isn’t providing actual performance numbers yet, but it providing some comparisons to the first-gen architecture.

Intel

Intel

Intel’s Xe2 core has been redesigned, with eight 512-bit vector engines accompanied by eight 2048-bit Xe Matrix Extension (XMX) engines capable of 2,048 FP16 operations per clock and 4,096 8-bit integer operations per clock — both tools that can be used for traditional graphics as well as AI. There’s an improved ray tracing unit, too.

In Lunar Lake, Intel has set up the GPU to offer eight Xe cores, with 64 vector engines and two geometry pipelines. All told, Intel believes it will offer 1.5X the performance of the previous generation, at the same power.

Intel Xe2 in Lunar Lake Here’s how Intel’s Xe2 will be configured within Lunar Lake. Intel Xe2 in Lunar LakeHere’s how Intel’s Xe2 will be configured within Lunar Lake.

Intel

Intel Xe2 in Lunar LakeHere’s how Intel’s Xe2 will be configured within Lunar Lake.

Intel

Intel

“I don’t think I’m allowed to tell you the performance at higher power,” Petersen added.

The Lunar Lake display engine will offer 3 display pipes, with HDMI 2.1 (up to 8K60 HDR 10-bit), DisplayPort 2.1 (three 4K60 displays) and a new eDP 1.5 connection, which will allow for 360Hz 1440p displays for gaming.

Intel also has a technology called “panel replay,” which is an evolution of how the display panel can self-refresh. Adaptive sync displays adjust the panel’s frame rate to match the content coming in, eliminating judder or screen tearing. Panel replay does something similar. The example shown was a movie, where the panel has to self-adjust its timing to account for the 24fps movies are broadcast in, as opposed to the native 60Hz (or higher) of the panel.

What panel replay does is understand that certain frames may need to be repeated. If this happens, though, the display engine can turn off the CPU cores and in some cases the memory when they aren’t needed. The GPU just queues the needed frames in place.

There’s also something new in the video codec front. While Lunar Lake performs coding and decoding of the AV1 video codec, it has added decoding support for VVC (H.266), an advanced video codec. AV1 shrinks file size by about 40 percent compared to the older HEVC file format, and VVC file sizes will be about 90 percent of a AV1 file, Petersen said. However, VVC’s file complexity is substantially more.

Lunar Lake’s NPU: It’s finally time for Copilot

Naturally, a key focus for Lunar Lake is AI, which features a significantly improved “NPU 4” core.

We live at a weird intersection of AI capabilities, which Lunar Lake lands in. Most people have only used AI in the cloud, through Windows Copilot, Google’s AI Overviews, ChatGPT, or some other service. Chipmakers would love for you to use local AI, and Copilot+ PCs with native AI capabilities will start shipping later this month — but only initially with Qualcomm’s Snapdragon X Elite chips inside.

Intel Lunar Lake TOPS Intel is making the case that whatever the platform — CPU, NPU, or GPU — it can deliver. Intel Lunar Lake TOPS Intel is making the case that whatever the platform — CPU, NPU, or GPU — it can deliver.

Intel

Intel Lunar Lake TOPS Intel is making the case that whatever the platform — CPU, NPU, or GPU — it can deliver.

Intel

Intel

Customers who bought into Intel’s initial vision of an AI PC may feel a little jilted; current Meteor Lake laptops only generated 11.5 TOPS from the NPU, significantly under the 40 TOPS that Microsoft’s Copilot+ program requires. The new “NPU 4” inside Lunar Lake produces 48 TOPS all by itself. That means Lunar Lake PCs will be Copilot+ capable, when they ship. Meteor Lake AI PCs are not.

Further reading: Microsoft’s Copilot+ PC push leaves existing ‘AI PCs’ behind

What’s new? Meteor Lake had a pair of inference pipelines in the NPU. Lunar Lake has six, each of which triples the amount of multiply-accumulate (MAC) engines that are fundamental to AI processing. That basically works out to double the performance in the same power envelope. AI processing is essentially a ton of specific matrix and vector mathematics, and Intel has begun adding in specialized blocks. What it calls the SHAVE DSP is one vector engine, which provides 12 times the vector performance. Basically, Intel is saying that SHAVE will boost the performance of LLMs, or AI chatbots, running locally on your PC.

Intel believes that Lunar Lake offers a potent combination of AI capabilities, with 120 TOPS spread over the CPU (5 TOPS), GPU (67 TOPS), and NPU (48 TOPS). But that unfortunately ignores the broader point: most applications pick one chip, and don’t use all three at once.

Not all, though. In a demo, Intel showed how running 20 iterations of Stable Diffusion could be achieved in about a quarter of the time of Meteor Lake, and at lower power, too, using the NPU and GPU in concert.

Intel Lunar Lake NPU4 Stable Diffusion Intel NPU4 on Lunar Lake in action., Intel Lunar Lake NPU4 Stable DiffusionIntel NPU4 on Lunar Lake in action.,

Intel

Intel Lunar Lake NPU4 Stable DiffusionIntel NPU4 on Lunar Lake in action.,

Intel

Intel

Lunar Lake’s communications technology: using Wi-Fi as a sensor and more

Surprisingly, Lunar Lake will not be the debut platform of Thunderbolt 5, as you might have expected. But it will integrate Wi-Fi 7 and Bluetooth 5.4, and provide an enhanced multi-link single-radio (eMLSR) technology that should improve throughput by hopping back and forth between wireless channels. And there’s a wild new technology, called Wi-Fi Sensing, that uses a Wi-Fi radio as essentially a type of radar.

According to Carlos Cordeiro, an Intel fellow the wireless CTO of Intel’s Client Computing Group, Intel is strongly encouraging laptop makers to cluster all of the Thunderbolt ports on one side of a laptop, stop mixing and matching Thunderbolt and USB-C ports, and properly label all Thunderbolt ports — all things that should have happened long ago. (Lunar Lake will also support three Thunderbolt ports, up from two, and the Thunderbolt Share sneakernet will be featured.) Cordeiro indicated that Thunderbolt 5 will be in Intel silicon later this year, which likely means Arrow Lake.

Interestingly, you will see higher throughput with Thunderbolt 5. Thunderbolt 5 SSDs will actually deliver 25 percent more performance on a Lunar Lake PC with a Thunderbolt 4 port, Cordeiro said.

Wi-Fi 7 was in Meteor Lake, too, but now it’s been more fully integrated, saving power. Intel built in a small 11Gbps interface between the Lunar Lake platform controller tile as well as the wireless, future-proofing the connection.

Though the Intel WiFi radio can talk on the three bands — 2.4GHz, 5GHz, and 6GHz — those bands can still become congested, slowing data throughput. Intel built a technology called enhanced multi-link single operation to solve that problem. Essentially, eMLSO concentrates on a single frequency, but periodically listens to others, especially if the frequency becomes congested. The technology will then shift the radio’s communication over to the uncongested frequency.

And did you know that DDR memory itself can cause Wi-Fi interference? Intel uses a technology called RF Interference Mitigation to dynamically adjust the clock frequency of the memory to prevent interference.

Intel DDR Wi-Fi interference Intel can adjust the frequency of its DDR memory to avoid interference with your laptop’s WiFi radio. Intel DDR Wi-Fi interferenceIntel can adjust the frequency of its DDR memory to avoid interference with your laptop’s WiFi radio.

Intel

Intel DDR Wi-Fi interferenceIntel can adjust the frequency of its DDR memory to avoid interference with your laptop’s WiFi radio.

Intel

Intel

WiFi Sensing uses both antennas, one broadcasting and one receiving. The laptop essentially broadcasts radio data out, then uses the other antenna to “listen” for a bounce off various objects — specifically you. If the WiFi Sensing technology detects you’re walking away, it locks your computer and shuts off the display. If you then approach, it wakes the displays (but doesn’t unlock the computer.)

“You can be a kid, a big person — that’s the other type of magic,” Cordeiro said. “We can retrain the model so that we know the size of the person that’s approaching.”

It’s a little scary! Intel has bigger plans for Wi-Fi Sensing, though it’s unclear whether they’ll come to market. “Future PCs will be able to detect user movements and gestures, monitor heartbeat and breathing rate, whether accessories are to the left or right, how many there are, etc.,” Intel said.

Intel’s Unison is getting beefed up, too, with tablet control, a quick connect to phones that don’t have access to Unison, and a universal hotspot. The latter functionality is already in Windows, so it’s unclear what Unison will deliver.

Intel Unison Lunar Lake Intel Unison Lunar Lake

Intel

Intel Unison Lunar Lake

Intel

Intel

Finally, Lunar Lake can run Bluetooth over PCIe, which Cordeiro said will save time accessing the Bluetooth device.

In all, Lunar Lake is yet another substantive rewriting of the mobile PC processor. But with Qualcomm’s Snapdragon X Elite and AMD’s Ryzen AI 300 waiting in the wings, can it maintain its traditional laptop leadership? We’ll see.

As Gelsinger said today, “This is the most exciting moment in the PC market in 25 years.”

CPUs and Processors, Laptops