Memory Frequency Scaling on Intel's Skull Canyon NUC - An Investigation
by Ganesh T S on August 29, 2016 8:00 AM ESTOverclocking has generally been the domain of enthusiasts with desktop rigs. However, increasing the CPU frequency beyond the official specification is not the only way to extract more performance from a computing system. Memory-bound workloads can benefit from memory hierarchies with increased bandwidth and/or lower latencies.
The memory controller in the Intel U-series processors (on which most, if not all, high performance SFF systems are based) is rated for operation only according to the standard JEDEC guidelines (1600 MHz for DDR3 and 2133 MHz for DDR4). However, the Skull Canyon NUC (NUC6i7KYK) can support DDR4 SODIMMs operating at 2133 MHz+. In this article, we explore the effects of varying DDR4 frequencies and latencies on the NUC6i7KYK.
Background
Intel's Skylake platform brought DDR4 DRAM into the mainstream market. DDR4 brings a host of improvements over DDR3, particularly in terms of operating at lower voltage and higher frequencies. Another important advantage is the maximum capacity per DIMM (moving from the usual 8GB in DDR3 to 16GB in DDR4). On standard non-overclocked systems, the DDR4 memory can operate at up to 2133 MHz. DDR4 DIMMs operating as high as 3733 MHz are available for desktop systems with full-sized memory slots. On the SODIMM side, we have seen various vendors introduce kits operating at more than 2133 MHz. However, there are very few systems utilizing DDR4 SODIMMs that also support for memory overclocking (a term I will use in this article to indicate DDR4 operation at more than 2133 MHz).
SODIMMs are used in notebooks and SFF PCs. The memory in most notebooks is rarely upgraded, and many high-performance notebooks are now opting to go with soldered DRAM instead of SODIMM slots. That leaves SFF PCs as the only viable option to explore the benefits of these new wave of DDR4 SODIMMs. Skylake-U is not an effective option because its memory controller is rated to operate at only 2133 MHz. However, the Skull Canyon NUC (NUC6i7KYK), which combines a Skylake-H CPU (Core i7-6770HQ) with the H170 chipset, is a completely different platform. The BIOS includes XMP support, resulting in the appropriate SODIMM kits operating at speeds higher than 2133 MHz automatically.
One of the advantages of DDR4 SODIMMs is that we can get up to 16GB on a single SODIMM. On systems like the NUC6i7KYK with two SODIMM slots, we can have up to 32GB of DRAM. To that end, for today's article we have procured 2x16GB DDR4 SODIMM kits rated for operation at more than 2133 MHz from Corsair, Crucial, G.Skill, Kingston and Patriot Memory to study the impact of memory overclocking on the performance of Skull Canyon.
The Core i7-6770HQ Memory Controller and Hierarchy
Our Skull Canyon NUC (NUC6i7KYK) review analyzed the platform and BIOS in great detail. The block diagram shows two channels of DDR4 memory operating at 1.2V for a 128b memory interface. When both memory slots are occupied, the two memory channels operate in dual-channel symmetric mode (also known as interleaved mode) and provide the best performance for real-world applications. Addresses are ping-ponged between the channels after each cache line boundary (64 bytes). In the case that only one slot is occupied, the operation is in single-channel asymmetric mode. The memory controller does support ECC RAM (since it is used in the Xeon E3-1500 v5 lineup also), but the feature is disabled in the Core i7-6770HQ.
Intel recommends that the SODIMMs used in the NUC6i7KYK support the Serial Presence Detect (SPD) data structure. This allows the BIOS to read the SPD data and program the chipset to accurately configure memory settings for optimum performance. If non-SPD memory is installed, the BIOS will attempt to correctly configure the memory settings, but performance and reliability may be impacted or the SO-DIMMs may not function under the determined frequency.
The other interesting aspect of the Core i7-6770HQ that bears relevance to the performance of the NUC6i7KYK is the internal memory hierarchy. The processor is one of the Skylake-H members to come with a 128MB On-Package Cache (eDRAM). The processor also features a 6MB L3 (LLC) cache, which is smarted among all 4 CPU cores. Each core also has a dedicated 256KB L2 cache (making for a total of 1MB L2 in the Core i7-6770HQ). Skylake's core architecture (32KB of I-cache and 32KB of D-cache) is well-known and has been analyzed before.
The LLC in the Core i7-6770HQ is only 6MB (1.5MB/core), while other members of the Skylake-H family with Iris Pro Graphics (same 128MB eDRAM configuration) have 2MB of LLC per core (total of 8MB). The most important aspect here is that the eDRAM is not available only to the GPU, but also to the other clients of the memory controller.
The BIOS of the NUC6i7KYK supports the Extreme Memory Profile (XMP), an Intel-developed JEDEC SPD extension for memory kits to indicate support for high-performance timings that are beyond the standard JEDEC standard. This allows 'overclocked' memory kits to be plug-and-play. The BIOS can read the extra SPD information at boot time and automatically set the memory timings to the overclocked configuration.
Evaluating Memory Frequency Scaling on the NUC6i7KYK
The rest of this review deals with the quantitative measurement of the effectiveness of different types of DRAM in the Skull Canyon NUC. In order to do this, we processed various benchmarks while keeping everything other than the DRAM SODIMMs constant. Each configuration was booted to BIOS multiple times to ensure that the SPD information was properly parsed and the optimal frequency / timing parameters chosen. Once the OS was booted, we also checked with multiple hardware monitoring tools that the parameters indicated by the BIOS for the DRAM SODIMMs were indeed what the OS was also seeing.
Intel NUC6i7KYK (Skull Canyon) Specifications | |
Processor | Intel Core i7-6770HQ Skylake-H, 4C/8T, 2.6 GHz (Turbo to 3.5 GHz), 14nm, 6MB L2, 45W TDP |
Memory | Various |
Graphics | Intel Iris Pro Graphics 580 |
Disk Drive(s) | Samsung SSD 950 PRO (512 GB; M.2 Type 2280 PCIe 3.0 x4 NVMe; 40nm; MLC V-NAND) |
Networking | Intel Dual Band Wireless-AC 8260 (2x2 802.11ac - 866 Mbps) Intel Ethernet Connection I219-LM GbE Adapter |
Audio | 3.5mm Headphone Jack Capable of 5.1/7.1 digital output with HD audio bitstreaming (HDMI) |
Miscellaneous I/O Ports | 4x USB 3.0 1x Thunderbolt 3 / USB 3.1 Gen 2 1x SDXC |
Operating System | Retail unit is barebones, but we installed Windows 10 Pro x64 |
Pricing (As configured) | $1027 |
Full Specifications | Intel Skull Canyon NUC6i7KYK Specifications |
In the next section, we will first take a look at the specifications of the five SODIMM kits that were evaluated in the NUC6i7KYK, along with the AIDA64 Memory Bench for each. Following this, we present the relevant benchmarks from Intel's Memory Latency Checker tool to determine the raw performance of the DRAM in the system. This is followed by our standard test suite for mini-PCs with a gaming focus - SYSmark 2014, Futuremark benchmarks and some select gaming titles. Prior to our concluding remarks, we take a look at a few miscellaneous aspects such as power consumption and pricing.
31 Comments
View All Comments
jjj - Monday, August 29, 2016 - link
The second graph on page 3 should be flipped upside down as lower latency is better and right now it is misleading if you aren't paying attention.snowmyr - Monday, August 29, 2016 - link
http://imgur.com/a/GxZWhYou're Welcome
kebo - Tuesday, August 30, 2016 - link
+1 internetsGigaplex - Monday, August 29, 2016 - link
"Upon booting into the BIOS after installation, I found that the memory was only configured to run at 2667 MHz. Altering the 'Automatic' DRAM timings to 'Manual' and 'user-defining' the various timing parameters as printed on the SODIMM label (16-18-18-43) enabled higher frequency operation."I'm not surprised. My G.Skill RAM (DDR3) also didn't perform as advertised in a plug and play fashion, and when I emailed to complain, they acted as if it was normal for manual entry to be required. So much for XMP compliance.
Ian Cutress - Monday, August 29, 2016 - link
The system BIOS automatically loads the SPD profile of the memory kit unless the XMP option is enabled. In most systems, XMP is disabled as the default option because kits without XMP (most of the base ones) don't exist. Also, the SPD profile is typically left as the base JEDEC settings to ensure full compatibility.If you want true plug and play of high speed memory kits, one of two things need to happen:
1) XMP is enabled by default (but not all memory will work)
2) Base SPD profiles on the memory should be the high-speed option (means the memory won't work in systems not geared for high performance)
There are a number of Kingston modules, typically DDR4-2400/2666, that will use option number (2). Some high-end motherboards have an onboard switch for (1). For everything else, it requires manually adjusting a setting in the BIOS.
The problem, as always, is maintaining wide compatibility. Just in case someone buys a high-end memory kit but wants to run it at base JEDEC specifications, because the hardware they are moving the kit into doesn't support the high frequency.
TheinsanegamerN - Monday, August 29, 2016 - link
dissapointing to see nearly no improvement in gaming benchmarks. You'd figure that a big iGPU would need more bandwidth with newer games.Perhaps current iGPUs just are not powerful enough. Maybe AMD will fix that with zen APUs next year.
Ian Cutress - Monday, August 29, 2016 - link
It's a function of the embedded DRAM. You would expect DRAM speed to affect the iGPU less when eDRAM is present because it provides a large 50GB/s bidirectional DRAM buffer. Without eDRAM, I would expect the differences in gaming results to be more. Will have to do testing to find out - this piece was focusing primarily on the Skull Canyon environment which lists high speed memory support as a benefit.Samus - Monday, August 29, 2016 - link
I haven't seen a memory frequency roundup like this since Sandy Bridge, which did show a slight benefit (more than Skylake for sure) moving from DDR3 1066 through 1333, 1600 and so on. Haswell I'm sure is a similar story. I had noticeable performance improvements on AM3+ platforms going from 1600 to 2400 especially in regard to the embedded GPU.With sky lake it seems you are just wasting your money to run potentially less reliable, more expensive memory out of specification. But I wonder if CPUs without the eDRAM have the same flat scale?
Ian Cutress - Monday, August 29, 2016 - link
Ivy Bridge: http://www.anandtech.com/show/6372/memory-performa...Haswell: http://www.anandtech.com/show/7364/memory-scaling-...
Samus - Monday, August 29, 2016 - link
Oh cool, thanks Ian! Should have figured you guys keep up with it.