An external GPU in practice: lanes and performance
Even though Gigabyte’s egpu-boxes contain the same gpu as a desktop PC - in the case of the Aorus GTX 1070 Gaming Box even a model that can be purchased from a store directly - the performance level of an external gpu is lower than you could expect from the same graphics card that you just insert into your motherboard. This is first of all due to the limitations of Thunderbolt 3: the bandwidth may be high for an external connection, but via pci-express a graphics card still has four times as much bandwidth available. In addition, communication with an egpu first takes place via the platform controller hub (pch), and then the Thunderbolt controller, which encapsulates the pcie-signals and sends them over the cable together with any other signals (displayport, usb).
With the synthetic benchmarks below, we show the performance loss of Thunderbolt 3. We used a desktop system with Thunderbolt 3. This system consists of a Gigabyte X299 Designare EX motherboard, a 10-core Intel Core i9 7900X-Cpu with a maximum turbo clock of 4.5 GHz, plus 32GB of ddr4 memory running at 3200 MHz. The test system will not cause any bottleneck for the gpu. We first performed our series of game benchmarks using the Gigabyte Aorus GTX 1070 Gaming Box connected via Thunderbolt 3. Then we took the GTX 1070 out of the case and installed it in the PCie slot on the motherboard, after which we ran the same tests again.
Even when using a very powerful system, a gpu connected via Thunderbolt 3 achieves about thirty percent lower performance than via pci-express. However, it is important to note that the gpu in a gaming notebook does not offer the same level of performance as a desktop card either. This is why the differences in the other test that we ran are smaller.
A second remark: the cpu of your office laptop can be another limitation to the final performance level. Although the cpu performance of ultrabooks has improved considerably since Intel introduced the energy-efficient Kaby Lake-R quadcores last year, you can still not expect the same level of performance from these processors as from a desktop CPU, especially as desktops now also have more cores. For example, the Core i5-8400 for the desktop has six computing cores and is allowed to dissipate 65 W, while the quadcore i5-8250U that we often encounter in ultrabooks is bound to a tdp of 15 to 25 W. When it comes to cooling a laptop cpu has less room, literally and figuratively speaking.
If you have a laptop with Thunderbolt 3-support, confusingly enough you can still not always be sure that an external gpu has optimal support. This is due to the way the Thunderbolt 3 controller is internally connected to the rest of the system. On paper, the connection between pch and controller is made via four pci-e 3.0 lanes, each with ten gigabits of bandwidth, so that the port with maximum bandwidth can be used.
However, several Thunderbolt 3-capable laptops over the past few years turned out to be connected with two lanes rather than four. According to the official Intel specifications, this is permitted. The fact that laptop manufacturers connect the Thunderbolt controller with fewer lanes is not always their own fault either. Intel’s ultrabook mobile processors have very few pcie-lanes available to them for communicating with the outside world. For example, the recent mobile Core i5-8250U-cpu has twelve lanes, whereas a modern Core i5 8400 desktop processor is equipped with twice as many lanes.
This is why some ultrabook manufacturers who want to add extra functionality that requires more bandwidth sometimes simply connect the Thunderbolt-controller to half as many lanes. For many applications this is not that big of a problem, but it is very relevant for egpus.
Unfortunately, if you want to make sure that the Thunderbolt connector on your new notebook can communicate with the CPU at full speed, it is rather difficult to check the system's internal structure before you purchase the product. Fortunately, there are websites that keep track of which laptops have a Thunderbolt 3 port that is connected via four lanes, such as Ultrabook Review.
Once you have the laptop, you can check how the Thunderbolt controller is connected by downloading the HWiNFO tool. In the overview on the left fold out the heading 'Bus'. You can now see a series of 'PCI Express Root Ports' below the pch. Search for the port under which the Alpine Ridge Thunderbolt Controller is located. If there is a 'PCI Express x2 bus' somewhere in the tree structure between the Thunderbolt controller and pch, the controller is connected via two lanes. If, on the other hand, you only see x4 connections from top to bottom, then your Thunderbolt controller can communicate with the cpu at maximum speed.
External monitor versus internal panel
As we will see in the practical tests with several laptops, the difference between two and four lanes is fortunately not as big in practice, except in one scenario: when you want to play games on the panel of the laptop itself, instead of on an external monitor that you connect directly to the external gpu.
Even if your laptop does have a Thunderbolt 3 connection that is internally connected via four PCie lanes, you will experience performance loss when you use the internal laptop panel. The explanation for the difference in performance is simple. If you want to show the images that your external gpu calculates on your laptop screen, they need to be sent back to your system via the same Thunderbolt 3 cable. The Thunderbolt controller gives priority to the displayport signals over the pcie-connection, reducing the gpu bandwidth.
In the three standard gaming benchmarks we have performed for this test, the frame rate of our test system shows an average decrease of 13 percent when we use the internal panel instead of an external monitor. We performed these tests in 1080p resolution. If your Thunderbolt connector uses only two lanes, it is even more obvious that the connection is the bottleneck. Using our test system, in which the controller is connected via two pcie-lanes, the frame rate decreases by an average of 23 percent.