قالب وردپرس درنا توس
Home / Technology / Nvidia GeForce GTX 1660 Review: The Turing Onslaught Continues

Nvidia GeForce GTX 1660 Review: The Turing Onslaught Continues



It was only a matter of time before Nvidia took the pristine TU116 graphics processor in its GeForce GTX 1660 Ti and sculpted it a little. to create a low-cost derivative. The new GeForce GTX 1660 is, without surprises, very similar to the higher-end model as it lacks the RT and Tensor cores of Turing architecture. Instead, focus on on-die resources to accelerate today's rasterized games.

Nvidia hasn't even cut much from the TU116 resource pool in creating GeForce GTX 1660: a pair of Streaming Multiprocessors are excised, taking 128 CUDA cores and eight unit tiles with them. But the GPU is otherwise fairly complete. The biggest loss of this card is the lack of GDDR6 memory. By exchanging GDDR5 at 8 Gb / s, instead, the bandwidth drops from 288 GB / s of the 1660 Ti to only 192 GB / s.

Of course, the GeForce GTX 1660 is aimed primarily at FHD games, where 6 GB of slower memory will not damage performance as much as higher resolutions. Can the $ 220 / £ 200 card keep frame rates high enough to take AMD's Radeon RX 590 away with more GDDR5 on a larger bus, though?

TU116 Summary: Turing without RT and Tensor cores

The GPU in the center of GeForce GTX 1660 is specifically named TU116-300-A1. It is a close relative of the TU116-400-A1 of the GeForce GTX 1660 Ti, cut from 24 multiprocessors in streaming at 22. Obviously we are still dealing with a processor without core RT and Tensor from Nvidia, which measure 284 mm² and consists of 6.6 billion transistors produced with TSMC's 12 nm FinFET process.

Despite the smaller transistors, the TU116 is 42 percent larger than the GP106 processor that preceded it. Part of this growth is attributable to the more sophisticated shaders of Turing's architecture. Like the higher-end GeForce RTX 20 series cards, GeForce GTX 1660 supports the simultaneous execution of FP32 arithmetic instructions, which make up most of the shader workloads and INT32 operations (for addressing / retrieval of data, minimum / maximum mobile, comparison, etc.). When one feels that the Turing cores achieve better performance than Pascal at a given clock frequency, this capability largely explains why

Turing's streaming multiprocessors are composed of a lower number of CUDA cores than Pascal, but the design partially compensates by spreading more SM on each GPU. The most recent architecture assigns a programmer to each set of 16 CUDA cores (2x Pascals), along with a sending unit for 16 CUDA cores (like Pascal). Four of these 16-core groupings include SM, along with 96KB of cache that can be configured as 64K or L1 / 32KB shared memory and vice versa and four plot units. Since Turing doubles on the schedulers, it is sufficient to issue an instruction to the CUDA cores every other clock cycle to keep them full. Meanwhile, he is free to give different instructions to any other unit, including INT32 cores.

In the TU116, Nvidia replaces Turing's Tensor cores with 128 dedicated FP16 cores for SM, which allow GeForce GTX 1660 to process half-precision operations at 2x the FP32 rate. The other Turing-based GPUs also boast dual-frequency FP16s through their Tensor cores, so the TU116 configuration serves to maintain the standard through hardware developed specifically for this GPU. The following table is an updated version of the one published in our GeForce GTX 1660 Ti review, which illustrates the tremendous improvement of the TU116 compared to half-precision throughput compared to GeForce GTX 1060 and the Pascal-based GP106 chip.

[194590010]

When we executed Sandra & # 39; s The scientific analysis module, which verifies the multiplication of the general matrix, shows us how many FP16 throughputs reach the Tensor nuclei of TU106 with respect to TU116. GeForce GTX 1060, which symbolically supported only FP16, barely records on the graph.

In addition to the Turing architecture shaders and the unified cache, TU116 also supports a pair of algorithms called Content Adaptive Shading and Motion Adaptive Shading, along with variable-rate shading. We covered this technology in Nvidia's Turing architecture. Explored: inside the GeForce RTX 2080. That story also introduced Turing's accelerated video encoding and decoding features, which also affect GeForce GTX 1660.

Putting it all together …

Nvidia packs 24 SM into TU116, dividing them between three graphics processing clusters. With 64 FP32 cores for SM, it is 1,536 CUDA cores and 96 texture units on the entire GPU. In the loss of two SM, GeForce GTX 1660 ends with 1,408 active CUDA cores and 88 usable texture units.

On-board partners will undoubtedly choose a range of frequencies to differentiate their cards. However, the official base clock frequency is 1,530 MHz with a specific 1.785 MHz GPU Boost . Both of these numbers are slightly higher than the GeForce GTX 1660 Ti watches, although they cannot entirely compensate for the missing MS.

Our sample Gigabyte GeForce GTX 1660 OC 6G maintained a constant 1.935 MHz through three series of Metro: Ultimo Light, runs at around 90 MHz faster than the 1660 Ti we reviewed a few weeks ago . On paper, therefore, GeForce GTX 1660 offers up to 5 TFLOPS of FP32 performance and 10 TFLOPS throughput FP16

Six 32-bit memory controllers give the TU116 a 192-bit aggregate bus, which is populated by 8 Gb / s GDDR5 modules that push up to 192 GB / s. It is comparable to GeForce GTX 1060 6GB and a 33% reduction compared to GeForce GTX 1660 Ti. Combined with the loss of two SM, moving from GDDR6 to GDDR5 memory accounts for the lower performance of GeForce GTX 1660 compared to 1660 Ti.

Each memory controller is associated with eight ROPs and a 256KB L2 cache portion. In total, TU116 exhibits 48 ROPs and 1.5 MBs of L2. The number of ROPs of GeForce GTX 1660 is comparable to that of RTX 2060, which also uses 48 rendering outputs. But the L11 cache slices of the TU116 are half the size of the TU106.

Given the similarities with the GeForce GTX 1660 Ti, it is not surprising that the GeForce GTX 1660 is rated for the same 120W. Unfortunately, none of the graphics cards include multi-GPU support. Nvidia continues to push the narrative that SLI intends to improve absolute performance, rather than offering players a way to combine single GPU configurations.


Gigabyte GeForce GTX 1660 OC 6G GeForce GTX 1660 Ti GeForce RTX 2060 FE GeForce GTX 1060 FE GeForce GTX 1070 FE
Architecture (GPU)
Turing (TU116 ) Turing (TU116) Turing (TU106) Pascal (GP106) Pascal (GP104)
CUDA Cores
1408

1920 1280 1920
Peak FP32 Compute
5 TFLOPS 5.4 TFLOPS 6.45 TLFOPS
[19659036] 4.4 TFLOPS
6.5 TFLOPS
Tensor Core
N / A [19659034] N / A 240 N / A
] RT Cores
N / A N / A 30 N / A N / A
Weft unit [19659033] 88 96 120 80 120
Base clock frequency
1530 MHz 1500 MHz 1365 MHz 1506 MHz 1506 MHz
GPU increment rate [19659035] 1785 MHz 1770 MHz 1680 MHz
[19659034] 1708 MHz
1683 MHz
Memory capacity
6 GB GDDR5 6 GB GDDR6 6 GB GDDR6 6 GB GDDR5 8 GB GDDR5
Memory bus
192-bit 192-bit
[19659036] 192-bit
192-bit 256-bit
Memory bandwidth
192 GB / s 288 GB / s 336 GB / s 192 GB / s 256 GB / s
ROP
48 48 48 48 64
L2 Cache
1.5 MB 1.5 MB 3MB 1.5 MB 2 MB
TDP
120 W 120 W 160 W
[19659036] 120 W
150 W
Transistor count
6.6 billion 6.6 billion 10.8 billion 4.4 billion 7.2 billion
Mold size
284 mm² 284 mm² [19659034] 445 mm² 200 mm² 314 mm²
No
No
No
No No No Yes (MIO)

OTHER: Best graphics cards

OTHER: Table of the performance hierarchy of Desktop GPU

OTHER: all graphic content


Source link