NVIDIA Quadro RTX5000
The World's First Ray Tracing GPU:
Shatter the boundaries of what’s possible with NVIDIA® Quadro RTX™ 5000. Powered by the NVIDIA Turing™ architecture and the NVIDIA RTX™ platform, it fuses ray tracing, deep learning and advanced shading to supercharge next-generation workflows. Creative and technical professionals can make more informed decisions faster and tackle demanding design and visualization workloads with ease.
Turing GPU Architecture:
Based on state-of-the-art 12nm FFN (FinFET NVIDIA) high-performance manufacturing process customized for NVIDIA to incorporate 3072 CUDA cores, the Quadro RTX 5000 GPU is the most powerful computing platform for HPC, AI, VR and graphics workloads on professional desktops. The Turing GPU architecture enables the biggest leap in computer real-time graphics rendering since NVIDIA’s invention of programmable shaders in 2001. It includes 13.6 billion transistors on die size of 545 mm2. Able to deliver more than 11.2 TFLOPS of single-precision (FP32), 22.3 TFLOPS of half-precision (FP16), 44.6 TOPS of integer-precision (INT8), and 89.2 TFLOPs of tensor operation capability, it supports a wide range of compute-intensive workloads flawlessly.
RT Cores:
New dedicated hardware-based ray-tracing technology allows the GPU for the first time to real-time render film quality, photorealistic objects and environments with physically accurate shadows, reflections, and refractions. The real-time ray-tracing engine works with NVIDIA OptiX, Microsoft DXR, and Vulkan APIs to deliver a level of realism far beyond what is possible using traditional rendering techniques. RT cores accelerate the Bounding Volume Hierarchy (BVH) traversal and ray casting functions using low number of rays casted through a pixel.
Enhanced Tensor Cores:
New mixed-precision cores purpose-built for deep learning matrix arithmetic, delivering 8x TFLOPS for training, compared to previous generation. Quadro RTX 5000 utilizes 384 Tensor Cores; each Tensor Core performs 64 floating point fused multiply-add (FMA) operations per clock, and each SM performs a total of 1024 individual floating point operations per clock. In addition to supporting FP16/FP32 matrix operations, new Tensor Cores added INT8 (2048 integer operations per clock) and experimental INT4 and INT1 (binary) precision modes for matrix operations.
Advanced Shading Technologies:
Mesh Shading: Compute-based geometry pipeline to speed geometry processing and culling on geometrically complex models and scenes. Mesh shading provides up to 2x performance improvement on geometry-bound workloads. Variable Rate Shading (VRS): Gain rendering efficiency by varying the shading rate based on scene content, direction of gaze, and motion. Variable rate shading provides similar image quality with 50% reduction in shaded pixels. Texture Space Shading: Object/texture space shading to improve the performance of pixel shader-heavy workloads such as depth-of-field and motion blur. Texture space shading provides greater throughput with increased fidelity by reusing pre-shaded texels for pixel-shader heavy VR workloads.
High Performance GDDR6 Memory:
Built with Turing’s vastly optimized 16GB GDDR6 memory subsystem for the industry’s fastest graphics memory (448 GB/s peak bandwidth), Quadro RTX 5000 is the ideal platform for latency-sensitive applications handling large datasets. Quadro RTX 5000 delivers greater than 50% more memory bandwidth compared to previous generation.
NVIDIA GPU BOOST 4.0:
Automatically maximize application performance without exceeding the power and thermal envelope of the card. Allows applications to stay within the boost clock state longer under higher temperature threshold before dropping to a secondary temperature setting base clock. This feature requires implementation by software applications and it is not a stand-alone utility. Please contact quadrohelp@nvidia.com for details on availability.
Advanced Streaming Multiprocessor (SM) Architecture:
Combined shared memory and L1 cache improve performance significantly, while simplifying programming and reducing the tuning required to attain best application performance. Each SM contains 96 KB of L1/shared memory, which can be configured for various capacities depending on compute or graphics workload. For compute cases, up to 64 KB can be allocated to the L1 cache or shared memory, while graphics workload can allocate up to 48 KB for shared memory; 32 KB for L1 and 16 KB for texture units. Combining the L1 data cache with the shared memory reduces latency and provides higher bandwidth.
Mixed-Precision Computing:
Double the throughput and reduce storage requirements with 16-bit floating point precision computing to enable the training and deployment of larger neural networks. With independent parallel integer and floating-point data paths, the Turing SM is also much more efficient on workloads with a mix of computation and addressing calculations.
Error Correcting Code (ECC) on Graphics Memory:
Meet strict data integrity requirements for mission critical applications with uncompromised computing accuracy and reliability for workstations.
Graphics Preemption:
Pixel-level preemption provides more granular control to better support time-sensitive tasks such as VR motion tracking.
Compute Preemption:
Preemption at the instruction-level provides finer grain control over compute tasks to prevent long-running applications from either monopolizing system resources or timing out.
H.264 and HEVC Encode/Decode Engines:
Deliver faster than real-time performance for transcoding, video editing, and other encoding applications with two dedicated H.264 and HEVC encode engines and a dedicated decode engine that are independent of 3D/compute pipeline.
Single Instruction, Multiple Thread (SIMT):
New independent thread scheduling capability enables finer-grain synchronization and cooperation between parallel threads by sharing resources among small jobs.
Specifications:
CUDA Parallel Processing cores
|
3072
|
NVIDIA Tensor Cores
|
384
|
NVIDIA RT Cores
|
48
|
Frame Buffer Memory
|
16 GB GDDR6
|
RTX-OPS
|
62T
|
Rays Cast
|
8 Giga Rays/Sec
|
Peak Single Precision (FP32) Performance
|
11.2 TFLOPS
|
Peak Half Precision (FP16) Performance
|
22.3 TFLOPS
|
Peak Integer Operation (INT8) Performance
|
178.4 TOPS
|
Deep Learning TeraFLOPS1
|
89.2 TFLOPS
|
Memory Interface
|
256-bit
|
Memory Bandwidth
|
448 GB/s
|
Max Power Consumption
|
265 W
|
Graphics Bus
|
PCI Express 3.0 x16
|
Display Connectors
|
DP 1.4 (4) + VirtualLink (1)
|
Form Factor
|
4.4” H x 10.5” L Dual Slot
|
Product Weight
|
972 g
|
Thermal Solution
|
Active
|
NVIDIA® 3D Vision® and 3D Vision Pro
|
Support via 3 pin mini DIN
|
Frame lock
|
Compatible (with Quadro Sync II)
|
NVLink Interconnect
|
50 GB/s
|
Online prices, specifications, description and images generally match actual, but may vary and are subject to change without notice. GTstore.pk cannot be held liable for errors or omissions.
Please verify the specifications at the time of purchase. No claim whatsoever would be accepted in case of specification error/mismatch, once the product is sold.
|