If you have ever ran a LLM inside the RAM only of a computer it is disastrously slow. To remedy this required very large amounts of RAM on the GPU, and Nvidia seeing this made sure to charge appropriately.The bottleneck every time is the slow speed of the PCIe