What is a Graphics Processing Unit (GPU) and How Does it Work? | Liquid Web

What is a graphics processing unit (GPU) and how does it work?

Graphics processing units (GPUs) play an integral role in computing today. As artificial intelligence (AI) and machine learning applications become more ubiquitous, the need for dedicated GPU hardware continues to grow. For this reason, customized hardware with GPUs built in from Liquid Web is just a call away.

Liquid Web allows customers to order custom GPU servers to meet their specific needs, particularly for AI and large language models that require a significant amount of video random-access memory (VRAM). Let’s explore what exactly a GPU is, how it works, and why companies like Liquid Web provide tailored solutions around GPU servers.

Key takeaways

Here is a list of the key takeaways you will come across reading this article:

  • The answer to a common question for online businesses — what is a graphics processing unit?
  • Knowledge of how GPUs work
  • Understanding what GPUs are used for
  • A GPU inside the computer explained
  • Web hosting GPU server described in detail
  • The process for submitting a custom GPU server order to Liquid Web
  • Reviewing GPU server configuration options
  • Answers to other frequently asked GPU questions

What is a graphics processing unit?

A graphics processing unit — or a GPU for short — is a specialized circuit designed to quickly process and manipulate computer graphics and image data parallelly. The GPU offloads these compute-intensive tasks from the central processing unit (CPU), allowing the CPU to handle more generic computational workloads.

Unlike CPUs with just a few cores optimized for sequential serial processing, GPUs are built for parallel processing, so they have thousands of smaller, more efficient cores. This architecture makes them exponentially faster than CPUs at certain workloads, such as graphics rendering, video editing, cryptocurrency mining, and AI computing.

To visualize the difference, think of ordering dinner at a restaurant. A CPU is like a single waiter responsible for taking orders, submitting them to the kitchen, serving dishes, clearing plates, and processing payments. They do each task one by one in a serial fashion. A GPU is like an entire waitstaff working as a parallel team — multiple staff taking orders simultaneously, another team serving dishes as soon as the kitchen has them ready, another clearing dishes from tables as customers finish, etc.

By leveraging many smaller cores in parallel, GPUs achieve tremendous speeds and efficiency gains on specialized computational workloads. Next, we’ll explore exactly why they excel at these types of tasks compared to general-purpose CPUs:

By leveraging many smaller cores in parallel, GPUs achieve tremendous speeds and efficiency gains on specialized computational workloads. Next, we’ll explore exactly why they excel at these types of tasks compared to general-purpose CPUs.
Image Source: Understanding GPU Architecture > GPU Characteristics > Design: GPU vs. CPU (Cornell University)

How does a GPU work?

After one learns what a graphics processing unit is, the next question that comes to mind usually relates to how a GPU works.

Fundamentally, GPUs are designed with parallelism and FLOPS (floating point operations per second) in mind. By packing in small, efficient shader cores they emphasize throughput over low latency. Some highend GPUs, like the NVIDIA A100, contain up to 6912 CUDA cores.

These thousands of cores enable multiple arithmetic logic units (ALUs) to process shader, texture, vertex, and pixel data simultaneously. Stream processors handle specialized operations appropriate for graphics rendering (matrix multiplies, texture mapping, polygon rasterization, etc.).

As data moves through the GPU pipeline it gets distributed across these myriad simpler cores, unlike a CPU where sequential tasks hit more complex cores one by one. Executing identical instructions concurrently allows GPUs to speed up parallel workloads tremendously.

Modern GPU architecture contains additional enhancements like fast shared memory for low-latency data access, constant memory for read-only data, and configurable L1 and L2 cache for improved workload handling. Advancements like NVIDIA Ampere architecture also introduce third generation Tensor Cores for extremely fast deep learning performance.

Under the hood, GPU architecture looks fundamentally different from CPU architecture, even though both use silicon wafers for fabrication. The difference lies in optimizing GPU design to handle highly parallelized tasks like graphics rendering much more efficiently. Next, we’ll explore some real-world applications that leverage these GPU strengths.

What is a GPU used for?

There are a variety of compute-intensive workloads that rely heavily on GPU hardware for performance gains. We are going to review those different applications of GPU-optimized hardware in the next sections.

Gaming/graphics

The most apparent application is gaming and 3D graphics rendering. GPUs allow video games to render complex 3D environments and high-resolution textures in real time, enabling immersive gameplay. GPUs handle transformations, vertex generation, pixel shaders, antialiasing, physics simulations, and other graphics pipeline operations in parallel.

Highend GPUs enable gamers to play titles across multiple monitors or virtual reality headsets with extremely high frame rates and resolutions for super smooth experiences. Desktop users also leverage powerful GPUs for video editing, CAD engineering simulations, graphic design applications, and more.

Artificial intelligence (AI)

In recent years GPUs have become crucial for artificial intelligence and deep learning applications since they excel at matrix math computations that underlie machine learning algorithms. Frameworks like TensorFlow and PyTorch rely extensively on GPU acceleration for fast model training.

From natural language processing (NLP) to computer vision, nearly all modern AI workloads harness GPU power. Large transformer models used for state-of-the-art NLP tasks can easily exceed 32 GB VRAM capacity during training. This kind of VRAM requirement will continue expanding as models grow bigger.

Scientific computing

Beyond gaming and graphics, GPU parallelism works wonders for scientific computations involving complex math operations. Molecular dynamics simulations, astrophysics, quantum chemistry, financial risk modeling, and weather forecasting all utilize GPU horsepower for tremendous performance benefits.

Researchers now offload supercomputing cluster workloads to remote GPU cloud servers since innovations like NVIDIA’s HGX platform offer cutting-edge capabilities. Google Cloud, Microsoft Azure, Amazon Web Services (AWS), and Oracle Cloud all provide GPU-based cloud instance options to meet this demand.

As shown above, GPUs serve critical roles across many industries and applications today. Up next, we’ll explore how they differ from integrated graphics processors commonly found in desktop and laptop systems.

What is a GPU in a computer?

There are two types of graphics processors in personal computers. Integrated GPUs (iGPUs) are built directly into the CPU chip itself and dedicated GPUs (dGPUs) operate as separate components. iGPUs leverage shared system memory, whereas dGPUs come equipped with dedicated high-speed graphics memory (VRAM). Let’s compare iGPUs versus dGPUs at a high level.

Most laptops rely on energy-efficient iGPUs that lack the horsepower for intensive gaming or computational tasks. But they easily handle basic video playback, desktop effects, and 2D workloads while conserving battery life.

Discrete GPUs — also known as dedicated GPUS (dGPUs) — offer substantially higher performance and VRAM capacity but consume more power. Gaming PCs and workstations designed for graphics-heavy applications typically incorporate powerful dedicated NVIDIA or AMD dGPUs with the latest architectures (Ampere, RDNA 2.0, etc.) and upwards of 8 to 24 GB VRAM. Components like video cards plug into PCI-E slots on the motherboard. Many highend GPU products require supplementary power connectors to satisfy their significant energy needs under full load.

Server platforms especially need PCI-E GPUs for accelerated computing since CPU resources alone prove insufficient for applications like AI and machine learning at scale. Cloud platforms like AWS EC2 incorporate server-grade Tesla GPU options as customer instances configurable during provisioning. However, smaller hosting providers often rely on custom orders to fulfill GPU server requests due to associated hardware costs. Next, we’ll explore the concept of GPU servers more closely.

What is a GPU server?

A GPU server refers to a high-powered rackmount server equipped with one or more PCI-E GPU(s) as dedicated coprocessors for massively parallel workloads. These units typically utilize server-grade components for enhanced reliability at scale compared to desktop PC hardware.

System designers tightly integrate GPUs with server-grade CPUs (Xeon, AMD EPYC), ample DDR4 memory (from 256 MB to greater than 1.5 TB), fast NVMe Solid State Drives, and high-bandwidth networking adapters purpose-built for data center usage. NVIDIA and AMD both offer products specialized for the server space including Tesla GPUs and Instinct MI accelerators respectively.

GPU servers aim to solve demanding computational challenges like accelerated deep learning, high-frequency financial trading, seismic imaging, molecular dynamics, and weather modeling. Key metrics for these servers include TFLOPS performance, GPU memory capacity, NVLink interconnect bandwidths, and power consumption since data center costs run high.

Top-tier offerings like NVIDIA DGX platforms offer complete solutions containing eight or more A100 GPUs networked via NVLink and PCIe interfaces. Combined TFLOPS in the PETAFLOPS range allows large ML models to train rapidly. Turnkey DGX servers prove popular albeit expensive, leading many organizations to investigate custom solutions instead.

Submitting a custom order for a GPU server with Liquid Web

Liquid Web gives customers the flexibility to request fully customized GPU servers tailored to unique requirements. However, GPU hardware allotments remain limited in supply. Customers must submit detailed specifications that help guide the procurement process.

A Liquid Web Product Administrator sheds some light on the workflow:

“Any server with a GPU requires a custom build. You will need to gather as much info as possible from the customer and then check stock with our Purchasing and CapEx Inventory Manager.”

According to a well-respected Product Manager at Liquid Web:

“A single GPU is less of an issue in terms of hardware accommodations, but it will still need a full 3U server chassis and in many cases might take a 4U server chassis.”

For those readers less familiar with these server chassis descriptions, a 3U server chassis is a type of rackmount server case that takes three rack units (and therefore has the “3U” designation) in a server rack and a 4U server chassis is a rackmount server case that takes four rack units. Each rack unit (U) is 1.75 inches in height, so a 2U chassis is 3.5 inches tall and a 4U chassis is 7 inches tall, for example.

Given supply constraints, Liquid Web relies on case-by-case vetting before fulfilling special orders. Users must clearly communicate desired workloads, GPU types being considered, memory and storage needs, and favorable server configurations to assist the custom build process.

Since modern AI and machine learning applications perform best with ample GPU memory, specifying VRAM requirements proves essential. As applications grow increasingly complex, be sure to request at least 12 GB of VRAM (with more that 12 GB of VRAM capacity often being optimal). Processing transactions quickly with a knowledgeable Liquid Web customer representative helps get you the ideal GPU server tailored to your workload specifications.

GPU server options from competitors

Many cloud hosting providers offer flexible machine configurations during server provisioning. However, GPU-enabled options appear scarce among smaller vendors. Larger players like Amazon with its AWS platform provide thousands of EC2 instance types including GPU choices.

Still, the preconfigured nature of off-the-shelf server products limits customization flexibility, forcing users into makeshift solutions when application demands evolve. Competitors like OVHcloud accommodate GPU needs through more tailored ordering and on-demand accelerators. Let’s analyze a few alternatives in this space.

OVHcloud

OVHcloud captures business via strong GPU-centric messaging across product pages — their GPU portfolio page details multiple NVIDIA GPU server options configurable with T4 and A100 model adapters. They outline server specifications like 12-core Xeon CPUs, up to 1.5 TB RAM, 8 TB NVMe storage, and 10/25/40 Gbit/s network adapters. OVHcloud GPU offerings range from T4 models (16 GB VRAM) up to 8x A100 servers packing 320 GB VRAM and 18,400 CUDA cores for intensive AI workloads.

OVHcloud allows customers access to industry-leading TFLOPS performance without large upfront capital expenses. However, their Alzheimer Warning on compute-intensive orders seems ominous. Still, they represent stiff competition in the dedicated GPU server hosting space.

DigitalOcean

DigitalOcean acquired cloud machine learning startup Paperspace back in 2021 to target the AI computing segment. They offer GPU-enabled droplets on an hourly/monthly billing cycle backed by standardized templates that simplify the provisioning process.

Both standard and optimized machine types allow one to four GPU attachments per droplet. Options range from entry-level K80 adapters up to advanced V100 models packing from 16 to 32 GB VRAM. DigitalOcean mirrors Paperspace’s straightforward pricing for the different tiers. Overall, they provide solid self-serve capabilities, but less flexibility than Liquid Web custom options.

Other frequently asked GPU questions

Transitioning to GPU computing comes with a learning curve. Aside from “What is a graphics processing unit?” many first-timers have questions about ideal setup configurations, hardware compatibility, expected costs, and more. Let’s review some common questions customers ask when ordering GPU servers.

What GPU card specification should I request for machine learning?

For modern machine learning (ML) training choose an NVIDIA RTX A6000 or preferably an A100 model. These feature high-speed GDDR6 memory, third generation Tensor Cores and NVLink interconnects perfect for neural network model development. If cost proves prohibitive, previous generation cards like Tesla V100 or RTX 8000 can suffice.

What CPU works best with an NVIDIA GPU?

AMD Threadripper Pro or Intel Xeon Gold processors prove very popular for GPU servers running intensive workloads. Optimized core/thread counts, RAM specs, PCIe lane bandwidth, and fast clock speeds pair nicely with Tesla cards. Multiple CPUs can sync via UPI or NVIDIA NVLink for blistering multi-GPU performance.

How much RAM do I need on a GPU server for large language models?

Aim for 256 GB minimum, but 512 GB or even 1 or 2 TB of system RAM provides plenty of headroom as model complexity expands over time. Plus, multiple GPUs require ample memory to prevent bottlenecks when data moves across PCIe. High-bandwidth ECC RAM helps fully utilize all those GPU shader cores.

What Power Supply Unit (PSU) wattage should I use with multiple RTX 3090 cards?

Plan for a 1500W or higher 80 Plus Gold PSU to provide at least 320W per each RTX 3090 GPU added to your server. Power spikes can trigger shutdowns on weaker PSUs. If possible, spread cards across multiple rails and employ supplementary 6+2 pin power cables for peak stability. Efficiency optimization keeps operating costs lower.

Please reach out to a Liquid Web sales specialist with any additional questions when preparing to order your custom GPU server. We provide tailored solutions to accommodate desired workloads.

Wrapping up — what is a graphics processing unit (GPU)?

We’ve just explored graphics processing units more closely, including GPU architecture, typical applications, server configurations, and frequently asked purchase questions.

It is becoming more understood by even non-technical consumers that GPUs provide tremendous parallel processing power — vastly exceeding CPU capabilities for specialized workloads like machine learning, cryptocurrency mining, genomics analysis, and graphics rendering. Dedicated GPU servers now operate as crucial data center infrastructure across many industries. Their popularity will only increase.

GPU demand rises significantly as emerging technologies like large AI models and the metaverse evolve. Liquid Web lets customers obtain fully customized GPU servers tailored to unique business requirements. Our expert team handles procurement, installation, testing, and deployment to exceed expectations.

Get started with Liquid Web

Now that you know the answer to the title question of this post regarding graphics processing units, feel free to reach out to our most helpful technical crew now to discuss building your ideal GPU server platform. Liquid Web drives customer success via tailored IT solutions that are not possible with rigid one-size-fits-all providers. Allow us to earn your business by collaborating closely to achieve shared goals.

Take a look at our hosting plans — or chat with a solutions advisor regarding hardware solutions with GPUs.

Related articles

Wait! Get exclusive hosting insights

Subscribe to our newsletter and stay ahead of the competition with expert advice from our hosting pros.

Loading form…