Posted on February 4, 2014 by Adam Biessener

ces06

Many ways to measure performance exist, frames per second being the most common and intuitive metric. It makes sense – a higher framerate is better. The numbers are easy to wrap your head around: 60 is great, 30 is okay, and anything lower than that starts to look jittery. For all its advantages as a metric, though, FPS is in many ways an imperfect and too general measure of performance.

As engine developers, we care about how much “stuff” games that use our code can do on a given set of hardware. The most elementary unit of computing in the context of rendering engines is a “batch,” or a single command sent from the CPU to the GPU. When it comes to strategy games, or any game that puts a lot of objects onscreen at once, the number of batches an engine can push per frame (or per millisecond, which is easier to measure and more consistent) is hugely important.

For example, consider a typical Star Swarm scene (hugely simplified for example purposes, naturally). If we have 5,000 units on screen, and each one takes four batches to render, we need 20,000 batches to render a single frame. In currently available rendering engines, the kind of performance to achieve that with an acceptable framerate is unlikely to be attained when partnered with a full-scale game simulation and current graphics APIs. Ten thousand batches per frame is realistic for an extremely high-performing engine in the current state of the art. This leaves us two options: cut the number of units on screen in half, or cut the batches per unit in half by doing things like drastically simplifying the lighting, removing shadows, or aggressively reducing draw distance.

Nitrous was designed so that users and developers would not have to make those kinds of sacrifices. Star Swarm pushes 5,000 to 10,000 units in the simulation at frame rates north of 30 frames per second, rendering scenes in real time that blow the curve for what gamers expect out of a modern graphics engine.

When you look at it this way, it’s clear why “batches per millisecond” metric matters a lot more to us as engine developers than a raw framerate, which could have any number of factors affecting it. One important way we increase our batches/ms score is through Nitrous’ aggressively multi-threaded nature, which allows us to process rendering commands on as many cores as the user has.

brad-mantle chart

This entire discussion about batches and CPUs and GPUs circles back to the root cause of the huge performance gains Nitrous has with a next-gen API like AMD’s Mantle. The above hardware configuration (Core i7-980, Radeon 290) pushes almost double the batches per millisecond under Mantle as it does under DirectX 11. This is due to the difference in how the game engine talks to the graphics driver; we’re able to leverage Nitrous’ multi-threaded capabilities (and therefore multiple CPU cores) far more effectively under a truly multi-threaded API like Mantle.

And that’s how we squeeze enough performance out of today’s hardware to start seriously considering things that used to be unrealistic for real-time applications, like bleeding-edge temporal anti-aliasing on a scene with 5,000 units fully rendered in the frame. We’re thrilled to be there both as game designers and gamers ourselves, and we can’t wait to share the game we’re working on as well as see what other game developers like Mohawk Games are able to create with Nitrous.

Join the conversation on our forums
WordPress Appliance - Powered by TurnKey Linux