Kind of divering from the larger point, but that’s true — RAM prices haven’t gone up as much as other things have over the years. I do kind of wonder if there are things that game engines could do to take advantage of more memory.
I think that some of this is making games that will run on both consoles and PCs, where consoles have a pretty hard cap on how much memory they can have, so any work that gets put into improving high-memory stuff is something that console players won’t see.
checks Wikipedia
The XBox Series X has 16GB of unified memory.
The Playstation 5 Pro has 16GB of unified memory and 2GB of system memory.
You can get a desktop with 256GB of memory today, about 14 times that.
Would have to be something that doesn’t require a lot of extra dev time or testing. Can’t do more geometry, I think, because that’d need memory on the GPU.
considers
Maybe something where the game can dynamically render something expensive at high resolution, and then move it into video memory.
Like, Fallout 76 uses, IIRC, statically-rendered billboards of the 3D world for distant terrain features, like, stuff in neighboring and further off cells. You’re gonna have a fixed-size set of those loaded into VRAM at any one time. But you could cut the size of a given area that uses one set of billboards, and keep them preloaded in system memory.
Or…I don’t know if game systems can generate simpler-geometry level-of-detail (LOD) objects in the distance or if human modelers still have to do that by hand. But if they can do it procedurally, increasing the number of LOD levels should just increase storage space, and keeping more preloaded in RAM just require more RAM. You only have one level in VRAM at a time, so it doesn’t increase demand for VRAM. That’d provide for smoother transitions as distant objects come closer.
You can divide stuff up into memory however you want, into objects, arrays, whatever. Generally speaking the GPU memory is used for things which will run fast in the streaming processors of the GPU. They are small processors specialized for a limited set of tasks that involve 3D rendering. The types of thing you would have in GPU memory are textures, models, shader scripts, various buffers created to store data for rendering passes like lighting and shadow, zbuffers, and the frame buffer and stuff.
Other things are kept in the ram and are used by the CPU which has many instruction sets and many optimizations for different types of tasks. CPUs are really good at running unpredictable code. They have very large and complex cores which do all kinds of things like branch prediction( taking several paths through code ahead of time when there is free time available) it has direct access to the PCI bus and access things like the south and north bridge, storage controller, io devices, etc.
Generally on a game engine most of the actual logic is happening on the CPU because this is very complex and arbitrary code that is calculation heavy. Things like the level data, AI, collisions, physics, streaming data and stuff is handled by the CPU. The CPU prepares frames by batching many things into one call to the GPU. This is because the GPU is good at taking a command from the CPU and performing that task many times simultaneously. Things like pixels for example. If the CPU had to send every instruction to the GPU in sequence it would be very slow. This is because of the physical distance between the GPU and CPU and also just that a script would only do one thing at a time in a loop. Shaders are different. They are like running a function across a large data set utilizing the 1000 + cores in an average modern GPU.
There are other differences as well. The CPU has access to low latency memory where the GPU prefers higher latency but high bandwidth memory. This is because the types of operations the GPU is doing are much more predictable and consistent. CPUs are very arbitrary and often the CPU might end up taking a path that is unusual so the memory it has to access might be scattered and arbitrary.
So basically most of the game engine and game logic runs in memory because it’s essentially a sequential program that is very linear and arbitrary and because the CPU has many tools in its tool boxes for different tasks, like AVX, SSE, and stuff like this. Most of the visual stuff like 3D transformation and shading and sampling take place on the GPU because its high bandwidth and highly parallel yet with some cores, yet you have many of them that can operate independently.
Ram is very useful but is always limited by console tech. It is particularly important in more interactive and sandboxy type games. Stuff like voxels. It also comes in handy when running sim or rts games. Engines are usually designed around console specs so they can release on those platforms. It can be used for anything even rendering, but it is extremely slow compared to GPU memory in actual bandwidth, which is usually less then an inch away from the actual GPU and has a large bus interface, something like 128-512 bit. This is how many physical wires connect the memory chip to the GPU. It limits how much data you can send in one chunk or cycle. With a 64 bit interface you can only send one 64 bit word at a time. Many processes can pack 4 of those into a 256 word and send them at once getting a 4x speed increase on a 256 bit bus, or 8x speed on a 512 bit bus.
So you have higher bandwidth, high latency memory on a wide bus which feeds a very predictable set of many simple processors. Usually when you want to load memory into the GPU you have to prepare it with the CPU and send it over the PCI bus. This is far too slow to actually use system ram to augment the GPU ram. It’s slow in latency and ram, so if you were to do so, your GPU will be sitting idle like 80% of the time waiting on packets, and then it will only get a 64 or 128 bit packet from the ram, not to mention the CPU overhead of constantly managing the memory in real time.
Having high ram requirements wouldn’t be the worse thing in the world because it’s cheap and can really help some types of games which have large and complex worlds with lots of physics and things happening. Ram is cheap. Not optimizing for GPUs is pretty bad especially with prices these days. That will not happen much because games tend to be written in languages like C++ which manage memory in a very low level way, so they tend to just take about as much as they need. One of the biggest reasons you use a language like C++ to write game engines is because you can decide how and when to allocate and free memory. This prevents stuttering. If the system is handling memory you tend to get a good deal of stuttering because the CPU will get loaded for half a sec here and there as the garbage collector tries to free 2 GBs of memory or something. This tends to make games engines very structured when it comes to the amount of memory they use. Since they are mostly trying to reuse code as much as possible, and are targeting consoles, they usually just aim for the amount of ram they know they will have on consoles. Things like extra draw distance on PCs and stuff can use more memory.
LODs can be generated in real time but this is slow. You can do nearly anything with code. It’s just if it’s a good fit for your application. In a game engine every cycle is precious. You are updating the entire scene, moving all your data, preparing a frame, resolving all interactions, running scripts, and everything else in just over 16 ms for 60 fps. The amount of data your PC in processing in just 16 ms will blow your mind. Usually 3-12 passes in the renderer. A very simple engine will draw a zbuffer, where during this 16 ms it determines the distance to the closest object behind every pixel, then using this data to figure out what needs to actually be drawn. Then it’s taking these objects and resolving the normals, basically figuring out if the polygon is facing towards or away from the player. This is cutting out rendering the vast majority of polygons. Then the lighting data and everything is combined with this and sent to the GPU which actually goes through a list of polygons need to be drawn, and then looking up the points to draw the polygons. It’s also casting rays from a light source and shading the scene. This is very simple, basically a quake or doom like game. Modern games are much more complex. They draw each frame many times with many different buffers Generating different data and using it for the next pass. Generating LODs is just something that isn’t done unless needed for some reason, like dynamic terrain or voxel terrain. In a game that is mostly static geometry there is not really any reason to give up that compute time when you can just pregen them. Generating LODs in real time is quite a process. You have to load a region into memory, reduce it’s polygon, downsize the texture. Generate a new mesh and texture, and place it in the world. This would be a back and forth between the GPU and CPU. Some games do it however. 7dtd, space engineers, Minecraft with a distant terrain mod, and I’m sure many others generate LODs on another thread, but these are usually fairly low quality meshes.
Kind of divering from the larger point, but that’s true — RAM prices haven’t gone up as much as other things have over the years. I do kind of wonder if there are things that game engines could do to take advantage of more memory.
I think that some of this is making games that will run on both consoles and PCs, where consoles have a pretty hard cap on how much memory they can have, so any work that gets put into improving high-memory stuff is something that console players won’t see.
checks Wikipedia
The XBox Series X has 16GB of unified memory.
The Playstation 5 Pro has 16GB of unified memory and 2GB of system memory.
You can get a desktop with 256GB of memory today, about 14 times that.
Would have to be something that doesn’t require a lot of extra dev time or testing. Can’t do more geometry, I think, because that’d need memory on the GPU.
considers
Maybe something where the game can dynamically render something expensive at high resolution, and then move it into video memory.
Like, Fallout 76 uses, IIRC, statically-rendered billboards of the 3D world for distant terrain features, like, stuff in neighboring and further off cells. You’re gonna have a fixed-size set of those loaded into VRAM at any one time. But you could cut the size of a given area that uses one set of billboards, and keep them preloaded in system memory.
Or…I don’t know if game systems can generate simpler-geometry level-of-detail (LOD) objects in the distance or if human modelers still have to do that by hand. But if they can do it procedurally, increasing the number of LOD levels should just increase storage space, and keeping more preloaded in RAM just require more RAM. You only have one level in VRAM at a time, so it doesn’t increase demand for VRAM. That’d provide for smoother transitions as distant objects come closer.
You can divide stuff up into memory however you want, into objects, arrays, whatever. Generally speaking the GPU memory is used for things which will run fast in the streaming processors of the GPU. They are small processors specialized for a limited set of tasks that involve 3D rendering. The types of thing you would have in GPU memory are textures, models, shader scripts, various buffers created to store data for rendering passes like lighting and shadow, zbuffers, and the frame buffer and stuff.
Other things are kept in the ram and are used by the CPU which has many instruction sets and many optimizations for different types of tasks. CPUs are really good at running unpredictable code. They have very large and complex cores which do all kinds of things like branch prediction( taking several paths through code ahead of time when there is free time available) it has direct access to the PCI bus and access things like the south and north bridge, storage controller, io devices, etc.
Generally on a game engine most of the actual logic is happening on the CPU because this is very complex and arbitrary code that is calculation heavy. Things like the level data, AI, collisions, physics, streaming data and stuff is handled by the CPU. The CPU prepares frames by batching many things into one call to the GPU. This is because the GPU is good at taking a command from the CPU and performing that task many times simultaneously. Things like pixels for example. If the CPU had to send every instruction to the GPU in sequence it would be very slow. This is because of the physical distance between the GPU and CPU and also just that a script would only do one thing at a time in a loop. Shaders are different. They are like running a function across a large data set utilizing the 1000 + cores in an average modern GPU.
There are other differences as well. The CPU has access to low latency memory where the GPU prefers higher latency but high bandwidth memory. This is because the types of operations the GPU is doing are much more predictable and consistent. CPUs are very arbitrary and often the CPU might end up taking a path that is unusual so the memory it has to access might be scattered and arbitrary.
So basically most of the game engine and game logic runs in memory because it’s essentially a sequential program that is very linear and arbitrary and because the CPU has many tools in its tool boxes for different tasks, like AVX, SSE, and stuff like this. Most of the visual stuff like 3D transformation and shading and sampling take place on the GPU because its high bandwidth and highly parallel yet with some cores, yet you have many of them that can operate independently.
Ram is very useful but is always limited by console tech. It is particularly important in more interactive and sandboxy type games. Stuff like voxels. It also comes in handy when running sim or rts games. Engines are usually designed around console specs so they can release on those platforms. It can be used for anything even rendering, but it is extremely slow compared to GPU memory in actual bandwidth, which is usually less then an inch away from the actual GPU and has a large bus interface, something like 128-512 bit. This is how many physical wires connect the memory chip to the GPU. It limits how much data you can send in one chunk or cycle. With a 64 bit interface you can only send one 64 bit word at a time. Many processes can pack 4 of those into a 256 word and send them at once getting a 4x speed increase on a 256 bit bus, or 8x speed on a 512 bit bus.
So you have higher bandwidth, high latency memory on a wide bus which feeds a very predictable set of many simple processors. Usually when you want to load memory into the GPU you have to prepare it with the CPU and send it over the PCI bus. This is far too slow to actually use system ram to augment the GPU ram. It’s slow in latency and ram, so if you were to do so, your GPU will be sitting idle like 80% of the time waiting on packets, and then it will only get a 64 or 128 bit packet from the ram, not to mention the CPU overhead of constantly managing the memory in real time.
Having high ram requirements wouldn’t be the worse thing in the world because it’s cheap and can really help some types of games which have large and complex worlds with lots of physics and things happening. Ram is cheap. Not optimizing for GPUs is pretty bad especially with prices these days. That will not happen much because games tend to be written in languages like C++ which manage memory in a very low level way, so they tend to just take about as much as they need. One of the biggest reasons you use a language like C++ to write game engines is because you can decide how and when to allocate and free memory. This prevents stuttering. If the system is handling memory you tend to get a good deal of stuttering because the CPU will get loaded for half a sec here and there as the garbage collector tries to free 2 GBs of memory or something. This tends to make games engines very structured when it comes to the amount of memory they use. Since they are mostly trying to reuse code as much as possible, and are targeting consoles, they usually just aim for the amount of ram they know they will have on consoles. Things like extra draw distance on PCs and stuff can use more memory.
LODs can be generated in real time but this is slow. You can do nearly anything with code. It’s just if it’s a good fit for your application. In a game engine every cycle is precious. You are updating the entire scene, moving all your data, preparing a frame, resolving all interactions, running scripts, and everything else in just over 16 ms for 60 fps. The amount of data your PC in processing in just 16 ms will blow your mind. Usually 3-12 passes in the renderer. A very simple engine will draw a zbuffer, where during this 16 ms it determines the distance to the closest object behind every pixel, then using this data to figure out what needs to actually be drawn. Then it’s taking these objects and resolving the normals, basically figuring out if the polygon is facing towards or away from the player. This is cutting out rendering the vast majority of polygons. Then the lighting data and everything is combined with this and sent to the GPU which actually goes through a list of polygons need to be drawn, and then looking up the points to draw the polygons. It’s also casting rays from a light source and shading the scene. This is very simple, basically a quake or doom like game. Modern games are much more complex. They draw each frame many times with many different buffers Generating different data and using it for the next pass. Generating LODs is just something that isn’t done unless needed for some reason, like dynamic terrain or voxel terrain. In a game that is mostly static geometry there is not really any reason to give up that compute time when you can just pregen them. Generating LODs in real time is quite a process. You have to load a region into memory, reduce it’s polygon, downsize the texture. Generate a new mesh and texture, and place it in the world. This would be a back and forth between the GPU and CPU. Some games do it however. 7dtd, space engineers, Minecraft with a distant terrain mod, and I’m sure many others generate LODs on another thread, but these are usually fairly low quality meshes.