Are there any more possible optimizations to increase the number of cars without performance loss?

mrwallace888 · Sep 22, 2020

I remember hearing about how cars in BeamNG are usually done per-core on your machine, rather than each car getting spread across multicore properly. And I heard that because of this, it's partly why you can only have a few cars at a time (depending on your specs, but even beasts can't usually have that many cars either).

I don't know what's already been done in terms of optimizations, but is there anything else that can be done to increase performance, especially majorly, and the number of cars before performance gets hit?

I heard someone also saying that optimizations were the reason why you could test with like a hundred hay bales. Which also makes me wonder how much more complex the computer sees props like that compared to the actual vehicles themselves.

When the game comes out in God only knows how many years, are a lot of us still going to likely be stuck with only a handful of cars at once in a scene?

I don't know, I'm mostly just curious as to how much we've optimized so far vs. what's still left to do and what else is possible to change.

I've followed this game since 2013, and it's absolutely mindblowing how much work's been put into this masterpiece of a simulator. And it's still Early Access even after everything that's been done to it!

Considering this, I'm assuming things will only get better from here!

atv_123 · Sep 23, 2020

mrwallace888 said: ↑

I remember hearing about how cars in BeamNG are usually done per-core on your machine, rather than each car getting spread across multicore properly. And I heard that because of this, it's partly why you can only have a few cars at a time (depending on your specs, but even beasts can't usually have that many cars either).

I don't know what's already been done in terms of optimizations, but is there anything else that can be done to increase performance, especially majorly, and the number of cars before performance gets hit?

I heard someone also saying that optimizations were the reason why you could test with like a hundred hay bales. Which also makes me wonder how much more complex the computer sees props like that compared to the actual vehicles themselves.

When the game comes out in God only knows how many years, are a lot of us still going to likely be stuck with only a handful of cars at once in a scene?

I don't know, I'm mostly just curious as to how much we've optimized so far vs. what's still left to do and what else is possible to change.

I've followed this game since 2013, and it's absolutely mindblowing how much work's been put into this masterpiece of a simulator. And it's still Early Access even after everything that's been done to it!

Considering this, I'm assuming things will only get better from here!
Click to expand...

Well... pretty much the only reason it is done the way that it is done, is because each... ehh... "entity", whether it be a hay bale, or a semi truck, uses its own independent thread mostly because the calculations must be done in a very linear fashion. Basically, the answer to this equation must be plugged into the input of the next equation. Multithreaded and multicore processing lend themselves to doing lots of different problems simultaneously rather than all the cores working on just one problem at the same time.

Think of it this way.

If you have a classroom full of students and a class room of 1 single student who are all getting ready to take a test. The teacher in the single students classroom hands the student a 24 problem test and that student gets to work. The teacher in the classroom full of students takes 1 problem from the test and hands each of the 24 problems to 24 different students.

Naturally, as you can see, the room with lots of students that are all doing only one problem on the test will finish MUCH faster than the singular student will. This is easy to see. Many hands make light work.

However... if we change one thing up, we will end up with a very different result.

Back to our test... the teacher hands the single student a new 24 question test, but this time, the answer to question 1 is used in question 2, answer to question 2 is used in question 3, and so on and so forth. The single student gets to work...

In the multi student room, the teacher does the same thing as before, takes question 1 and hands it to student 1, takes question 2 and hands it to student 2, so on and so forth... they set to work as well... but they quickly run into a problem. students 2 through 24 can't get any work done as they don't have all the information to complete their problem. They must wait for the student with the problem before them to complete his problem, and then pass the answer to that problem onto them before they can even get started solving their problem.

While before, you could see that the 24 independent problems would be solved very quickly when the problems weren't all interconnected, with the second example, you can see that the 24 problems that depend on each other will probably take roughly the same amount of time to solve with all 24 students as they would with just the 1 singular student.

This is why Beam is the way that it is.

The math for every beam and node in a given vehicle must all be done linearly... taking the answers from the previous equation and plugging them into the next one time and time and time again to give us the simulation that we see. To try and split the math up among all the cores would only end up having those cores waiting around for the previous core to give it an answer so that it can even get started on its problem.

This is why Beam runs the way it does. Each vehicle is basically treated as one singular math problem, and thus, it is computed by one singular core much more efficiently rather than trying to share the load.

So... all the optimizations that you hear about... where does that all come from? Basically, that comes from the devs finding ways to make the equations in question simpler by canceling out variables that actually don't matter in some equations that they thought did before, or from finding new ways to calculate the same problem that requires much fewer steps to actually get to the same answer.

I hope that made it clear why they aren't running one vehicle on multiple cores... I tried to explain it as simply as I could.

Edit: Saying this though... this basically means that you can have as many cars as your processor has cores... minus like... 2 so that the engine can actually run and have some overhead. I have a 12 core 24 thread monster computer... and it basically can run about 20 to 22 cars before I start running out of horsepower to run them... Threadripper 64 core 128 thread... well... I think you can see where this is going...

mrwallace888 · Sep 23, 2020

Cores or threads?

atv_123 said: ↑

Saying this though... this basically means that you can have as many cars as your processor has cores... minus like... 2 so that the engine can actually run and have some overhead. I have a 12 core 24 thread monster computer... and it basically can run about 20 to 22 cars before I start running out of horsepower to run them...
Click to expand...

atv_123 · Sep 23, 2020

mrwallace888 said: ↑

Cores or threads?
Click to expand...

I have a 12 core computer... using hyperthreading I basically have virtual 24 cores, but that really just means it can handle 24 threads simultaneously.

KillerMemz · Sep 24, 2020

There is one small difference compared to the 24-question test analogy; In a computer, the threads all have multiple levels of cache. If we take it back to that analogy, think of it like the students are able to discuss between themselves, and they each have a handheld whiteboard where they can write down what they've found out and/or discussed using a dry-erase (Level 1 cache). There are also slightly larger whiteboards sitting between the desks (Level 2 cache), but the kids have to spend a little bit of time to turn their heads and look at them/interact with them, because they're further away. then, there's an even larger whiteboard in the middle of the room, that they have to get up and walk to to interact with (level 3 cache). Finally, at the front of the room, behind a teacher's desk, is the largest whiteboard of them all (Level 4 cache/RAM), but because it's all the way over there the students prefer to use the other whiteboards when they can, but they still have to use it regularly because there is too much stuff to discuss to fit on all of the smaller whiteboards. At the Teacher's desk, is a big thick lesson planner book (the hard drive), and the Teacher is able to update it with new lessons to assign the students to, as well as writing down results from older lessons.
If done right, even a heavily linear task can be made into a decently spread parallel processing task if the person behind it knows what they're doing with the different cache levels, as well as the ram.
while the cores have to process data independently, they have access to the different cache levels to help them share data between them.

Trophy · Sep 29, 2020

atv_123 said: ↑

I have a 12 core computer... using hyperthreading I basically have virtual 24 cores, but that really just means it can handle 24 threads simultaneously.
Click to expand...

I heard that whybeare got a 32 core threadripper and a 2080 ti. You should put a 3090 in that thing.
OT: A modern cpu has a certain number of cores. Each of these cores acts as a seperate cpu, that can process something differently than the others. The reason beamng works so well with high core cpus is because there are lots of things to process at once. Each car needs to be processed differently with different information. Beamng benifits much more than most games from high core counts. Its also easier for for each car to have its own core to process it because then a core doesn't have to juggle over all of the information that it gets. This is one reason that I think amd cpus are better suited for beamng than intel cpus, because while intel has fewer cores but higher single core performance, amd has more cores and lower single core performance. Beamng is the type of game that is more cpu intensive than gpu intensive, and it could be 10x worse if the maps were deformable and water had nodes and physics. Best cpu for this game under $500: amd ryzen 9 3900x.

Forums

Mods

Are there any more possible optimizations to increase the number of cars without performance loss?

mrwallace888
Expand Collapse

atv_123
Expand Collapse

mrwallace888
Expand Collapse

atv_123
Expand Collapse

KillerMemz
Expand Collapse

Trophy
Expand Collapse

Useful Searches

Are there any more possible optimizations to increase the number of cars without performance loss?

mrwallace888 Expand Collapse

atv_123 Expand Collapse

mrwallace888 Expand Collapse

atv_123 Expand Collapse

KillerMemz Expand Collapse

Trophy Expand Collapse

mrwallace888
Expand Collapse

atv_123
Expand Collapse

mrwallace888
Expand Collapse

atv_123
Expand Collapse

KillerMemz
Expand Collapse

Trophy
Expand Collapse