▶
Friday Facts #215 - Multithreading issues
To make PvP fun, and fair, it is pretty important that all the teams have a fair start. We do this for the normal game by using some special logic to generate a specific starting area. While the map generation didn't support this, I worked around the issue by copy-pasting all the starting area chunks to new starting areas for all players. This looked very odd though, and since it was done in the Lua script, was very slow.
So the solution is to fix the map generator to create multiple starting points, which with all the refactoring work by TOGoS, was not so hard to do. Whereas before all the logic calculated its own 'distance from the start' from (0,0), now it looks for the closest starting area, and calculates the distance from that. This actually fixes and simplifies a lot of the PvP script, as I also don't have to worry about clearing away huge biters nests from each teams front door.
This is a much cleaner solution, as the new starting areas are seamlessly integrated into the map.
We were always saying, that we keep the full update multithreading as the last ace in the sleeve to be used after the normal optimisations become too difficult. Apart from the plan to split update into chunks and update them separately, there are also smaller and simpler things we can do. We have for example this part of the update step, where we update trains, electric network and belts, and we update it one after another like this:
But these tasks are almost independent as neither trains or belts use electricity and belts don't interact with trains. There would be some details that would have to be solved, like Lua API calls and inserter activation order, but other than that it would be pretty straightforward. So I tried to run these things in parallel and measured how fast it would be:
To my surprise, the parallel version didn't speed things up, it was actually even slower. First I thought that it is caused by the fact, that the threads steal each others cache or the task is too short to be parallelized etc., but then I realized, that we already do the "prepare logic" in parallel. The prepare logic gathers all the data (sprite draw orders) for rendering and it is doing in parallel up to 8 threads. The task is iterating through the game world, and it is also quite short. Yet still the parallelization works pretty well there. After a few days (and many cppcon videos about multithreading later), I got the answer to my question. The answer is the same as many other answers related to performance, it is caused by cache invalidation. There is some shared cache for all the threads (L3 I believe), and some smaller caches specific for each of the cores (L2 and L1 I believe). The cache is divided into some kind of pages. The problem is that as long as I don't control the memory allocation, the data of 3 objects that might be completely independent can end up in the same page of memory cache. So lets imagine this situation:
Even if two different cores work with the same part of memory, they don't slow each other down as long as they only read the memory (Which is the case of our prepare logic as it is read only operation) as each core has its own copy of the memory page. The problem arises when the operation is not read only:
Each thread has its own copy of the page, but whenever something is changed, the other copies of the same page need to be invalidated and updated. This means that the threads are invalidating each others cache all the time, which slows the whole process so much that it is slower than the non-parallel solution. How do we solve it? The solution would probably be to organize the memory allocation in such a way, that everything that is on the same chunk, or related to some specific task would have to use its own memory allocator that keeps things in one place together in the memory. Every chunk would have its own piece of memory dedicated to entities in the chunk, trains would have all its data in one piece of memory together etc. This way, it shouldn't happen that two tasks would work on one piece of memory at the same time so the multithreading would actually speed things up. The problem with this is, that it requires big changes in the code as not only the entities itself but all their dynamically allocated data would have to be always allocated depending on the entity position. Whenever the entity moves between chunks, all its data structures would have to be re-allocated in the new chunk. All pointers to any data structures in entity would have to be properly updated as they move around etc. This means, that once (if) we decide to do this step, it would probably be one big update related only to this and this is not something we have time to do before 1.0. As always, let us know what you think on our forums.
[ 2017-11-03 17:19:38 CET ] [ Original post ]
Hello, it is Friday once again.
Multiple starting areas
To make PvP fun, and fair, it is pretty important that all the teams have a fair start. We do this for the normal game by using some special logic to generate a specific starting area. While the map generation didn't support this, I worked around the issue by copy-pasting all the starting area chunks to new starting areas for all players. This looked very odd though, and since it was done in the Lua script, was very slow.
So the solution is to fix the map generator to create multiple starting points, which with all the refactoring work by TOGoS, was not so hard to do. Whereas before all the logic calculated its own 'distance from the start' from (0,0), now it looks for the closest starting area, and calculates the distance from that. This actually fixes and simplifies a lot of the PvP script, as I also don't have to worry about clearing away huge biters nests from each teams front door.
This is a much cleaner solution, as the new starting areas are seamlessly integrated into the map.
Multithreading issues
We were always saying, that we keep the full update multithreading as the last ace in the sleeve to be used after the normal optimisations become too difficult. Apart from the plan to split update into chunks and update them separately, there are also smaller and simpler things we can do. We have for example this part of the update step, where we update trains, electric network and belts, and we update it one after another like this:
But these tasks are almost independent as neither trains or belts use electricity and belts don't interact with trains. There would be some details that would have to be solved, like Lua API calls and inserter activation order, but other than that it would be pretty straightforward. So I tried to run these things in parallel and measured how fast it would be:
To my surprise, the parallel version didn't speed things up, it was actually even slower. First I thought that it is caused by the fact, that the threads steal each others cache or the task is too short to be parallelized etc., but then I realized, that we already do the "prepare logic" in parallel. The prepare logic gathers all the data (sprite draw orders) for rendering and it is doing in parallel up to 8 threads. The task is iterating through the game world, and it is also quite short. Yet still the parallelization works pretty well there. After a few days (and many cppcon videos about multithreading later), I got the answer to my question. The answer is the same as many other answers related to performance, it is caused by cache invalidation. There is some shared cache for all the threads (L3 I believe), and some smaller caches specific for each of the cores (L2 and L1 I believe). The cache is divided into some kind of pages. The problem is that as long as I don't control the memory allocation, the data of 3 objects that might be completely independent can end up in the same page of memory cache. So lets imagine this situation:
Even if two different cores work with the same part of memory, they don't slow each other down as long as they only read the memory (Which is the case of our prepare logic as it is read only operation) as each core has its own copy of the memory page. The problem arises when the operation is not read only:
Each thread has its own copy of the page, but whenever something is changed, the other copies of the same page need to be invalidated and updated. This means that the threads are invalidating each others cache all the time, which slows the whole process so much that it is slower than the non-parallel solution. How do we solve it? The solution would probably be to organize the memory allocation in such a way, that everything that is on the same chunk, or related to some specific task would have to use its own memory allocator that keeps things in one place together in the memory. Every chunk would have its own piece of memory dedicated to entities in the chunk, trains would have all its data in one piece of memory together etc. This way, it shouldn't happen that two tasks would work on one piece of memory at the same time so the multithreading would actually speed things up. The problem with this is, that it requires big changes in the code as not only the entities itself but all their dynamically allocated data would have to be always allocated depending on the entity position. Whenever the entity moves between chunks, all its data structures would have to be re-allocated in the new chunk. All pointers to any data structures in entity would have to be properly updated as they move around etc. This means, that once (if) we decide to do this step, it would probably be one big update related only to this and this is not something we have time to do before 1.0. As always, let us know what you think on our forums.
[ 2017-11-03 17:19:38 CET ] [ Original post ]
Factorio
Wube Software LTD.
Developer
Wube Software LTD.
Publisher
2020-08-14
Release
Game News Posts:
506
🎹🖱️Keyboard + Mouse
Overwhelmingly Positive
(164072 reviews)
The Game includes VR Support
Public Linux Depots:
- Factorio Linux64 [306.86 M]
- Factorio Linux32 [300.1 M]
Available DLCs:
- Factorio: Space Age
Factorio is a game in which you build and maintain factories. You will be mining resources, researching technologies, building infrastructure, automating production and fighting enemies. In the beginning you will find yourself chopping trees, mining ores and crafting mechanical arms and transport belts by hand, but in short time you can become an industrial powerhouse, with huge solar fields, oil refining and cracking, manufacture and deployment of construction and logistic robots, all for your resource needs. However this heavy exploitation of the planet's resources does not sit nicely with the locals, so you will have to be prepared to defend yourself and your machine empire.
Join forces with other players in cooperative Multiplayer, create huge factories, collaborate and delegate tasks between you and your friends. Add mods to increase your enjoyment, from small tweak and helper mods to complete game overhauls, Factorio's ground-up Modding support has allowed content creators from around the world to design interesting and innovative features. While the core gameplay is in the form of the freeplay scenario, there are a range of interesting challenges in the form of the Scenario pack, available as free DLC. If you don't find any maps or scenarios you enjoy, you can create your own with the in-game Map Editor, place down entities, enemies, and terrain in any way you like, and even add your own custom script to make for interesting gameplay.
Discount Disclaimer: We don't have any plans to take part in a sale or to reduce the price for the foreseeable future.
Join forces with other players in cooperative Multiplayer, create huge factories, collaborate and delegate tasks between you and your friends. Add mods to increase your enjoyment, from small tweak and helper mods to complete game overhauls, Factorio's ground-up Modding support has allowed content creators from around the world to design interesting and innovative features. While the core gameplay is in the form of the freeplay scenario, there are a range of interesting challenges in the form of the Scenario pack, available as free DLC. If you don't find any maps or scenarios you enjoy, you can create your own with the in-game Map Editor, place down entities, enemies, and terrain in any way you like, and even add your own custom script to make for interesting gameplay.
Discount Disclaimer: We don't have any plans to take part in a sale or to reduce the price for the foreseeable future.
What people say about Factorio
- No other game in the history of gaming handles the logistics side of management simulator so perfectly. - Reddit
- I see conveyor belts when I close my eyes. I may have been binging Factorio lately. - Notch, Mojang
- Factorio is a super duper awesome game where we use conveyor belts to shoot aliens. - Zisteau, Youtube
MINIMAL SETUP
- OS: Linux (tarball installation)
- Processor: Dual core 3Ghz+Memory: 4 GB RAM
- Memory: 4 GB RAM
- Graphics: OpenGL 3.3 core. DirectX 10.1 capable GPU with 512 MB VRAM - GeForce GTX 260. Radeon HD 4850 or Intel HD Graphics 5500
- Storage: 3 GB available space
- OS: Linux (tarball installation)
- Processor: Quad core 3GHz+Memory: 8 GB RAM
- Memory: 8 GB RAM
- Graphics: OpenGL 4.3 core. DirectX 11 capable GPU with 2 GB VRAM - GeForce GTX 750 Ti. Radeon R7 360
- Storage: 3 GB available space
GAMEBILLET
[ 6102 ]
GAMERSGATE
[ 764 ]
FANATICAL BUNDLES
HUMBLE BUNDLES
by buying games/dlcs from affiliate links you are supporting tuxDB