Hi everyone, it's time for our Friday devlog, although this is more like a Saturday devlog now, but let's keep the name the same! In this devlog, I'll explain in depth the current situation with crashes and what we're going to do about it.
Why all the lag & Crashes?
During launch, we saw an enormous amount of players and broke our previous all-time high player count by twofold. We had experience running highly populated servers in Korea, which is why we thought our servers could handle it, but they were still slowing down too much, more than we had anticipated.
The main problem causing the servers to slow down was the way Unreal Engine handles replication (how the server decides what information to send to each player). The server CPU had to calculate the distance between every placeable and item that players had placed to all players, and figure out if they were close enough that the player needed to be sent that information. Our servers could have over 100,000 placed items, and when we had 80 players, for example, the calculations became so heavy that servers couldn't handle it effectively.
We found a couple of solutions and decided to try the "replication graph" method, which does the same thing as before but divides the map into a grid and performs calculations based on which grid the player is in. This solved the performance problem, and now all servers are running at least 10 ticks (calculations) per second. Unfortunately, the plugin that implements replicationgraph has some compatibility issues causing frequent random crashes. We managed to make it slightly better, but we're now in a situation where servers usually stay online for only about 3-4 hours.
What next?
I want to first apologize for the state the game is currently in. We acknowledge that a fully released game should not be in this condition. We will be doing a few things to resolve the issue:
1. We're trying to solve the crashes by fixing how Unreal Engine handles "bad" actors that are no longer in memory, which currently causes the crashes.
2. If we don't manage to fix it by next Wednesday, we'll revert to the old replication method and try alternative approaches until we have a stable 10-tick rate minimum and at least 12 hours of uptime on servers.
3. After this, we'll start focusing on all the bugs and issues which cause item and progression loss, as this is the most frustrating thing when playing in a persistent world where all items are hard to get.
On the bright side, our artists are working on new content updates currently planned for June and another in august with the next official server wipe. More information on these will come in next Friday's devlogs!
Thank you for your support, and again, apologies for the current state of the game.
- Sven
[ 2025-03-22 08:06:45 CET ] [ Original post ]