TUXDB - LINUX GAMING AGGREGATE
 NEWS TOP_PLAYED GAMES ITCH.IO CALENDAR CHAT WINE SteamDeck
 STREAMERS CREATORS CROWDFUNDING DEALS WEBSITES ABOUT
 PODCASTS REDDIT 

 

SUPPORT TUXDB ON KO-FI

MENU

ON SALE

New Twitch streamer aggregation implemented (#FuckTwitch) due to Twitch's API issues (more info on my Discord )


Name

 MicroTown 

 

Developer

 Snowy Ash Games 

 

Publisher

 Snowy Ash Games 

 

Tags

 Strategy 

 

Singleplayer 

 

 Early Access 

Release

 2019-08-30 

 

Steam

 € £ $ / % 

 

News

 28 

 

Controls

 Keyboard 

 

 Mouse 

 

Players online

 n/a 

 

Steam Rating

 Very Positive 

Steam store

 https://store.steampowered.com/app/931270 

 
Public Linux depots

 MicroTown Linux [82.11 M] 




LINUX STREAMERS (0)




EA Update #16 - Shops

This update replaces Markets with Shops and adds significant rendering optimizations. And I might as well call this "Rendering Update", because I spent most of the time working on optimizing sprite drawing.

[h2]Shops[/h2]

Shops are new buildings that are set to replace Markets and Market Stalls.



They essentially do the same thing that Stalls do, except each shop can store and distribute several item types. For example, a Produce shop would sell Tomatoes, Carrots and Potatoes at the same time:



Shops allow me to add more items without adding a lot of micromanagement to Markets and Stalls. I can instead focus on logical item "groups", such as foods, supplies, clothing, well-being, luxuries, etc. The gameplay would then not be so much about having exact items, but a decent variety of items and an overall satisfaction for that item group. It's also much easier for me to design for and players to work with fewer categories.

For now, Markets and Stalls will still work, but they are tagged as obsolete and will be removed (probably) in the next update. Basically, I'm leaving them working so anyone who wants to update their world can do so. Unfortunately, I cannot maintain two versions of item distribution, so I will have to completely remove Stalls later. I also cannot automatically replace them in-game with shops, because shops are 7-tile buildings.

[h2]Miscellaneous changes[/h2]

There are now 2 more larger world size options. Thanks to the optimizations, I am finally able to create bigger islands without lagging everything. These won't have better terrain generation yet and the camp still spawns in the middle, but at least they are an option now.

Animal habitats now have to be selected to see their information:



This is mostly because I could not easily re-implement in-world tooltips following the rendering changes. But I also had this planned regardless, so I just went with it. There isn't any new information yet (notably, about quality factors).

[h2]Rendering optimization[/h2]

So the problem is: MicroTown's rendering is fairly slow because there are a lot of things to draw. I should note that MicroTown being "just" pixel art is rather deceptive. The game needs to render more geometry than many 3D games and, for various reasons related to how GPUs work, it's not much cheaper to send "just" sprites to render than it is to send 3D models.

[h3]Game world renderers[/h3]

Optimization should generally be done based on profiling, i.e. evidence. So, most importantly, I need to count the number of sprites that are required to draw everything in the game world. For my test case, I am "looking" at a 1440p resolution zoomed-out 160-size world with 2000 villagers. These are the numbers of in-world entities that exist in the world:



This is a rough indicator of how much "stuff" needs to be drawn, but it's inaccurate because each entity does not correspond to a single sprite. There could be zero to a dozen of sprites needed for an entity, depending on many factors. This also does not account for many extra things, like effects, tile overlays and outlines, markers, building icons, etc.

So here are the numbers of individual sprites:



These can be further broken down into broader categories:



The numbers I am most interested in are the 25k/54k renderers. This is how many sprites need to actually be drawn out of the total sprites that exist. Importantly, there are 5k updates per frame, which means that that many sprites were changed, added or removed.

Which means there are two main focuses for entity processing optimization - knowing what to render and processing changes quickly. Subsequently, the main rendering optimization focus is to render the visible sprites as fast as possible. And these two sentences are so much easier said than done.

[h3]CPU rendering bottleneck[/h3]

The main bottleneck is the CPU time it takes to render the game world. Unity works with "game objects" to do things, broadly speaking. For the game world, I only use game objects exclusively for sprite rendering (no data or logic there). Each sprite renderer requires a separate game object. So every sprite I mentioned above requires a game object, which is easily 50k objects. Meanwhile, for Unity, general consensus is that 10k objects is too many.

I basically have these long long lists of objects:



Unfortunately, there is only so much I can do before Unity itself bottlenecks like this. I've squeezed out as much performance as I could with this setup. Without going into too much detail, there are too many engine-dependent things happening that I cannot change in any way like sorting, culling and communication with the GPU. Unity simply isn't built to support so many objects via its "conventional" game object means. It doesn't matter that the GPU renders everything in 0.5 ms when it takes 25 ms for Unity to prepare and send the data. It still has the overhead of all those universal game objects regardless how many are visible and how cleverly they are optimized for rendering.

So here comes the technical and difficult part...

[h3]Custom rendering pipeline[/h3]

After lots of research and experiments, I decided that I would need to bypass almost all of Unity's "standard" rendering overhead and send sprites to the GPU more-or-less directly. This is essentially how most GPU-based 2D games have always done this. Simply put, "draw sprite S at X, Y" for every sprite. Unfortunately, this is much easier said than done nor does it capture the many complexities. Unity isn't a 2D game engine and all its features have many layers of expensive abstraction. But thankfully Unity does provide rough access to lowish-level graphics logic.

In short, I can send a mesh to render directly to the GPU. A mesh is just a bunch of triangles connected at their vertices. In my case, to show a sprite, I only need a quad -- a rectangular mesh that is 4 vertices and 2 triangles.(This is also what Unity uses by default for sprite renderers, except with a lot more stuff.)



I can not only send a mesh once, but "send" the same mesh again and again, which is called GPU instancing and which saves the day. Technically, I am not even sending multiple meshes - just the one quad, but I am telling the GPU where to draw copies of it. So each quad mesh would correspond to one sprite and I would send a batch of such sprite data to the GPU to get them all rendered at once really, really fast. This lets me render any arbitrary layout:



Fortunately, I have built the game in a way that allows me to fairly easily "collect" all the sprites I need to render (which still massively downplays the difficulty of this conversion). As I described, I have game objects that correspond somewhat to game world entities. I know where they are located and roughly in what order. So I "just" need to pass these to the GPU myself. It would look something like this (with a colorful test texture):



Except, there are no "sprites" on a GPU, there are definitely no sprite atlases. All the "sprites" here look the same. Because a "sprite" is a high-level engine abstraction done through custom assets, custom shaders and lots of engine code. All I get at low-level is a texture and raw vertices and pixels. What I really need to do it specify locations on the final atlas texture, so I can draw specific sprites that would be arranged there (these are 16x16 "squares" from the same 2048x2048 test texture):



The next step is to somehow combine individual meshes and send things to the GPU in batches. The problem is communicating what data I need to send per mesh, that is, per sprite. Naively, I can only set one property per communication, which basically results in every sprite in a batch being the same:



The solution is that modernish GPU shaders can be compute shaders, which can receive and process large chunks of arbitrary data and run (parallel) processing on the GPU. This means I can actually pass all the sprite information to the shader in one big batch very efficiently. This data can then be sampled to select the correct region from the atlas texture from the sprite's location data for each mesh/quad.

And this provides the starting basic functionality of rendering sprites directly to the GPU almost the same as game object sprite renderers, but for a tiny fraction of the cost. Here is the pipeline itself working correctly if not the logic of drawing everything:



This does unfortunately come with a bunch of significant drawbacks that can be summarized as "additional complexity" and whatever is the opposite of "ease of use". But I can live with these considering the speed benefits.

The new problem is that now everything that Unity did -- sort, cull, toggle, offset, scale, etc. -- is gone. I now need to make it all myself.

[h3]Sorting[/h3]

Most importantly, I can't just "render all sprites" in whatever order they are (unintentionally) stored. I can technically do that for tiles, roads, and tile overlays, because they never overlap and are exactly spaced. But every other entity must obey visual depth sorting. Simply put, a unit can walk in front and behind a building while still being in the same tile. But sending sprites to the GPU is fast precisely because it ignores such pre-processing details and just draws things sequentially. Just rendering naively would result in this:



Thankfully, my job is simpler than it could have been trying to sort 50k entities -- I already have a game tile hex grid. Every entity has a whole tile coordinate and I can hugely optimize sorting by looping through the tile coordinates in "visual order".

Entities also have in-tile fractional coordinates. So I have to loop through the entities in a tile back to front. For optimization purposes, I have to keep the entity list pre-sorted and add and update entities as their coordinates change. And this basically correctly sorts 90% of sprites.

The final consideration is that for entities at the same coordinate (like parts of the same building), I need an extra hard-coded depth-sorting value so they appear in the right order even though they are technically at the same location.

Sorting is probably the most time-consuming part to implement because I have to change and adjust so much code. Every one of these "considerations" is another layer of complexity and difficulty. And with all that, I am only now approaching the same visual results as I had at the beginning:



[h3]Culling[/h3]

Another important consideration is to cull anything that is not visible, that is, literally not render sprites that are not on screen. It's a simple problem to explain, but deceptively difficult to implement efficiently. I cannot just check 50k items every frame. So I keep a list of entities per-coordinate, update the lists when entity coordinates change, and loop only through visible coordinates when rendering:



The biggest consideration is that whatever code logic I do, I will be CPU bottle-necked and I cannot offload this work to GPU in any reasonable way.

[h3]Effects[/h3]

All the small effects in the game were provided by Unity's particle system -- smoke, dirt, chips, sawdust, etc. It was all relatively easy to set up. And none of this works anymore with the custom rendering pipeline. Unity's particle system wasn't compatible with how I rendered regular sprites. So I had to recreate the effects myself -- all the logic, animation, sprites, visuals, previews, etc.

There are now some new optimization considerations. For example, various effects like digging dirt used the same basic logic where a 1-pixel particle would fly out. There would be some 12 particles, which equates to 12 sprites. This is actually quite a lot when you consider they have to follow all the new rendering/sorting logic I implemented. I now have to design effects paying attention to the number of sprites they produce and optimize when I can. For example, I can use only 4 sprites if I make an "animation" of the particles splitting up as they spread over time:



My favorite effect -- smoke, which I painstakingly recreated -- takes up 8 sprites:



Here, there's nothing I can do to reduce the number of sprites, and it will be slower than Unity's particles were originally. Of course, I am considering the big picture performance and presentation, and nice effects are definitely a worthy "expense".

[h3]Shadows[/h3]

There is so much to talk about, because I am now revisiting 3 years worth of various visual features. But one of the cool easy-to-add changes was adding shadows for units:



These are very subtle and usually get lost in the "noise". But they subconsciously feel good. I couldn't add these before because they would get incorrectly rendered on top of other things - buildings, other units, etc. - because units would run all over the place and cause depth-sorting issues. However, now that I am "sending" sprites into layers for rendering, I converted all shadows into a "shadow layer", so they get drawn on the ground before any sortable objects, thus they are always on the bottom. This fixes a lot of shadow glitches I had, as well as lets me add shadows without worrying about problems like this.

In fact, I have a lot of new debug tools I had to make to visualize and find all the different parts. For example, a shadows-only preview:



[h3]Fallback rendering[/h3]

All this fancy shader stuff is great and all... except many older systems don't actually support it. The whole GPU-based compute shader processing is a relatively new concept. In fact, systems older than 3 years are increasingly unlikely to support it. Whereas even 15-year old machines could run the game before. Virtual machines also don't support advanced 3D accelerated graphics, which means I cannot test the game on macOS and Linux with the new rendering logic. This is also not something I can afford.

I did not enable analytics in the game, so I have no idea what systems MicroTown's players have. But the game has a strong retro vibe and I would not be surprised nor could I fault any players for trying it on an older system. So I have to support them. Which means implementing a fallback method for rendering everything.

(Un)fortunately, I cannot just fall back to my original Unity-based rendering, as I have essentially changed and rewritten everything. Thankfully, I have about 90% of actual logic reusable, because it's only the communication with the GPU that cannot utilize fancy data transfer logic. I have my list of sprites, their location, atlas information, etc. So "all" I need to do is replace the fancy shader stuff with a dumber, slower version.

Naively, I would basically have to send a mesh per sprite to the GPU one at a time. This is too slow (even with native batching), but I can also combine these meshes together. Which is also too slow with Unity's tools (but comparable to how fast the game ran before). So in the end, I am manually re-constructing meshes vertex-by-vertex to match the expected "sprite layout". It still makes things very CPU-bound, but it's still better than before. The biggest problem is just the human time it takes to get everything implemented and running smoothly.

[h3]Final words[/h3]

Most of the time I was looking either at a complete mess like this:



or this:



Or trying to fix one of the 100 different things that were glitchy:



I even discovered 2D MicroTown:



As a final thought, I only summarized the "correct" path to implement this. I spent a lot of time experimenting, agonizing why something doesn't work and getting exhausted by debugging. And this is besides Unity crashing every time it doesn't like something. The power of working directly with shaders is great and the results below speak for themselves, but it's also equally mind-numbingly tedious and disproportionately time-consuming. As a one-man team, I can't detour through such rabbit holes unless it's absolutely necessary.

So I'll end this with some benchmark comparisons. (I am not comparing FPS, because this isn't an accurate measure for games like MicroTown. The game doesn't have a fixed time step and fewer updates just means it has to process more changes.) These benchmarks are done on a 1440p resolution zoomed-out 160-size world with 2000 villagers drawing 25k entities (same as in the section on entities above) on 4-year old high-end hardware:

[table]
[tr]
[th]Method[/th]
[th]Frame time[/th]
[th]Render time[/th]
[/tr]
[tr]
[td]New[/td]
[td]34 ms[/td]
[td]6 ms[/td]
[/tr]
[tr]
[td]Fallback[/td]
[td]54 ms[/td]
[td]25 ms[/td]
[/tr]
[tr]
[td]Old[/td]
[td]63 ms[/td]
[td]35 ms[/td]
[/tr]
[/table]

A full frame now takes 34 ms instead of 63 ms, which is almost twice as fast. Notably, specifically rendering now takes 6 ms instead of 35 ms before, which is more than 5x times faster! Even the fallback method takes "only" 25 ms. It's still only 30 FPS, but this is also an extreme case. I think I have squeezed out as much performance as I will be able to from rendering. But I could also easily add more entities and sprites without a significant performance hit. (In fact, I imagine most of the future optimization will be to the world update logic. In these examples, world update takes around 20 ms.) Of course, this will vary system by system, but there should be a significant improvement for all systems because the game is CPU-bound.

[h2]Future plans[/h2]

I originally planned for "Shops and Houses" to be the next update (i.e. this update). This would have allowed Houses to use more items so that Shops can sell more items and then I could add more production chains and variety. But I only managed to get the Shops working, so the next main goal is getting Houses working. This will likely be a smaller update, but with bigger breaking changes. The Houses will likely also become 7-tile buildings.

[h2]Full changelog[/h2]

Changes

Market Square, Food Stall and Supply Stall are scheduled to be deprecated; they will remain functional for the current version, but will need to be replaced by Shops
Optimized rendering requires compute shader support: at least DirectX 11, OpenGL 4.3, OpenGL ES 3.1, Vulkan or Metal
Older system will fall back to a slower rendering method
Add shop buildings: Grocery Shop, Produce Shop, Meat Shop, Clothes Shop, Supplies Shop, Medicine Shop with corresponding sold items
Shop building HUD inspection to show full item "progress" (unlike market stalls)
Add Shop concept
Tooltips to show sold items at shops (similar to storage building tooltip)
Export Station and large storage buildings will distribute items directly to nearby shops
Market Square, Food Stall and Supply Stall are no longer constructible but will remain functioning as before in the world
Rename "Fish Steaks" to Raw Fish
Rename new game world size options from "small", "medium" and "large" to "tiny", "small" and "medium"
Add "large" and "huge" new game world size options
Fish Animal Habitats will no longer spawn far away from coast
Remove tutorial steps for Market Square and Food Stall construction and replace with Produce Shop construction
Adjust tooltips and explanations mentioning Market Square logic to instead describe Shops
Animal Habitats no longer show tooltips on mouseover; instead Habitats can be selected in-world and show an inspection panel with the same information and changed display
Add in-world markers for Animal Habitats animals
Shadows now appear on the "tile layer" and don't overlap other objects
Tile outlines, selection indicators and other tile overlays appear above roads
Markers and various indicators now appear above in-world objects
Mine, Sand Pit and Clay Pit prepared ground sprite now appears on the tile "layer"; adjust worker position and mining effect offset
Villagers and animals now have shadows
Add water splash effect to fisher casting and reeling animation
Micropedia now has "Tutorials" section/list
Item tooltips and Micropedia descriptions now combine storage entries to multiple buildings
Add tooltip and description for Beer and Mead usage at Brewery
Add amount label to (de)construction item progress for building and road inspection
Add warning in main menu if compute shaders are not supported and the game will run a slower fallback rendering method
Path-finding will discourage walking between buildings as much
Units will no longer stick to exact hex tile centers when path-finding and choose straighter paths
Workers with carts entering and exiting Export Station and Import Station will now move at a non-slowed down speed
Hunting goal now counts skinned (but not shot/hunted) animals as part of the completion number
Add internal "animals skinned" stat
Roads to have separate sprites for vertically-mirrored bits and new dedicated sprites for the straight segments
Add a more-expensive Boardwalk type of Roads that can built on otherwise unbuildable land tiles (Sand, Rocks, Clay)
Add Boardwalks tech
Add Boardwalk construction Goal
Roads to have a new dedicated sprites for the three-way segments
Animals now walk instead of running (internally, since visually these are the same speed and animation currently)
Fish move faster to bait/hook
Idle villagers will walk more instead of running, especially for short distance
Adjust Import Stations item positioning so all stacks fit on the same tile
Import Stations will also distribute their items to Food Stalls and Supply Stalls
Large Warehouses and Large Granaries will also distribute their items to Food Stalls and Supply Stalls
Add issue warning for large storage buildings that have dedicated workers but no compatible item sources or targets
Large storage buildings to mention how workers operate in the description and change range explanation label for construction

Fixes

Micropedia search bar results would disappear when clicked before the click is processed and would not navigate to the clicked entry
Raw Fish item distribution proportion unintentionally defaulting to weight of 1 instead of 3
Oven operation progress not showing correctly when the worker is activating the cooking process
Export Station and Import Station not showing all item delivery marker icons
Florist missing tooltip/Micropedia information line about Flower "production" from Seedlings
Brewing Vat not showing cooking sprite and smoke
Building workers still accepting tasks from far away blocking other workers from and choosing them until they arrive
Building worker with no task going to the building from far away even when the slot is disabled
Building construction and deconstruction markers now appear on the tile "layer" and don't overlap other objects
Mouseover tile would occasionally be momentarily incorrectly calculated
Various sprite fixes
Path-finding will avoid leading units through single-tile buildings
Fix path-finding not considering "exiting" a building tile in a proper direction
Path-finding will prioritize "front" tiles for multi-tile buildings and avoid going through "occupied" building tiles
Path-finding will again avoid leading units through fields/plots/pens, but take corners instead
Animals will properly ignore pathing restrictions and walk in pens as desired
Fix path-finding choosing sub-optimal shortcuts
Fix villager slowdowns when running between certain road tiles, most noticeable with carts
Animal Habitats not spawning
Graph window not working when loading previous version saves
Acquiring items not present in loaded saves causing an exception in internal graph logic
Export Stations sometimes not delivering items when one of its target Import Stations has a full stack of that item
Technology panel buttons having mouseover detection region larger than the button itself
Skinned animal stat not being recorded in saves
Animal Habitats not despawning when reaching a low score
Fix large storage building worker incorrect waiting locations
Import Stations and large storage buildings would not fill up compatible building items fully
Fix exception when loaded saves would fail to assign unit idle locations
Remaining items not appearing in the world when extracted from stacks
Internal ID conflicts between units (namely Pigs that are led to Butcher) and items would cause one or the other to be excluded from future deliveries for the ongoing session
Import Station issue verification causing internal exceptions while it's constructing
Large storage building issue verification incorrectly triggering when there are compatible item source buildings in range
Building issue verification for required auxiliaries would incorrectly show double-issue for buildings with multiple auxiliaries
Building issue verification for issues that have a building of item list to report would not update the tooltip if these lists changed

Balancing

Garden Plots, Crop Fields, Tree Nurseries and Animal Pens now cost 2 Planks instead of 3 (and 1 Stone Slab)
Import Stations can now distribute up to 6 items to nearby targets
Large storage buildings can now distribute up to 6 items to nearby targets
Double Large Barn and Large Granary capacity
Increase Forester operation range from 3 to 4 to match Lumberjack operation range

Optimizations

Redo rendering pipeline and significantly reduce CPU usage and overhead
Most visual-related logic has different performance cost, usually reduced
Small, emptier worlds run slightly slower, but large worlds run significantly faster
Increased memory usage
All particle effects are redone and slightly different due to using a new system
World entering is slightly faster
Reduce redundant unit internal animation processing
Speed up villager idling location lookup
Entering (generating, loading) and exiting a world is slightly slower due to the pooled objects now having inefficient hierarchy for mass changes
Processing (pathfinding and delivery processing) threads sleep longer and more frequently when idle freeing CPU usage
Internal game logic is now capped at 90 FPS, so faster machines do not do needless processing
Slightly fewer HUD rendering batches
Building issue periodic checking is faster
Many minor optimizations

Rudy
Snowy Ash Games


[ 2021-12-13 18:56:59 CET ] [ Original post ]