Donald Hays

Object Pooling in my Lua UI Library

April 14, 2026

I’m working on a UI library. It’s not ready for a 1.0 release yet, but I’ve slowly started opening things up to get feedback while it’s still easy for me to make major changes. I recently posted an update video about adding image widgets.

I’ve been rather reckless with memory management to this point. I freely allocate memory when I need it and drop it on the floor when I’m done with it, leaving it for the garbage collector. I’ve had plenty of painful experiences with garbage collector pauses in games in the past, so I’ve been concerned this could be a pain point in the future.

However, I’ve noticed I’ve had far fewer problems with garbage collectors recently. I suspect there’s multiple reasons for this, not least of which is the tendency towards incremental garbage collectors in game engines. They spread out the collection work over time, making it less likely to blow a frame’s time budget.

Regardless, developers often use techniques to reduce the need for garbage collection. If your code produces less garbage, there’s less pressure to collect it. Object pools can be used for this. When you’re done with an object, instead of dropping it for the GC, you instead put it into a reuse pool. Then, when you need a new object, you can recycle an existing object out of the pool, instead of allocating an entirely new one. You can save time spent allocating new memory, and especially save time spent in garbage collection cycles.

But object pools have their own tradeoffs. Most significantly, they add complexity. It’s an entire application-space system that doesn’t otherwise exist. Users are responsible for adding objects to the pool when they’re done. Objects don’t come out of the pool freshly initialized, so care must be taken to clean them as necessary. Finally, managing the pool itself takes CPU time.

That last point was a particular open question for me. Garbage collectors have a performance cost, of course, but they’re a low-level system that have been heavily optimized over time, and in my experience they’re quite good now. By using object pools, I would be adding my own bespoke layer of memory management, implemented in a dynamically-typed language. Would I actually outperform the GC?

So I didn’t want to jump in haphazardly. I only wanted to use object pools if I could demonstrate to myself that they actually yielded real improvements. So I waited until I had a big enough sample project that I could capture meaningful measurements.

The Implementation

My implementation is about as simple as it gets. It offers two methods: iui.get and iui.put. The pool supports multiple object types. To distinguish types, you pass typename to iui.get. If the sub-pool for a type is empty, it’ll generate a new object, otherwise it’ll return an available object. When you’re done with an object, you just pass it to iui.put, and it’ll assign it to the correct sub-pool.

There’s one potential gotcha: iui.get assigns typename to the object’s _typename key. This key will appear if you try to iterate the object via pairs, so that’s something to be mindful of.

Every type pool also tracks an index into its internal storage, called top. Originally, I pushed objects onto the pool using a simple table.insert, and popped them using table.remove. Doing that requires the table to figure out the relevant index itself, based on the size of the pool. However, finding the length of an array-style table has an O(log n) time complexity. That is very fast, but it’s additional time spent that can be saved if it can be avoided. By tracking the index myself, I can point the operations directly at the correct location. I measured some additional performance doing this, so it seems worth it.

Pooling Draw Calls

Though I do temporary allocations all over the library, there’s one thing responsible for the majority: draw calls.

Widgets in the library perform both behavior and presentation in one function. However, frameworks like LÖVE and LÖVR separate update and draw into two separate functions. To reconcile this, the graphics commands in a widget need to be deferred until the draw phase.

Previously, I would wrap the graphics commands in an anonymous function, which would then be pushed onto a queue for deferred execution, by calling iui.draw.

    iui.draw.pushClip(bx, by, bw, bh)
end

-- Every one of these anonymous functions would create a
-- temporary heap allocation, and every call to every
-- widget did this.
iui.draw(function()
    iui.graphics.setColor(1, 1, 1)
    iui.graphics.image(image, filter, ox, oy, ow, oh)
end)

if clip then
    iui.draw.popClip()

widgets/image.lua, old

This worked, but those anonymous functions would require heap allocations to close over their state. So every widget would cause a temporary allocation, which would have to be collected by the GC later.

Now, these closures aren’t a good candidate for the object pool, but I had a different strategy in mind. I could make the individual graphics calls themselves create command objects that would go on a queue for later execution. Those command objects could use the object pool. As an added bonus, the iui.draw call would go away completely, leaving a more straightforward programming model.

    iui.draw.pushClip(bx, by, bw, bh)
end

-- It *looks* like we're drawing here and now, but actually
-- these graphics APIs now create and enqueue deferred
-- command objects.
iui.graphics.setColor(1, 1, 1)
iui.graphics.image(image, filter, ox, oy, ow, oh)

if clip then
    iui.draw.popClip()

widgets/image.lua, new

Importantly, this is not exactly a one-to-one switch from temporary allocations to object pools. We’re getting rid of the anonymous function for iui.draw entirely, while the iui.graphics APIs are now acquiring, configuring, and enqueueing draw command objects, instead of just drawing directly. So the system would involve fewer temporary allocations, but more objects overall. I felt confident I would see a big drop in garbage, but would it be faster?

Results

Object pooling worked!

In my sample app, I saw temporary memory allocations drop from about 60 kilobytes per frame to 20, and garbage collection cycles became about a third to a fourth rarer.

The average update time dropped by about 13%, while the median update time dropped by about 5%. The average improved by more than the median because it’s more impacted by the rare but significant garbage collection cycles. If Jeff Bezos walks in a bar, the average wealth of patrons would immediately jump by a billion dollars, but if you sorted the patrons from richest to poorest, the patron standing in the middle would still likely be about as wealthy.

Finally, the standard deviation dropped by about half. This again indicates that rarer garbage collections result in more frame time consistency.

I’m especially pleased by the fact that I saw a real performance improvement even though the graphics system now enqueues multiple command objects per widget, instead of just one closure per widget. This suggests that multiple pooled objects win out over even a single temporary allocation. I may well see even bigger improvements in cases where the switch to pooled objects is more one-to-one.

Footnote

Under the previous model, the anonymous draw function would execute during the engine’s draw phase. If you wanted to perform any other tasks at that time, like messing with the render backend’s transform state or such, you could do so in that method.

Since the new model enqueues limited draw commands, that opportunity’s no longer available. To account for that possibility, I added a method called iui.draw.enqueue to let you execute arbitrary code during the draw phase, after previously issued draw commands, but before subsequent commands.

Yes, this means that I’m introducing temporary allocations again, after having just got rid of them, but this will be much rarer than before: currently, none of the built-in widgets make use of this functionality at all.

iui.draw.enqueue(function()
    -- Adjust the transform state
    
    -- Perform raw graphics rendering
    
    -- Unwind the change in transform
end)

Added Dark Mode

July 7, 2025

I dusted off the cobwebs on my site today, making sure the build process had all the modern dependencies and such. While I was at it, I went ahead and added Dark Mode support to the CSS 😎.

I would like to start trying to post somewhat more actively, so it was nice reacquainting myself with the site’s inner workings.

Updated Snake and Bubble Factory for Game Boy

April 2, 2023

It feels odd, releasing version updates for Game Boy games years after their initial release, and decades after the system was discontinued, and yet, here we are!

I’ve made updates to both my Snake and Bubble Factory games.

Bubble Factory

  • An alert bubble is now shown next to open doors when guards are about to appear
  • Difficulty screen now shows outlines of unearned stars to hint at the high score challenge
  • Other minor graphical improvements
  • Removed Twitter handle on title screen

Snake

  • Created a new title screen logo
  • Removed Twitter handle on title screen

Updated Bubble Factory for Game Boy

April 22, 2021

Similar to my previous update to Snake, I have updated Bubble Factory for Game Boy to version 1.1. This update fixes the same issue, where some emulators and flash carts wouldn’t save high scores correctly, because the game was configured as a non-standard cartridge type.

Also, like with Snake, I took the opportunity to refine the build process to be somewhat more standard. Building the game now expects SDCC to be in your PATH, instead of requiring you to place a copy of it in a specific location relative to the project directory.

Finally, there was a best-practice change, where I now wipe sprite object attribute memory before enabling sprites for the first time. Failing to do so can result in an issue where corrupted, random sprites appear on screen. I’ve never noticed this issue actually manifest in this game, but it was trivial to do the right thing, so I did.

Updated Snake for Game Boy

March 9, 2021

It is with great embarrassment that I’m pleased to announce version 1.1 of my Snake game for Game Boy!

This release fixes a single issue: high scores wouldn’t save correctly on some emulators and flash carts.

Originally, the game was configured as a ROM+RAM+Battery cart type. As it turns out, there was—as I understand—never a commercial release of a game with that cartridge type, which means emulators don’t have a reference for how exactly it should behave. As a result, even though every emulator I tested it on at the time worked, I was unknowingly relying on unspecified behavior, and it turns out that it doesn’t work everywhere. I have since changed the game to use the MBC5 memory bank controller, which is well-documented and supported. The game is still 32 kilobytes, and if you have a save file from an emulator that supported the old version, that same save should still work, but saving will now work on more emulators and flash carts.

Also, I moved the initial stack pointer from the end of high RAM to the end of regular RAM. This doesn’t make a practical difference for this game, but it’s just better practice in general: high RAM is fast, but not in a way that the stack can take advantage of.

Finally, updating the game involved a sizable commit to address the surprising amount a project for a long-dead platform had rotted. First, I switched from a custom build script to a more standard Makefile. The updated build process also expects you to install RGBDS externally, rather than have a copy of it in the project folder structure. These changes weren’t necessary, but are better practice. But more importantly, RGBDS itself has been evolving over time, and I addressed some deprecations and language changes.

Also embarrassing, Bubble Factory suffers the exact same save problem. I’ll be fixing it, too, later.

Older