LmCast :: Stay tuned in

The Lone Lisp Heap

Recorded: May 28, 2026, 8:03 p.m.

Original Summarized

The lone lisp heap

Matheus Moreira
Index

The lone lisp heap

Like many dynamic languages out there,
lone started out
simple. It was essentially a collection of data structures written in
C, an union comprising all of those types, a typed value
structure containing the union plus metadata, and a custom language
whose entire purpose is to bring all these values together into
patterns that resemble working programs.

struct lone_lisp_list { /* ... */ };
struct lone_lisp_vector { /* ... */ };
struct lone_lisp_table { /* ... */ };
/* ... */

enum lone_lisp_heap_value_type {
LONE_LISP_TYPE_LIST,
LONE_LISP_TYPE_VECTOR,
LONE_LISP_TYPE_TABLE,
/* ... */
};

struct lone_lisp_heap_value {
bool live: 1;
bool marked: 1;
/* ... */

enum lone_lisp_heap_value_type type;

union {
struct lone_lisp_list list;
struct lone_lisp_vector vector;
struct lone_lisp_table table;
/* ... */
} as;
};

When I put it this way, the language itself seems almost incidental to
the virtual machine work. That's because it is. Lone was in fact not
designed ahead of time. It was and still is being constructed in real
time. It's almost as if the language itself is arising
as a consequence of its own implementation. It grows in complexity
alongside the knowledge and understanding of its creator.

At first I was a neophyte. I was attracted to ideas that were almost
painfully simple, almost too naive to cope with the harsh reality of
the world. The code reflected that.

Take the above structures, for example. How are such values created?
My first instinct is to simply allocate some memory for each object.
Call malloc with the size of the structure, then
initialize the memory it returns. It's that easy.

Right?
Memory allocation

Lone is a lisp interpreter written in freestanding C. There is no
dynamic memory allocation in freestanding C. There is no such thing as
malloc. There is no libc. There is only me
and the code. If I wanted malloc, I would have to write
it myself.

So I wrote it. I read up on everything I could find online about
memory and its allocation. Then I made my own memory allocator.

To manage a memory block, the allocator needs to describe it. Let's
start with the basics. The most basic information about a piece of
data is its location and its size.

struct lone_memory {
size_t size;
unsigned char pointer[];
};

The allocator must also keep track of whether the block is free or
currently in use. Otherwise, there would be chaos.

struct lone_memory {
bool free;
size_t size;
unsigned char pointer[];
};

This is enough to manage any one block of memory. If it's free, then
the allocator can give the block to whoever asks. If it's not free,
then it can't. If they ask for less or exactly as much memory as this
block contains, then it can be given out. If they ask for more, then
it can't.

I started from literally nothing. Now I've got the one. Time to handle
infinity.

struct lone_memory {
struct lone_memory *prev, *next;
bool free;
size_t size;
unsigned char pointer[];
};

Simply link the blocks to each other. Now if some code requests a
block, the allocator can walk through the list of all blocks and
search for any block that fits.

And that's exactly what the allocator does.

for (block = system->memory; block; block = block->next)
if (block->free && block->size >= size)
break;

if (!block) { return 0; }

block->free = false;
return block->pointer;

Searches the list of blocks and literally returns the first block that
fits. This thing could end up returning 16 KiB blocks for 64 byte
requests. Crude, but effective. Incredibly wasteful, of course.
Potentially more wasteful than mmaping pages for every
single allocation.

The memory allocator can do better than this. Why not cut the block up
into smaller chunks?

block->free = false;
lone_memory_split(block, size);
return block->pointer;

Split the block into two blocks: an allocated block of exactly the
requested size, and a new free block for any excess memory that might
remain.

size_t excess = block->size - size;

/* create a new block only if there's enough
space for memory block descriptor + 1 byte */

if (excess >= sizeof(struct lone_memory) + 1) {
new = (struct lone_memory *) (block->pointer + size);

/* weave the new block into the linked lists */

new->free = true;
new->size = excess - sizeof(struct lone_memory);
block->size = size;
}

The excess memory block gets conjured up out of thin air: the
allocator simply drops a new memory block descriptor right after the
end of the previous memory block. When the links are established, the
excess memory can be allocated like any other memory block.

Memory deallocation is even simpler: just mark the block as free. The
block descriptor is just behind the pointer, trivially reachable.

struct lone_memory *block = ((struct lone_memory *) pointer) - 1;
block->free = true;

This is enough, but the allocator can do better. It can check if
surrounding blocks are also free, and undo the split if that is the
case.

lone_memory_coalesce(block);
lone_memory_coalesce(block->prev);

Coalescing a memory block is simple. When two adjacent blocks are
free, one block literally absorbs the other, block descriptor and all.

if (block && block->free) {
next = block->next;
if (next && next->free) {
block->size += next->size + sizeof(struct lone_memory);
/* unlink the blocks */
}
}

That's it. A simple yet complete first fit,
split/coalesce memory allocator. Give it a large block of memory to
manage and watch as it cuts the block up into small chunks and hands
them all out to anyone who asks.

#define LONE_MEMORY_SIZE (64 * 1024)
static unsigned char memory[LONE_MEMORY_SIZE];

lone_lisp_initialize(&lone, memory, sizeof(memory));

How good is this memory allocator? Let's just say that
terrible doesn't quite do it justice. This thing will
linearly scan the list of blocks for every single allocation. It will
fail pretty hard at keeping memory fragmentation under control, which
not only wastes memory but could result in entirely avoidable
out-of-memory situations. Small leftover blocks tend to accumulate
near the start of the list, further compounding the problems as it
runs. Prefixing block descriptors to every block means metadata
overhead of around 30% to 50% for typical allocations, not to mention
the fact those headers are classic targets for exploitation. The
shortcomings of this allocator could fill up an entire article all by
themselves.

Nevertheless, lone made do with this thing for
around three years. For all of its flaws, it absolutely does
allocate memory, and memory was all that I needed to make some
objects. I didn't really care about the allocator, I just needed some
values I could play with as I developed the language.

Yeah, I'll totally go ahead and do that now. Allocate some values, I
mean. Everything's gonna be simple now.

Right?
Scanning for pointers

It's not that simple. Other parts of the code have requirements that
must be fulfilled.

The
lone lisp garbage collector
is conservative. It walks the stack and scrutinizes everything it
finds. It wants to know whether they are pointers to lisp objects. In
order to do that, it must maintain a list of every single lisp value
pointer.

Sounds simple when I put it that way. Why not just do it? Just scan
the stack and compare every single word against
every single lisp pointer, right? Right?!

Yeah, no. That's not going to happen. Well, not unless
you want to get into the
quadratic hall of shame. There's got to be a way, some clever data structure to make the
problem go away...

The first heap

Wait, I think I know what to do. During my tenure in the Ruby mines
many eons ago, I just so happened to pick up some tricks. That
includes this one:

struct lone_lisp_heap {
struct lone_lisp_heap *next;
struct lone_lisp_heap_value values[LONE_LISP_HEAP_VALUE_COUNT];
};

A linked list of arrays of values!

General purpose memory allocators provide no guarantees; those must be
forged by hand by way of data structures. Now objects aren't
individually allocated any more,
the language buys them in bulk. The lone heap is essentially
a list of nice little chunks of lisp objects, all lying right next to
each other with zero gaps in between them. Now the garbage collector
can simply check if pointers fall within range and call it a day.
Alright!

Mechanically,
it's almost identical
to the general purpose memory allocator. It just works at the lisp
value level instead of working at the byte level. There's a linked
list of chunks, it walks through them all. For each chunk, it loops
over the values. If it finds a dead value, it just returns it. Chunks
are allocated with all values dead. The garbage collector just kills
the values in those slots instead of deallocating them.

The allocator isn't really allocating values, it's
resurrecting them.

The power of arrays never fails to impress. It's literally just
juxtaposing things together. Sounds like something a stupid caveman
would do. Grug programmer slaps
things together into arrays while the refined gentleman weaves linked
lists with grace and finesse. Yet it's arrays which the
processor likes, and it's arrays that the computer science validates
as the good solution to this problem. Go figure. It almost makes me
want to use arrays for literally everything.

Can I do that? Use one big stupid flat array for literally everything?
Is it even possible?

The tyranny of the pointers

The trouble with these arrays is precisely the fact they're big stupid
monolithic chunks of objects. Can't easily insert new values in the
middle of chunks, that would require resizing the array. Easier to
just put them into a linked list and call it a day. That way if more
storage is needed one can just allocate new chunks and link them in,
and if the entire chunk becomes dead one can just link them out and
deallocate the chunks. The insertion performance of linked lists
combined with the awesomeness of arrays. This must be what peak
performance looks like.

But why is that, anyway? Why is it impossible to easily resize these
arrays? Things would be so much easier if we could do that...

I kept thinking about it, then I realized there was no such thing as
"resizing" an array to begin with. That was an abstraction, a fantasy.
In reality, a completely new array gets allocated, the old data gets
copied over, and then the old array gets destroyed. That's what array
"resizing" actually is. It's destructive reconstruction. Like wars and
the economy.

A completely new array is allocated every single time. This array
could be in a totally different location.
This invalidates all pointers into the array.
All lone lisp objects are pointers to heap values inside these arrays.
Resizing them would dangle every pointer to the values inside them.
It would be pure chaos.

OK, not every single time. Allocators might
be clever enough to expand the array in place
if there is enough space to do so. They cannot, however,
guarantee this. So I pretend it never happens. Better to be pleasantly
surprised than disappointed.

Such is the tyranny of the pointers. They are lazy and unmoving,
eternally fixed in place, utterly unconcerned about the evolution of
the world around them. They are the
NIMBYs of the programming
world. Entire new edifices of memory are getting built all around
them, yet they do not react. They do nothing but drag their feet and
remain where they are until the world ends, forcing the rest of the
world to adapt to them instead. Yes, it is clear to me now.
Someone would have to do something about all of
these pointers.

But what exactly should be done about them? Pointers aren't gonna
change. Not for me, not for you, not for anyone tonight. If anything
was going to change, it was going to have to be the world. Could I
change the world?

What if the values... Weren't pointers to begin with? What if they
were just... Numbers? Indexes into the array? Easier said than done.
Like many lisps before it, lone used a
tagged pointer
representation. In lisp, everything is kung fu
pointers.
Such a fundamental change would touch everything
in the interpreter.

Did it anyway.
Rewrote the entire value representation. Mercifully, it had already
been
abstracted away
into nice functions and structures by this point. This was not
the first time
it was rewritten, or even
the second. Won't be the last time either. Mark my words.

So now lone values are no longer pointers, they are
indexes into the lone value heap. Yes,
the lone heap. Singular.
The linked lists are gone now.

struct lone_lisp_heap_value *
lone_lisp_heap_value_of(struct lone_lisp *lone, struct lone_lisp_value value)
{
size_t index = /* get the index out of the value */;

return &lone->heap.values[index];
}

The heap is literally just a big dumb flat array of values now. A
black monolith. An absolute unit. Lone values simply index into this
array. The array could be anywhere in memory, it doesn't matter where
it is. The real pointers are simply calculated on the fly.

Technically, every lone lisp value is now a heap base pointer plus
index pair. The base pointer is implicit in the interpreter structure
that's threaded through every single function. Code only manipulates
the position-independent indexes.

Values are now independent of their position in memory. Lone can now
fearlessly reallocate, move and resize the heap.

mremap
A lot of things are falling into place here all at once.

Position independent values. One big dumb array of values. Why not
just... Memory map this array as a huge page? Lone's heap values are
currently 56 bytes. Let's bump that up a bit by aligning it to 64
bytes. This means they will never ever overlap page boundaries, no
matter what page size Linux uses. And would you look at that... 64
bytes just happen to be exactly one cache line.

Lone's heap is made out of pages now. It is literally just
pages and pages and more pages, all filled to the brim with lisp
values. From byte allocation to value allocation to page allocation.
Not bad...

void lone_lisp_heap_initialize(struct lone_lisp *lone)
{
size_t size = LONE_LISP_HEAP_CAPACITY
* sizeof(struct lone_lisp_heap_value);

lone->heap.values = linux_mmap(0, size,
PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS,
-1, 0);

lone->heap.count = 0;
lone->heap.capacity = LONE_LISP_HEAP_CAPACITY;
}

mmap is awesome but Linux's got something
even more awesome: mremap, which makes it almost
trivial to manage the page-based heap. The kernel can restructure page
tables at will, which means it can move the entire lone heap without
copying anything!
The man page
wasn't kidding about the fact that it could be used to implement a
very efficient realloc.

static void lone_lisp_heap_grow(struct lone_lisp *lone)
{
size_t old_capacity, new_capacity,
old_size, new_size;

old_capacity = lone->heap.capacity;
old_size = old_capacity * sizeof(struct lone_lisp_heap_value);

new_capacity = old_capacity * 2;
new_size = new_capacity * sizeof(struct lone_lisp_heap_value);

lone->heap.values = linux_mremap(lone->heap.values,
old_size, new_size,
MREMAP_MAYMOVE);

lone->heap.capacity = new_capacity;
}

The lone heap

From nothing to a specialized mremappable cache friendly
heap of lone lisp values. Not bad for a homegrown programming
language.

It's still linearly scanning for dead values to resurrect though.
Starts from the beginning every single time. It has to. The garbage
collector could theoretically choose to assassinate any object in the
program, including the one at index zero. It can't assume that the
zeroth object will always be alive.

Right?

Matheus Moreira details the creation of the memory management and heap structure for the lone programming language, tracing the evolution from basic C memory allocation to a specialized system optimized for lisp values. Initially, lone was constructed from data structures written in C, establishing a foundation where the language's structure seemed incidental to the virtual machine implementation, growing in complexity alongside the creator's understanding.

The author began by implementing a custom memory allocator since freestanding C lacked dynamic allocation functions. This involved designing structures to manage memory blocks, tracking size and status, and using a linked list approach to manage free and allocated memory segments. The initial allocator employed a first-fit strategy, which, while functional, proved inefficient, leading to memory fragmentation and significant metadata overhead. To improve efficiency, the author introduced mechanisms for splitting memory blocks upon allocation to create smaller chunks, and coalescing adjacent free blocks to merge fragmented space.

However, the performance limitations of the custom allocator led the author to re-evaluate how lisp objects were represented in memory. Recognizing the difficulty of managing arbitrary pointer relationships within monolithic data structures, the focus shifted to creating a centralized heap structure. This led to the concept of the lone heap, implemented as a linked list of arrays containing lisp objects. This bulk allocation strategy allowed the garbage collector to operate more effectively by checking pointer validity within contiguous chunks of values.

A major conceptual challenge arose from the nature of arrays, which are inherently monolithic and resist easy resizing. The author argued that treating lisp values as pointers within these arrays created a "tyranny of the pointers," as resizing arrays would necessitate destructive reconstruction, invalidating pointers and introducing chaos. To resolve this, the fundamental representation of lone values was changed: they were redefined not as pointers, but as indices into the centralized lone heap. This transformation yielded a heap that was essentially a large, flat array of values, simplifying memory management by making values independent of their physical location.

To realize the physical implementation of this heap, the author leveraged modern operating system features. The heap was initialized by using the system mmap call to allocate a large, contiguous memory region. Furthermore, the author employed mremap, which allows the kernel to restructure page tables, enabling the entire heap to be moved without copying its content, thereby facilitating highly efficient reallocation. The heap was also aligned to page boundaries, specifically 64 bytes, to ensure cache-friendliness and prevent overlap with page boundaries.

Despite these advancements, the final implementation still required a linear scan by the garbage collector to identify dead objects, as it could not assume that objects at lower indices were necessarily dead. Nevertheless, the process resulted in a specialized, mremappable, cache-friendly heap architecture for lone lisp values, demonstrating a capability for a homegrown programming language memory system.