The Lone Lisp Heap
Recorded: May 28, 2026, 8:03 p.m.
| Original | Summarized |
The lone lisp heap Matheus Moreira The lone lisp heap Like many dynamic languages out there, struct lone_lisp_list { /* ... */ }; enum lone_lisp_heap_value_type { struct lone_lisp_heap_value { enum lone_lisp_heap_value_type type; union { When I put it this way, the language itself seems almost incidental to At first I was a neophyte. I was attracted to ideas that were almost Take the above structures, for example. How are such values created? Lone is a lisp interpreter written in freestanding C. There is no So I wrote it. I read up on everything I could find online about To manage a memory block, the allocator needs to describe it. Let's struct lone_memory { The allocator must also keep track of whether the block is free or struct lone_memory { This is enough to manage any one block of memory. If it's free, then I started from literally nothing. Now I've got the one. Time to handle struct lone_memory { Simply link the blocks to each other. Now if some code requests a for (block = system->memory; block; block = block->next) if (!block) { return 0; } block->free = false; Searches the list of blocks and literally returns the first block that The memory allocator can do better than this. Why not cut the block up block->free = false; Split the block into two blocks: an allocated block of exactly the size_t excess = block->size - size; /* create a new block only if there's enough if (excess >= sizeof(struct lone_memory) + 1) { /* weave the new block into the linked lists */ new->free = true; The excess memory block gets conjured up out of thin air: the Memory deallocation is even simpler: just mark the block as free. The struct lone_memory *block = ((struct lone_memory *) pointer) - 1; This is enough, but the allocator can do better. It can check if lone_memory_coalesce(block); Coalescing a memory block is simple. When two adjacent blocks are if (block && block->free) { That's it. A simple yet complete first fit, #define LONE_MEMORY_SIZE (64 * 1024) lone_lisp_initialize(&lone, memory, sizeof(memory)); How good is this memory allocator? Let's just say that Nevertheless, lone made do with this thing for Yeah, I'll totally go ahead and do that now. Allocate some values, I It's not that simple. Other parts of the code have requirements that The Sounds simple when I put it that way. Why not just do it? Just scan Yeah, no. That's not going to happen. Well, not unless Wait, I think I know what to do. During my tenure in the Ruby mines struct lone_lisp_heap { A linked list of arrays of values! General purpose memory allocators provide no guarantees; those must be Mechanically, The allocator isn't really allocating values, it's The power of arrays never fails to impress. It's literally just Can I do that? Use one big stupid flat array for literally everything? The trouble with these arrays is precisely the fact they're big stupid But why is that, anyway? Why is it impossible to easily resize these I kept thinking about it, then I realized there was no such thing as A completely new array is allocated every single time. This array OK, not every single time. Allocators might Such is the tyranny of the pointers. They are lazy and unmoving, But what exactly should be done about them? Pointers aren't gonna What if the values... Weren't pointers to begin with? What if they Did it anyway. So now lone values are no longer pointers, they are struct lone_lisp_heap_value * return &lone->heap.values[index]; The heap is literally just a big dumb flat array of values now. A Technically, every lone lisp value is now a heap base pointer plus Values are now independent of their position in memory. Lone can now Position independent values. One big dumb array of values. Why not Lone's heap is made out of pages now. It is literally just void lone_lisp_heap_initialize(struct lone_lisp *lone) lone->heap.values = linux_mmap(0, size, lone->heap.count = 0; mmap is awesome but Linux's got something static void lone_lisp_heap_grow(struct lone_lisp *lone) old_capacity = lone->heap.capacity; new_capacity = old_capacity * 2; lone->heap.values = linux_mremap(lone->heap.values, lone->heap.capacity = new_capacity; The lone heap From nothing to a specialized mremappable cache friendly It's still linearly scanning for dead values to resurrect though. |
Matheus Moreira details the creation of the memory management and heap structure for the lone programming language, tracing the evolution from basic C memory allocation to a specialized system optimized for lisp values. Initially, lone was constructed from data structures written in C, establishing a foundation where the language's structure seemed incidental to the virtual machine implementation, growing in complexity alongside the creator's understanding. The author began by implementing a custom memory allocator since freestanding C lacked dynamic allocation functions. This involved designing structures to manage memory blocks, tracking size and status, and using a linked list approach to manage free and allocated memory segments. The initial allocator employed a first-fit strategy, which, while functional, proved inefficient, leading to memory fragmentation and significant metadata overhead. To improve efficiency, the author introduced mechanisms for splitting memory blocks upon allocation to create smaller chunks, and coalescing adjacent free blocks to merge fragmented space. However, the performance limitations of the custom allocator led the author to re-evaluate how lisp objects were represented in memory. Recognizing the difficulty of managing arbitrary pointer relationships within monolithic data structures, the focus shifted to creating a centralized heap structure. This led to the concept of the lone heap, implemented as a linked list of arrays containing lisp objects. This bulk allocation strategy allowed the garbage collector to operate more effectively by checking pointer validity within contiguous chunks of values. A major conceptual challenge arose from the nature of arrays, which are inherently monolithic and resist easy resizing. The author argued that treating lisp values as pointers within these arrays created a "tyranny of the pointers," as resizing arrays would necessitate destructive reconstruction, invalidating pointers and introducing chaos. To resolve this, the fundamental representation of lone values was changed: they were redefined not as pointers, but as indices into the centralized lone heap. This transformation yielded a heap that was essentially a large, flat array of values, simplifying memory management by making values independent of their physical location. To realize the physical implementation of this heap, the author leveraged modern operating system features. The heap was initialized by using the system mmap call to allocate a large, contiguous memory region. Furthermore, the author employed mremap, which allows the kernel to restructure page tables, enabling the entire heap to be moved without copying its content, thereby facilitating highly efficient reallocation. The heap was also aligned to page boundaries, specifically 64 bytes, to ensure cache-friendliness and prevent overlap with page boundaries. Despite these advancements, the final implementation still required a linear scan by the garbage collector to identify dead objects, as it could not assume that objects at lower indices were necessarily dead. Nevertheless, the process resulted in a specialized, mremappable, cache-friendly heap architecture for lone lisp values, demonstrating a capability for a homegrown programming language memory system. |