This document captures the current state of meniOS user address spaces and the plan that drove issue #28 (per-process virtual memory management). The implementation now in trunk covers region metadata, eager PT_LOAD registration, lazy-growing stacks, and user-mode page-fault recovery. Remaining enhancements (heap growth, demand paging, bespoke user CR3 layouts) build on this foundation.
- Every process carries a
vm_regions[]table (vm_region_t) that records the virtual range, permissions, and growth behaviour for code, rodata, data, stack, and future heap/mmap regions. proc_create_user()registers the stack as a grow-down region. Only the top page is mapped initially; the rest of the stack window is populated on demand via page faults.elf64_load_image()classifies each PT_LOAD segment by its flag bits, maps the pages, and records the range in the owning region metadata so teardown can free it later.- The page-fault handler (
vm_region_handle_page_fault) intercepts user faults and allocates a fresh page when the address falls inside a grow-down (stack) or grow-up region. Unexpected or permission-violating accesses still trigger the traditional diagnostic path. - Physical allocations are still tracked in
proc->user_segmentsfor cleanup, but new mappings are also reflected in the region metadata (committed_base/committed_top).
-
Canonical Layout
- Introduce a shared
vm_layout.hdescribing text, rodata, data, heap, stack, and mmap windows (today stack starts viauser_stack_top(); code still relies on PID-strided helpers). - Reserve guard pages between regions once the new layout is in place.
- Introduce a shared
-
Heap Support
- ✅ A userland
brk/sbrkshim has been implemented insrc/libc/brk.c(issue #423), providing POSIX-compatible heap allocation backed bymmap(MAP_ANONYMOUS). - Future: Carve out a native kernel-managed grow-up heap region and hook it into the lazy page allocator.
- Replace the flat
user_segmentsbookkeeping with region-aware structures so teardown can free lazily allocated heap pages.
- ✅ A userland
-
Address Space Isolation
- Replace the “clone kernel CR3” approach with a curated template that maps only shared kernel ranges plus user regions.
- Harden
proc_exitto iterate regions, unmap virtual ranges, and release physical pages without relying solely onuser_segments.
-
Testing & Tooling
- Add self-tests that force stack expansion and trigger fault recovery paths.
- Once the heap is wired, stress the allocator by forcing repeated grow/shrink cycles.
With these follow-ups in place we can move toward demand paging and file-backed mappings while keeping the region infrastructure as the central source of truth.
With the introduction of vm_map, vm_unmap, and vm_clone, user address-space management now has a kernel-facing API:
vm_map(proc, params)reserves a region, allocates physical pages, zeroes them, maps them into the target CR3, and updates both region metadata and the legacyuser_segments[]bookkeeping.vm_unmap(proc, base, length)removes page table entries and frees the backing frames for the specified range, then drops the region descriptor. (Current implementation assumes whole-region unmap; partial unmap support is a follow-up.)vm_clone(child, parent)duplicates the parent’s region table, allocates fresh frames for each committed page, copies contents, and maps them into the child.
Limitations:
- No copy-on-write yet; clone eagerly copies committed pages.
vm_unmapcurrently frees entire regions at once—page-granular tear-down will come alongside mmap/heap work.- Guard pages and canonical per-process layouts are still pending (see roadmap above).
These APIs bridge the earlier region metadata work with actual page-table manipulation, enabling higher-level features (heap grow, mmap, fork/exec) to advance.
With vm_clone in place, meniOS now offers a full fork/execve path:
proc_fork()allocates a childproc_info_t, clones the kernel PML4, and callsvm_clone()to duplicate committed user regions into the new address space. The syscall trampoline copies the parent frame into the child so both return to user space with distinct return values (0in the child,child_pidin the parent).proc_exec_image()stages the replacement image in a fresh CR3, mapping a clean stack region before running the ELF loader. On success the old user mappings are torn down, the new root installs in the process, and the syscall frame is reset with pristine registers and user segments so the caller resumes in ring 3 at the ELF entry point. The helper now also seeds the user stack withargc,argv, andenvpso binaries observe a Linux-like process entry contract.SYS_forkandSYS_execvedispatch into the helpers above.SYS_execvenow copies the user-supplied path withproc_user_buffer_accessible()and streams the ELF contents from the VFS before handing them toproc_exec_image(), surfacing filesystem errors as negative errno values.- The user demo program now exercises
fork, emitting per-branch messages before the child exits, keeping the example deterministic while we wire up richer userland payloads. vm_clone()now rounds partially committed regions up to full pages before copying, ensuring child processes inherit stack data that was still sharing a leaf page with uncommitted space.
This closes the loop on process cloning and image replacement: the scheduler can now spin up arbitrary user tasks, duplicate them, and hand control over to new executables without rebooting the kernel.
Anonymous mmap/munmap now ride on top of the VM manager:
kmmap()translates POSIX protection/flag bits intovm_region_tmetadata and carves aVM_REGION_MMAPentry usingvm_map(). Each process tracks an independent(base, next, limit)window so mappings live away from the code/stack layout.kmunmap()looks up the owning region, callsvm_unmap(), and rolls back the cursor when the highest mapping is released—no partial unmaps yet, matching thevm_unmapsemantics.- syscalls
SYS_mmapandSYS_munmapvalidate arguments, surface kernel errors as negative errno values, and return page-aligned addresses. A tiny libc wrapper (src/libc/mman.c) forwards the POSIX API into the new interrupt 0x80 entries and keepserrnoin sync for userspace allocators like jemalloc.