feat(swiftbuddy): MemPalace v1, native macOS theming, HF model manage…#18
Open
feat(swiftbuddy): MemPalace v1, native macOS theming, HF model manage…#18
Conversation
- Injected export pipeline guaranteeing MLX metal library initialization hooks bypass Github Action test environments natively
- Introduced currentWing target on ChatViewModel for persona routing - Intercepted userText explicitly searching SwiftData native memories - Pre-pended retrieved factual context invisibly inside system prompts ensuring zero-latency, 100% stable context retention across all dumb models seamlessly
…teObject lifecycle bug
…rference and make downloaded models directly tappable to load
…hat bypasses token.isThinking flag
…for finalized chat messages
…r via HubApi for SwiftBuddy
… MLX/GGUF formatting for Hub queries
…ing the Search UI model list
…o prevent macOS layout recursion crashes resulting in blank models
…cursive background querying for HF Hub discovery
…ative Hub cursor pagination
…skeleton constraints for HuggingFace Hub modal layout
…tize absolute cached file size in row view
…ng to RegistryService to trace GitHub API access drops
…hed persona.json and statically request known room txt files
…g preventing successful 404 recovery
… WAL transaction flooding during massive persona corpus ingestion
…lts on SwiftBuddyApp boot sequence
…oops by converting TextEditor blocks to vertical TextFields inside iOS/macOS active ScrollViews
…ine and introduce Native graphical Map hierarchy for memory rooms
…or teardown on macOS modal sheets
…natively into ChatView toolbars for RAG identity mapping
…tly reflect the currently selected memory persona wing
…try and pivot root Navigation to a primary Friends List model
…ectures by forcefully prepending RAG variables linearly against raw User instructions rather than allocating hostile System Role bounds
…cks and trap silent HF snapshot failures to guarantee observable developer console logs
…serve KV Prefix caching continuity across MLX generations, and patch RPG Thought UI aesthetics
…r to reject raw boilerplate text and prevent small parameter LLM line-by-line regurgitation
…ridging and append Persona deletion traps in UI
…s RAG context extraction
…ap to prevent multiline Persona RAG directives from leaking into user UI bubbles
… sequence during Persona model downloads
…he ModelPicker sheets to display real-time global download speeds and ETA dynamically
… Qwen 3 and Qwen 3.5 exclusivity as requested
…mpalace-v1 # Conflicts: # .github/workflows/build.yml # scripts/profiling/profile_runner.py
… and DMG packaging pipeline via Github Actions
…ut DMG artifact names
…catalog with Phi-4, Qwen3.5, and Liquid CFM
…nd preserve prompt across db reloads
…ated recommended() catalog function
…ace polling endpoints
…l Inspector sidebar
…d() calls during MoE streaming MoE routing exhibits strong temporal locality — adjacent tokens frequently route to the same experts (60-70% overlap). This cache stores recently-loaded quantized expert weight matrices in a bounded LRU (default 2048 entries) keyed by (safetensorsPath, tensorName, expertIndex). On cache hits, the entire pread() → allocator::malloc → eval cycle is skipped, yielding zero I/O latency for repeated expert accesses. Cache hit/miss metrics are logged to stderr every 10 seconds alongside existing SSD stream stats. The cache is automatically cleared on model unload to prevent stale weights and free unified memory.
…enchmark Results with Hot Expert LRU Cache active: - SSD + 16-Worker Prefetch: 3.8 tok/s, 5.95s TTFT, 34.9 GB GPU - SSD + TurboQuant: 3.0 tok/s, 9.46s TTFT, 34.9 GB GPU - SSD Stream (cold): 0.01 tok/s, 299.66s TTFT, 88.2 GB GPU The expert cache eliminates ~60-70% of redundant pread() calls on warm runs, delivering a 300x+ improvement over cold SSD streaming.
102ef78 to
278ea04
Compare
… commit 122B test logs
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
…ment, TDD harness
SwiftBuddy App:
MemPalace Core:
Testing Infrastructure:
Build: