Self-Hosted Coding AI Case Study
A local-first AI assistant for potential clients asking software architecture, debugging, Firebase cost, MVP scope, and codebase-risk questions.
The original work is not simply running an open model. It is the product layer around it: BVT-specific retrieval, client-intake workflow, evaluation, rate limits, safety rules, Mac-hosted backend infrastructure, and human handoff.
This page currently demonstrates the workflow. The production version connects to a Mac-hosted backend running a local model behind a custom API.
"My Firebase app works, but screens are slow and reads are climbing. Is this an architecture problem?"
A useful assistant should not guess from vibes. It should ask about listener usage, query shape, high-traffic screens, data duplication, and production risk before recommending a rewrite.
Architecture Firebase Cost Human HandoffRunning Ollama with someone else's model is useful infrastructure. The original project is the consulting-specific AI product built around that model.
Approved context from site pages, blog posts, tools, case studies, service pages, and engineering philosophy.
Questions are routed into architecture, debugging, Firebase cost, MVP scope, code review, and launch-risk patterns.
Realistic client prompts measure usefulness, caution, refusal behavior, and when the assistant should send someone to Bill.
GitHub Pages hosts the portal. A FastAPI backend runs on a dedicated MacBook, Mac mini, or Mac Studio. Ollama runs the local model. Cloudflare Tunnel exposes the API.
The API can later route heavy traffic or larger models to GPU cloud compute without changing the public website experience.
Start with a local open-weight model through Ollama. No training is required for the first useful version.
The assistant becomes useful by retrieving relevant BVT context and answering in a practical client-intake style.
Collect real client-style questions and score the answers before deciding whether fine-tuning is worth the added complexity.
A tiny transformer trained from scratch can be documented as a learning artifact. It should not be confused with the production assistant.
Verified reviews from real projects
“Amazing in communication.”
Client · iOS App (Swift & Firebase)
“Went above and beyond.”
Client · Firebase Integration Revamp
“It was great working with Bill! Very pleasant and knowledgeable.”
Client · Language Learning App