Four Surprising Architectural Lessons from Building ConfiGPT

Rob Stoltz
Oct 29, 2025
4 min read

Building enterprise-grade AI systems is notoriously difficult. Developers constantly battle challenges like agent unreliability, where multi-step plans would frequently fail, and inconsistent outputs that undermine user trust. These issues, along with ballooning operational costs, can stall a project before it ever delivers value.

In the process of architecting our sophisticated, multi-tenant ConfiGPT platform, we encountered these same obstacles. Overcoming them revealed several counter-intuitive but powerful design principles that have become central to our architecture.

This article shares four of the most surprising and impactful architectural takeaways from that journey.

1. The Most Important Feature We Built Was Removing Features

Our initial design for the main chat interface gave end-users a high degree of control. The page featured numerous sliders and checkboxes, allowing users to fine-tune settings like AI "creativity" (temperature), toggle Retrieval-Augmented Generation (RAG), and select specific automation tools for each query. We believed that empowering users with this flexibility was a core feature. We were wrong.

The critical architectural shift came when we removed all of these complex controls from the everyday chat page. Instead, we made the 'Workspace Admin' page the new, centralized hub where administrators configure all default settings like temperature, RAG settings, and tool usage for each of their team's workspace(s).

This decision to remove individual user controls is a vital lesson for reliable enterprise AI. Centralizing configuration accomplishes three key goals for a business owner:

Ensures Consistent Results: All users within a workspace get a reliable, predictable AI experience, which is essential for business operations.
Simplifies User Experience: The chat interface becomes clean and focused on the user's goal (getting work done), not on a complex array of technical settings.
Enables Governance & Cost Control: Administrators gain granular control over AI behavior and, crucially, the associated costs on a per-team basis.

Ultimately, we deliberately sacrificed granular user-level flexibility in favor of system-level governance, predictability, and a simplified user experience. It's a trade-off that is essential for any successful business deployment.

2. To Build a Smarter Agent, We Built Smarter Tools

One of the most common failure points of AI Agents (AI that takes action) is relying on the AI "planner" to reason through long, complex sequences of simple actions. A fragile plan like search website A -> read page B -> search website C is error-prone and slow. To combat this, we adopted a principle: "Smarter Tools, Not Just a Smarter Brain."

Instead of trying to make the AI planner more intelligent, we focused on making the tools it uses more capable. Our multi-page web reader tool is the perfect example. Rather than asking the agent to create a brittle, multi-step plan to research a topic, this single, high-level tool encapsulates the entire sub-process. When called, it performs all the necessary steps internally (e.g. search, identify, read three pages, and consolidate the text).

This approach transforms the agent's job. A complex research task that was previously a fragile 7-step gamble becomes a simple and highly reliable 2-step plan: call the smart tool to gather information, then call a summarizer to synthesize the final answer. This dramatically improves agent reliability by letting the AI planner operate on powerful, robust commands.

3. The Best Agent Has Two Brains: One for Planning, One for Doing

Our AI agent architecture employs a surprisingly efficient dual-model strategy to balance cost and performance. Instead of using a single, powerful, and expensive AI model for every task, the agent uses two distinct types of models, each with a specialized role: an "Architect" for planning and a "Specialist" for execution.

The "Architect," or Agent Planner Model, has one job: reasoning. It takes the user's high-level goal and figures out the step-by-step plan required to achieve it. Because a high-quality plan is the foundation for success, this role demands the most powerful model available.

The "Specialist," or Default Model, handles execution and synthesis. It performs the "grunt work" tasks, like taking a large block of text from a tool and writing a simple summary. This task is more straightforward, allowing us to use a faster, more cost-effective model for the job.

This "two-brain" approach is a powerful optimization. It ensures that the expensive, high-reasoning model is used only for the single most critical task, 'creating the plan', while a cheaper, faster model handles the less demanding execution steps. This strikes an optimal balance between performance and cost, making the agent both highly capable and economically viable for your business.

4. The Secret to High-Quality Results Isn't Just Better Search. It's Reranking

A standard 'RAG' (AI using your documents) process often involves a single search to find relevant document chunks. While effective, this can still pull in irrelevant information, which is commonly referred to as "noise."

Our RAG pipeline significantly improves on this by first using Hybrid Search (combining keyword and conceptual search) for a better initial retrieval. But the real secret is the crucial second step we added: reranking.

Here’s how it works: The system first fetches a large set of candidate documents. It then uses a specialized internal service to re-order those documents based on their direct relevance to your exact question, pushing the most impactful information to the top and filtering out the noise.

This second pass is a game-changer for enterprise RAG. It significantly improves the quality of the context sent to the AI, which translates into more accurate, less "hallucinated" answers. For us, this extra step is a deep commitment to both quality and the data security of your internal processes.

What's Next? Ontological Knowledge Graphs: The "Neurosymbolic" evolution of our RAG system. This involves building a new service to extract and store structured relationships to provide fact-based, auditable answers for our clients who need absolute certainty and systematically auditable results.

Conclusion

The journey of building a sophisticated AI platform taught us that the most effective solutions are not always the most obvious. By prioritizing governance through centralized control, building smarter tools to reduce agent complexity, optimizing costs with a dual-model agent strategy, and ensuring quality with an advanced reranking step in our RAG pipeline, we created a more robust and reliable system.

These architectural principles have become the bedrock of ConfiGPT.

As we build the next generation of AI platforms, our journey suggests a critical question: is the path to greater intelligence found in a more powerful model, or in a more intelligent system designed to orchestrate it?