AX-Office.ai - How We Built Local AI Into the Heart of Our Office

By Maniraj Sai | AI Engineer, aXite Security Tools

We set out to answer a question most companies are still avoiding: what does it actually look like to go all-in on AI, without compromising on security or data ownership?

At aXite Security Tools, we move fast and we handle sensitive data. That combination demands tools that are both powerful and trustworthy. So instead of waiting for the perfect enterprise AI solution to come to us, we built one ourselves. AX-Office.ai is our fully on-premise AI platform and it has become a key enabler of our company’s operations.

✨ And critically, every model, every inference call, every piece of data stays entirely within our own walls. No external API dependencies. No rate limits. No guesswork about where your data goes.

This post walks through each piece of what we built, how it works, and what it has meant for our team in practice.

🔐 The Core Principle: If You Want Privacy, Own the Stack

There's a version of this problem where you go to a major cloud provider, spin up a private deployment of some API-wrapped model, sign a data processing agreement, and call it a day. We considered it. We rejected it.

The reason is simple: we don't just want contractual guarantees about our data. We want physical certainty.

🛡️ When inference runs on hardware we own, in a space we control, there is no ambiguity about where the data goes. The query enters the machine. The response leaves the machine. Nothing else moves.

That philosophy is the foundation of AX-Office.ai, our internal AI platform running entirely on dedicated on-premise compute. Every model, every agent, every transcription, every OCR job runs locally. Always.

We use a high-performance model serving layer on top of the hardware to handle inference efficiently, and we've built a suite of five distinct AI capabilities on top of that.

🤖 Use Case 1: A Private ChatGPT for Everyone (The aXLLM)

Most of our team already knew how to use ChatGPT. That familiarity was something we wanted to keep. So when we built aXLLM, the goal was simple: give everyone the same experience they were used to, just running entirely on our own infrastructure.

You open it and it works exactly like ChatGPT. Type a question, get an answer. Ask it to write something, summarize something, explain something, work through a problem with you. There was no onboarding, no training session, no adjustment period. People just started using it because it already felt familiar.

Under the hood, aXLLM runs on a Mixture of Experts (MoE) model hosted entirely on our own infrastructure.

What is a Mixture of Experts (MoE) architecture?

Authentication is tied to our internal directory, so there is nothing extra to set up.