AI that works
where others can't.
OfflineIQ is an on-premises AI platform that ingests your document corpus and powers 13 purpose-built work tools — drafting, review, extraction, and more. No data ever leaves your network. By architecture, not policy.
Why organisations need AI
that stays on-premises.
Standard AI tools route data through the cloud
When an employee pastes a document into ChatGPT, Copilot or any hosted AI, that content transits to a third-party server. For organisations handling client data, contracts, or regulated information, this is not permissible under existing policy or regulation.
Compliance requirements block adoption
HIPAA, financial data regulations, and bar association ethics guidance each impose data residency or handling obligations that cloud-based AI tools cannot satisfy by design. Compliance teams block deployment as a result.
On-prem AI has historically required heavy infra
Running models locally has traditionally required large GPU deployments, specialist MLOps teams, and long procurement cycles — making it impractical for most organisations outside large tech companies.
OfflineIQ's position
OfflineIQ deploys inside the client's own AWS account or physical hardware, with no outbound connections at the model, query, or data layer. It is designed to satisfy data residency requirements without requiring specialist infrastructure expertise.
Four-step flow,
entirely within your network.
Connect your data sources
OfflineIQ connects to existing document stores, databases, and file shares via intranet connectors. Supports SQL, NoSQL, PDF libraries, Word/Excel files, and SharePoint. Nothing is copied externally — ingestion runs inside your perimeter.
Build your corpus
Documents are processed, indexed, and stored inside your environment. The system extracts structure, sections and metadata, and builds a searchable knowledge base specific to your organisation. Updates are incremental.
AI model runs on-device
A language model runs locally — inside your AWS account or on your hardware. It has no internet access and makes no external API calls. All reasoning is grounded in your corpus.
Employees work through 13 task tools
Staff access OfflineIQ through a browser-based interface. Each tool is purpose-built for a specific task. Every output cites the source document and section it was drawn from.
How OfflineIQ produces accurate,
cited, on-corpus answers.
Every query passes through a multi-stage intelligence pipeline before a response is produced. Each stage exists to prevent a class of failure — hallucination, irrelevant retrieval, unsupported output.
Query rewriting
HyDE (Hypothetical Document Embeddings) rewrites the employee's query into a form that improves vector search precision before retrieval runs.
Finds relevant content that a literal keyword search would miss.
Corpus retrieval
Custom embedding model — calibrated on the client's own document space, not generic pre-trained vectors — retrieves the most semantically relevant chunks.
Retrieval quality is domain-specific, not generic.
Re-ranking
A second-pass re-ranker scores retrieved chunks for relevance and filters low-confidence results before they reach the model.
Only high-confidence source material reaches the model.
Grounded generation
The SLM is instructed to answer only from retrieved chunks. Below the confidence threshold, it returns 'no information found' rather than generating unsupported output.
The model cannot hallucinate. It can only cite or decline.
Citation injection
Every response carries structured metadata: source document name, section, and chunk ID. Citations are injected at the pipeline layer — not inferred after generation.
Every statement is traceable to a specific document and page.
What makes OfflineIQ different
from a generic RAG deployment.
Client-specific embedding calibration
The system understands your terminology, not generic language.
Embedding models are trained on the client's own document space rather than generic pre-trained vectors. On domain-specific corpora — legal, clinical, financial — this is the primary driver of retrieval quality.
Client-specific SLM fine-tuning
The model has been trained on your documents, not just given access to them.
A base small language model is fine-tuned on the client's corpus. Weights encode domain knowledge, style, and terminology — producing output consistency and citation fidelity that plain RAG cannot match.
Zero-egress enforced at three layers
The data boundary is verifiable, not just a policy commitment.
Egress prevention runs simultaneously at the OS network policy, container runtime, and application layer. Triple-layer enforcement is verifiable by packet capture.
Proprietary RAG orchestration pipeline
Each query goes through five stages before a response is generated.
Combines HyDE rewriting, custom retrieval, cross-encoder re-ranking, strict grounding, and structured citation injection — a purpose-built pipeline, not a LangChain wrapper.
13 task tools,
grouped by function.
Document Production
Review & Validation
Data Extraction & Organisation
Research & Synthesis
How organisations use OfflineIQ
in practice.
Case preparation and knowledge retrieval without exposing client data
A litigation associate needs to summarise a 400-page case file, identify key precedents, and match them against the firm's prior matter knowledge base — without uploading any client documents to a cloud service.
Summarise without losing detail
Associate drags the case file into OfflineIQ. The Summarization agent produces a structured summary preserving key facts, parties, dates, and arguments — every section cited back to the source page. PII stays inside the firm's perimeter throughout.
Match against precedent knowledge base
The Similar-Items agent searches the firm's indexed matter history for semantically similar cases — past filings, judgments, expert opinions — and returns ranked matches with explanations of how each is relevant and where it differs.
Draft the legal strategy brief
The Draft agent produces a first-cut strategy document grounded in the retrieved precedents and current case facts. The associate reviews, edits, and approves before anything is filed. No output is sent anywhere automatically.
What OfflineIQ
can connect to.
Intranet-only connectors. No data is routed externally during ingestion or at any other point. Supports live database connections (corpus stays current without manual re-uploads) and static document ingestion.
Relational Databases (SQL)
- PostgreSQL
- MySQL / MariaDB
- Microsoft SQL Server
- Oracle Database
Live intranet queries. Read-only access. Authenticated and encrypted.
Document & NoSQL Stores
- MongoDB
- Elasticsearch
- Amazon S3 (private)
- Custom JSON / REST feeds
Intranet-only. Incremental sync — only new or changed records reprocessed.
File & Document Formats
- PDF (text-native and scanned)
- Word (.docx)
- Excel (.xlsx)
- PowerPoint (.pptx)
Scanned documents processed via offline OCR engine. No cloud OCR dependency.
Enterprise Content Platforms
- SharePoint / OneDrive (on-prem)
- Network file shares
- Internal portals via custom connector
- Email export archives
All access via intranet. No Microsoft cloud routing.
Two deployment paths,
identical software stack.
The choice of deployment determines the infrastructure model and procurement path — not the capabilities or compliance posture.
AWS Private VPC
Runs inside the client's own AWS account
On-Premises Hardware
Spark Box (NVIDIA DGX) or managed GPU rack
What the platform enforces
and what can be verified.
Zero-egress enforcement
No data, query, embedding, or model response leaves the network boundary. Enforced at three independent layers: OS network policy, container runtime, and application layer. Verifiable by packet capture.
Encryption at rest and in transit
All stored documents, vector indexes, metadata, audit logs, and configuration files are encrypted using AES-256. All intra-service communication uses TLS 1.3. Key material is hosted on-premises.
Role-based access control
Access permissions are enforced at three layers: the API gateway, application feature flags, and the document retrieval layer. Document-level permissions propagate into the index.
Tamper-evident audit log
Append-only audit store records every user action, query, and output with timestamp and user ID. Cryptographic hash chaining — any attempt to modify or delete entries produces a detectable chain break.
No shared model training
Each client's language model is fine-tuned on that client's corpus only. No data is pooled across clients. The fine-tuned model weights are stored within the client's environment.
Human-in-the-loop on all outputs
No agent action is automated. Every tool produces a draft, report, or extract that a human reviews before any action is taken. No auto-send, auto-post, or auto-write-back capability.
Where the value
is realised.
Figures drawn from published industry research. Client-specific ROI depends on document volume, headcount and use-case mix — directional benchmarks, not contractual commitments.
Knowledge workers spend significant time reviewing and validating documents manually. AI review tools grounded in institutional documents reduce first-pass review cycles.
Average time a knowledge worker spends locating, reading, and extracting information from documents. OfflineIQ Extraction and Similar-Items tools address this directly.
When AI drafting tools are grounded in an organisation's own precedent documents, first-draft speed increases substantially compared to drafting from scratch.
Organisations that deploy private, air-gapped AI remove a class of breach risk that cloud AI tools introduce.
What OfflineIQ
does not do.
Clearly defining what the platform does not do is as important as what it does. These are architectural constraints — not feature gaps.
No cloud model fallback
Every inference runs on the local model. OfflineIQ does not fall back to OpenAI, Anthropic, or any hosted model under any circumstance.
No automated actions
Agents produce outputs — drafts, extracts, reports — that a human reviews. The platform does not auto-send, auto-post, or write back to any system.
No web search at inference time
The model reasons only from the client's corpus. It cannot access the internet, retrieve live data, or use knowledge outside the ingested document set.
No shared training across clients
Each client's fine-tuned model is specific to that client. No client data is used to improve a shared model or shared index.
No live connectors to comms tools
OfflineIQ does not connect to Gmail, Outlook, Slack, Jira, or any live system. Data enters through the ingestion pipeline or user upload.
No user account management
Identity and access control live in the client's own systems (SSO, directory). OfflineIQ integrates with existing identity infrastructure.
Book a scoping call.
One hour.
We'll define the document corpus, use cases, and compliance requirements — and confirm whether AWS or on-prem is the right deployment path.
Takes under 2 minutes. A member of our team will reach out within one business day.