‘Kompact AI Enables Enterprises to Use Existing Hardware over Investing in Expensive Accelerators’: Vineet Mittal, Senior Vice President, Ziroh Labs

Vineet Mittal, Senior Vice President, Ziroh Labs

In this exclusive interview, Vineet Mittal, Senior Vice President, Ziroh Labs, shares with Anannya Saraswat, Reporter (Public Sector and Leadership) at CXO Media and APAC Media, the company’s vision behind Kompact AI, a platform designed to make AI deployment more practical, affordable, and sustainable for real-world use. 

The conversation explores why the company is challenging GPU-heavy AI architectures, how CPU-based inference can unlock scalable and privacy-compliant AI for enterprises, and why this shift is especially important for regulated sectors and emerging markets like India. 

What issue are you addressing through Kompact AI, and why is this shift significant now? 

Kompact AI addresses the growing challenge of deploying AI inference at scale without the high cost, energy consumption, and regulatory friction associated with GPU and closed-source dependent systems. Nearly 100% of all applications in the world run on CPUs today. It does not have to be different for AI. 

As enterprises move from experimentation to real-world production, they need inference solutions that are affordable, reliable, and compliant across global markets. 

The shift to efficient, CPU-based inference is important now because organisations are under increasing pressure to reduce infrastructure costs, meet sustainability goals, and deploy AI in regulated environments, without sacrificing performance.

How does Kompact AI improve accessibility, affordability, and scalability of AI for enterprises and institutions with limited infrastructure budgets? 

Kompact AI improves the accessibility, affordability, and scalability of AI by removing the industry’s heavy dependence on GPUs for inference and enabling high-performance open-source AI models to run efficiently on standard CPU infrastructure—without altering model weights or sacrificing accuracy. 

Kompact AI enables enterprises and institutions to use existing hardware rather than invest in expensive, power-hungry accelerators. This dramatically lowers infrastructure and operational costs, making advanced AI viable even for organisations with limited budgets.

At the same time, Kompact AI is designed for real-world production environments, from air-gapped data centres and regulated industry clouds to on-premises systems and edge devices. 

Its OpenAI-compatible API layer, built-in observability through OpenTelemetry, and support for diverse CPU architectures make deployment and scaling straightforward. By eliminating GPU constraints, organisations can deploy more models, serve more users, and scale AI sustainably—while maintaining full control over their models, data, and intellectual property.

What kinds of use cases or industries stand to benefit the most from CPU-based AI deployment?

In BFSI and other regulated industries, CPU-based AI enables compliant, auditable deployments in air-gapped or tightly controlled environments. Use cases such as fraud detection support, risk analysis, customer service automation, and regulatory reporting benefit from predictable performance, strong data governance, and the ability to keep sensitive data on-premises.

For semantic search and enterprise knowledge serving, CPU-optimised inference makes it feasible to index and query large volumes of documents cost-effectively. Organisations can power internal search, policy retrieval, and knowledge assistants that operate reliably across departments without incurring the high cost of GPU scaling. ZIROH Labs’ innovation in optimising semantic search provides 80% of responses in 0.3 seconds. 

Agentic automation and NLP-to-SQL workloads also benefit from innovations by the open-source community, providing models that can generate accurate queries from metadata. 

Running these workloads on CPUs allows enterprises to scale agents, automate workflows, and translate natural language into structured queries using existing infrastructure.

Document processing and knowledge-intensive AI systems—such as contract analysis, document serving, compliance checks, and internal audits benefit from innovations like semantic search, prompt and results caching. Also, Ziroh Labs makes it possible to run them close to where they are needed most, for example, edge devices, mobile phones, cars, etc. 

How does Ziroh Labs balance performance, cost, and energy efficiency while moving away from GPU-heavy architectures? 

At Ziroh Labs, we set out to break the industry’s dependence on GPUs for AI inference. Through deep systems engineering and data science,  we’ve enabled large language models to run on server-class CPUs at one-third the cost and one-third the power consumption of GPU-based systems. 

With Kompact AI, CPU inference now matches GPU-level token performance and outperforms other CPU inference engines by up to 2×. This allows enterprises to deploy high-performance, energy-efficient AI at scale without the economic or operational burden of GPU-centric infrastructure.

How do partnerships with IIT Madras, IITM Pravartak, and the Centre of AI Research strengthen Ziroh Labs’ R&D and product roadmap? 

Ziroh Labs is built on strong deep-tech roots emerging from the Indian Institute of Technology Madras (IIT Madras) and is supported by the IITM Pravartak Foundation. This foundation combines world-class research with a startup mindset. Thus enabling Ziroh Labs to move cutting-edge systems engineering rapidly from the lab into real-world, enterprise-scale AI deployments.

Privacy-compliant AI is central to Ziroh Labs’ vision. How do you address growing regulatory and data sovereignty concerns? 

Kompact AI is designed with privacy and data residency at its core, drawing on Ziroh Labs’ deep expertise in privacy-preserving technologies, including its pioneering work in fully homomorphic encryption.

By enabling high-performance AI inference to run entirely on CPU-based infrastructure—on-premises, in private clouds, air-gapped environments, or regulated industry clouds. 

Kompact AI ensures sensitive data never leaves an organisation’s control. This approach allows enterprises to maintain data sovereignty, meet regional compliance requirements, and deploy AI securely at scale without relying on external GPU-centric platforms.

How do you see Ziroh Labs shaping the future of inclusive and responsible AI, especially in emerging markets like India?

Ziroh Labs, in collaboration with the open-source AI community, is demonstrating that strong AI performance does not require massive models or brute-force scale. By prioritising efficiency and smart optimisation, smaller models are delivering powerful real-world results. 

In India, a growing movement is emerging around compact AI models trained in Indian languages—across both text and speech—available at significantly smaller parameter sizes. At the same time, India is innovating in building domain-specific models trained on local data spanning agriculture, defence, financial markets, and cultural knowledge, accelerating the adoption of inclusive, locally relevant, and sustainable AI.