Proactive data strategy planning is now a competitive necessity for sustained growth in India’s rapidly evolving digital landscape. In an exclusive interaction with CXO Media and APAC Media, Bhanu Jamwal, Head of India Business, TiDB explains to Bhavya Bagga, Business Reporter – Corporate & Leadership how explosive data growth is exposing the limitations of traditional databases and why enterprises should consider distributed databases like TiDB if they are struggling to scale their current database infrastructure
With the growing demands of AI and real-time analytics, how are traditional databases falling short, and what makes TiDB’s distributed SQL architecture more suitable for modern enterprise workloads?
Modern enterprises grapple with handling massive data volumes, and traditional databases struggle to meet these demands due to scalability issues in single-machine environments, limitations in real-time processing, vulnerabilities to single-point failures, and the complexity of data integration leads to fragmented data silos and inconsistencies.
Traditional transactional databases struggle with real-time analytics because they were originally built for OLTP operations rather than the large-scale data scanning required for analytical queries. Their single-node design becomes a performance bottleneck when handling massive datasets.
Since real-time analytics demands parallel processing across multiple nodes, TiDB’s distributed design offers the ideal solution and sets itself apart in this landscape by blending the strengths of both SQL and NoSQL databases.
As a NewSQL database, it reimagines how relational databases are designed, built, and deployed at massive scale. Its unique architecture and innovative features position TiDB as a leader in redefining database capabilities for modern applications.
Below are the architectural elements that distinguish TiDB in the market:
Horizontal scalability: TiDB can scale horizontally by adding more nodes, and handles increased loads without performance degradation, unlike traditional vertical scaling.
Automatic sharding: TiDB, an advanced distributed SQL database, automates sharding, distributing data across nodes to balance loads and prevent bottlenecks, unlike the manual, error-prone sharding in traditional databases. It eliminates the complexity of data sharding and removes any dependency of adding such sharding logic within an application.
Separation of storage and compute: TiDB tiered architecture separates storage and compute, allowing independent scaling of resources, which is ideal for dynamic cloud environments. This offers flexibility to scale computation layer for read heavy workloads and storage layer for write heavy workloads.
High availability and strong consistency: TiDB ensures high availability and strong consistency, maintaining functionality and continuous access to the database even during node failures.
Real-time HTAP: TiDB supports both transactional and analytical workloads, simplifying architecture by eliminating the need for separate OLTP and OLAP systems. With its advanced optimizer and well-designed architecture for storing data in both row and column orientations, TiDB intelligently routes queries to the appropriate orientation based on the query’s nature.
Cloud-native design: TiDB cloud offers managed services on leading cloud hyperscalers and offers TiDB serverless who demand auto scaling. Since TiDB is open-source, it does not lock customers into a particular cloud provider and offers flexibility to move to any cloud of their preference
MySQL compatibility: TiDB’s compatibility with MySQL allows users to migrate applications from MySQL to TiDB with minimal or no changes in most cases.
Vector databases are gaining attention with the rise of generative AI and LLMs. Could you explain how they work and how TiDB is integrating or supporting this evolution?
Vector databases are designed to handle searches based on vector embeddings, which are essentially lists of floats that represent various forms of data like text, images, and audio in a dense mathematical space. This approach is especially valuable in fields that rely on semantic search capabilities, where the aim is to grasp the meaning rather than the exact text.
For instance, in recommendation systems, vector databases can significantly enhance personalization by understanding the similarities in user preferences through vector comparisons.
TiDB‘s vector search capabilities extend to diverse use cases, such as semantic search, recommendation engines, and Retrieval-Augmented Generation (RAG) applications.
Core competencies of TiDB’s vector search feature includes:
Unified database: The primary advantage of TiDB in the vector storage and execution is its unified database that combines traditional SQL, vector search, and HTAP (Hybrid Transactional/Analytical Processing) capabilities in a single system. This eliminates the need for multiple specialized databases.
Data Storage Models: In TiDB, data storage is optimized for the complex requirements of high-dimensional vector processing. TiDB enables seamless integration of vector data types within its storage system, allowing vector data to occupy specific columns alongside conventional structured data
Indexing Techniques: TiDB utilizes advanced indexing techniques to optimize vector data retrieval. Vector indexes in TiDB facilitate rapid similarity searches using methods like distance metrics and ANN algorithms. The vector search functionality indexes data based on high-dimensional distances, such as Euclidean or cosine distances, thus enabling efficient k-NN queries.
Familiar SQL: Most vector databases require learning new APIs or query languages. However, TiDB offers MySQL compatibility with vector capabilities that allows developers to use familiar SQL syntax to join, index, and query both operational and vector data together. Writing complex queries is easy since it combines traditional SQL operations and vector similarity searches.
TiDB enables real-time Hybrid Transactional and Analytical Processing (HTAP). How does this benefit developers and businesses looking to modernize their applications?
TiDB’s real-time Hybrid Transactional and Analytical Processing (HTAP) architecture provides significant benefits for both developers and businesses looking to modernize their applications. By integrating row-based storage (TiKV) and columnar storage (TiFlash) in a single database system, TiDB enables simultaneous OLTP and OLAP workloads without the complexity of traditional multi-system architectures.
Simplified Development Experience
Developers no longer need to write code to interact with separate OLTP and OLAP systems. TiDB provides a single MySQL-compatible interface for both transactional and analytical operations. This eliminates the need for:
- Managing multiple database connections
- Learning different query languages or APIs
- Implementing complex data synchronization logic
- Handling consistency issues between systems
Automatic Engine Selection
TiDB’s intelligent optimizer automatically selects the optimal storage engine for each query—directing transactional queries to TiKV’s row-based storage and analytical queries to TiFlash’s columnar storage—without requiring any manual intervention from developers.
Real-Time Data Access
With TiDB’s HTAP architecture, analytical queries operate on the current transactional data without delay. This enables developers to build applications with real-time analytics capabilities, such as:
- Dashboards showing up-to-the-minute business metrics
- Recommendation engines using the latest user behaviour data
- Fraud detection systems analysing transactions as they occur
Simplified Data Model Management
Developers maintain a single schema that serves both transactional and analytical needs. When schema changes are required, they only need to be applied once, and TiDB automatically propagates the changes to both storage engines.
India has a vibrant open-source developer community. How is TiDB engaging with developers and what initiatives or partnerships are you pursuing in the region?
Empowering developers is at the heart of our India strategy. TiDB is actively engaged with the developer community in India through various initiatives to address the challenges posed by traditional databases and promote the adoption of distributed SQL systems like TiDB.
We regularly conduct hackathons and receive creative contributions from the community. For example, someone once proposed implementing graph database capabilities on TiDB—a completely different use case that showcased the community’s ingenuity.
Through PingCAP University, we offer educational resources, including courses and certifications to enhance skills in managing distributed SQL databases. This helps developers and IT professionals in India understand TiDB’s capabilities in handling large-scale data scenarios.
We are deepening our ties with the local IT community through workshops, training sessions, and hackathons, thus contributing to local talent development and innovation. We also provide extensive documentation, developer guides, and a community forum, supporting developers transitioning from traditional databases to modern distributed systems.
We have a dedicated global team for community management, including a local community manager in India.
In India, we see immense potential due to the availability of exceptional talent. We are actively investing in community-building efforts, hosting events like TiDB User Day to further engage with the opensource community.
What trends do you foresee in the evolution of open source and cloud-native databases, especially in the context of AI-driven application development?
Some of the trends I believe will shape the future of databases are:
Rise of S3 as the go-to architecture
One of the most significant shifts in database kernel technology is the growing recognition of S3 (or S3-like object storage) as the new disk. Emerging players like NeonDB, Rockset, Supabase, Databend, and TiDB Serverless are building their architectures around S3.
With S3 taking on much of the storage responsibility, database developers can shift their focus away from data replication and toward higher-level features like multi-tenancy, resource management, and advanced indexing.
This evolution is driving broader interest among database providers to adopt S3 as a core component of their storage stack.
Vector Indexing and Multi-Modal Databases
With RAG gaining significant traction, vector indexing has become a central focus in database technology. By 2025, I anticipate that mainstream databases will natively support vector indexing, gradually pushing standalone vector databases into a niche or transitional role.
Finally, what advice would you give to IT leaders and developers who are looking to modernize their data architecture but are hesitant to move away from legacy systems?
India’s digital-native businesses have demonstrated exceptional growth, and according to industry leaders, we’ve only scratched the surface of their potential. The sector is projected to expand 10x to 100x over the next 5-10 years, driving unprecedented data generation that is putting immense pressure on modern applications.
This explosive data growth is exposing the limitations of traditional databases, which were never designed to handle such scale. These legacy systems are reaching their breaking point, and the temporary solutions—such as read replicas and sharding workarounds—will soon hit the same scalability walls.
The shift to distributed databases is no longer optional, but it is becoming essential for any growing company. Business leaders must act now to develop comprehensive data strategies and begin their modernization journey before these limitations become critical bottlenecks.
Companies should consider distributed databases like TiDB if they are:
- Struggling to scale their current database infrastructure
- Relying on workarounds like multiple read replicas with single-writer architectures
- Unable to generate real-time business insights due to data processing limitations
- Experiencing performance degradation as their user base and data volumes grow
The time for reactive measures has passed—proactive data strategy planning is now a competitive necessity for sustained growth in India’s rapidly evolving digital landscape.