Database Design आणि Scaling
SQL vs NoSQL, Sharding, Replication
Database हे system design मधला सर्वात critical component आहे. Wrong database choice, poor schema design — या चुका नंतर fix करणे खूप कठीण असतं. SQL vs NoSQL, vertical vs horizontal DB scaling — हे decisions मोठ्या systems मध्ये फार महत्त्वाचे असतात.
SQL vs NoSQL — कधी काय?
Comparison
SQL (PostgreSQL, MySQL):
✅ ACID transactions — banking, e-commerce साठी
✅ Complex JOINs, relationships
✅ Strong schema — data integrity
✅ Mature, battle-tested
❌ Horizontal scaling कठीण
❌ Schema changes painful
Use when: structured data, transactions, complex queries
NoSQL (MongoDB, Cassandra, DynamoDB):
✅ Horizontal scaling सोपं
✅ Flexible schema — schema-less
✅ High write throughput
✅ Specific use cases मध्ये खूप fast
❌ JOINs नाहीत (denormalization)
❌ ACID guarantees कमी (mostly)
Use when: unstructured data, massive scale, specific access patternsDatabase Replication
Replication म्हणजे एका database चे data दुसऱ्या DB वर copy करणे. Primary-Replica setup: Primary वर writes होतात, Replicas वर reads. यामुळे read load distribute होतो आणि failover होते.
Primary-Replica Replication
Write Operations → [Primary DB]
↓ (replication)
[Replica 1] [Replica 2] [Replica 3]
↑ ↑ ↑
Read Read Read
Benefits:
- Read requests Replicas वर जातात — Primary वरचा load कमी
- Primary fail झाल्यास Replica promote होतो
- Geographical replication — different regions मध्ये copiesDatabase Sharding
Sharding म्हणजे data horizontally partition करणे — एकाच DB मध्ये सगळं न ठेवता multiple DB instances मध्ये वाटणे. User_id 1-1M → Shard 1, 1M-2M → Shard 2.
Sharding Example
Users Table — Too large for one DB (1 Billion rows)
Shard Key: user_id
Shard 1: user_id 0 - 250M
Shard 2: user_id 250M - 500M
Shard 3: user_id 500M - 750M
Shard 4: user_id 750M - 1B
Challenges:
❌ Cross-shard JOINs — खूप complex
❌ Resharding — shard full झाल्यावर rebalancing
❌ Application-level routing लागतंDatabase scaling order: 1) Cache वापरा (Redis), 2) Read Replicas add करा, 3) Vertical scaling, 4) Sharding — as a last resort. Sharding खूप complexity आणतो.
✅ Key Points — लक्षात ठेवा
- ▸SQL: ACID, joins, structured data — Postgres/MySQL
- ▸NoSQL: scale, flexibility — MongoDB/Cassandra
- ▸Replication: Primary writes, Replicas reads
- ▸Sharding: data horizontally partition करा
- ▸Cache first, scale DB later