Saturday, February 1, 2025

Databases concepts in details

Types of Databases link




NoSQL databases are different from each other. There are four kinds of this database: document databases, key-value stores, column-oriented databases, and graph databases.

   Note: Vector DB , Event store


Types of Databases

  • Hierarchical Databases
  • Relational Databases
  • NoSQL Databases
       Document mongodb, DocumentDb
       Key-value -> Redis, DynamoDB
       Columnar -> casssandra, bigtable, druid
       Graph -> Azure cosmos db
        Time series -> Influxdb, prometheus
  • Network Databases
  • Object-oriented Databases
  • Cloud Databases
  • Centralized Databases
  • Operational Databases
  • NewSQL database -> CockroachDb
  • FIle storage
  • Block storage
object storage vs block storage vs file storage
https://aws.amazon.com/compare/the-difference-between-block-file-object-storage/


Techniques for Optimizing

  • Avoiding Over-Indexing
  • Efficient Query Design
  • Use of Stored Procedures


Key Metrics to Track

To maintain the health of your database, it’s important to track key metrics that provide insights into its performance and stability:

  • QPS (Queries Per Second): Measures the number of queries processed per second, helping you understand the load on your database.
  • Latency: Tracks the time taken to execute queries, indicating the responsiveness of your system.
  • CPU and Memory Usage: Monitors the resource consumption of your database nodes, ensuring they are not overburdened.
  • Disk I/O: Measures the read and write operations on your storage devices, highlighting potential bottlenecks.
  • Replication Lag: Indicates the delay in data replication across nodes, which is crucial for maintaining consistency and availability.

Regular Maintenance Practices


Index Rebuilding

Indexes play a vital role in query performance, but they can become fragmented over time, leading to inefficiencies. Regularly rebuilding indexes helps maintain their effectiveness:

  • Reorganize Index: This operation defragments the index pages, improving read and write performance without locking the table.
  • Rebuild Index: This more intensive operation creates a new index and drops the old one, fully optimizing the index structure. It’s useful for heavily fragmented indexes but may require downtime.

Database Backups

Regular backups are essential for data protection and disaster recovery. TiDB provides several tools and strategies for effective backup management:

  • BR (Backup & Restore): A command-line tool designed for large-scale data backup and restoration. It supports both full and incremental backups, allowing you to efficiently manage your backup strategy.
  • Dumpling: A lightweight tool for exporting data from TiDB into SQL or CSV files. It’s useful for smaller datasets or when you need to migrate data between environments.



No comments:

Post a Comment

Data Engineering and Best practices

Data and types Data at rest (e.g. batch data pipelines / data stored in warehouses or object stores) Data in motion (e.g. streaming pipeline...