Wednesday, December 4, 2024

Software Engineer must also know

Distributed System

  • Bloom Filters
  • Consistent Hashing
  • Quorum
  •  Leader and Follower 
  •  Write-ahead Log
  • Segmented Log
  •  High-Water mark
  •  Lease
  •  Heartbeat
  •  Gossip Protocol
  •  Phi Accrual Failure Detection 
  •  Split-brain
  •  Fencing
  •  Checksum
  •  Vector Clocks
  •  CAP Theorem


SSL, TLS authentication

Keystore vs Truststore https://yasarayasawardhana.medium.com/beginners-guide-to-key-stores-and-trust-stores-f7fa6d70ca2e

Software Architectire pattern

Single Signon


32 engineering blogs I swear by to level up as a software engineer, get a grip on system design, and understand how to scale real-world systems

1. Uber Engineering: https://lnkd.in/gvga-NEg
2. Figma Engineering: https://lnkd.in/gFbvV8Mk
3. Canva Engineering: https://lnkd.in/gRHZtDCa
4. Nextflix Tech: https://lnkd.in/gq-SWapT
5. Slack Engineering: https://slack.engineering/
6. LinkedIn Engineering: https://lnkd.in/g5eMavet
7. Dropbox Tech: https://dropbox.tech/
8. GitHub Blog: https://lnkd.in/gKNBpb7V
9. Stripe Engineering: https://lnkd.in/g4JqgY39
10. Pinterest Engineering: https://lnkd.in/gnfrme2Z
11. SoundCloud Backstage Blog: https://lnkd.in/gNDUUreD
12. OpenAI Software Engineering: https://lnkd.in/g3wFsZk7
13. Twitter Engineering: https://lnkd.in/gVVcSYNf
14. Instagram Engineering: https://lnkd.in/grE9bRCy
15. Airbnb Tech: https://lnkd.in/gy3RF5ih
16. Medium Engineering: https://lnkd.in/g3ASQbbB
17. Lyft Engineering: https://eng.lyft.com/
18. Heroku Eng Blog: https://blog.heroku.com/
19. Yelp Engineering: https://lnkd.in/gBWEyaHK
20. Stack Overflow Engineering: https://lnkd.in/gENW-7Wh
21. Etsy Engineering: https://lnkd.in/gS6UwNN9
22. Notion Tech: https://lnkd.in/gCUUi2UC
23. Gusto Engineering: https://lnkd.in/gyqBr_CW
24. Rippling Engineering: https://lnkd.in/ge6K-UkG
25. MongoDB Engineering: https://lnkd.in/grsAiuvS
26. PayPal Tech: https://lnkd.in/gxj9Mx64
27. Instacart Tech: https://lnkd.in/gdGN9SrY
28. Ramp Engineering: https://lnkd.in/gw_kd2Vj
29. Coursera Engineering: https://lnkd.in/gvga-NEg
30. Engineering @ Meta: https://lnkd.in/gc9xnZQ8
31. Grab Tech: https://lnkd.in/gfhcUM5y
32. Engineering @ Spotify: https://lnkd.in/gje__jGu




► Coding Interview Preparation
- Coding Interview University → https://lnkd.in/gbhtV-Zn
- Tech Interview Handbook → https://lnkd.in/gbE2x95p
- Awesome Interview Questions → https://lnkd.in/gnrriMyb
- Front-End Interview Handbook → https://lnkd.in/gw4kZaQu
- Javascript Algorithm Implementations → https://lnkd.in/gu324JHD
- Python Algorithm Implementations → https://lnkd.in/gD_sDQC6
- Hiring Without Whiteboards → https://lnkd.in/gGYnBwz8

►  System Design & Architecture
- System Design Primer → https://lnkd.in/gmTP7kwc
- System Design 101 → https://lnkd.in/gyDYHhpF
- Low-Level Design Primer → https://lnkd.in/g4aVVDue
- Awesome System Design Resources → https://lnkd.in/gEX9FCaU
- System Design Questions → https://lnkd.in/gzDhrk-J
- Complete System Design → https://lnkd.in/gxD2QpW8
- Mobile System Design → https://lnkd.in/gqstsBSp
- System Design & Architecture → https://lnkd.in/gTBcfjU5
- Low-Level Design Guide → https://lnkd.in/gSPXTmkp



System Deisgn
1. Design Vending Machine: https://lnkd.in/e9A7FdVm
2. Design Facebook: https://lnkd.in/eNgMkQjN
3. Design Distributed Job Scheduler: https://lnkd.in/eDduhS4k
4. Design Google Search: https://lnkd.in/e-WjtfdY
5. Design a Digital Wallet: https://lnkd.in/eAbSZNwm
6. Design Instagram: https://lnkd.in/eVtTh6pY
7. Design Distributed Cache: https://lnkd.in/eJGpAEX6
8. Design Uber: https://lnkd.in/ee4Wz9ij
9. Design Distributed Key-Value Store: https://lnkd.in/eRaNTFEG
10. Design Notification Service: https://lnkd.in/exmierj9
11. Design Spotify: https://lnkd.in/e_kn-ekT
12. Design Payment System: https://lnkd.in/e4-uXTJD
13. Design Leaderboard: https://lnkd.in/ejK3xQBK
14. Design File Sharing System like Dropbox: https://lnkd.in/exMTnp2i
15. Design Google Maps: https://lnkd.in/eQrUTZdp
16. Design Distributed Counter: https://lnkd.in/eCZNfCJi
17. Design WhatsApp: https://lnkd.in/eq_TGNHK
18. Design Netflix: https://lnkd.in/e6VkezVX
19. Design TikTok: https://lnkd.in/eT9HYZzd
20. Design Text Storage Service like Pastebin: https://lnkd.in/ezzjcJhd
21. Design Rate Limiter: https://lnkd.in/erSVhcDF
22. Design E-commerce Store like Amazon: https://lnkd.in/e_SpQRhm
23. Design Google Docs: https://lnkd.in/eVQKG2jn
24. Design Live Comments: https://lnkd.in/ex6t4yjb
25. Design Flight Booking System: https://lnkd.in/e3Hni_C9
26. Design Content Delivery Network (CDN): https://lnkd.in/ebBaYK-y
27. Design Online Code Editor: https://lnkd.in/eVKeuVw8
28. Design Autocomplete for Search Engines: https://lnkd.in/e7Vk62Ge
29. Design Food Delivery App like Doordash: https://lnkd.in/eTZcYpis
30. Design Authentication System: https://lnkd.in/eEAUhkp2
31. Design Distributed Message Queue like Kafka: https://lnkd.in/euAUzpht
32. Design Slack: https://lnkd.in/ejXs3B8E
33. Design Distributed Locking Service: https://lnkd.in/e5_JzgBt
34. Design Twitter: https://lnkd.in/etCt5KhG


33 Technical Papers I Would Recommend You to Read as an Engineering Manager at Google
1. Amazon DynamoDB (Amazon)
→(https://lnkd.in/ghzF8jRj)

2. BigTable (Google)
→(https://lnkd.in/gqddJyE8)

3. Bigtable: A Distributed Storage System for Structured Data (Google)
→(https://lnkd.in/gf5SiZYw)

4. Borg Cluster Management (Google)
→(https://lnkd.in/gUQCuJpD)

5. Cassandra: A Decentralized Structured Storage System (Facebook)
→(https://lnkd.in/g26BNNXr)

6. Dapper Tracing System (Google)
→(https://lnkd.in/gmqgEMyW)

7. Dremel: Interactive Analysis of Web-Scale Datasets (Google)
→(https://lnkd.in/g9qdExJE)

8. Dynamo: Amazon's Highly Available Key-value Store (Amazon)
→(https://lnkd.in/gePwCJup)

9. ElasticSearch Architecture (Elastic)
→(https://lnkd.in/gvHAQhas)

10. Erasure Coding in Windows Azure Storage (Microsoft)
→(https://lnkd.in/gTagXQ-U)

11. F1: A Distributed SQL Database That Scales (Google)
→(https://lnkd.in/gdQEhpv4)

12. Facebook Cassandra (Distributed NoSQL DB) (Facebook)
→(https://lnkd.in/eD9erCNu)

13. Facebook Memcache (KV Store) (Facebook)
→(https://lnkd.in/eYZM5SPb)

14. Finding a Needle in Haystack: Facebook’s Photo Storage (Facebook)
→(https://lnkd.in/gWKRb66H)

15. FlumeJava: Easy, Efficient Data-Parallel Pipelines (Google)
→(https://lnkd.in/gRCaqGcH)

16. GFS: Evolution on Fast-forward (Google)
→(https://lnkd.in/g52wY64c)

17. Google Chubby Locking Service (Google)
→(https://lnkd.in/g-G9hM9J)

18. Google File System (GFS) (Google)
→(https://lnkd.in/gNxqFMwF)

19. Hive: A Warehousing Solution Over a Map-Reduce Framework (Facebook)
→(https://lnkd.in/gnK_wcCR)

20. LinkedIn Kafka (PubSub) (LinkedIn)
→(https://lnkd.in/eGcagdRA)

21. MapReduce (Google)
→(https://lnkd.in/gc4bqa8W)

22. Mesa: Geo-Replicated, Near Real-Time, Scalable Data Warehousing (Google)
→(https://lnkd.in/gynRCu8N)

23. Meta XFaaS (Serverless Functions) (Meta)
→(https://lnkd.in/eHqbPXpH)

24. Napa - Data Warehousing (Meta)
→(https://lnkd.in/gi34K37Z)

25. Napa - Partitioning Algorithm (Meta)
→(https://lnkd.in/gjguksXB)

26. PNUTS: Yahoo!'s Hosted Data Serving Platform (Yahoo!)
→(https://lnkd.in/gUJYh6Ha)

27. Percolator: Large-scale Incremental Processing Using Distributed Transactions and Notifications (Google)
→(https://lnkd.in/gb3gDnrd)

28. Pregel Graph Processing (Google)
→(https://lnkd.in/g6usKZci)

29. RCFile: A Fast and Space-efficient Data Placement Structure in MapReduce-based Warehouse Systems (Facebook)
→(https://lnkd.in/gC4q_TSm)

30. Redis Architecture (Redis)
→(https://lnkd.in/gkENeYKh)

31. Spanner (Google)
→(https://lnkd.in/giEv9CYm)

32. TAO: Facebook’s Distributed Data Store for the Social Graph (Facebook)
→(https://lnkd.in/gCFn54ie)

33. Zanzibar Authentication System (Google)
→(https://lnkd.in/gJ7m_bha)

8 Important Algorithms You Should Know Before Your System Design Interviews



1/ Consistent Hashing
-  Distributes data evenly across servers, minimizing remapping when nodes are added or removed.
- Use Case Distributed caching systems, databases, and load balancing.
- Example: Cassandra uses consistent hashing to distribute data across clusters seamlessly.
2/ Quadtrees
-  Efficiently indexes 2D spatial data by recursively dividing space into quadrants.
- Use Case Mapping services, geospatial databases, and location-based applications.
- Example: Google Maps leverages quadtrees for fast location searches and rendering maps.
3/ Leaky Bucket Algorithm
-  Controls the rate of incoming requests by processing them at a steady flow, preventing overloads.
- Use Case API rate limiting, traffic shaping, and network routers.
- Example: AWS API Gateway implements rate limiting using variations of this algorithm.
4/ Tries (Prefix Trees)
-  Stores and retrieves strings quickly by organizing data into prefixes.
- Use Case Autocomplete systems, IP routing, and spell-checkers.
- Example: Google Search uses tries for predictive text and autocomplete suggestions.
5/ Bloom Filters
-  Provides fast membership checks with probabilistic accuracy, minimizing memory usage.
- Use Case Caching, spam detection, and deduplication systems.
- Example: Google Bigtable uses Bloom filters to reduce disk reads for non-existent keys.
6/ Sliding Window Algorithm
-  Manages data streams in fixed-size windows to optimize performance and reduce memory usage.
- Use Case Network congestion control, streaming data analysis, and rate limiting.
- Example: TCP Flow Control applies this to manage network traffic efficiently.
7/ Raft Consensus Algorithm
-  Ensures distributed systems agree on shared states, even with node failures.
- Use Case Leader election, replication, and fault tolerance in distributed databases.
- Example: etcd (used in Kubernetes) relies on Raft for cluster coordination and fault recovery.
8/ MapReduce
-  Processes and aggregates large datasets by splitting tasks into parallel operations.
- Use Case Distributed data processing and big data analytics.
- Example: Hadoop and Google BigQuery implement MapReduce to process massive datasets efficiently.


These 10 Patterns Will Save You!


Before you step into that interview room, make sure you understand these 10 must-know microservices patterns as they could create the difference between landing the job or getting stuck on a tricky question! 👇

1️⃣ Service Registry & Discovery
=> Helps services find and communicate with each other dynamically.
🔧 Tools: Eureka, Consul

2️⃣ API Gateway
=> Acts as a single entry point, handling request routing, authentication, and load balancing.
🔧 Tools: Zuul, Spring Cloud Gateway

3️⃣ Circuit Breaker
=> Prevents system-wide failures by cutting off requests to struggling services.
🔧 Tools: Resilience4j, Hystrix

4️⃣ Database Per Service
=> Each microservice gets its own database to maintain data isolation and prevent tight coupling.

5️⃣ Saga Pattern
=> Manages distributed transactions across multiple services via orchestration or choreography.
🔧 Tools: Camunda, Temporal

6️⃣ Strangler Fig Pattern
=> A safe way to migrate from a monolith to microservices—gradually replacing pieces instead of a risky big bang rewrite.

7️⃣ Event Sourcing
=> Stores every state change as an event, allowing for historical tracking and easier debugging.

8️⃣ CQRS (Command Query Responsibility Segregation)
=> Separates read and write operations for better scalability and performance.

9️⃣ Sidecar Pattern
=> Adds extra capabilities like logging, monitoring, or security without changing the main service.
🔧 Tools: Istio, Envoy

🔟 Publish-Subscribe (Pub-Sub)
=> Services communicate asynchronously, ensuring loose coupling and better scalability.
🔧 Tools: Kafka, RabbitMQ



11 𝗧𝗲𝗰𝗵𝗻𝗶𝗰𝗮𝗹 𝗽𝗮𝗽𝗲𝗿𝘀 I wish I had read as a junior engineer (these helped me learn so much on my journey in the last 6 years from junior to senior SWE)

1. LinkedIn Kafka (PubSub) [https://lnkd.in/gJrDkNPv]

2. Designing Container-Based Systems [https://lnkd.in/gxJzAKi7]

3. Amazon Dynamo (SOSP 2007) [https://lnkd.in/gY-UJ5JP]

4. Bitcoin: A Peer-to-Peer Electronic Cash System [https://lnkd.in/gcqquJyQ]

5. Real-time Data Infrastructure at Uber [https://lnkd.in/g3WvfcMx]

6. Hierarchical Text-Conditional Image Generation with CLIP Latents [https://lnkd.in/gH4j_PQG]

7. Google File System (GFS) [https://lnkd.in/gTJQtVHw]

8. Google Chubby Locking Service [https://lnkd.in/gHAT7cBR]

9. Meta XFaaS: Hyperscale and Low-cost Serverless Functions [https://lnkd.in/eHqbPXpH]

10. Facebook Cassandra (Distributed NoSQL DB) [https://lnkd.in/eD9erCNu]

11. Facebook Memcache (KV Store) [https://lnkd.in/eYZM5SPb]






Additional Topics: 

Consistent Hashing, CAP Theorem, Load Balancing, Caching, Data Partitioning, Indexes, Proxies, Queues, Replication, and choosing between SQL vs. NoSQL.

Consistent Hashing:  https://highscalability.com/consistent-hashing-algorithm/

Loadbalancer: https://medium.com/must-know-computer-science/system-design-load-balancing-1c2e7675fc27



No comments:

Post a Comment

Data Engineering and Best practices

Data and types Data at rest (e.g. batch data pipelines / data stored in warehouses or object stores) Data in motion (e.g. streaming pipeline...