
https://medium.com/@vinciabhinav7/how-to-design-a-read-heavy-system-some-strategies-and-best-practices-20e416a77cfd query optimisation
How do you design write heavy system?
How do you design low latency application?
How do you select which database to use?
https://medium.com/learning-sql/12-tips-for-optimizing-sql-queries-for-faster-performance-8c6c092d7af1
https://diptendud.medium.com/how-to-improve-the-performance-of-a-microservice-ff14542defe2
two types of services: Function as a Service (FaaS) and Backend as a Service (BaaS).
Types of trigger for Azure functions
Data Engineering basic
https://medium.com/@datavalleyind/data-engineering-getting-started-with-the-fundamentals-7ef83a432e78
https://www.altexsoft.com/blog/big-data-analytics-explained/
- Put your scenario in place
- Add monitoring
- Add traffic
- Evaluate results
- Remediate based on results
- Rinse, repeat until reasonably happy
You are working on a Spring Boot REST API that processes over 1 million financial transactions daily. The API is experiencing high latency and excessive CPU/memory usage. How do you optimize it for better performance?
Optimizing the performance of a high-volume financial transactions API involves several strategies to reduce latency and manage CPU and memory usage effectively. Here are key approaches to consider:
Implement Caching: Utilize caching mechanisms to store frequently accessed data, reducing the need for repetitive database queries. Tools like Redis or Memcached can be employed to cache responses for high-traffic endpoints. Ensure that cache expiration policies are appropriately set to maintain data consistency.
Optimize Database Interactions:
- Efficient Queries: Review and optimize database queries to ensure they are performant. Avoid fetching unnecessary data and ensure that queries are indexed appropriately.
- Connection Pooling: Use a fast connection pool, such as HikariCP, and configure it optimally to manage database connections efficiently.
Employ Asynchronous Processing: For operations that don't require immediate responses, implement asynchronous processing. This approach allows the API to handle other requests while waiting for long-running tasks to complete, thereby improving overall responsiveness.
Utilize Reactive Programming: Adopt a reactive, non-blocking programming model to handle concurrent requests more efficiently. This is particularly beneficial when the API acts as a pass-through to external services, as it allows threads to be reused while waiting for external responses.
Implement Pagination and Filtering: For endpoints that return large datasets, incorporate pagination and filtering to limit the amount of data processed and transmitted in each request. This reduces server load and response times.
Enable Response Compression: Compress API responses to reduce payload sizes, leading to faster transmission times and reduced bandwidth usage. GZIP is a commonly used compression method that can be enabled in Spring Boot applications.
Monitor and Manage Thread Pools: Properly configure thread pools to match the application's workload and the server's capabilities. This ensures that the API can handle concurrent requests without overwhelming system resources.
Implement Rate Limiting: Introduce rate limiting to prevent abuse and ensure fair usage of the API. This helps protect the system from being overwhelmed by excessive requests from a single client.
Profile and Monitor Performance: Continuously monitor the API's performance to identify bottlenecks. Use profiling tools to gain insights into CPU and memory usage, and adjust configurations as needed to optimize resource utilization.
By systematically applying these strategies, you can enhance the performance of your Spring Boot REST API, ensuring it efficiently handles over a million financial transactions daily.
- Traditionally, NoSQL databases often followed the BASE (Basically Available, Soft state, Eventually consistent) model, which prioritizes availability and performance.
- Relational databases, on the other hand, adhere to ACID properties, ensuring data integrity.
- Rate limiting is a precise method for restricting request counts within a time window.
- Throttling is a more general term for controlling resource consumption, which may or may not include rate limiting.
- Backpressure management is specifically designed to handle flow control between components with different processing speeds.
In simpler terms:
- Rate limiting says, "You can only do X amount of things in Y time."
- Throttling says, "We need to slow things down to keep the system healthy."
- Backpressure says "Slow down, I am getting overloaded."
Hashing Algorithm
System Design
- Amazon
- Netflix
- Airbnb
- Uber
- Youtube
- Payment system
Design a highly available and scalable URL shortening service.
- BookmyshowComplete links:
- Architecture of various system https://www.interviewbit.com/blog/category/architecture/
- Event Sourcing and Event Driven Architecture basic events store another link
- Apache Kafka with Apache Spark link
- Apache spark
- System Design grokking
- Consistent Hashinganti-entropy mechanism in distributed system link
- Quorum based systsem https://blog.sofwancoder.com/distributed-system-understanding-quorum-based-systems
- eventual consistency vs strong consistency
- Setting up CI/CD pipeline
- OLTP vs OLAP
- hinted handoff https://systemdesign.one/hinted-handoff/
- merkle tree https://www.linkedin.com/pulse/exploring-key-distributed-system-algorithms-concepts-series/
- API gateway
- Failover capability in DB
Additional topics:
- Dynamo - Highly Available Key-value Store
- Kafka - A Distributed Messaging System for Log Processing
- Consistent Hashing - Original paper
- Paxos - Protocol for distributed consensus
- Concurrency Controls - Optimistic methods for concurrency controls
- Gossip protocol - For failure detection and more.
- Chubby - Lock service for loosely-coupled distributed systems
- ZooKeeper - Wait-free coordination for Internet-scale systems
- MapReduce - Simplified Data Processing on Large Clusters
- Hadoop - A Distributed File System
Advance question:
Java & Backend Development1. If Java didn’t have the synchronized keyword, how would you implement thread safety? link
2. How would you store a billion records in memory while ensuring efficient search operations?link
3. Explain Java’s ClassLoader in a way that a 10-year-old could understand.
4. What exactly happens inside the JVM when a NullPointerException is thrown?
System Design Challenges
5. Design a traffic management system for a city with self-driving cars.
6. If you had to reduce API response time by 50% in a large-scale system, where would you start?
7. How would you design a video streaming platform that adapts in real-time to network conditions?
Algorithm & Data Structures Curveballs
8. Can you sort an array faster than O(n log n)?
9. You have an infinite stream of numbers. How would you efficiently find the median at any point?
10. If you could only use one data structure for every problem, which one would it be and why?
Unique & Unexpected Questions
11. How would you explain recursion to someone who has never coded before?
12. If you could remove one feature from Java, what would it be and why?
13. Tell me something interesting about technology that isn’t on your resume.
Broad Category of System Design
➥1. Load BalancerKey Topics:
- Types of Load Balancers - Application Layer (L7) vs Network Layer (L4).
- Algorithms - Round Robin, Least Connections, IP Hashing.
- Health Checks - Monitoring server availability and performance.
- Sticky Sessions - Keeping user sessions tied to specific servers.
- Scaling Strategies - Horizontal vs Vertical scaling with load balancers.
- Global Load Balancers - Handling traffic across multiple regions.
- Reverse Proxy - Serving as a gateway and caching responses.
➥2. Application Server
Key Topics:
- Stateless vs Stateful Servers - When to use which.
- Caching Strategies - In-memory caching (Redis/Memcached) and local caching.
- Session Management - Cookies vs Tokens (JWT).
- Concurrency Handling - Managing multiple requests with threads or async models.
- Microservices Architecture - Service discovery and inter-service communication.
- Containerization - Docker, Kubernetes, and deployment strategies.
- Rate Limiting & Throttling - Preventing abuse and managing traffic bursts.
➥3. Database (SQL vs NoSQL)
Key Topics:
- SQL vs NoSQL - When to choose which database type.
- Sharding and Partitioning - Horizontal scaling techniques.
- Replication - Primary-Secondary, Multi-Master setups for reliability.
- Consistency Models - Strong vs Eventual Consistency (CAP theorem).
- Indexing Strategies - Improving query performance.
- Caching Layers - Redis, Memcached for faster reads.
- Backup and Recovery - Disaster recovery planning and failover systems.
➥4. Pub-Sub or Producer-Consumer
Key Topics:
- Messaging Patterns - Pub-Sub vs Queue-based systems.
- Message Brokers - Kafka, RabbitMQ, AWS SQS/SNS.
- Idempotency - Avoiding duplicate message processing.
- Durability and Ordering - Ensuring messages aren’t lost or misordered.
- Dead Letter Queues - Handling failed messages.
- Scaling Consumers - Parallel processing and worker pools.
- Eventual Consistency - Maintaining consistency with asynchronous systems.
➥5. Content Delivery Network (CDN)
Key Topics:
- How CDNs Work - Edge caching and reducing latency.
- Caching Policies - TTL (Time-to-Live) and Cache Invalidation.
- Geolocation Routing - Serving content from nearest data centers.
- Static vs Dynamic Content Delivery - Optimizing for both.
- SSL/TLS Termination - Secure communication.
- Load Distribution - Managing spikes in traffic.
- DDoS Protection - Preventing attacks and ensuring availability.
MicroService Developer roadmap
1. Microservices Architecture Basics: Monolithic vs. Microservices, characteristics (independence, scalability, resilience), and designing microservices boundaries (DDD - Domain-Driven Design).2. Service Communication: Synchronous (REST, gRPC) vs. Asynchronous (Message Queues), API design and versioning, event-driven architecture, and event sourcing.
3. Data Management: Database per service, distributed data management (saga pattern, 2PC, CQRS), and handling data consistency across services.
4. Deployment Strategies: Containerization (Docker), orchestration (Kubernetes), and service discovery and registry (Eureka, Consul).
5. Frameworks and Tools: Spring Boot (Spring Cloud for microservices), Micronaut, Quarkus, or Dropwizard as alternatives.
6. Communication Protocols: RESTful APIs and gRPC, messaging systems (Kafka, RabbitMQ).
7. Databases: SQL (PostgreSQL, MySQL), NoSQL (MongoDB, Cassandra), and distributed caching (Redis, Memcached).
8. CI/CD Pipelines: Tools like Jenkins, GitHub Actions, GitLab CI, and deployment strategies like Blue-Green and Canary deployments.
9. Infrastructure as Code: Terraform, Ansible, or AWS CloudFormation.
10. Logging and Monitoring: Centralized logging (ELK Stack, Splunk) and monitoring tools (Prometheus, Grafana).
11. Resilience and Fault Tolerance: Circuit Breaker (Hystrix, Resilience4j), Bulkhead pattern, and retries.
12. Security: OAuth2, OpenID Connect, and API Gateways (Zuul, Spring Cloud Gateway, Kong).
13. Testing Microservices: Unit and integration testing, contract testing (Pact), and end-to-end testing.
14. Scalability Patterns: Horizontal and vertical scaling, load balancing (HAProxy, NGINX).
15. Distributed Tracing: Tools like Jaeger and Zipkin.
16. Anti-Patterns: Avoiding distributed monoliths and over-engineering microservices.
No comments:
Post a Comment