There are only patterns, patterns on top of patterns, patterns that affect other patterns. Patterns hidden by patterns. Patterns within patterns.
If you watch close, history does nothing but repeat itself.
What we call chaos is just patterns we haven’t recognized. What we call random is just patterns we can’t decipher. what we can’t understand we call nonsense. What we can’t read we call gibberish.
There is no free will.
There are no variables.Chuck Palahniuk
Building microservices is hard. Exceptionally hard. Despite the difficulties of building and managing such a chaotic architecture, professionals gravitate towards it. The reasons vary from the most mundane (e.g. it’s the “In” thing), to the most sensible (e.g. I need to scale badly). I have been dealing with Microservices for more than 5 years now, and I still learn something new from time to time (especially when there are issues).
The Microservice Architecture is not new. In fact, you can trace its origins to Service-Oriented Architecture, with the goal of having loosely-coupled, scalable services. This loose coupling allows services to be implemented and scaled independently of each other. At small scale, it doesn’t really make sense. Most applications and use cases can be implemented with ease using existing monolithic frameworks and approaches. But once you start dealing with large scale systems that need to evolve at a fast pace and used by thousands (if not millions) of people at the same time, Microservices start to make sense.
As I transition to my new job, I feel that it’s a good time to share my thoughts and lessons I have learned while dealing with this chaotic architecture.
Microservice Architecture is the manifestation of Organized Chaos. Just like this list.
For simplicity, I will refer to Microservice Architecture as MSA, and Microservices as MS.
Enjoy the brain dump!
- MSA requires a lot of coordination and management overhead across teams and business stakeholders
- MS sizes vary. There is no prescribed size or complexity for it.
- A monolithic system can be considered a MS in the grand scheme of things.
- Legacy systems, as long as they expose the right interfaces, can be considered a MS.
- Many API’s and frameworks used for building MS still use ancient technologies like HTTP 1.0 by default (this is a very common gotcha).
- Proper resource usage footprint is a must. Otherwise, be prepared to throw hardware at the problem. If this is the case, cost may become an issue.
- Minimize log footprint. Only output what is necessary for detecting issues.
- Invest in telemetry early. Building telemetry infrastructure for MSA IS VERY PAINFUL.
- MSA is not just about architecture and software. It requires specialized infrastructure to operate effectively.
- MS must be designed and implemented to handle failure and recovery in an automated fashion.
- Have an understanding of not just the software, but also the OS and hardware. Most problems at scale require a combination of solutions at various levels.
- DDoSing your own systems is a common problem. Be prepared to optimize your flows.
- Eliminate the use of HTTP/1.0 due to its poor management of connections. Make sure to use HTTP/1.1 and/or HTTP/2’s persistent connections.
- When a MS uses Java, make sure to warm up the JVM before allowing traffic. This ensures that the MS works at peak performance, and CPU/memory metrics are considered accurate.
- Choose the right data store for the job. As a general rule of thumb, relational databases are not suitable for MSA.
- Use of relational databases in a MSA can be done, but be prepared to implement data sharding.
- Load test early. Load test often.
- Unit tests and integration tests will save you a ton of time. Make sure to implement it across MS.
- Kafka and MSA is a match made in heaven. Learn to use it.
- Automate! Automate! Automate! Automate! Automate! And Automate some more!
- When using Nginx or Apache as a reverse proxy, make sure to use HTTP/1.1 persistent connections.
- Learn the mechanics of performing zero downtime deployment. It is the right way to deploy at scale.
- Stop using outdated security practices like IP whitelisting. It just doesn’t scale.
- Understand the concept of Network of Trust. It is the right way to build secure systems.
- Building a MSA is like building a water pipeline. Back pressure is a real problem and very difficult to diagnose without proper telemetry
- Understand what it takes to build resilient and fault-tolerant MS. If necessary, use a framework similar to Hystrix to manage timeouts, rate limiting, fail fast, gradual degradation, etc.
- Alarms are overrated. They will eventually be ignored. Focus on stability and self-healing.
- Implement idempotent services to ensure that requests can be retried safely.
- For transactional workflows, make sure to implement a transaction query and rollback API.