Handshake has grown and evolved significantly over the last eight years. Founded in 2014 primarily as a job board and career services support software, today, Handshake is a full-fledged relationship platform with 500+ employees. It has become increasingly clear that continued massive growth in both traffic and organization size demands a fresh approach to our architecture. The time has come for a new vision, what we are calling Handshake Next.
Historically, Handshake has been a monolithic Rails application, which enabled rapid development in the early days. However, this foundation has become more challenging as our codebase, and our teams have grown. Ownership over the code hasn’t always been clear, resulting in teams feeling limited autonomy to make more significant changes or code ending up being unowned. With that in mind, it was clear that we needed to decompose our monolith to drive clearer ownership and greater autonomy.
Other companies have followed two well-worn paths when evolving their monoliths: microservices or a modular monolith. Both have their trade-offs. Modular monoliths provide all the benefits of a monolith plus clearer boundaries. On the other hand, they have a limited ability to use different technologies or to scale parts of an application independently. Moreover, the boundaries can be difficult to enforce in practice. Microservices provide flexibility in technology choices and scaling independence at the cost of needing to handle all the complexity of a fully distributed system. Additionally, the wrong boundaries can create a distributed monolith with all the complexity of a distributed system combined with all of the coupling of a monolith.
Given those concerns, we took a step back and decided to focus on how we define our boundaries first.
Defining our boundaries
Decomposing a monolith is a lengthy process with many steps. Given that, and our focus on establishing clear lines of ownership, we are prioritizing logical separation over physical isolation. We want our code and our data organized and modularized even if they share a codebase or database. Logical separation is a prerequisite for physical isolation and sets us up to physically isolate when needed. Where and when to physically isolate is determined by our scaling needs.
To determine how to break down our app logically, we turned to the tried-and-true Domain-Driven Design (DDD). Used for over a decade by companies large and small to tackle even the most complex business domains, DDD gives us strategic design foundations to confidently break up our system. Specifically, DDD provides the concept of a bounded context. Instead of having a single domain model, each bounded context is distinct, and its relationships to other contexts are explicitly modeled.
When defining our bounded contexts, we decided to align them with the subdomains of the business; this ensures our technical boundaries aren’t at odds with the needs of the business and our users. It also avoids creating artificial boundaries that make sense technically but are overgeneralized or introduce excessive coupling between distinct use cases.
Moreover, mapping these contexts to business subdomains makes aligning them with our org structure simpler. In keeping with our goal of clearer ownership and increased autonomy, we want each bounded context to be owned by a single team.
Right-sizing our services
When deciding how to translate these boundaries into code, we wanted to be forward-looking and pragmatic. Following a modular monolith strategy would have satisfied our goal of getting to logical separation first but would have eliminated the possibility of physical isolation of our bounded contexts. Going with microservices instead would have required us to physically isolate every service from day one. Additionally, as microservices tend to be very focused, they would be smaller than the bounded contexts we defined.
We aligned on using miniservices. For our purposes, miniservices are logically separated services that can run physically isolated and can also be composed together, sharing a runtime or other resource. Miniservices also represent a full bounded context; this gives us a tremendous amount of flexibility to decompose our application gradually and matches our goals.
Sometimes, a certain feature or use case is so performance-sensitive that it needs to be split out and potentially even written in another language. For these special cases, we allow the creation of microservices. To us, microservices must be completely physically isolated, and they map to an individual feature or use case of a broader bounded context. We treat these microservices as private implementation details. Each bounded context still provides a single interface to the rest of the system regardless of whether they contain any microservices.
Balancing autonomy and collaboration
While we want to drive autonomy, we have also reached the scale where we need to work horizontally. We want to be building core capabilities that can be reused across various product features without having to be reimplemented multiple times or composed together to simplify building out new features or exploring new lines of business. We also want to continue to allow our product experiences to be oriented around user personas while our cross-cutting features are maintained holistically.
To achieve these, we developed a layered approach to which our bounded contexts would map:
- Experience – specific user experiences
- Product – product features
- Platform – reusable capabilities specific to Handshake
- Infrastructure – generic capabilities not specific to Handshake
Each layer represents increasingly specific areas of concern, starting from our generic infrastructure going all the way to the experiences our users directly interact with. Bounded contexts on lower layers may serve multiple contexts on upper layers. For instance, a given product feature might be used in multiple experiences, or multiple product features might be built on top of a single platform capability.
To simplify the relationships and avoid cyclic dependencies between bounded contexts, a given context can only synchronously depend on contexts in layers beneath it.
Collaboration between peer contexts or in the reverse direction are handled asynchronously via an event-driven architecture. Any context can listen for domain events published by another context. A domain event represents a record of an action and who or what caused it in the system. Domain events can be used to trigger side effects, but also provide a way to transmit state when contexts use physically isolated data stores.
Decomposing a monolith is a huge undertaking that can take years to complete. However, with a pragmatic approach like we’ve outlined above, Handshake will have the flexibility to adapt to the needs of our growing organization and increasing scale of our system while continuing to deliver important product updates.
We have already begun this journey, and we will be sharing more stories about our progress along the way. We've established the foundations of Handshake Next, but there is a lot more to do. We’re seeking talented technical team members to join us in this effort and accelerate this work to new heights. If you’d like to experience it firsthand, we are hiring.