Moving beyond monolithic architecture in government
In 2020, Kristo Vaher the Estonian Government Chief Technology Officer released a discussion paper covering options for Estonia's future government architecture. The paper canvassed a number of areas of interest to researchers and policy makers interested in digital government. For the purposes of this discussion, the focus of this article is on Vaher's commentary on moving past monolithic architecture to micro-services.
Introduction
Vaher notes that Estonian digital government architecture is complex due to the fragmented approach to technology across different administrative sectors. The evolution of software architecture however, offers opportunities for existing monolithic architecture and the possibility of shifting to service-oriented architecture.
Challenges in digital government
The paper describes how digital government in Estonia faces significant risks, including single points of failure in both infrastructure and services. Many services are interdependent, leading to potential failures if one service, such as if the population registry is unavailable. Additionally, there is a low level of service reuse across administration sectors due to the complexity of adopting services developed for other sectors.
Monolithic architecture
Like many governments, the majority of digital government services in Estonia are built on monolithic software architectures. These systems are developed as single, comprehensive units that encapsulate business logic, data, and user interfaces. Although some developers have attempted to mitigate risks by building modular monoliths, reusing software modules across different systems remains challenging due to the deep integration required of business domain specifics in each module.
The advantages of monoliths
While the focus of Vaher's discussion is upon the possibilities beyond monolithic architecture and its drawbacks, he does accept that monolithic systems are still attractive in certain scenarios. They are often quicker to develop, easier to maintain, and may be more suitable for small business cases where the need for rapid deployment and testing is crucial. Monolithic architecture can be more efficient than alternatives like micro-services in some cases.
Risks associated with monoliths
Clearly, monolithic systems do carry risks, particularly in the context of digital government. These include:
Tight coupling: This means changes to one part of a monolithic system often requiring an understanding of the entire system, making updates and maintenance challenging over time.
Single point of failure: In a monolithic system, if one part of the system fails, it can impact all functionalities, leading to increased testing and maintenance efforts.
Technology lock-in: Probably the most well known downside of monolithic systems is that they are often tied to specific programming languages and databases, making them vulnerable to changes in technology popularity and increasing operational costs.
Scalability issues: Scaling monolithic systems to handle peak loads is usually inefficient, because the entire system must be scaled, leading to higher infrastructure costs.
Legacy concerns: Over time, monolithic systems frequently become difficult to maintain and extend, leading to the need for complete rewrites.
Vendor lock-in: Monolithic architectures can result in dependence on specific vendors and technologies, making it challenging to implement new features or switch to alternative solutions.
Service-Oriented Architecture (SOA)
While some monolithic architecture still exists, the Estonian government is seeking to follow a predominantly Service-Oriented Architecture (SOA) approach. SOA is built on the concept of "black boxes" or services that operate independently, interacting through defined APIs (Application Programming Interfaces).
This architecture offers several advantages:
Decoupled services: Services that can be replaced or updated independently without affecting the entire system.
Scalability: Only the specific services experiencing high demand are scaled, consequently reducing infrastructure costs.
API management: API gateways provide a controlled environment for managing API traffic, authentication, and security. It is noted that API management does bring its own complexities.
Some of the challenges with SOA include the risk of API gateways becoming bottlenecks, critical business logic being embedded in gateways, and the need for careful governance to avoid creating new 'monolithic' dependencies.
X-Road framework
Vaher notes that X-Road is Estonia's most prominent example of SOA, and it connects various government information systems, enabling secure data exchange across different administrative sectors. Starting in 2001, X-Road has expanded beyond Estonia, with usage now in Finland, and Iceland. The framework integrates multiple services, typically using SOAP APIs, and supports REST APIs. Despite its lauded success, X-Road does faces challenges including:
Complex setup: Implementing X-Road, even for testing, is complex and can be a barrier to adoption for some jurisdictions.
Internal monoliths: Information systems connected to X-Road are often monolithic, making it difficult to implement new business rules and leading to tight coupling.
Synchronous requests: X-Road's synchronous communication model may not be well-suited for massive data analysis, creating potential fragmentation in data exchange.
Addressing the tight coupling of services and reducing complexity in systems like X-Road will be an area of ongoing focus for Estonia. Ensuring the sustainability and resilience of critical services, particularly in the specific context of data embassies and cross-border data management, will require ongoing architectural innovation and refinement.
The concept of microservices
Vaher explains the concept of microservices by comparing the Athenians story about how a ship was used by Theseus to escape, by replacing the parts of the ship one by one as they decayed. Over time every part of the ship was replaced and by the time the ship returned home, none of its original parts remain. The story provoked the question - is it still the same ship?
The analogy illustrates the flexibility and adaptability required in modern digital government systems, where systems need to evolve peice by peice without losing their functionality or identity.
Evolving from monoliths to microservices
The shift to microservices does not need to be a dramatic, revolutionary change but an evolutionary transition from monolithic architecture to Service-Oriented Architecture (SOA), and then microservices. Some researchers suggests that building a monolith may still be a viable option due to its cost-effectiveness, however, the flexibility and scalability of systems makes microservices necessary. Microservices offer the autonomy and scalability required for modern digital services, building on the benefits of SOA while integrating the concept of Event-Driven Architecture (EDA).
What are the key features of microservices?
Vaher describes a good microservice as one that has a number of key features, built upon the principles of SOA. This includes:
Statelessness: Microservices should operate independently, ensuring that each request is processed in isolation, with consistent outputs for consistent inputs.
Loose coupling and autonomy: Services should function independently, even when other services are unavailable, and must support backward compatibility and versioning to evolve without disrupting existing functionalities.
Caching and mockability: Implementing caching protocols and making services mockable allows for better performance and easier testing.
Self-documentation: Services should publish accurate, up-to-date documentation, preferably using tools like Swagger (Note: Swagger is an Open Source set of specifications, rules, and tools for developing and describing RESTful APIs. The Swagger framework supports the creation of interactive, machine and human-readable API documentation)
Monitoring and logging: Services must be monitored with traceable correlation IDs to maintain operational integrity.
Idempotency: Services should handle repeated requests consistently, without causing errors, to ensure stability in distributed environments (An API call or operation is idempotent if it has the same result no matter how many times it's applied. An idempotent operation provides protection against accidental duplicate calls causing unintended consequences).
Non-centralized authentication: Using methods like JSON Web Token (JWT) allows for secure, decentralized authentication, supporting concepts like Single Sign-On (SSO).
Cloud readiness
Microservices should be designed with cloud environments in mind, following the Twelve-Factor App methodology whereever possible (The Twelve-Factor App methodology is for creating software-as-a-service applications. These best practices are designed to enable applications to be built with portability and resilience when deployed to the web).
In Estonia, simplified requirements include automated service setup, the ability to run multiple independent instances, scalability across locations, and data backup capabilities. Additionally, Vaher recommends that microservices should be primarily choreographed, to respond to events in their environment rather than being tightly coupled to specific APIs.
Microservices in practice
While microservices offer many advantages, they are not a one-size-fits-all solution. Autonomy can be costly, and shared libraries for common functionalities may still be necessary. Implementing microservices without cloud technologies can also be challenging, as their full benefits are realised when deployed in the cloud. Examples from the private sector of successful microservices include companies like Spotify, Netflix, and Amazon, albeit these examples also highlight the need for thorough planning and testing.
Synchronous vs. asynchronous communication
In microservices, the distinction between synchronous and asynchronous communication is critical. Synchronous communication involves waiting for a response, which can lock up business processes and lead to cascading failures. Asynchronous communication, on the other hand, allows services to process multiple requests simultaneously, improving scalability and reducing the risk of system-wide failures.
Event-Driven Architecture (EDA)
Microservices often rely on Event-Driven Architecture to achieve decoupling and scalability. EDA allows services to react to events in a more natural, human-like manner, similar to how organisations operate. In EDA, services subscribe to message rooms where events of interest are posted. This approach reduces dependencies and improves the flexibility of the architecture, making it well-suited for large-scale organisations like government sectors.
CAP Theorem in distributed systems
Vaher references CAP Theorem, as proposed by Eric Brewer, which states that in a distributed system, it is impossible to achieve all three of the following simultaneously: Consistency, Availability, and Partition Tolerance. In microservices, where services are distributed across a network, partition tolerance is essential, meaning architects must choose between consistency and availability. Consistent and partition-tolerant services may sacrifice availability, while available and partition-tolerant services may sacrifice consistency, depending on the specific needs and priorities of the system.
Service-Oriented Architecture and cloud-nativeness
In Service-Oriented Architecture (SOA), cloud-native principles are frequently adopted. User interfaces are decoupled from backend logic through APIs, allowing them to scale independently. However, scaling API gateways can remain a challenge. Most services operate on virtual machine stacks, which support API functionality and can scale more easily, though peak hours can still lead to increased costs and requirements.
Microservice architecture allows individual components to scale independently when deployed in the cloud (on services such as AWS). If good practices are followed, cloud platforms can dynamically scale components by creating temporary instances to manage load. Shared cloud infrastructure across multiple administrative sectors can further optimize performance, allowing available capacity to be used by other sectors during downtime, potentially reducing service costs.
Chaos engineering in cloud-based microservices
Vaher notes that chaos engineering can be used to test the resilience of cloud-based microservice architectures by intentionally causing failures within the system, such as removing parts like networking, databases, or storage servers. This practice, pioneered by companies like Netflix, ensures the integrity and quality of the architecture. In public sector systems, applying chaos engineering can reveal vulnerabilities and cascading failures. A well-engineered system should survive these tests, maintaining limited functionality during downtimes and recovering effectively.
Risks with microservices
Despite all of the benefits of microservices in contrast to monolithic architecture, microservices still contain risks. Microservices offer several benefits over traditional monolithic and Service-Oriented Architectures, but they also present challenges. It is crucial not to adopt microservices without a clear system design and understanding from business stakeholders. Autonomy of microservices is key, but dependencies on shared frameworks or libraries can compromise this autonomy. Additionally, the rapid evolution of cloud technologies like Docker presents a risk that must be addressed. It’s advisable to avoid building microservice architectures with dependencies on a single database and to maintain flexibility by keeping some parts of the system as monolithic until more certainty is achieved.
Extending X-Road
Vaher posits that Estonia’s X-Road, which is a system for secure data exchange between different administration sectors - could be extended with 'X-Rooms'. These would include publish/subscribe messaging rooms within the X-Road infrastructure that would enhance the system's ability to handle asynchronous communication, decoupling services while maintaining security and transparency. X-Rooms could also enable multitenancy, allowing multiple services to participate and respond to requests dynamically. This concept also has potential applications beyond the public sector, facilitating cooperation with the private sector and improving service efficiency.
If X-Road does not provide messaging rooms, administration sectors may need to implement these rooms independently, leading to technological fragmentation and increased security risks. Messaging rooms can store either messages containing full event data or events with links to external data sources, depending on the design. This flexibility could allow for secure and efficient data handling in public cloud environments, although careful consideration will be required in system design.
X-Rooms could also possibly facilitate more cross-border data exchange, particularly between countries like Estonia and Finland that have both adopted X-Road. By decoupling their architectures, governments can automate data exchange processes while maintaining control over their internal systems. This approach would enhance interoperability and efficiency in cross-border public services.
'Fact registries' for next-generation digital government
Fact registries could revolutionize digital government architecture. These registries would store events and facts about citizens, allowing for independent development of new services without the need for data migration from old systems. Fact registries would enable multiple information systems to operate simultaneously within the same domain, reducing the risk of breaking old systems and eliminating the need for Big Bang releases. They would also facilitate better data archiving and backup processes, ensuring long-term digital continuity and easier access to critical information.
Conclusion
The modernisation of digital government architecture presents both significant opportunities and challenges. Estonia’s experience highlights key lessons for governments to consider:
Monolithic architecture limitations: While monolithic systems can be efficient and easier to manage for smaller-scale applications, they carry inherent risks such as tight coupling, scalability issues, and technology lock-in. As these systems age, they become harder to maintain and adapt, making it crucial for governments to consider more flexible architectural alternatives.
Service-Oriented Architecture (SOA) as a transitional model: Estonia’s shift towards a Service-Oriented Architecture (SOA) demonstrates the benefits of decoupling services to enhance scalability and resilience. However, careful governance is required to avoid creating new dependencies or bottlenecks within API gateways. SOA can serve as a valuable stepping stone toward more advanced architectures like microservices.
Adoption of microservices: Microservices offer significant advantages in terms of scalability, autonomy, and cloud readiness. However, their implementation requires careful planning, particularly in managing dependencies and ensuring proper system design. Governments should approach this transition as an evolutionary process, moving gradually from monolithic systems through SOA to microservices.
Cloud integration and Event-Driven Architecture (EDA): The adoption of cloud-native principles and Event-Driven Architecture (EDA) within microservices can greatly enhance system flexibility and resilience. Governments should design their services to be cloud-ready, with features like statelessness, loose coupling, and asynchronous communication to better handle the demands of modern digital services.
Risk management in architecture evolution: As governments modernise their digital infrastructure, it is essential to mitigate risks associated with both legacy systems and new architectures. This includes avoiding over-reliance on specific technologies or vendors, carefully managing service dependencies, and ensuring systems are designed with future scalability and flexibility in mind.
Cross-border and inter-sector collaboration: The extension of Estonia’s X-Road framework and the concept of ‘X-Rooms’ illustrate the potential for enhanced cross-border data exchange and inter-sector collaboration. Governments should explore similar models to improve interoperability and efficiency in public services, particularly in a globalised digital environment.
Fact Registries for future-proofing: The idea of ‘fact registries’ represents a forward-looking approach to digital government architecture, enabling the simultaneous operation of multiple information systems and reducing the need for disruptive system overhauls. This approach can enhance data management, archiving, and service development, providing a foundation for long-term digital continuity.
By learning from Estonia’s experiences, governments can better navigate the complexities of digital transformation, ensuring that their digital architectures are resilient, scalable, and capable of meeting the evolving needs of citizens and public services.
References
Vaher, K. (2020). Next generation digital government architecture. Republic of Estonia GCIO Office.
Comments