Scalable Software and Big Data Architecture - Software Architectural Patterns and Design Patterns
Articles in This Series
Software Architectural Patterns and Design Patterns
Welcome to the second article in a multi-part series about the design and architecture of scalable software and big data solutions.
In this article, we’ll focus on various architectural patterns and styles. We’ll also briefly cover network communication and information transfer concepts, as well as define and explain SDKs and APIs as related to some of the architectures discussed.
Architectural Patterns and Design Patterns Overview
Reuse is critical in software development. Reinventing the wheel is unnecessary, time-consuming, doesn’t follow best practices, and is ultimately very expensive.
Given this, software architects and engineers embrace patterns to apply the same, or very similar solutions to commonly occurring problems. Design patterns refer to reusable patterns applied in software code, whereas architectural patterns are reusable patterns used to design complete software, big data, IoT, and/or analytics-based solutions.
Design patterns are usually categorized as either creational, structural, behavioral, or concurrent, and there are many well-defined patterns for each category. We will focus primarily on architectural patterns in this article.
Architectural patterns, on the other hand, are often categorized and grouped by an architectural style. Wikipedia lists the following architectural styles: structure, shared memory, messaging, adaptive systems, and distributed systems.
The first article of this series noted that typical cloud solutions consist of clients, servers, and data storage. Servers usually host databases, web servers, services, microservices, and so on. Clients typically include desktop, web, and/or native mobile applications.
Network Communication and Information Transfer
Before diving deeper into specific patterns, let’s briefly discuss how information is transferred between different computers, software, and services.
Ultimately, everything transmitted on local networks and the Internet are a form of data message and are typically encoded according to a specific protocol. The network protocols found in the OSI model layers are the most common, and include HTTP/HTTPS, TCP/IP, UDP, IPv4/IPv6, and so on.
All requests made from a web browser to a server, and the subsequent responses, leverage the HTTP and TCP/IP protocols. Two primary messaging patterns are request-response (e.g., HTTP) and one-way (e.g., UDP).
Given this information transfer mechanism, let’s explore various ways in which this applies to software architecture and related patterns.
Client-server, Layers, and Tiers
Many applications are classified as client-server, in which a client makes requests to a server responsible for providing an appropriate response. Both the client and the server can be considered independent applications that interact with each other through well-defined interfaces.
The code in both is often separated into layers that each address different concerns. Common layers include a presentation layer, application (or service) layer, business (or domain) logic layer, and a data access (or persistence) layer. Actual data storage and databases can be considered the database layer.
Layers are used to implement separation of concerns in the code that’s running on a single computing machine. Architecturally, software solutions are often separated into tiers, where a tier describes a part of the solution running on a different computing instance (physical or virtual).
Typical tiers include the presentation tier, domain logic tier (e.g., API), and data storage tier. This is a typical 3-tier architecture, and n-tier is used to describe an architecture that includes more than one tier.
Services, Service Oriented Architecture (SOA), and Microservices
Software applications can be separated into services, or units of software functionality that are deployed remotely and can be accessed via networking communication protocols. Services are leveraged for many uses, with information retrieval and execution of operations being the most common.
A very common architectural pattern, particularly suited to enterprise and/or large-scale applications is called service-oriented architecture (SOA). SOA describes a solution that’s broken into discrete components that are each responsible for a certain part of the solution’s functionality, and provide that functionality as a service to other solution components.
According to Wikipedia, an SOA service has the following four properties:
It logically represents a business activity with a specified outcome
It is self-contained
It is a black box for its consumers
It may consist of other underlying services
SOA is a great example of separation of concerns, modularity, and loose coupling as applied at the application or solution level. The most typical implementations of SOA include web services, messaging services, and RESTful services (e.g., API).
A popular and more specialized version of SOA is know as microservices. Microservices are organized around individual application capabilities, as compared to the services of SOA being organized around a specific business activity.
Microservices are very similar to the services of SOA, but are characterized as being:
More granular (smaller) and finer grained interfaces
More modular and loosely coupled
More easily replaced
More language and technology agnostic
More specialized around a specific application capability
Better suited for containerization and deployment
Better suited for agility and continuous delivery
Microservices also introduce many potential drawbacks as well, which should be considered carefully. Some of these are related to the fallacies of distributed computing, and include:
Significant increase in:
Application testing, configuration, deployment, and management complexity
Internal team communications and coordination costs
HTTP request/response call chain complexity
Potential networking issues such as network failure and high latency
Computational and performance overhead due to network-based messaging communication and transfer protocols
Requires DevOps adoption (e.g., scalability) and monitoring practices
Potentially distributed transaction handling
Some also note that the true benefits of microservices are not realized unless each microservice has sole ownership of their data, i.e., having their own data store.
Specific Solution Architectures and Patterns
In addition to those already described, there are many other specific architectural patterns that you should know about, and consider using in your next solution if applicable.
Here’s a high level overview of different key patterns categorized by different architectural styles, including some of the Wikipedia-noted styles mentioned earlier. Since the scope of this article is limited, some patterns are simply named for reference, or discussed at a very high level. The reader is therefore encouraged to do additional research as needed.
In the first article of this series we discussed separation of concerns (SOC) as a way of separating different concerns of a code base or complete solution in order to improve code reuse, reduce coupling, improve testability and maintainability, and so on.
In the extreme case where concerns are barely separated, if at all, an application and its code base is referred to as being monolithic. This type of application has very tightly coupled and mostly non-reusable code.
Software solutions that embrace and emphasize SOC are usually referred to as being layered and/or tiered as previously discussed, but also can be component-based, modular, service oriented, and so on.
Components are usually higher level, larger abstractions of encapsulated concerns of a complete solution. Examples include web services, software packages, or software binaries.
Modules on the other hand are usually separate and encapsulated concerns within an application, service, package, or library’s code base. These items when composed of many interchangeable, yet integrated modules is considered to be highly modular and therefore loosely coupled.
The main pattern to note here is the blackboard pattern and associated blackboard system. The common terms involved with this pattern include blackboard, knowledge sources, and control component (or control shell).
The pattern is analogous to an actual blackboard (or whiteboard) session, in which multiple people (knowledge sources) write various approaches or elements to solve a problem on a board (blackboard), and then a control component is able to selectively piece together (i.e., execute) knowledge sources as needed to produce a solution.
Wikipedia notes that this pattern may have lost some popularity in favor of statistical-based techniques such as Hidden Markov Models.
A very important architecture is called event-driven or message-driven architecture (EDA). This architecture deals with producing, detecting, processing, and responding to events, which as previously mentioned, take the form of messages. This architectural pattern promotes loose coupling, scalability, high performance, efficiency, and potentially asynchronous, non-blocking operations.
Events are created by a so-called event emitter or agent, which are received by event consumers or sinks. The communication and pairing between emitters and consumers is facilitated by event channels.
The specific event channel and event handling framework is typically implemented using a system or pattern such as message-oriented middleware (MOM), publish-subscribe (pub/sub), and message queues.
Other related concepts and architectures are the enterprise service bus (ESB) and enterprise messaging system (EMS). An ESB acts as a highly flexible and loosely coupled communication system, in the form of a virtual software bus, that sits between distributed services and/or software applications.
Message-based communication via an ESB is based on an enterprise messaging system (EMS), which defines and outlines various standards, best practices, and implementations of the messaging system itself. This includes message formats, queuing, transfer protocols, application protocols, security, and more.
The most predominant architectural patterns in this category are plug-ins, integrations, and the microkernel pattern. Plug-ins and integrations are increasingly being referred to as apps or extensions. The microkernel pattern is an operating system-level pattern, and is therefore not applicable to this article.
Plug-ins and integrations can be considered as part of a broader category known as add-ons, which allow users to add functionality and features as needed. Software applications that allow and support any form of add-ons are considered to be customizable. The main difference between plug-ins, extensions, and integrations involves the type and form of functionality being added.
Plug-ins tend to be software packages that host applications leverage (via a plug-in manager) to provide additional non-native functionality by specifying an interface and API for which the plug-in adapts to. Plug-ins tend to be dedicated software packages created specifically for the host application. Examples include Salesforce Apps and Adobe Flash for web browsers.
Integrations, on the other hand, are ‘apps’ that can extend a host application’s functionality like a plug-in, but typically represent a stand alone software application or service that exists as a product or service in its own right. In this case, the host recognizes another app or service’s usefulness and identifies ways in which it can be integrated into a user’s workflow within the host application.
Examples include Google Calendar, Trello, and Jira integrations with Slack. Google Chrome offers many app integrations and extensions as well.
Distributed systems use more than one physical or virtual compute instance to provide functionality, e.g., a software application, service, or database.
Some examples include the tiered and service-oriented architectures, as discussed. Other notable architectures in this category include peer-to-peer, space-based, and the many different architectural patterns associated with cloud computing and cloud services (e.g., AWS) in general.
Peer-to-peer computing contrasts with client-server computing in that peers (aka nodes) work together to accomplish tasks and share resources, as opposed to being characterized by strict separations of consumer/provider or request/response relationships. Peers can be both consumers and providers within their network.
Space-based architecture is a pattern driven primarily by scalability and performance goals, particularly when involving large numbers of concurrent and often unpredictable loads on the system, usually in the form of requests.
This pattern is a bit complex relative to some others, and involves horizontally scalable computing or processing units that each leverage in-memory data for maximum performance. There are also caching and asynchronous data persistence components.
Lastly, space-based architecture involves a virtualized middleware component, which consists of a messaging grid, data grid, processing grid, and deployment manager. This middleware component is a controller that performs most management and coordination tasks associated with this architecture.
Other Architectural Patterns
Here are some other notable patterns worth looking into that aren’t discussed in this article.
MV*__ (e.g., _MVC, MVP, MVVM, …)
Single page application (SPA) vs multi-page
Job Scheduler, aka scheduling
Inversion of control (IOC)
APIs and SDKs
One of the most important and widely used patterns is the application programming interface (API) and often complementary software development kit (SDK).
As discussed, web applications are usually split between client and server components, where the client typically runs in a web browser or mobile device, and the server typically runs on a physical or virtual machine.
A very popular and widespread trend is to shift an increasing amount of application logic to the front-end of web and/or native mobile-based applications. Given this, many software applications communicate primarily through matched SDKs on the client, and RESTful APIs on the server.
REST stands for representational state transfer and is a communication and information transfer pattern between clients and servers. Servers expose RESTful API endpoints so that client applications can communicate with them over common Internet protocols (e.g., HTTP and HTTPS), and in order to perform CRUD operations on application data, or to execute specific server-side tasks and operations.
We’ve now had a solid overview of many of the widely used architectural patterns and styles found in scalable software solutions.
The next part of this series will be a detailed discussion of big data and analytics architectural patterns.