Oct 15, 2024 7 min read Ruby

Plumbing

An exercise in doing things "my way"

The Rails Way

I'm reading "Layered Design for Ruby on Rails Applications" by Vladimir Dementyev; I've not finished it yet but when I have, I'm sure I'll have more things to say about it.

But one of his key points is there is a "Rails Way" to do things. This is a great thing about Rails, as it gives you a standard structure, a standard set of abstractions to use, so you get a head start when building your apps. The trouble is, the "Rails Way" falls short as the abstractions provided only take you so far. And the book is about adding new abstractions to larger Rails apps that still look and feel like the "Rails Way".

I also recently watched Eileen Uchitelle's "The Myth of the Modular Monolith" - a talk about how modular monoliths (which have replaced microservices as the hyped way to structure big, complex, applications) will not protect you from tangled spaghetti code.

Giant balls of spaghetti and/or mud

Both of these have a lot of meaning to me, as Collabor8Online is hitting that point where the code-base is big, complicated and getting intertwined. This is because I wasn't disciplined enough to say "no" to the others when features were needed in a hurry.

I decided a while back that I was going to split the application into a modular monolith. I would use Rails Engines to divide the functionality into easily understood pieces. Those engines would be layered - business logic (domain) engines plus user-facing (application) engines. And then an overall Rails "container" application that consists of nothing more than wiring the different pieces together.

I'll let you know how I get on.

But, underneath all of that, the different parts of the application need a way to communicate with each other, without being coupled together. And a very simple, yet powerful, way of doing this is through events/messaging/observers/notifications.

So I've written Plumbing - a ruby gem that deviates completely from the Rails Way (sorry Vladimir and Eileen) to act as the foundation for my new architecture.

Not the Rails Way - Plumbing4Life

Plumbing has four main parts to it. I tried to keep all the names water-related but my imagination was not active enough - so suggestions gratefully accepted.

Pipelines

These are "unix pipes" in Ruby. I've written about unix pipes and functional-style transformations before and now I have got my head around them, they make perfect sense to me. (I'll write about Object-Orientated Programming versus Functional Programming again soon, as well).

Pipelines are my take on Dry Transactions.

You define a class and declare it's pre-conditions, post-conditions and the sequence of transformations that will take place. Then you use #call, passing in some input data. The data is tested against your pre-conditions, then passed to the first step. This does its thing, resulting in some output, which is passed to the next step. And so on. Until finally it is tested against the post-condition and returned to the caller. The object itself is stateless, and therefore concurrency-safe, by default (although there's nothing to stop you adding state in either an initializer or during your steps). And each step is independent of the previous, allowing for easy testing. Finally, you can compose your pipelines by specifying that certain steps are implemented by another pipeline class, allowing for easy reuse. Just like a unix pipe.

Rubber Ducks

Apart from being the single greatest name for a programming concept I've ever come up with, Rubber Ducks are a way of introducing type-checking to Ruby in a way that feels natural to me.

A Rubber Duck defines the methods that it expects an object to have, then you "cast" it into that rubber duck - @my_rubber_duck = @some_object.as MyRubberDuck. This does two things - firstly, it tests @some_object to ensure that it has public methods for each of the messages defined in MyRubberDuck. Then it creates a proxy object that only responds to those methods - so you cannot bypass the rubber duck as call things you are not supposed to.

This keeps the duck-style typing that makes ruby so powerful and eloquent. But also adds a "fail-early" check so if you are given an object that you weren't expecting, you know before it causes problems.

And they're called Rubber Ducks!

Actors

Ever since I first got my head around programming (via reading articles about Smalltalk, an early OO language that ruby borrows heavily from), I've been slightly obsessed with concurrency.

Back when I used to write in Pascal for Windows, I wrote a whole multi-threaded library that built on top of the Windows API to offer mutexes, multi-reader-single-writer locks and other synchronisation primitives.

But it's the concept of Actors that I really like.

Unlike most concurrent/parallel processing, an actor is self-contained - just like an object. From the actor's perspective, everything happens within a single thread, so concurrency issues like locking (and subsequent problems, like deadlocking) are side-stepped. To make this work, other actors "post messages to a queue" and the receiver processes the queue, in order, within its own thread.

If you think about it, this is exactly how object instances work - method calls are message passing (which is why ruby has the #send method). All actors do is define some concurrency limits around that message passing.

There are many implementations of actors in ruby but concurrency is hard and there's no better way to understand something than by building it yourself. And did I mention I'm slightly obsessed with it?

So actors in plumbing are my take. You define your actor, declaring the methods that are async. Then you call .start - which is where I've taken a different path to the other actor libraries I've seen. .start creates a proxy object (similar to a rubber duck) that only implements the methods you have marked as async. This means that a caller cannot bypass your restrictions and access data outside of the actor paradigm. Plus, the actual proxy used is based upon the mode you have selected: inline involves no concurrency, async uses the Async gem to implement concurrency without parallelism with ruby's fibers and threaded (or threaded-rails) uses Concurrent Ruby to implement concurrency with limited parallelism via threads. My plan is to add ractors (ruby's in-built actor model with true parallelism) but the implementation is going to be far more complex, as ractors impose some strict rules about passing data.

Concurrency is complicated and can result in hard to figure out bugs. But in my relatively limited usage of actors so far, it seems to be a robust implementation. Eyeballs and bug reports welcome! And if you can think of a plumbing-related name for these (they were originally called Valves) I would be ever thankful.

Pipes

Pipes are where plumbing becomes relevant to the "modular monolith" architecture I was describing earlier.

A Pipe is an implementation of the observer pattern (event listeners in Javascript, the Dependency Protocol in Smalltalk). You call #add_observer to receive notifications from the pipe, which, whenever it needs to, calls #notify to notify you of events that have happened.

However, there are two areas where pipes differ from most other implementations of the observer pattern.

Firstly, pipes are designed to be composable. You can create sequences of pipes that listen to each other. To start with I've built a Pipe::Filter that lets you ignore certain events whilst notifying your own observers of others. And a Pipe::Junction that subscribes to multiple pipes, so a single observer can keep track of multiple sources. Coming soon are a Pipe::Buffer to let you batch notifications and a Pipe::Debouncer which will remove duplicates.

Secondly, pipes are also actors. I know this is premature optimisation, but one thing that has always bothered me about the observer pattern is that, because you do not know who your observers are, your behaviour may be adversely affected by them. What if an observer raises an exception whilst dealing with one of your notifications? Or, more insidious, what if an observer is incredibly slow at handling your notification?

Because pipes are actors, they each work independently of their caller. If you set the mode to async or threaded/threaded_rails notifications are dispatched asynchronously, so they do not affect your own operations. Your observable object has a private instance variable which is initialised with Pipe.start. You delegate calls to #add_observer and #remove_observer to this pipe and, when you have something to tell the world, you call @pipe.notify "my_event", some: "data". The pipe dispatches its notifications asynchronously, so you can go about your business without worrying about slow observers (or exceptions as the pipe logs and then handles them).

Plumbing it all together

There are two types of dependency - the first, where you require in-depth knowledge of the code you are using and the second, where you just need to know that things have happened.

In my engine-based future, the plan is, if Engine A needs deep knowledge of how Engine B works, Engine A's gem_spec will depend on Engine B. But if Engine A just needs to know that something has changed, A can add itself as an observer to Engine B (probably attaching filters to the events it is interested in). This means, in the second case, Engine A needs no deep knowledge of Engine B, it only needs to know which notifications it sends. Keeping the two decoupled.

Taking this further, it makes the job of the "container" application simple. The container depends upon all the various engine gems, then, at start-up, "plumbs" them all together, creating the pipes that respond to various events. The individual engines now very little about each other and the application provides the customisation and application-specific behaviour.

You can see the start of this in C8O Automations and C8O Workflows.

Automations is an engine (already being used in Collabor8Online), for handling user-defined automations that are triggered by events within the application. The engine itself knows nothing about Collabor8Online; instead the main application uses include Automations::Container within the Project class and has a rake task that triggers the relevant automations when required. If the application needs to know when an automation has triggered (or failed) - which in Collabor8Online needs to happen so the audit trail can be updated - it adds itself as an observer to Automations.events, a pipe that notifies observers every time an automation does something.

Workflows is a rewrite of the workflows functionality from Collabor8Online and is not in use yet. This works similarly - the Project is a Workflows::TemplateContainer and a Folder is a Workflows::TaskContainer. The project manager defines their workflows at the project level, then the user triggers those workflows by various actions at the folder level. The workflow templates themselves use C8O Automations to define what happens as the particular workflow unfolds (and again, these Automations are defined within the application, so the engine knows nothing about them). And the application observes the notifications from the folder so it knows which workflows should be started.

So that's the plan. It's a deviation from the Rails Way, but I think it's one that makes a lot of sense.

I'll let you know how I get on.