Big Picture Event Storming - finding the gaps

11 Mar 2024.11 minutes read

Big Picture Event Storming - finding the gaps webp image

In my latest article of the series about our Domain-Driven Design show-case project, which you can read more about here, I covered our kickoff session of Big Picture Event Storming. Quick recap: we started with a chaotic exploration, which was a brain dump of relevant domain events from everyone involved. After that, we got down to organizing chaos by sorting out the events, removing duplicates, and fine-tuning them.

We wrapped the session with the events ordered in chronological order. We also pointed out a hot spot that highlighted the things we were not sure about. Additionally, we used some yellow sticky notes for the crucial bits we didn’t want to forget.

As a bonus, we pinpointed the initial sub-processes happening within the main devices inventory process. Those were highlighted, and we gave them temporary names. Here’s a snapshot of what our Miro board looked like when we were done:

Picture 1. Miro board after the timeline enforcement phase, CLICK to preview on Miro

That looks pretty clear. But even with everything in order and the board looking sharp, we were aware that there could still be some gaps or inconsistencies hidden in our earlier modeling. Fortunately, Big Picture Event Storming helps identify such gaps or inconsistencies by providing two techniques - explicit walk-through and reverse narrative. Awareness of things we don’t know is often valuable knowledge for the stakeholders, especially when they didn’t know about it before.

In this post, I will dive into how we tackled the second part of our workshop. I'll share the techniques we used to ensure we had a complete picture of the process. But first, let’s start with some theory on the approaches to finding the gaps in the event flow.

Check all the articles from the series:
Domain Driven Design: A SoftwareMill Way
Mastering Chaos with Big Picture Event Storming
Finding the Gaps with Big Picture Event Storming
Event Storming from a non-technical domain expert perspective
Big Picture Event Storming: Simple Workshops, Big Benefits for Your Business
Managing complexity and uncertainty in Software Development: know when to press pause
Remote Event Storming challenges from a facilitator's perspective

Explicit walk-through

The explicit walk-through technique is one approach of the Event Storming workshop that checks if the model is both accurate and complete. It involves a detailed review of the entire event sequence laid out during the workshop. It lets everyone involved check each part of the process together. This way, we can spot inconsistencies, unnecessary parts, or anything missing. It makes sure we all understand the business process.

During this, the facilitator might ask someone to start telling the story based on the events, following their order in time. This role of storyteller can be passed around to keep everyone engaged.

To effectively validate the process steps and identify potential gaps or inconsistencies, it’s crucial to ask the right questions while going through the events. Though the facilitator usually starts this, participants often see its value and start asking questions, too. The common, useful questions which help to find potential missing elements are:

Why did this event happen?
Is there any other reason for this event to happen?
How does this event affect the system's state?
What could happen between this event and the one before it?
What follows this event?
How does this event affect the business?
Is it possible that this event will not happen? If yes, what are the consequences of such a situation?
What if this event happens more than once?
Are there any exceptional situations for this event?
Could this event be split into smaller, more specific events?

These questions help us find events we might have missed initially, confirm the importance of each event, and explore what happens before and after an event. They also help us see if there are different ways things could go.

miro 2

Picture 2. Asking questions while going through the events in an explicit walk-through approach. It starts from the left (first event) and follows along with the timeline.

Those are quite a lot of questions, and these are just examples. We don't need to ask all of them for every event - that would be too much. Experienced facilitators and modelers know which questions make sense for different events.

What is important in this approach is that we review the events according to the order they happen, which was defined in the initial brainstorming phase. It differs from another method we use to check our work, the reverse narrative.

Reverse narrative

Another technique we often use to spot potential gaps in the process is called reverse narrative. Unlike in the explicit walk-through, we start from the end of the process, which is the last event on the timeline. This method offers a unique perspective that can uncover hidden assumptions, missed steps, or parts of the process that are more complicated than they need to be.

By starting from the desired outcome and asking, "How did we get here?” participants are encouraged to think critically about what happened before. Looking at things backward makes us reconsider whether each event is really necessary or could be simplified.

Just like in the explicit walk-through, we ask certain questions to check if each event makes sense and to find any we might have missed. The questions I mentioned earlier work here, too, but we tweak them a bit to fit the backward perspective. For instance:

What has to happen for this event to occur?
How could this outcome have been achieved differently?
What conditions were necessary for this event to occur?
What could have prevented this event from occurring?

Thinking backward isn't our usual way of doing things. Consider which feels more natural and easier: saying the alphabet from A to Z or from Z to A? This kind of backward thinking can help us notice things we missed the first time around.

miro 3

Picture 3. Asking the questions while going through the events in a reverse narrative approach. It starts from the right (last event) and moves against the timeline's direction.

The reverse narrative technique, or a similar backward analysis method, is commonly used by law enforcement and investigative agencies (e.g., police) to conduct criminal investigations and analyze events. It's especially useful for catching inconsistencies in a suspect's story. In Event Storming, our goal isn't to catch anyone in a lie but to ensure the process they describe is complete and doesn't rely on unspoken assumptions.

The key benefit of both techniques is ensuring that the modeled process is complete and correctly expressed by the events. However, they also help everyone involved get a common understanding of the process. It ensures that the domain knowledge is not only shared but also validated across the teams. This helps bridge the gap between technical and non-technical participants, making complex domain logic accessible and understandable to all. It also supports the creation of ubiquitous language, which is used to express the domain model.

Now that we've covered the theory let's see how it all played out in our session.

Finding the gaps in our processes

For our workshop, we chose the reverse narrative technique. Why? Because we (the modelers) only knew a little about the process we were playing with, as we were just users of it. But, there are many hidden aspects that our experts know well, which we don't.

Given the differences in how much participants know about the process, the reverse technique helps us examine the process more critically. From what I've seen, the explicit walk-through is more effective when everyone already understands the process similarly. That's pretty rare, so we usually go with the reverse narrative.

Decommissioning processes

Starting from the end of the process, we quickly realized that the reasons for decommissioning a device were different from what we initially thought. In our previous session, we understood that a device is decommissioned either when it's completely damaged and irreparable according to the service or when it's chosen to be decommissioned following a replacement. Here's what we found out after the last session:

miro 4

Picture 4. Events in the decommissioning process after the first session

Although both scenarios generally were correct, we realized the sequence of events was incorrect. Specifically, when a device is declared non-functional by the service, the first step is actually to initiate the replacement procedure due to damage. Following this subprocess, the decommissioning process is triggered.

Interestingly, while using the reverse narrative, we discovered a cyclical process of reviewing the inventory of devices. This highlights a key reason to use this technique - it helps us spot subprocesses, events we missed, or different ways of doing things that weren't clear at first and hadn't been identified.

In this discovered review subprocess, each device might be kept in stock, sold, or decommissioned if considered too old for reuse. So, the decommissioning process could be initiated in this scenario as well. Here's what we found after some discussion:

miro 5

Picture 5. The modified entry point to the decommissioning process after the reverse narrative phase

Besides changing the starting points of this process, which I've already mentioned, we also realized the process itself functions a bit differently than we initially thought. Specifically, the documentation proving a device has been decommissioned, which is created once the decision is made, is also sent to the external accounting system.

Moreover, we rephrased the initial event from "Device was decommissioned" to "Decision to decommission device was made" to more accurately reflect the event's meaning. At this point, the device is not physically decommissioned. We just decided to do so.

Device replacement process

Another example of how the reverse narrative helped us better understand the process was device replacement. Initially, we believed it was a straightforward process, which by the end of the first session looked like this:

miro 6

Picture 6. Device replacement process at the end of the first session

At that stage, there was only one way to start the process - when a contractor requested a replacement. This triggers the usual process for issuing devices, similar to what happens with a new contractor. The "old" computer might then be returned to inventory or decommissioned, based on its condition, but only after the contractor has cleared the device's data.

However, when we looked at the events backward and asked the right questions, it turned out to be more complex:

miro 7

Picture 7. Device replacement process after reverse narrative

First, we discovered additional ways to start this process. It's not just about a contractor asking for a new device because the old one is outdated. For instance, the contractor might lose the device, get damaged unexpectedly, or even be stolen. Additionally, suppose the service determines that the computer can't be repaired. In that case, that also serves as a signal to start the replacement process, which I described while discussing updates to the decommissioning process.

In all these situations, there's a need for a new device, which triggers the main device issuing process. If it's just a replacement, the contractor has the option to purchase the device. It initiates the process of selling the company's device. If not, the device is returned to the inventory pool as available for another contractor or decommissioned.

In situations where a device is lost, stolen, or damaged, the decommissioning process kicks off as well. Additionally, regardless of the scenario, we need to remove the device's assignment from the contractor. We also explored what other actions should be taken if a device is lost or stolen. However, since we've (fortunately) never encountered such incidents, we decided to delay this discussion. That's why we added two hot spots to the events related to these scenarios.

As you can see, we significantly modified this process. We uncovered different scenarios with various flows and discovered events that hadn't been identified earlier. This shows the power of the reverse narrative technique.

Wrap-up

In this article, I shared how we used the reverse narrative technique to uncover gaps in the process we modeled in our previous session. I explained how this method helped us better understand and refine the decommissioning and device replacement processes. These were just two instances that highlighted the effectiveness of this approach. Yet, there were additional events we identified through discussions in this session.

We also revisited other subprocesses identified in the first session. In the end, their scope was mainly unchanged, although, as mentioned, some underwent significant revisions. We also decided that two smaller subprocesses (the invoice accounting process and the warranty period expiration process) were somewhat side processes, and they don’t affect device availability, so we gave them less time. However, this was a conscious choice.

Finally, the board looked as follows:

Picture 8. State of the board after the reverse narrative phase. CLICK to preview on MIRO

It appears even more organized than before we started this session, which is what we aimed for. However, this session's true value lay in exploring the process's nuances, identifying gaps that hadn’t been revealed before, and establishing a shared understanding of the process we were workshopped.

In our next session, we will aim to identify the actors and systems involved in our device inventory process. Stay tuned for that!