Added new message modelling blog post #342

dhope-scottlogic · 2025-07-29T12:33:50Z

Please add a direct link to your post here:

https://dhope-scottlogic.github.io/blog/2025/07/29/message_types_part2.html

Have you (please tick each box to show completion):

[ * ] Added your blog post to a single category?
[ * ] Added a brief summary for your post? Summaries should be roughly two sentences in length and give potential readers a good idea of the contents of your post.
[ * ] Checked that the build passes?
[ * ] Checked your spelling (you can use npm install followed by npx mdspell "**/{FILE_NAME}.md" --en-gb -a -n -x -t if that's your thing)
[ * ] Ensured that your author profile contains a profile image, and a brief description of yourself? (make it more interesting than just your job title!)
[ * ] Optimised any images in your post? They should be less than 100KBytes as a general guide.

Posts are reviewed / approved by your Regional Tech Lead.

jamesheward · 2025-08-15T08:41:05Z

_posts/2025-07-29-message_types_part2.md

+Coarser messages can be simpler for a consumer because all the information they need comes in one message and can be immediately stored; there is no joining of messages, but there are also costs and we'll explore these ideas soon. 
+
+Before we do that, let's briefly consider the different decision points where specific choices around granularity have been made:
+ 1. Should each endpoint send out an event matching the endpoint payload or divide it into smaller messages for particular field sets


This list isnt rendering on new lines, perhaps separate out so each bullet is on a new line.

Good spot, worked fine in my local markdown

jamesheward · 2025-08-15T08:42:09Z

_posts/2025-07-29-message_types_part2.md

+### Endpoint to message mapping
+Let's say there is a single update-profile REST endpoint, like PUT /profile where the XML/JSON payload includes an email, postal address, phone number etc (assume no separate email endpoint for now)
+
+When generating a message there is a choice between


jamesheward · 2025-08-15T08:44:22Z

_posts/2025-07-29-message_types_part2.md

+Next let's think about the scenario shown in the diagram with a separate endpoint for changing the email (we'll conveniently ignore complexities of changing email, which may actually be multi-step). 
+We could send an EMAIL_CHANGED  event or EMAIL state message for the email endpoint and a PROFILE/PROFILE_UPDATED one (without the email) when the profile endpoint is hit. But... wouldn't a consumer expect to find an email in a profile message? If we find that persuasive then we might just send a "PROFILE" message including the email when either endpoint is hit meaning that the consumer has one simple message to listen for regardless of how the change occurred. In this case we are relating our events to the entities within our system rather than coupling them to the REST endpoints. 
+
+Such an approach makes good sense but brings consistency risks that'll we'll discuss shortly


that'll typo

jamesheward · 2025-08-15T08:45:27Z

_posts/2025-07-29-message_types_part2.md

+   "ID": "0f504d3b-d76a-4aaa-b628-5e9eeaa10bdc",
+   "datetime": 10:00, 23/04/1983 UTC
+   "name": "Premier League",
+   "shortName": "Man City",


jamesheward · 2025-08-15T08:45:50Z

_posts/2025-07-29-message_types_part2.md

+   "datetime": 10:00, 23/04/1983 UTC
+   "name": "Premier League",
+   "shortName": "Man City",
+   "location": "Manchester",


maybe location doesnt make sense here

yeah, copy and paste mistake

jamesheward · 2025-08-15T08:56:06Z

_posts/2025-07-29-message_types_part2.md

+
+ Many tools like Kafka/Kinesis/EventHubs can guarantee ordering within a shard/partition (it's up to you to pick a sensible key, e.g. user Id, to select the shard) and this will simplify consumers who don't have to worry about receiving and stashing out of order events. If you don't have this you'll have to rely on timestamps to enforce order and add some consumer complexity. 
+
+If you are sending events from application code after a database write and without ACID guarantees, reasoning about your messages will be difficult, not just in terms of change lists but also for overall system consistency and avoiding conflicts.


Id go further than that and just say this is a total no-no - unless you are using distributed transactions to get an atomic event + db write (which I've never seen been worthwhile given the complexity and lack of sound guarantees) then you have no guarantee of consistency in your events and its almost guaranteed to drift from the true state.

To a point, but with many NoSQL DBs there has been no way to guarantee message+DB atomicity as many DBs haven't had transactions no a CDC mechanism that has proper guarantees

jamesheward · 2025-08-15T08:57:02Z

_posts/2025-07-29-message_types_part2.md

+
+### Security in aggregated messages and accidental coupling 
+Moving on to a totally separate topic, let's briefly consider security. 
+With REST and other APIs it is normal to have access controls saying which endpoints an be accessed by who.


jamesheward · 2025-08-15T09:03:32Z

_posts/2025-07-29-message_types_part2.md

+
+What this means is that going down the state transfer route as opposed to events can limit your security when the messages contain an aggregation of multiple REST endpoint payloads. So never aggregate data that has different access requirements. 
+
+Related to this, there's also a risk of accidental coupling. Imagine you add a new field to some message where that field is only really intended for one consumer. You think that if you ever need to change/remove the field it it'll be a quick conversation with that one consumer's dev team. However, perhaps this field is also added to an existing aggregated message because the consumer already gets this and it's easier than integrating with a new topic and message. Unfortunately, 20+ other consumers are also using the aggregated message and, over time, developers in the associated other teams choose to use/misuse this field but you have no visibility. All you know is that 20 consumers of the aggregated message may or may not be using a given field. Suddenly you can't make a change because you'll break lots of services, not just the original intended one.


Im not sure I agree with these next two paragraphs. The producer is in control of the event and if its published for consumption outside the domain should treat it as public a contract that could be used in any way a consumer decides. There needs to be agreed pattern for how schemas are managed, published and evolved within the ecosystem.

Events and API's are both contracts the producer is giving guarantees over. If, as a producer, you are too concerned with specific consumers of those contracts then you possibly have your domains modelled incorrectly.

I might reword slightly or simplify the point. Ultimately I am trying to say that if you aggregate for convenience you can lose control over who is accessing what. The same is less true with GraphQL where you can control which parts of the schema each consumer can see. And yes agreed, if you don't model your domains well then you end up with messages crossing domains and half the fields are not relevant to many consumers but they need the message for the other half

jamesheward · 2025-08-15T09:06:04Z

_posts/2025-07-29-message_types_part2.md

+
+
+## Enrichment pipelines
+To finish, consider a slightly different pattern I am calling the enrichment or decoration pipeline:


This feels like a little bit of an aside, and im not sure it adds a huge amount to the overall blog

jamesheward · 2025-08-15T09:12:48Z

_posts/2025-07-29-message_types_part2.md

+
+You can get round the cost issue to some degree by mandating that where a service enriches data it should effectively pass through the existing data. To put it another way you treat earlier data as a blob and don't map it into internal models on input and output. However, you still need to think about schemas and how you keep this up to date. If consumer A reads from Enricher N-1 at the end of the chain, it wants an async API schema from Enricher N-1 and that should include all the data added by previous stages.
+
+## Final thoughts


Overall the content is great, but I think it could flow a bit more clearly, some suggestions:

you draw on a few examples throughout - trading, sports, videos, user prefs etc.. - perhaps set up the prototypical system that we will use at the start, and then use that as a single example throughout?

it feels like the last 3 sections in granularity could go in to a separate top level section on aggregation so you would have granularity, normalisation, aggregation, and tradeoffs which could be laid out in the intro.

maybe in the final thoughts separate out the key recommendations as bullet points for readability

Will have a look at that, thanks

Added new message modelling blog post

e31554d

jamesheward reviewed Aug 15, 2025

View reviewed changes


		Many tools like Kafka/Kinesis/EventHubs can guarantee ordering within a shard/partition (it's up to you to pick a sensible key, e.g. user Id, to select the shard) and this will simplify consumers who don't have to worry about receiving and stashing out of order events. If you don't have this you'll have to rely on timestamps to enforce order and add some consumer complexity.

		If you are sending events from application code after a database write and without ACID guarantees, reasoning about your messages will be difficult, not just in terms of change lists but also for overall system consistency and avoiding conflicts.


		What this means is that going down the state transfer route as opposed to events can limit your security when the messages contain an aggregation of multiple REST endpoint payloads. So never aggregate data that has different access requirements.

		Related to this, there's also a risk of accidental coupling. Imagine you add a new field to some message where that field is only really intended for one consumer. You think that if you ever need to change/remove the field it it'll be a quick conversation with that one consumer's dev team. However, perhaps this field is also added to an existing aggregated message because the consumer already gets this and it's easier than integrating with a new topic and message. Unfortunately, 20+ other consumers are also using the aggregated message and, over time, developers in the associated other teams choose to use/misuse this field but you have no visibility. All you know is that 20 consumers of the aggregated message may or may not be using a given field. Suddenly you can't make a change because you'll break lots of services, not just the original intended one.



		## Enrichment pipelines
		To finish, consider a slightly different pattern I am calling the enrichment or decoration pipeline:


		You can get round the cost issue to some degree by mandating that where a service enriches data it should effectively pass through the existing data. To put it another way you treat earlier data as a blob and don't map it into internal models on input and output. However, you still need to think about schemas and how you keep this up to date. If consumer A reads from Enricher N-1 at the end of the chain, it wants an async API schema from Enricher N-1 and that should include all the data added by previous stages.

		## Final thoughts

Added new message modelling blog post #342

Are you sure you want to change the base?

Added new message modelling blog post #342

Uh oh!

Conversation

dhope-scottlogic commented Jul 29, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants