Skip to content

fix: prevent synced invoices from getting updated by outdated webhooks #68

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

ignaciodob
Copy link
Contributor

@ignaciodob ignaciodob commented Jun 23, 2025

This PR introduces a fix for out-of-order webhooks causing erroneous states in the invoices.

What is the current behavior?

Out-of-order webhook events from Orb can cause incorrect invoice states in the database. Since the system upserts invoices without checking the recency of the state, an issued status may overwrite a previously stored paid status.

What is the new behavior?

This PR introduces a general safeguard for upserting - the upsertManyWithTimestampProtection function. This function uses a SQL statement with a WHERE clause that only updates records when the incoming last_synced_at timestamp is newer than the existing one. This prevents older webhook events from overwriting newer data, ensuring data consistency. This pattern can easily be applied to other entities by using the same function in their respective sync functions.

It also:

  • Adds a new migration to add the last_synced_at column to all entities: invoices, customers, subscriptions, credit_notes, plans, and billable_metrics
  • Adds the syncTimestamp parameter to the syncInvoices method allows webhook handlers to pass the webhook's created_at timestamp, which is then used as the last_synced_at value for timestamp-based protection
  • Adds assertions and tests to cover this functionality, including a new test case where we check that an invoice is not updated with an out-of-date webhook.

@ignaciodob ignaciodob changed the title Prevent synced invoices from getting updated by out of date webhooks Prevent synced invoices from getting updated by outdated webhooks Jun 23, 2025
@ignaciodob ignaciodob changed the title Prevent synced invoices from getting updated by outdated webhooks feat: prevent synced invoices from getting updated by outdated webhooks Jun 25, 2025
@ignaciodob ignaciodob changed the title feat: prevent synced invoices from getting updated by outdated webhooks fix: prevent synced invoices from getting updated by outdated webhooks Jun 25, 2025
@ignaciodob ignaciodob force-pushed the feat/last-synced-invoices branch from 9bce8bf to 8387069 Compare June 26, 2025 12:18
@ignaciodob ignaciodob force-pushed the feat/last-synced-invoices branch from fe7dce9 to 04d51a1 Compare June 26, 2025 15:30
@ignaciodob ignaciodob self-assigned this Jun 26, 2025
@ignaciodob ignaciodob marked this pull request as ready for review June 26, 2025 16:55
@ignaciodob ignaciodob requested review from ecktoteckto and kevcodez and removed request for ecktoteckto June 26, 2025 16:55
.map((x) => `"${x}" = EXCLUDED."${x}"`)
.join(',')},
last_synced_at = :last_synced_at
WHERE "${table}"."last_synced_at" IS NULL
Copy link
Collaborator

@kevcodez kevcodez Jun 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will upsert if ANY entry in the table matches this, but not necessarily this id, we can do something like

WHERE NOT EXISTS (select 1 from "${table}" where id = :id AND "${table}"."last_synced_at" > :last_synced_at;)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The integration tests likely did not catch this (if true), given you test for a single entry

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch! You're right - I'll add a test to cover this too

Copy link
Contributor Author

@ignaciodob ignaciodob Jun 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to do a deep dive into the Postgres docs, but it seems that Postgres actually scopes the WHERE clause in ON CONFLICT DO UPDATE very narrowly.

Specifically:

  • The WHERE clause only runs after a conflict is detected (i.e. once a row with matching constraint is found).
  • At that point, it only has access to two rows:
    • The conflicting row from the table (via ${table} or alias)
    • The proposed insert row (via excluded)

So even though it looks like ${table}."last_synced_at" could match any row, it actually only refers to the row that caused the conflict.

Source:

  • PostgreSQL Docs on INSERT
    “The SET and WHERE clauses in ON CONFLICT DO UPDATE have access to the existing row using the table’s name (or an alias), and to the row proposed for insertion using the special excluded table.”
  • This comment on StackOverflow also says the same thing and has a clearer example: https://stackoverflow.com/a/44579804

That said, I find this to be quite non-intuitive (even a bit misleading 😅). I've added a test to verify that this works and reviewed it thoroughly, and I believe it addresses this case. I've added a comment too.

Let me know what you think :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sgtm, also tested this locally and it makes sense

@@ -43,6 +43,7 @@ export const invoiceSchema: JsonSchema = {
sync_failed_at: { type: 'string' },
voided_at: { type: 'string' },
will_auto_issue: { type: 'boolean' },
last_synced_at: { type: 'string' },
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

last_synced_at might be a little confusing wrt to the name, as this field is either

  • timestamp when a manual sync was triggered (matches the name)
  • Timestamp of the entity change, not when it was synced - i.e. entity changed at 21:00:00, webhook came in at 21:15:00 - so we synced it at 21:15:00, but we will be using 21:00:00 as date

I'm okay with the name as-is, but maybe we can come up with a more fitting name, just a nit though

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for flagging this!

Timestamp of the entity change, not when it was synced - i.e. entity changed at 21:00:00, webhook came in at 21:15:00 - so we synced it at 21:15:00, but we will be using 21:00:00 as date

We will actually update last_synced_at to the time of the webhook creation, not the time when it arrived. We're using the webhook's created_at timestamp.

For example, taken from a recent Orb webhook:

{
  "id": "<invoice_id>",
  "created_at": "2025-06-27T17:54:17.144536+00:00",
  "type": "invoice.issued",
  "invoice": {
    "created_at": "2025-06-27T17:54:15+00:00",
    ...
   }
}

We'll use the first one as the last_synced_at, which is two seconds after the Invoice entity was created/updated.
In the webhook handler:

 await syncInvoices(this.postgresClient, [webhook.invoice], webhook.created_at);

Notice that we're using webhook.created_at and not webhook.invoice.created_at

So last_synced_at = “we synced this entity because of an event that occurred at this time”
I think that last_synced_at still makes sense in this case, what do you think?

Happy to make the change if it doesn't make sense! :D

entries: T[],
table: string,
tableSchema: JsonSchema,
syncTimestamp: string
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When doing a manual sync, when can always pass new Date() as syncTimestamp, given we sync at that time specifically - in that case we may end up ALWAYS having a syncTimestamp, so this separate method would not be necessary - or are there any cases I am missing here?

Copy link
Contributor Author

@ignaciodob ignaciodob Jun 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A separate method is not strictly necessary, but my idea was to keep the new functionality contained to the invoices first, especially given that we don't have tests for the other events yet. In case something goes wrong, syncing up only the invoices would be easier.

But yeah, eventually, I don't see a reason why we wouldn't do the same for every other entity.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine by me to get this out 👍

Copy link
Collaborator

@kevcodez kevcodez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants