Skip to content
This repository was archived by the owner on Sep 30, 2023. It is now read-only.
This repository was archived by the owner on Sep 30, 2023. It is now read-only.

'replicate.progress' event is not always sent #122

@EmiM

Description

@EmiM

orbit-db: 0.28.3
orbit-db-store: 4.3.3

Hi,
we are using OrbitDb for developing a p2p chat application.
Some time ago we noticed that once in a while users are lacking older messages ("message" - entry in a EventStore). It happens rarely but it already happened at least few times and we were sure that it wasn't a connection issue simply because new messages were arriving and could be sent with no problem.

Right now we are relying on replicate.progress event to send newly received messages to frontend. After some intensive testing I managed to get to the broken state (missing one message) and gather some logs.

missingMessage
(Notice missing "I received a message but Windows did not start replicating missing messages. Will it trigger now?" on the left side).

What happened was that replicate.progress event didn't fire for this particular message because none of the conditions in onReplicationProgress (https://github.com/orbitdb/orbit-db-store/blob/main/src/Store.js#L98) were met.

These are the logs from the application with a broken state. They are a bit messy because I was logging the db snapshot on every 'replicate.progress' to see how oplog is changing.
app1MissingMessage.log

This is the final snapshot that proves that the "missing" entry is in the local db, information about receiving it just wasn't propagated:
app1MissingMessagesFinalSnapshot.log

Looking at the logs of the last 3 messages I noticed that the Replicator received "Yes it did trigger" message before "I received a message but Windows did not start replicating missing messages. Will it trigger now?". I am not sure if this matters but after "Yes it did trigger" the replicationStatus wasn't recalculated properly thus replicate.progress didn't happen:

entry.clock.time: 31
onReplicationProgress: I reconnected at 22:49. Will sending a message trigger replication?
onReplicationProgress -> (this._oplog.length + 1 | 43), (this.replicationStatus.progress | 43), {previousProgress | 42}, (this.replicationStatus.max | 44), (previousMax | 44)
entry.clock.time: 44
onReplicationProgress: Yes it did trigger
onReplicationProgress -> (this._oplog.length + 1 | 43), (this.replicationStatus.progress | 44), {previousProgress | 43}, (this.replicationStatus.max | 44), (previousMax | 44)
entry.clock.time: 43
onReplicationProgress: I received a message but Windows did not start replicating missing messages. Will it trigger now?
onReplicationProgress -> (this._oplog.length + 1 | 43), (this.replicationStatus.progress | 44), {previousProgress | 44}, (this.replicationStatus.max | 44), (previousMax | 44)

Unfortunatelly I don't have a working test yet because the path for reproducing the problem is a bit random. Opening and closing apps (aka peers) in some order seems to do the trick. I'll provide a test as soon as I create one.

Do you have any idea what could've happened here?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions