Skip to content

[FLINK-38224][table-planner] Do not convert to delta join when the sink is a retract sink #26907

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

xuyangzhong
Copy link
Contributor

What is the purpose of the change

Delta joins might generate redundant data, while retract sinks require data to appear in pairs as both '+' and '-' forms. So we should not convert join to delta join when the sink is a retract sink.

Brief change log

  • Disable duplicate changes when the sink is a retract sink
  • Add tests

Verifying this change

Tests are added to verify this pr.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): no
  • The public API, i.e., is any changed class annotated with @Public(Evolving): no
  • The serializers: no
  • The runtime per-record code paths (performance sensitive): no
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
  • The S3 file system connector: no

Documentation

  • Does this pull request introduce a new feature? no
  • If yes, how is the feature documented?

@flinkbot
Copy link
Collaborator

flinkbot commented Aug 15, 2025

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

+ ")");

String sql = "insert into pk_snk select a,b,c from retract_src";
String sql = "insert into pk_upsert_snk select a,b,c from retract_src";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some places are a,b,c, while others are a, b, c. The code style for spaces can be unified.

@@ -545,6 +571,39 @@ class DeltaJoinTest extends TableTestBase {
util.verifyRelPlanInsert("insert into tmp_snk select a0, a1 from src1")
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another curious thing is why TABLE_EXEC_SINK_UPSERT_MATERIALIZE should be set to NONE. I have observed that other tests also have the same upsertMaterialize=[true] in DeltaJoinTest. xml

"insert into retract_snk select * from src1 join src2 " +
"on src1.a1 = src2.b1 " +
"and src1.a2 = src2.b2")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we set TABLE_OPTIMIZER_DELTA_JOIN_STRATEGY to FORCE and provide some different exception messages to indicate the reason for not supporting it

verifyRelPlanInsert(sql);
}

@TestTemplate
void testChangelogNormalize() {
assumeTrue(testSinkWithPk);

String sql = "insert into pk_snk select a,b,c from upsert_src";
String sql = "insert into pk_retract_snk select a,b,c from upsert_src";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

upsert_src is only used here, can we move the ddl here

void testRetractSink() {
assumeTrue(testSinkWithPk);

String sql = "insert into pk_retract_snk select a,b,c from retract_src";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe there is essentially no difference between testRetractSink and testChangelogNormalize, both of which are caused by pk_detract_stnk resulting in duplicateChanges being set to disallow. Therefore, they can be merged into one test and the ddl of pk_detract_stnk can be moved here

@Au-Miner
Copy link
Contributor

Thanks for improving the duplicate semantics, let me provide some comments.

@github-actions github-actions bot added community-reviewed PR has been reviewed by the community. and removed community-reviewed PR has been reviewed by the community. labels Aug 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community-reviewed PR has been reviewed by the community.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants