Skip to content

Conversation

jkachel
Copy link
Contributor

@jkachel jkachel commented Oct 1, 2025

What are the relevant tickets?

Closes mitodl/hq#8655

Description (What does it do?)

We're working to make Keycloak the source of truth for organization data, including memberships. MITx Online allows users to join an organization by supplying an enrollment code (either through the checkout process or via a separate "attach" API the user can get to via Learn). These joins need to be represented in Keycloak too, or the user will be removed from the organization when they do anything else in the system. (Org membership is evaluated any time the user hits the system, and orgs the user's joined on their own won't show up in the list that we get back from APISIX, so this will signal the auth middleware to remove them from those orgs.)

So, this PR adds code to hit the Keycloak admin API's organization members endpoint to add the user to the Keycloak organization when we process an enrollment code. When a user either completes checkout of a contract courserun and uses an enrollment code, or uses the attach API, the system will now POST the user's global ID to organizations/<org id>/members/ which should add them to the organization in question.

This also does some other backend shoring up that was needed:

  • We now have a hard link between users and organizations. Previously, we calculated this based on contract membership.
    • This allows us to flag organization records on MITx Online's end that have been manually processed by it.
    • Every place that the system added contract membership was updated to ensure the org memberships were also correct.
  • The integration_type field on contracts has changed to membership_type.
    • integration_type wasn't a reasonable name for the field anymore - we don't use it to determine whether or not the contract/org has an SSO integration, we use it to determine when we should automatically manage the membership of the contract.
    • This also adds auto, managed, and code as options for this field. managed and auto correspond to the sso type before, and code corresponds to the non-sso type. (There are two auto ones for a future PR.)
    • The migration that changes this should preserve the field values for existing contracts.

How can this be tested?

Automated tests should pass.

To test, your setup needs to have:

  • MITx Online configured properly to hit the Keycloak admin API. See
  • An org that is set up in Keycloak and has a matching org in MITx Online (use sync_keycloak_orgs management command)
  • A B2B contract that is configured for the org and is a non-sso/code contract (so you get enrollment codes).
  • A course for that contract (so you get enrollment codes - it does not necessarily need to exist in edX).
  • Ideally: Learn also running, so you can easily hit the attach API. Alternatively, you can work out the URL for that manually or you can go through the checkout process to attach to the contract/org. (But Learn is highly recommended since it has the org dashboard in it.)
  1. Log in as your test user.
  2. Attach the user to your contract using an enrollment code - either by using the cart process or the attach URL.
  3. Check the user in MITx Online. It should have the org attached to it, and the "keep until seen" flag should be True.
  4. Check the user in Keycloak. They should be in the org.
  5. Continue doing things in MITx Online. The user's org membership should remain stable.
  6. Log out and then back in again.
  7. Check the user in MITx Online. The org membership should not have any changes, but the "keep until seen" flag should be cleared.
  8. Removing the user from the org in Keycloak should result in them being removed from the org in MITx Online. However, you have to log out and back in for that change to take effect.

Additional Context

The managed and auto types are to cover two cases: when we want a contract to be available to all users in the org without exception, and when we want a contract to be available to users in the org who arrive there via federation (i.e. they log in via their host organization). At the moment, there's no real distinction between these two; that will come in a later PR. (Basically, we'll need to add a flag to the org membership in MITx Online to determine if they came in with an enrollment code or not - if they did, they won't get managed contracts automatically.)

The "keep until seen" flag works around an issue with reconciling the user's org memberships based on the user data from APISIX. APISIX's user data payload includes the organizations the user belongs to. However, this user data is cached - it retrieves the user data from Keycloak when the user logs in, and then it never updates it. So until the APISIX session is cleared, the org list for the user won't change either (even though we've changed it in Keycloak). The "keep until seen" flag means that the reconciliation code should skip this particular membership until it's actually seen in the APISIX user data. This is also why you have to log out and back in if you remove a user from the org in Keycloak for the change to take effect in MITx Online.

@jkachel jkachel force-pushed the jkachel/8655-update-keycloak-org-membership-on-enroll branch 2 times, most recently from ce12a0f to 7fbab77 Compare October 1, 2025 20:44
@jkachel
Copy link
Contributor Author

jkachel commented Oct 2, 2025

Final test failure here is one of the flaky ones - see https://github.com/mitodl/hq/issues/8744

@rhysyngsun rhysyngsun self-assigned this Oct 3, 2025
Comment on lines 76 to 80
CONTRACT_MEMBERSHIP_SSO,
CONTRACT_MEMBERSHIP_NONSSO,
CONTRACT_MEMBERSHIP_MANAGED,
CONTRACT_MEMBERSHIP_CODE,
CONTRACT_MEMBERSHIP_AUTO,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be [value[0] for value in CONTRACT_MEMBERSHIP_CHOICES] instead so we don't risk them getting out of sync if other types are added or removed?

CONTRACT_MEMBERSHIP_AUTO,
],
default=CONTRACT_INTEGRATION_NONSSO,
default=CONTRACT_MEMBERSHIP_MANAGED,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there maybe not be a default and require it to be explicitly specified to avoid user error?



def noop(apps, schema_editor):
"""No operation (placeholder for reverse migration)."""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wasn't ever implemented, so there's permanent data loss on a rollback.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed this to do an AlterField on the column rather than creating a new one and dropping the old; seemed more straightforward that way since the "use" of the column really doesn't change any.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rename would actually cause issues w/ an in-flight requests at the time though, right? Even the original code actually had this issue so I'm thinking in this PR you should only add the field and copy the values over, and then in a follow up PR after the next release remove the old field. Normally I think we'd just roll with it but I think there's extra sensitivity around this given the users it'd impact?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Reworked this in ee9a451.

b2b/api.py Outdated
Q(sso_organization_id__in=organizations) & ~Q(organization_users__user=user)
)
.filter(sso_organization_id__isnull=False)
.all()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: all() is unnecessary here. Same for the next query, but I'll only comment here.

b2b/api.py Outdated
Comment on lines 817 to 830
new_membership, created = UserOrganization.objects.get_or_create(
user=user,
organization=add_org,
defaults={"keep_until_seen": False},
)

contracts_to_remove = user_contracts_qs.filter(organization__in=no_orgs).all()
if not created and new_membership.keep_until_seen:
new_membership.keep_until_seen = False
new_membership.save()
log.info(
"reconcile_user_orgs: cleared keep_until_seen for user %s org %s",
user.id,
add_org,
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This (through line 830, github is not doing the diff well) could just be simplified down to use update_or_create and perform 1 less query per org unless the log statement is important:

Suggested change
new_membership, created = UserOrganization.objects.get_or_create(
user=user,
organization=add_org,
defaults={"keep_until_seen": False},
)
contracts_to_remove = user_contracts_qs.filter(organization__in=no_orgs).all()
if not created and new_membership.keep_until_seen:
new_membership.keep_until_seen = False
new_membership.save()
log.info(
"reconcile_user_orgs: cleared keep_until_seen for user %s org %s",
user.id,
add_org,
)
new_membership, _ = UserOrganization.objects.update_or_create(
user=user,
organization=add_org,
defaults={"keep_until_seen": False},
)

b2b/models.py Outdated
for contract in contracts_qs.all():
user.b2b_contracts.add(contract)

user.save()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary write to the user

users/models.py Outdated
hashed_email = models.CharField(max_length=128)


class UserOrganization(models.Model):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that this model should live in the b2b app since it is specific to that.

Comment on lines 227 to 231
id = serializers.SerializerMethodField()
name = serializers.SerializerMethodField()
description = serializers.SerializerMethodField()
logo = serializers.SerializerMethodField()
slug = serializers.SerializerMethodField()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the exception of logo, these could all be regular typed fields w/ source= specified, e.g. id = serializers.IntegerField(source="organization.id").

pytest.ini Outdated
KEYCLOAK_BASE_URL=http://keycloak/
KEYCLOAK_REALM_NAME=ol-test
KEYCLOAK_CLIENT_ID=mitxonline
KEYCLOAK_DISCOVERY_URL=https://keycloak/realms/ol-local/.well-known/openid-configuration
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably not be an actually resolvable hostname to guarantee that we catch when a request is made that hasn't been mocked.

users/models.py Outdated
Comment on lines 625 to 675
@staticmethod
def process_add_membership(user, organization, *, keep_until_seen=False):
"""
Add a user to an org, and kick off contract processing.
This allows us to manage UserOrganization records without necessarily
being forced to process contract memberships at the same time.
Args:
- user (users.models.User): the user to add
- organization (b2b.models.OrganizationPage): the organization to add the user to
- keep_until_seen (bool): if True, the user will be kept in the org until the
organization is seen in their SSO data.
"""

obj, created = UserOrganization.objects.get_or_create(
user=user,
organization=organization,
)
if created:
obj.keep_until_seen = keep_until_seen
obj.save()
try:
organization.attach_user(user)
except ConnectionError:
log.exception(
"Could not attach %s to Keycloak org for %s", user, organization
)
organization.add_user_contracts(user)

return obj

@staticmethod
def process_remove_membership(user, organization):
"""
Remove a user from an org, and kick off contract processing.
Other side of the process_add_membership function - removes the membership
and associated managed contracts.
Args:
- user (users.models.User): the user to remove
- organization (b2b.models.OrganizationPage): the organization to remove the user from
"""

organization.remove_user_contracts(user)

UserOrganization.objects.filter(
user=user,
organization=organization,
).get().delete()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These methods don't access cls or self so to follow the pattern we typically use these should live in b2b/api.py instead

@jkachel jkachel force-pushed the jkachel/8655-update-keycloak-org-membership-on-enroll branch from 13ad15a to dc8a368 Compare October 6, 2025 15:48
@jkachel
Copy link
Contributor Author

jkachel commented Oct 6, 2025

Last few commits also fix an unrelated N+1 error that was popping up in the test runs.

@jkachel jkachel requested a review from rhysyngsun October 6, 2025 22:26
@jkachel jkachel force-pushed the jkachel/8655-update-keycloak-org-membership-on-enroll branch 2 times, most recently from 809d79f to e34d91b Compare October 9, 2025 13:31
@jkachel
Copy link
Contributor Author

jkachel commented Oct 9, 2025

Requesting re-review on this - I have reworked how the ManyToMany relationship works per request and the tests seem to pass. They don't appear to be running here, though, but as of this comment GH status says webhook performance is degraded, so... maybe eventually they will. Tests pass locally though.

Copy link
Collaborator

@rhysyngsun rhysyngsun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

… attachment points, update org membership in Keycloak where approrpiate
…o AlterField rather than drop a column and create a new one, which seems more straightforward
… a bunch of stuff for that, cleanup elsewhere
made membership_type a char field - we don't allow writing via the serializer anyway; when we remove the integration_type field, membership_type can have the choices instead
…obust

test_discounted_price is somewhat flaky, but can't really ID why, because the factory randomizes some stuff. So, control said randomization; maybe if it flakes out again it'll be easier to debug.
@jkachel jkachel force-pushed the jkachel/8655-update-keycloak-org-membership-on-enroll branch from e5d4f7b to bfe145c Compare October 16, 2025 19:44
@jkachel jkachel merged commit 786e840 into main Oct 16, 2025
8 checks passed
@odlbot odlbot mentioned this pull request Oct 16, 2025
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants