Skip to content

Backstage connector #4315

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

arslan-autoscout24
Copy link

@arslan-autoscout24 arslan-autoscout24 commented Mar 21, 2025

Description

This PR adds a connector for spotify backstage.

How Has This Been Tested?

I have tested this with our company's backstage portal.

Backporting (check the box to trigger backport action)

Note: You have to check that the action passes, otherwise resolve the conflicts manually and tag the patches.

  • This PR should be backported (make sure to check that the backport attempt succeeds)
  • [Optional] Override Linear Check

@arslan-autoscout24 arslan-autoscout24 requested a review from a team as a code owner March 21, 2025 13:25
Copy link

vercel bot commented Mar 21, 2025

@arslan-autoscout24 is attempting to deploy a commit to the Danswer Team on Vercel.

A member of the Team first needs to authorize it.

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Summary

This PR introduces a new Backstage connector to integrate with Spotify's Backstage service catalog, enabling entity and metadata fetching from Backstage instances with OAuth authentication support.

  • Critical security issue: Hardcoded credentials and URLs for portal.services.as24.tech in test_backstage_connector_real.py must be removed
  • Connector implementation in backend/onyx/connectors/backstage/connector.py needs better error handling for rate limiting and token expiration
  • Missing validation for entity_kinds parameter in BackstageConnector initialization
  • Test coverage in test_backstage_connector.py should be expanded to include pagination and error recovery scenarios
  • Inconsistent parameter naming between test functions in manual test script needs to be standardized

💡 (1/5) You can manually trigger the bot by mentioning @greptileai in a comment!

16 file(s) reviewed, 18 comment(s)
Edit PR Review Bot Settings | Greptile

Comment on lines +173 to +175
if (not self.access_token or
not self.token_expiry or
datetime.now() + timedelta(minutes=1) >= self.token_expiry):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: token_expiry is checked but never set, causing unnecessary token refreshes

Suggested change
if (not self.access_token or
not self.token_expiry or
datetime.now() + timedelta(minutes=1) >= self.token_expiry):
if (not self.access_token or
not self.token_expiry or
datetime.now(timezone.utc) + timedelta(minutes=1) >= self.token_expiry):

retry_after = int(response.headers.get('Retry-After', 5))
sleep_time = min(retry_after, 60) * (2 ** retry_count)
logger.warning(f"Rate limited. Retrying after {sleep_time} seconds")
time.sleep(sleep_time)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

syntax: time module is used but not imported

Suggested change
time.sleep(sleep_time)
import time
from datetime import datetime, timedelta, timezone


# Example of how to use the connector with proper error handling
try:
connector = BackstageConnector("https://portal.services.as24.tech/")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: Production credentials are hardcoded in example code

Suggested change
connector = BackstageConnector("https://portal.services.as24.tech/")
connector = BackstageConnector("https://backstage.example.com/")

{
type: "text",
query: "Enter the base URL:",
label: "Base URL",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: Add description field to explain expected format (e.g., 'https://backstage.example.com')

@@ -344,13 +351,19 @@ export const credentialTemplates: Record<ValidSources, any> = {
not_applicable: null,
ingestion_api: null,
discord: { discord_bot_token: "" } as DiscordCredentialJson,
backstage: {
backstage_client_id : "",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: extra space after backstage_client_id colon

Suggested change
backstage_client_id : "",
backstage_client_id: "",

@@ -379,6 +379,7 @@ export enum ValidSources {
Egnyte = "egnyte",
Airtable = "airtable",
Gitbook = "gitbook",
Backstage = "backstage",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: Backstage source was added but not included in validAutoSyncSources array - consider if it should support auto-sync

evan-onyx and others added 27 commits April 10, 2025 01:28
* don't yield expected auth errors

* only catch 403s
* bump fastapi and starlette

* bumping llama index and nltk and associated deps

* bump to fix python-multipart

* bump aiohttp

* update package lock for examples/widget

* bump black

* sentencesplitter has changed namespaces

* fix reorder import check, fix missing passlib

* update package-lock.json

* black formatter updated

* reformatted again

* change to black compatible reorder

* change to black compatible reorder-python-imports fork

* fix pytest dependency

* black format again

* we don't need cdk.txt. update packages to be consistent across all packages

---------

Co-authored-by: Richard Kuo (Onyx) <rkuo@onyx.app>
Co-authored-by: Richard Kuo <rkuo@rkuo.com>
… the db or do any work. (onyx-dot-app#4498)

Co-authored-by: Richard Kuo (Onyx) <rkuo@onyx.app>
* updating more packages

* mypy fixes

---------

Co-authored-by: Richard Kuo (Onyx) <rkuo@onyx.app>
* refactor salesforce sqlite db access

* more refactoring

* refactor again

* refactor again

* rename object

* add finalizer to ensure db connection is always closed

* avoid unnecessarily nesting connections and commit regularly when possible

---------

Co-authored-by: Richard Kuo (Onyx) <rkuo@onyx.app>
…-app#4503)

* ensure individual search tool runs do not affect each other

* small bug fixes

* nit
* address file path

* k

* update

* update

* nit- fix typing

* k

* should path

* in a good state

* k

* k

* clean up file

* update

* update

* k

* k

* k
* initial working version

* ranking profile

* modification for keyword/instruction retrieval

* mypy fixes

* EL comments

* added env var (True for now)

* flipped default to False

* mypy & final EL/CW comments + import issue
* minor cleanup

* cleanup doc deduping and add unit tests
* Fix default log level

* fix
* refactor to use stricter typing

* older version of ruff
* update

* fix

* finalize`

* remove unnecessary prints

* fix

* k
* rollback properly on exception

* rollback on exception

* don't continue if we can't set the search path

* cleaner handling via context manager

---------

Co-authored-by: Richard Kuo (Danswer) <rkuo@onyx.app>
rkuo-danswer and others added 28 commits June 5, 2025 17:52
* add percentage progress

* range checking

* formatting

* for new channels, skip them if the most recent messages are all from bots

* comments

* bypass bot channels

* code review

---------

Co-authored-by: Richard Kuo (Onyx) <rkuo@onyx.app>
* Adjust migration

* update default in form

* Add cloud indices for bfloat16

* Update backend/shared_configs/configs.py

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

* Update vespa schema gen script

* Move embedding configs

* Remove unused imports

* remove import from shared configs

* Remove unused model

---------

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
* Fixed indexing when no sites are specificed

* Added test for Sharepoint all sites index

* Accounted for paginated results.

* Typing

* Typing

---------

Co-authored-by: Wenxi Onyx <wenxi-onyx@Wenxis-MacBook-Pro.local>
* remove Hagen from CONTRIBUTING.md

* fix slack invite url

* fix second slack invite
* Add error clarity to restart containers script

* erroneous cleanup on exit

* space

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

---------

Co-authored-by: Wenxi Onyx <wenxi-onyx@Wenxis-MacBook-Pro.local>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
* Add error clarity to restart containers script

* erroneous cleanup on exit

* fix when starting containers for the first time

---------

Co-authored-by: Wenxi Onyx <wenxi-onyx@Wenxis-MacBook-Pro.local>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
* db setup

* transfer 1 - incomplete

* more adjustments

* relationship table + query update

* temp view creation

* restructuring

* nits

* updates

* separate read_only engine

* extraction revamp

* focus on metadata relatonships 1

* dev

* migration downgrade fix

* rebase migration change

* a3+

* progress

* base

* new extraction

* progress

* fixed KG extraction

* nits

* updates

* simplifications & cleanup

* fixes

* updates

* more feature flag checks

* fixes

* extraction process fix

* read-only user creation as part of setup

* fix for missing entity attributes

* kg read-only user creation as part of migration

* typo

* EL initial comments

* initial Account/SF Connector chnges

* SF Connector update

 - include account information

* base w/ salesforce

* evan updates + quite a bit more

* kg-filtered search

* EL changes pt 2

* migrations and env vars

* quick migration fix

* migration update

* post_rebase fixes

* mypy fixes

* test fixes

* test fix

* test fix

* read_only pool + misc

* nf

* env vars

* test improvements

* salesforce fix

* test update

* small changes

* small adjustments

* SF Connector fix & kg_stage removal for one table

* mypy fix

* small fixes

* EL + RK (pt 1) comments

* nit

* setting updated

* Salesforce test update

* EL comments

* read-only user replacement & cleanup

* SQL View fix

* converting entity type-name separators

* sql view group ownership

* view fix

* SQL tweak

* dealing with docs that were skipped by indexing

* increased error handling

* more error handling

* Output formatting fix

* kg-incremental-reindexing

* 0-doc found improvement

* celery

* migration correction

* timeout adjustments

* nit

* Updated migration

* Entity Normalization for KG Dev 1 (onyx-dot-app#4746)

* feat: trigrams column

* fix: reranking and db

* feat: v1

* fix: convert to orm

* feat: parallel

* fix: default to id_name

* fix: renamed semantic_id and semantic_id_trigrams

* fix: scalar subquery

* fix: tuning + redundancy

* fix: threshold

* fix: typo

* fix: shorten names

* wip

* fix: reverted

* feat: config

* feat: works but it was dumb

* feat: clustering works

* fix: mypy

* normalization <-> language awareness for SQL generation

* small type fixes

---------

Co-authored-by: joachim-danswer <joachim@danswer.ai>

* mypy

* typo and dead code

* kg_time_fencing

* feat: remove temp views on migration downgrade

* remove functions and triggers for now

* rebase adjustments

* EL code review results

* quick fix + trigger/funcs for single tenant

* fix: typo, mypy, dead code

* fix: autoflake

* small updatesd

* nit

* fix: typo

* early + faster view creation

* Extension creation in MT migration

* nit changes to default ETs

* Incremental Clustering and KG Refactor V1 (onyx-dot-app#4784)

Optimized/restructured incremental clustering. New pipeline actually that moves vespa updates to clustering.
Also, celery configuration has been updated.
---------

Co-authored-by: joachim-danswer <joachim@danswer.ai>

* prompt tweak & ET extraction reset

* more general hierarchical structure

* feat: better vespa reset logic

* prompt optimization and entity replacemants

* small prompt changes

* KG Refactor V2 (onyx-dot-app#4814)

Clustering & Extraction improvements & various nits 

Co-authored-by: joachim-danswer <joachim@danswer.ai>

* add connector-level coverage days

* fix: nit

* initial  EL responses

* refactor: helper functions for formatting

* fix: more helper fns & comments

* fix: comment code that's been implemented elsewhere

* fix: tenant_id missing arg

* fix: removed debugging stuff

* fix: moved kg_interactions db query to helper fn

* fix: tenant_id

* fix: tenant_id & removed outdated helper fn

* fix always set entity class

* fix: typo

* fix alembic heads

* fix: celery logging

* fix: migrations fix

* fix: multi tenant permissions

* fix: temp connector fix

* fix: downgrade

* Fix upgrade migration

* fix: tenant for normalization

* added additional acl

* stray EL comments

* fix: connector test

* fix mypy

* fix: temporary connector test fix

* fix: jira connector test

* nit

* small nits

* fix: black

* fix: mypy

* fix: mypy

---------

Co-authored-by: Rei Meguro <36625832+Orbital-Web@users.noreply.github.com>
Reflexion Flow
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
…ials interface and update related references
…onnector support

- Resolved conflicts in backend/onyx/configs/constants.py - preserved both HIGHSPOT and BACKSTAGE constants
- Resolved conflicts in backend/onyx/connectors/factory.py - preserved both connector mappings
- Resolved conflicts in web/src/components/icons/icons.tsx - preserved both icon imports
- Resolved conflicts in web/src/lib/connectors/credentials.ts - preserved both credential interfaces and display names
- Resolved conflicts in web/src/lib/sources.ts - preserved both source metadata entries
- Resolved conflicts in web/src/lib/types.ts - preserved both source enum values

All connector functionality maintained for both Highspot and Backstage.
@arslan-autoscout24 arslan-autoscout24 deleted the backstage-connector branch June 8, 2025 09:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.