-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Backstage connector #4315
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Backstage connector #4315
Conversation
@arslan-autoscout24 is attempting to deploy a commit to the Danswer Team on Vercel. A member of the Team first needs to authorize it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR Summary
This PR introduces a new Backstage connector to integrate with Spotify's Backstage service catalog, enabling entity and metadata fetching from Backstage instances with OAuth authentication support.
- Critical security issue: Hardcoded credentials and URLs for portal.services.as24.tech in
test_backstage_connector_real.py
must be removed - Connector implementation in
backend/onyx/connectors/backstage/connector.py
needs better error handling for rate limiting and token expiration - Missing validation for entity_kinds parameter in BackstageConnector initialization
- Test coverage in
test_backstage_connector.py
should be expanded to include pagination and error recovery scenarios - Inconsistent parameter naming between test functions in manual test script needs to be standardized
💡 (1/5) You can manually trigger the bot by mentioning @greptileai in a comment!
16 file(s) reviewed, 18 comment(s)
Edit PR Review Bot Settings | Greptile
if (not self.access_token or | ||
not self.token_expiry or | ||
datetime.now() + timedelta(minutes=1) >= self.token_expiry): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: token_expiry is checked but never set, causing unnecessary token refreshes
if (not self.access_token or | |
not self.token_expiry or | |
datetime.now() + timedelta(minutes=1) >= self.token_expiry): | |
if (not self.access_token or | |
not self.token_expiry or | |
datetime.now(timezone.utc) + timedelta(minutes=1) >= self.token_expiry): |
retry_after = int(response.headers.get('Retry-After', 5)) | ||
sleep_time = min(retry_after, 60) * (2 ** retry_count) | ||
logger.warning(f"Rate limited. Retrying after {sleep_time} seconds") | ||
time.sleep(sleep_time) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
syntax: time module is used but not imported
time.sleep(sleep_time) | |
import time | |
from datetime import datetime, timedelta, timezone |
|
||
# Example of how to use the connector with proper error handling | ||
try: | ||
connector = BackstageConnector("https://portal.services.as24.tech/") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: Production credentials are hardcoded in example code
connector = BackstageConnector("https://portal.services.as24.tech/") | |
connector = BackstageConnector("https://backstage.example.com/") |
{ | ||
type: "text", | ||
query: "Enter the base URL:", | ||
label: "Base URL", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: Add description field to explain expected format (e.g., 'https://backstage.example.com')
@@ -344,13 +351,19 @@ export const credentialTemplates: Record<ValidSources, any> = { | |||
not_applicable: null, | |||
ingestion_api: null, | |||
discord: { discord_bot_token: "" } as DiscordCredentialJson, | |||
backstage: { | |||
backstage_client_id : "", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: extra space after backstage_client_id colon
backstage_client_id : "", | |
backstage_client_id: "", |
@@ -379,6 +379,7 @@ export enum ValidSources { | |||
Egnyte = "egnyte", | |||
Airtable = "airtable", | |||
Gitbook = "gitbook", | |||
Backstage = "backstage", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: Backstage source was added but not included in validAutoSyncSources array - consider if it should support auto-sync
* don't yield expected auth errors * only catch 403s
* bump fastapi and starlette * bumping llama index and nltk and associated deps * bump to fix python-multipart * bump aiohttp * update package lock for examples/widget * bump black * sentencesplitter has changed namespaces * fix reorder import check, fix missing passlib * update package-lock.json * black formatter updated * reformatted again * change to black compatible reorder * change to black compatible reorder-python-imports fork * fix pytest dependency * black format again * we don't need cdk.txt. update packages to be consistent across all packages --------- Co-authored-by: Richard Kuo (Onyx) <rkuo@onyx.app> Co-authored-by: Richard Kuo <rkuo@rkuo.com>
… the db or do any work. (onyx-dot-app#4498) Co-authored-by: Richard Kuo (Onyx) <rkuo@onyx.app>
* updating more packages * mypy fixes --------- Co-authored-by: Richard Kuo (Onyx) <rkuo@onyx.app>
* refactor salesforce sqlite db access * more refactoring * refactor again * refactor again * rename object * add finalizer to ensure db connection is always closed * avoid unnecessarily nesting connections and commit regularly when possible --------- Co-authored-by: Richard Kuo (Onyx) <rkuo@onyx.app>
…-app#4503) * ensure individual search tool runs do not affect each other * small bug fixes * nit
* address file path * k * update * update * nit- fix typing * k * should path * in a good state * k * k * clean up file * update * update * k * k * k
* initial working version * ranking profile * modification for keyword/instruction retrieval * mypy fixes * EL comments * added env var (True for now) * flipped default to False * mypy & final EL/CW comments + import issue
* minor cleanup * cleanup doc deduping and add unit tests
* Fix default log level * fix
* refactor to use stricter typing * older version of ruff
* update * fix * finalize` * remove unnecessary prints * fix * k
* rollback properly on exception * rollback on exception * don't continue if we can't set the search path * cleaner handling via context manager --------- Co-authored-by: Richard Kuo (Danswer) <rkuo@onyx.app>
* add percentage progress * range checking * formatting * for new channels, skip them if the most recent messages are all from bots * comments * bypass bot channels * code review --------- Co-authored-by: Richard Kuo (Onyx) <rkuo@onyx.app>
* Adjust migration * update default in form * Add cloud indices for bfloat16 * Update backend/shared_configs/configs.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Update vespa schema gen script * Move embedding configs * Remove unused imports * remove import from shared configs * Remove unused model --------- Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
* Fixed indexing when no sites are specificed * Added test for Sharepoint all sites index * Accounted for paginated results. * Typing * Typing --------- Co-authored-by: Wenxi Onyx <wenxi-onyx@Wenxis-MacBook-Pro.local>
* remove Hagen from CONTRIBUTING.md * fix slack invite url * fix second slack invite
* Add error clarity to restart containers script * erroneous cleanup on exit * space Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> --------- Co-authored-by: Wenxi Onyx <wenxi-onyx@Wenxis-MacBook-Pro.local> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
* Add error clarity to restart containers script * erroneous cleanup on exit * fix when starting containers for the first time --------- Co-authored-by: Wenxi Onyx <wenxi-onyx@Wenxis-MacBook-Pro.local>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
* db setup * transfer 1 - incomplete * more adjustments * relationship table + query update * temp view creation * restructuring * nits * updates * separate read_only engine * extraction revamp * focus on metadata relatonships 1 * dev * migration downgrade fix * rebase migration change * a3+ * progress * base * new extraction * progress * fixed KG extraction * nits * updates * simplifications & cleanup * fixes * updates * more feature flag checks * fixes * extraction process fix * read-only user creation as part of setup * fix for missing entity attributes * kg read-only user creation as part of migration * typo * EL initial comments * initial Account/SF Connector chnges * SF Connector update - include account information * base w/ salesforce * evan updates + quite a bit more * kg-filtered search * EL changes pt 2 * migrations and env vars * quick migration fix * migration update * post_rebase fixes * mypy fixes * test fixes * test fix * test fix * read_only pool + misc * nf * env vars * test improvements * salesforce fix * test update * small changes * small adjustments * SF Connector fix & kg_stage removal for one table * mypy fix * small fixes * EL + RK (pt 1) comments * nit * setting updated * Salesforce test update * EL comments * read-only user replacement & cleanup * SQL View fix * converting entity type-name separators * sql view group ownership * view fix * SQL tweak * dealing with docs that were skipped by indexing * increased error handling * more error handling * Output formatting fix * kg-incremental-reindexing * 0-doc found improvement * celery * migration correction * timeout adjustments * nit * Updated migration * Entity Normalization for KG Dev 1 (onyx-dot-app#4746) * feat: trigrams column * fix: reranking and db * feat: v1 * fix: convert to orm * feat: parallel * fix: default to id_name * fix: renamed semantic_id and semantic_id_trigrams * fix: scalar subquery * fix: tuning + redundancy * fix: threshold * fix: typo * fix: shorten names * wip * fix: reverted * feat: config * feat: works but it was dumb * feat: clustering works * fix: mypy * normalization <-> language awareness for SQL generation * small type fixes --------- Co-authored-by: joachim-danswer <joachim@danswer.ai> * mypy * typo and dead code * kg_time_fencing * feat: remove temp views on migration downgrade * remove functions and triggers for now * rebase adjustments * EL code review results * quick fix + trigger/funcs for single tenant * fix: typo, mypy, dead code * fix: autoflake * small updatesd * nit * fix: typo * early + faster view creation * Extension creation in MT migration * nit changes to default ETs * Incremental Clustering and KG Refactor V1 (onyx-dot-app#4784) Optimized/restructured incremental clustering. New pipeline actually that moves vespa updates to clustering. Also, celery configuration has been updated. --------- Co-authored-by: joachim-danswer <joachim@danswer.ai> * prompt tweak & ET extraction reset * more general hierarchical structure * feat: better vespa reset logic * prompt optimization and entity replacemants * small prompt changes * KG Refactor V2 (onyx-dot-app#4814) Clustering & Extraction improvements & various nits Co-authored-by: joachim-danswer <joachim@danswer.ai> * add connector-level coverage days * fix: nit * initial EL responses * refactor: helper functions for formatting * fix: more helper fns & comments * fix: comment code that's been implemented elsewhere * fix: tenant_id missing arg * fix: removed debugging stuff * fix: moved kg_interactions db query to helper fn * fix: tenant_id * fix: tenant_id & removed outdated helper fn * fix always set entity class * fix: typo * fix alembic heads * fix: celery logging * fix: migrations fix * fix: multi tenant permissions * fix: temp connector fix * fix: downgrade * Fix upgrade migration * fix: tenant for normalization * added additional acl * stray EL comments * fix: connector test * fix mypy * fix: temporary connector test fix * fix: jira connector test * nit * small nits * fix: black * fix: mypy * fix: mypy --------- Co-authored-by: Rei Meguro <36625832+Orbital-Web@users.noreply.github.com>
Reflexion Flow
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
…ials interface and update related references
… token before requests
…onnector support - Resolved conflicts in backend/onyx/configs/constants.py - preserved both HIGHSPOT and BACKSTAGE constants - Resolved conflicts in backend/onyx/connectors/factory.py - preserved both connector mappings - Resolved conflicts in web/src/components/icons/icons.tsx - preserved both icon imports - Resolved conflicts in web/src/lib/connectors/credentials.ts - preserved both credential interfaces and display names - Resolved conflicts in web/src/lib/sources.ts - preserved both source metadata entries - Resolved conflicts in web/src/lib/types.ts - preserved both source enum values All connector functionality maintained for both Highspot and Backstage.
Description
This PR adds a connector for spotify backstage.
How Has This Been Tested?
I have tested this with our company's backstage portal.
Backporting (check the box to trigger backport action)
Note: You have to check that the action passes, otherwise resolve the conflicts manually and tag the patches.