Skip to content

Conversation

imalsogreg
Copy link
Contributor

@imalsogreg imalsogreg commented Aug 27, 2025

This is AI generated code with human guidance for porting a python AI pipeline to existing and planned features of the BAML language.

See notes.md for a summary of the findings.

My own general impression: having been fed a bunch of BAML documentation, Claude Code generated mostly correct BAML. This pipeline didn't demand any major features that are missing from our plans for BAML, except for error handling. Most of the costs of the port come from ergonimics issues (missing std lib functions and db access). And we gained really cheap observability.

Maybe interestingly, I don't think we saw many opportunities for parallelization in this code given our current async rescheduling plan. The main place where we could theoretically parallelize is in a video batch processing step that uses a for loop. We should think about automatic for->parfor transforms - how to implement them and how to inform the user that they're happening.

@imalsogreg imalsogreg temporarily deployed to boundary-tools-dev August 27, 2025 05:46 — with GitHub Actions Inactive
@imalsogreg imalsogreg temporarily deployed to boundary-tools-dev August 27, 2025 05:46 — with GitHub Actions Inactive
@imalsogreg imalsogreg temporarily deployed to boundary-tools-dev August 27, 2025 05:46 — with GitHub Actions Inactive
Copy link

vercel bot commented Aug 27, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
promptfiddle Ready Ready Preview Comment Aug 27, 2025 6:13am

Copy link

🔒 Entelligence AI Vulnerability Scanner

No security vulnerabilities found!

Your code passed our comprehensive security analysis.


Copy link

LGTM 👍

Copy link

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Changes requested ❌

Reviewed everything up to b46ffd1 in 1 minute and 57 seconds. Click for details.
  • Reviewed 1539 lines of code in 10 files
  • Skipped 0 files when reviewing.
  • Skipped posting 6 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. integ-tests/baml_src/ai-content-pipeline/api_functions.baml:5
  • Draft comment:
    Ensure env.API_BASE_URL is normalized to avoid double slashes in URL concatenation.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% The comment falls into the category of speculative issues ("if API_BASE_URL has a trailing slash..."). We don't have evidence that this is actually a problem. Most API clients handle double slashes fine anyway. The comment also starts with "Ensure that..." which is a red flag per the rules. Double slashes in URLs could potentially cause issues with some API servers, and normalizing URLs is generally good practice. While URL normalization is good practice, this comment is speculative without evidence of an actual problem. The rules specifically say not to make speculative comments or ones that start with "Ensure that..." Delete this comment as it's speculative and starts with "Ensure that..." which violates the review rules.
2. integ-tests/baml_src/ai-content-pipeline/api_functions.baml:7
  • Draft comment:
    Consider adding error handling for std::fetch_value calls to gracefully handle API call failures.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% Error handling for API calls is generally important for production code. However, I don't know if std::fetch_value has built-in error handling or if this is test code that intentionally omits error handling. Without knowing the std::fetch_value implementation or the broader context, I can't be certain this is actually an issue. I may be overvaluing the importance of error handling in what could be test code. I also don't know how std::fetch_value behaves when errors occur. While error handling is important, suggesting it without understanding the existing error handling behavior of std::fetch_value could lead to unnecessary code changes. Given the uncertainty about std::fetch_value's behavior and the test nature of the code, this comment should be removed as it may not be necessary or helpful.
3. integ-tests/baml_src/ai-content-pipeline/api_functions.baml:67
  • Draft comment:
    Verify that 'recording.download_url' is not null before using it in query_params.
  • Reason this comment was not posted:
    Comment did not seem useful. Confidence is useful = 20% <= threshold 50% The comment is asking the author to verify a condition, which is against the rules. It doesn't provide a specific suggestion or point out a clear issue with the code. It could be rephrased to suggest a specific code change or to ask for a test to be written.
4. integ-tests/baml_src/ai-content-pipeline/expression_functions.baml:142
  • Draft comment:
    The retry loop in ProcessVideoWithErrorHandling lacks proper exception handling; consider wrapping TryProcessVideo in try/catch.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% The code is already handling errors through a status-based approach rather than exceptions. TryProcessVideo is documented to handle exceptions internally. The retry mechanism is working as intended by checking result.status. Adding try/catch would be redundant since error handling is already implemented through a different pattern. I could be wrong about the internal exception handling - the comment on TryProcessVideo says it "would" handle exceptions, implying it might not be implemented yet. Even if TryProcessVideo's implementation is incomplete, the architectural pattern is clearly established to use status codes rather than exceptions for error handling, so adding try/catch would go against the existing design. The comment should be deleted because it suggests adding exception handling to code that intentionally uses a different error handling pattern (status codes).
5. integ-tests/baml_src/ai-content-pipeline/expression_functions.baml:119
  • Draft comment:
    Using 'results = results + [processed_job]' in a loop may be suboptimal; consider using an append/push operation if available.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% The code is in a new BAML file, which appears to be a domain-specific language. We don't have clear documentation about what array operations are available in BAML. While array concatenation in loops is generally inefficient in many languages, we don't know if BAML has append/push operations or if this is the idiomatic way to build arrays in BAML. The comment makes assumptions about available operations. I could be wrong about BAML's capabilities - maybe it does have append operations and this is a legitimate performance concern. The underlying principle about array concatenation performance is generally valid. Without clear documentation about BAML's array operations and performance characteristics, this comment is speculative and may not be actionable. We shouldn't assume typical programming language operations exist in a DSL. The comment should be removed since it makes assumptions about BAML's capabilities and we lack evidence that better alternatives exist in this language.
6. integ-tests/baml_src/ai-content-pipeline/llm_functions.baml:4
  • Draft comment:
    Typo noticed: The client is specified as 'GPT4o'. If the intended value is 'GPT4', please correct the extra 'o' to avoid any confusion.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% The consistent usage of 'GPT4o' across multiple functions suggests this is an intentional naming choice, not a typo. Without knowing the codebase's client naming conventions or configuration, we can't assume 'GPT4o' is incorrect. The comment is making an assumption without strong evidence that there's actually an issue. I could be wrong if 'GPT4o' is truly a typo that was copy-pasted multiple times. Also, maybe there's a standard client naming convention I'm unaware of. Even if it were a typo, the consistent usage across the file means this would need to be verified with the author or checked against documentation. We don't have strong evidence it's incorrect. Delete the comment. The consistent usage of 'GPT4o' suggests it's intentional, and we don't have strong evidence that it's incorrect.

Workflow ID: wflow_hwu6YLH2b3WD26aR

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.


function GetCurrentTimestamp() -> string {
// This would need to be implemented as a builtin or API call
return "2024-01-01T00:00:00Z";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetCurrentTimestamp returns a constant value; update this to generate a dynamic current timestamp.

Suggested change
return "2024-01-01T00:00:00Z";
return now();

"status": "pending",
"error_message": None,
"youtube_video_id": None,
"created_at": datetime.now().isoformat(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using timezone‐aware timestamps instead of datetime.now().isoformat() for consistency.

Suggested change
"created_at": datetime.now().isoformat(),
"created_at": datetime.now().astimezone().isoformat(),

Copy link

codecov bot commented Aug 27, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant