-
Notifications
You must be signed in to change notification settings - Fork 26
feat: support new metrics firehose api with get_usage()
#404
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 9 commits
99e5a10
1e6a816
a06bb56
437803b
19e45cd
66e61ef
24f9dc3
6442b4d
919bb0a
7b05c73
01a6ccd
567fea3
f68f753
26980e3
491f7df
b474ed9
d2ae1a0
30ea333
cf48a08
442ac62
b19f147
9d6cef4
68e4586
fa24199
b709457
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
@@ -818,6 +818,33 @@ Connect <- R6::R6Class( | |||||||||||||||||||||||||||
self$GET(path, query = query) | ||||||||||||||||||||||||||||
}, | ||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||
#' @description Get content usage data. | ||||||||||||||||||||||||||||
#' @param from Optional `Date` or `POSIXt`; start of the time window. If a | ||||||||||||||||||||||||||||
#' `Date`, coerced to `YYYY-MM-DDT00:00:00` in the caller's time zone. | ||||||||||||||||||||||||||||
#' @param to Optional `Date` or `POSIXt`; end of the time window. If a | ||||||||||||||||||||||||||||
#' `Date`, coerced to `YYYY-MM-DDT23:59:59` in the caller's time zone. | ||||||||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure this is doing what we expect with timezones. And actually, I'm not even totally sure what is intended. So maybe we should work that out here and then adapt the code to fit? When we send timestamps to Connect with this function do we want them to be transformed to UTC from the caller's local timezone before being sent? Or some other behavior? One thing to note: If I'm reading this correctly, Lines 16 to 28 in e8c8075
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We discussed this the libraries guild. We discussed the tradeoffs between (1) having time-less dates be processed in the timezone of the server, (2) processing them in UTC, or (3) requiring that the caller specify a time. We could see arguments for all sides, but I think as a group we leaned towards (2) or (3) — if we allow bare dates, process in UTC; perhaps just require timestamps and force the user to write the date-selection part of the code. That last option might make more sense, as converting a date to a timestamp should be closer to the UI. @karawoo / @tdstein / @marcosnav can correct me if I got anything wrong here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I removed special handling from this function and deferred to |
||||||||||||||||||||||||||||
inst_content_hits = function(from = NULL, to = NULL) { | ||||||||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think this method name benefits from having the More of a philosophical musing, but I also am not sure it's worth packing more things into methods on the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agreed that I don't think the method name at all benefits from I would also be completely fine with beginning to move everything into the functions; I agree that I don't think we get much benefit at all from having the logic in the R6 object. In fact, I think it adds confusion: it was hard for me to decide where we want different bits of logic to live, i.e. within the method or the outer function. It's kind arbitrary. I guess you could argue that the goal of the methods is to enumerate the available endpoints that operate on different objects, but it feels like a pretty verbose way to do that. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I merged all of the implementation from the |
||||||||||||||||||||||||||||
error_if_less_than(self$version, "2025.04.0") | ||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||
# If this is called with date objects with no timestamp attached, it's | ||||||||||||||||||||||||||||
# reasonable to assume that the caller is indicating the days as an | ||||||||||||||||||||||||||||
# inclusive range. | ||||||||||||||||||||||||||||
if (inherits(from, "Date")) { | ||||||||||||||||||||||||||||
from <- as.POSIXct(paste(from, "00:00:00")) | ||||||||||||||||||||||||||||
} | ||||||||||||||||||||||||||||
if (inherits(to, "Date")) { | ||||||||||||||||||||||||||||
to <- as.POSIXct(paste(to, "23:59:59")) | ||||||||||||||||||||||||||||
} | ||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||
self$GET( | ||||||||||||||||||||||||||||
v1_url("instrumentation", "content", "hits"), | ||||||||||||||||||||||||||||
query = list( | ||||||||||||||||||||||||||||
from = make_timestamp(from), | ||||||||||||||||||||||||||||
to = make_timestamp(to) | ||||||||||||||||||||||||||||
) | ||||||||||||||||||||||||||||
) | ||||||||||||||||||||||||||||
}, | ||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||
#' @description Get running processes. | ||||||||||||||||||||||||||||
procs = function() { | ||||||||||||||||||||||||||||
warn_experimental("procs") | ||||||||||||||||||||||||||||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -58,15 +58,19 @@ ensure_column <- function(data, default, name) { | |
# manual fix because vctrs::vec_cast cannot cast double -> datetime or char -> datetime | ||
col <- coerce_datetime(col, default, name = name) | ||
} | ||
|
||
if (inherits(default, "fs_bytes") && !inherits(col, "fs_bytes")) { | ||
col <- coerce_fsbytes(col, default) | ||
} | ||
|
||
if (inherits(default, "integer64") && !inherits(col, "integer64")) { | ||
col <- bit64::as.integer64(col) | ||
} | ||
|
||
if (inherits(default, "list") && !inherits(col, "list")) { | ||
col <- list(col) | ||
} | ||
|
||
col <- vctrs::vec_cast(col, default, x_arg = name) | ||
} | ||
data[[name]] <- col | ||
|
@@ -101,6 +105,65 @@ parse_connectapi <- function(data) { | |
)) | ||
} | ||
|
||
# nolint start | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What linting are we escaping here? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Commented code maybe? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, commented code — it's not
toph-allen marked this conversation as resolved.
Show resolved
Hide resolved
|
||
# Unnests a list column similarly to `tidyr::unnest_wider()`, bringing the | ||
# entries of each list-item up to the top level. Makes some simplifying | ||
# assumptions for the sake of performance: | ||
# 1. All inner variables are treated as character vectors; | ||
# 2. The names of the first entry of the list-column are used as the | ||
# names of variables to extract. | ||
# Performance example: | ||
# > nrow(x_raw) | ||
# [1] 373632 | ||
# > nrow(x_raw) | ||
# [1] 373632 | ||
# > t_tidyr <- system.time( | ||
# + x_tidyr <- tidyr::unnest_wider(x_raw, data) | ||
# + ) | ||
# > t_custom <- system.time( | ||
# + x_custom <- fast_unnest_character(x_raw, "data") | ||
# + ) | ||
# > identical(x_tidyr, x_custom) | ||
# [1] TRUE | ||
# > t_tidyr | ||
# user system elapsed | ||
# 7.018 0.137 7.172 | ||
# > t_custom | ||
# user system elapsed | ||
# 0.281 0.005 0.285 | ||
# nolint end | ||
fast_unnest_character <- function(df, col_name) { | ||
if (!is.character(col_name)) { | ||
stop("col_name must be a character vector") | ||
} | ||
if (!col_name %in% names(df)) { | ||
stop("col_name is not present in df") | ||
} | ||
|
||
list_col <- df[[col_name]] | ||
|
||
new_cols <- names(list_col[[1]]) | ||
|
||
df2 <- df | ||
for (col in new_cols) { | ||
df2[[col]] <- vapply( | ||
list_col, | ||
function(row) { | ||
if (is.null(row[[col]])) { | ||
NA_character_ | ||
} else { | ||
row[[col]] | ||
} | ||
}, | ||
"1", | ||
toph-allen marked this conversation as resolved.
Show resolved
Hide resolved
|
||
USE.NAMES = FALSE | ||
) | ||
} | ||
|
||
df2[[col_name]] <- NULL | ||
df2 | ||
} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Thinking about this a bit more: this isn't a huge chunk of code of course, but it is another chunk that we will take on the maintenance of if we go this route. This is another example where having our data interchange within connectapi be all data frames means we have to worry about the performance of json-parsed list responses into data frames and make sure those data frames are in a natural structure for folks to use. If we relied instead on only the parsed list data as our interchange and then gave folks There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I can see what you mean, and this does align with what you've been saying about other server objects. I hear what you're saying about parsing to data frames for data interchange and I think that approach would be great to use for, say, the integrations endpoints that I just added stories for. For the data from the hits endpoint, presenting it as anything other than a data frame goes back to feeling kinda not-R-idiomatic, as it isn't data that can… hmm… So definitely one of the tasks, and maybe the main task that I can imagine for this data is to, like, treat it as a data frame and filter, plot, etc., it. But another thing you might want to do is, like, get the content item associated with this hit. And yeah, in that case, you might just want to be able to pass the I still think we might want to keep code like this around in an Open to a variety of options — let's discuss what the best approach would be to finalize and merge this PR. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Discussed this in the Libraries Guild meeting. Some takeaways:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Calling I think these are the approaches open to us, in order of… least to most code in the package:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe an option that splits the difference somewhat would be to have the function return a list, provide an There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
@karawoo I took this approach — take a look and see how it reads to you! |
||
|
||
coerce_fsbytes <- function(x, to, ...) { | ||
if (is.numeric(x)) { | ||
fs::as_fs_bytes(x) | ||
|
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,52 @@ | ||
[ | ||
{ | ||
"id": 8966707, | ||
"user_guid": null, | ||
"content_guid": "475618c9", | ||
"timestamp": "2025-04-30T12:49:16.269904Z", | ||
"data": { | ||
"path": "/hello", | ||
"user_agent": "Datadog/Synthetics" | ||
} | ||
}, | ||
{ | ||
"id": 8966708, | ||
"user_guid": null, | ||
"content_guid": "475618c9", | ||
"timestamp": "2025-04-30T12:49:17.002848Z", | ||
"data": { | ||
"path": "/world", | ||
"user_agent": null | ||
} | ||
}, | ||
{ | ||
"id": 8967206, | ||
"user_guid": null, | ||
"content_guid": "475618c9", | ||
"timestamp": "2025-04-30T13:01:47.40738Z", | ||
"data": { | ||
"path": "/chinchilla", | ||
"user_agent": "Datadog/Synthetics" | ||
} | ||
}, | ||
{ | ||
"id": 8967210, | ||
"user_guid": null, | ||
"content_guid": "475618c9", | ||
"timestamp": "2025-04-30T13:04:13.176791Z", | ||
"data": { | ||
"path": "/lava-lamp", | ||
"user_agent": "Datadog/Synthetics" | ||
} | ||
}, | ||
{ | ||
"id": 8966214, | ||
"user_guid": "fecbd383", | ||
"content_guid": "b0eaf295", | ||
"timestamp": "2025-04-30T12:36:13.818466Z", | ||
"data": { | ||
"path": null, | ||
"user_agent": null | ||
} | ||
} | ||
] |
Uh oh!
There was an error while loading. Please reload this page.