-
Notifications
You must be signed in to change notification settings - Fork 25.4k
A random-random test for time-series data #132556
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Pinging @elastic/es-storage-engine (Team:StorageEngine) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to get some confirmation from @kkrik-es that this is doing what he wants, but I think it's pretty good. I left some feedback, none of which is critical but I'd like to get it addressed.
test/framework/src/main/java/org/elasticsearch/datageneration/FieldType.java
Outdated
Show resolved
Hide resolved
test/framework/src/main/java/org/elasticsearch/datageneration/MappingGenerator.java
Show resolved
Hide resolved
...c/main/java/org/elasticsearch/datageneration/datasource/DefaultMappingParametersHandler.java
Outdated
Show resolved
Hide resolved
private List<XContentBuilder> documents = null; | ||
private DataGenerationHelper dataGenerationHelper; | ||
|
||
private static final class DataGenerationHelper { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this should be a top level class. Seems like we'll want to build multiple test classes using this framework.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've moved this class to its own file! TY.
private static Object randomDimensionValue(String dimensionName) { | ||
// We use dimensionName to determine the type of the value. | ||
var isNumeric = dimensionName.hashCode() % 5 == 0; | ||
if (isNumeric) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about IP dimensions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added 20% of dimensions as IP-like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as follow up ill add dynamic mapping to parse as ip. thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...in/esql/src/internalClusterTest/java/org/elasticsearch/xpack/esql/action/GenerativeTSIT.java
Outdated
Show resolved
Hide resolved
...in/esql/src/internalClusterTest/java/org/elasticsearch/xpack/esql/action/GenerativeTSIT.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TY @not-napoleon - ptal!
test/framework/src/main/java/org/elasticsearch/datageneration/MappingGenerator.java
Show resolved
Hide resolved
...c/main/java/org/elasticsearch/datageneration/datasource/DefaultMappingParametersHandler.java
Outdated
Show resolved
Hide resolved
...in/esql/src/internalClusterTest/java/org/elasticsearch/xpack/esql/action/GenerativeTSIT.java
Outdated
Show resolved
Hide resolved
...in/esql/src/internalClusterTest/java/org/elasticsearch/xpack/esql/action/GenerativeTSIT.java
Outdated
Show resolved
Hide resolved
test/framework/src/main/java/org/elasticsearch/datageneration/FieldType.java
Outdated
Show resolved
Hide resolved
private List<XContentBuilder> documents = null; | ||
private DataGenerationHelper dataGenerationHelper; | ||
|
||
private static final class DataGenerationHelper { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've moved this class to its own file! TY.
private static Object randomDimensionValue(String dimensionName) { | ||
// We use dimensionName to determine the type of the value. | ||
var isNumeric = dimensionName.hashCode() % 5 == 0; | ||
if (isNumeric) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added 20% of dimensions as IP-like.
private static Object randomDimensionValue(String dimensionName) { | ||
// We use dimensionName to determine the type of the value. | ||
var isNumeric = dimensionName.hashCode() % 5 == 0; | ||
if (isNumeric) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as follow up ill add dynamic mapping to parse as ip. thoughts?
...rc/main/java/org/elasticsearch/datageneration/datasource/DefaultObjectGenerationHandler.java
Outdated
Show resolved
Hide resolved
...in/esql/src/internalClusterTest/java/org/elasticsearch/xpack/esql/action/GenerativeTSIT.java
Outdated
Show resolved
Hide resolved
...in/esql/src/internalClusterTest/java/org/elasticsearch/xpack/esql/action/GenerativeTSIT.java
Outdated
Show resolved
Hide resolved
...in/esql/src/internalClusterTest/java/org/elasticsearch/xpack/esql/action/GenerativeTSIT.java
Outdated
Show resolved
Hide resolved
...in/esql/src/internalClusterTest/java/org/elasticsearch/xpack/esql/action/GenerativeTSIT.java
Outdated
Show resolved
Hide resolved
...in/esql/src/internalClusterTest/java/org/elasticsearch/xpack/esql/action/GenerativeTSIT.java
Outdated
Show resolved
Hide resolved
...in/esql/src/internalClusterTest/java/org/elasticsearch/xpack/esql/action/GenerativeTSIT.java
Outdated
Show resolved
Hide resolved
...in/esql/src/internalClusterTest/java/org/elasticsearch/xpack/esql/action/GenerativeTSIT.java
Outdated
Show resolved
Hide resolved
...in/esql/src/internalClusterTest/java/org/elasticsearch/xpack/esql/action/GenerativeTSIT.java
Outdated
Show resolved
Hide resolved
// Verify that the second column is the avg value (thus why row.get(2)) | ||
docValues.stream().mapToDouble(Integer::doubleValue).average().ifPresentOrElse(avgValue -> { | ||
var res = (Double) row.get(2); | ||
assertThat(res, closeTo(avgValue, res * 0.5)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need the 0.5
factor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that was a mistake (I meant to do 5% not 50%).
However, average calculation does seem to have up to 20-25% error between ES and test-framework numbers. Should I check if that's a bug and how to deal with it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There should be no error here (Double.compare()
should return 0), so there's a bug somewhere. Let's investigate separately.
...src/internalClusterTest/java/org/elasticsearch/xpack/esql/action/TSDataGenerationHelper.java
Show resolved
Hide resolved
...src/internalClusterTest/java/org/elasticsearch/xpack/esql/action/TSDataGenerationHelper.java
Show resolved
Hide resolved
...src/internalClusterTest/java/org/elasticsearch/xpack/esql/action/TSDataGenerationHelper.java
Outdated
Show resolved
Hide resolved
...src/internalClusterTest/java/org/elasticsearch/xpack/esql/action/TSDataGenerationHelper.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Pablo, this ia a good step. It's nice that you tried to include the pass-through field on the first take, though that complicates things somewhat. I'd start with statically defined dimension and metric fields to get the validation logic in place first, then introduce dynamic fields on top of that.
Let's try to refactor the logic slightly so that it can be further extended in follow-up PRs.
...src/internalClusterTest/java/org/elasticsearch/xpack/esql/action/RandomizedTimeSeriesIT.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks for addressing the comments. Let's keep iterating.
Follow up items after this PR:
rate
function and counters in general