Skip to content

Conversation

xiazcy
Copy link
Contributor

@xiazcy xiazcy commented Sep 18, 2025

https://issues.apache.org/jira/browse/TINKERPOP-2234

Implementation based on the proposal in review.

Added P.typeOf() predicate, which recognized a new set of GType enum token, String as registered in the GlobalTypeCache, as well as Class in Java/Groovy usage.

Main implementation is ready to be reviewed. Listing here a couple of remaining tasks, and keeping this as draft until proposal is merged.

  • Proposal merged
  • Replaced N with GType
  • Additional tests added
  • Documentations added

@xiazcy xiazcy changed the base branch from master to 3.8-dev September 19, 2025 16:03
* {@code Type} is a {@code BiPredicate} that determines whether the first argument is a type of the second argument.
*
*/
public enum Type implements PBiPredicate<Object, Object> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think my comment still stands from the proposal. I still think Type should be better constrained by that <Object, Object> and that <Object, Class<?>> is more agreeable. Doing so would also get rid of the catch-all else { return false } and that check for GType.NULL:

    typeOf {
        @Override
        public boolean test(final Object first, final Class<?> second) {
            if (first == null) return false;
            return secondClass.isAssignableFrom(first.getClass());
        }

    };

This way, it can never be called other than how it is intended. In P you already have the types constrained to multiple typeOf overloads. Why expand them back to the Type enum when you can just coerce them each to Class<?> by moving that logic I removed above back to those specific P overloads?

Also, you'd catch GlobalTypeCache failures at traversal construction time with that logic in P rather than at runtime which seems more advantageous to me.

Do we lose something by getting more strict with the types for the Type enum?

As a separate point, are we making a mistake using Type for the name of this enum? We have Type and GType and at some point in the future we will have a schema. "Type" is such a great word to save for when we really need it. We currently have Contains, Compare , and the unfortunately named Text. The first two are verbs which actively define what P is doing - containing or comparing. Maybe there's a verb form that would be better than Type? Maybe we call the enum InstanceOf rather than Type- that's pretty normal and understood?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually yes, that makes sense. I'll make the change.

Maybe we call the enum InstanceOf rather than Type- that's pretty normal and understood?

I agree. I was a bit unsure of the name as well but couldn't think of a better one. InstanceOf sounds like a good alternative. Or ClassOf? IsA?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think i like InstanceOf better than ClassOf. IsA is interesting though given how generic it is but I can't think of how else we would use that offhand.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or if we want something more action CompareType?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose we could sort of extend from Compare with specific other implementations. Either are better than taking Type so maybe see if anyone else favors CompareType over InstanceOf. I think it's worth keeping in mind that most users won't even see these predicates as they are just enums within P, so they are a layer lower than the primary API for Gremlin at least which take some pressure away from getting the naming perfect.

@xiazcy xiazcy marked this pull request as ready for review September 24, 2025 20:20
The provided predicates are outlined in the table below and are used in various steps such as <<has-step,`has()`>>-step,
<<where-step,`where()`>>-step, <<is-step,`is()`>>-step, etc. Two new additional `TextP` predicate members were added in the
TinkerPop 3.6.0 release that allow working with regular expressions. These are `TextP.regex` and `TextP.notRegex`
TinkerPop 3.6.0 release that allow working with regular expressions. These are `TextP.regex` and `TextP.notRegex`. Starting with TinkerPop 3.8.0, a new type predicate `P.typeOf` is added to support filtering traversers based on types.
Copy link
Contributor

@spmallette spmallette Sep 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please take this moment to clean up this Note on Predicates section? I don't think there's any value to speaking about 3.4.0 and other versions. Have this section just talk about predicates and what they are.

| `P.within(objects...)` | Is the incoming object in the array of provided objects?
| `P.without(objects...)` | Is the incoming object not in the array of the provided objects?
| `P.typeOf(GType)` | Is the incoming object of the type indicated by the provided `GType` token?
| `P.typeOf(string)` | Is the incoming object of the type indicated by the provided `String`?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i realized that P.typeOf(Class) is sugar syntax so i guess it makes sense not to include that here. Not sure how we want to start better documenting the difference between the canonical grammar and the variants. I think that as of right now the "Differences" section is the best place to do that. Java doesn't have such a section yet, but I think this point could be a start of one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, that's why I didn't have it added here. Sure, I'll start a section.

The `asNumber()`-step (*map*) converts the incoming traverser to the nearest parsable type if no argument is provided,
or to the desired numerical type, based on the number token (`N`) provided.
or to the desired numerical type, based on the type token (`GType`) provided. If a type token entered isn't a numerical type, an `IllegalArgumentException` will be thrown.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We dont do a great job describing our enums. We just sort of assume folks will find and understand them via examples. I suppose it could be like "A Note on Types" to go with those related sections. Then at least there would be something to link to for GType.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the section was added but we don't have a link here to a-note-on-types...

if (first == null) {
return second == null;
}
return second != null && second.isAssignableFrom(first.getClass());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe just a point of thought, have we made a mistake in the syntax by assuming matching on inheritance and not class equality? should there be option for both? maybe not. just thought i'd pose the question. if we're not sure, that may be fine as we could have typeOf as it is and typeEquals for equality, so easily added at a later time if so decided.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I had a thought on it as well. I leaned towards inheritance but the idea of having typeEquals (or tyepEq?) for exact class match might not be a bad idea for later.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding onto this thread, a strict typeEquals isn't particularly useful in practice for many of our types as GTypes mostly match up with interfaces which have many concrete implementations (Vertex, TinkerVertex, ReferenceVertex... List, ArrayList, LinkedList... Map, HashMap, LinkedHashMap...)

I would agree that inheritance matching is the right choice for now, and while there may be potential for an exact match predicate in the future, I expect usage of that to be much more niche.

TREE(Tree.class),
UUID(UUID.class),
VERTEX(Vertex.class),
VP(VertexProperty.class),;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit on naming: VP is nice and short, but given we spelled out everything else. also, VPmight not be understood by new users.VPROPERTY` if we want it shorter?

public Class<?> getType() { return javaType; }

public boolean isNumeric() {
Class<?> type = getType();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: please review PR for final declarations.

import java.util.UUID;

/**
* Gremlin types
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GType is an important new concept in Gremlin as a language. I think a bit more context would be good in the javadoc.

Also, Gremlin Semantics docs do not appear to be updated for asNumber or include information about type predicates, or these defined types.

| d[32].i |
| d[35].i |

Scenario: g_V_valuesXageX_isXtypeOfXGType_BYTEXX
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

testing a lot of negatives here which is good. where are the feature tests for positives for all these types?

By defining types for Gremlin in GType I feel like we've elevated those selected ones to have special meaning. I understand none of our example graphs have that kind of data at this point, but I think that's a call to expand what our test framework can do. We already have the kitchen sink graph for this sort of thing. I think you should expand that dataset to throw in some data for this. Maybe one vertex label per testable data type (that can be stored as a property), so that you could do g.V().hasLabel("double", "string").is(typeOf(NUMBER)) and assert the right stuff.

I didn't examine this carefully, but i don't see positive tests for GRAPH and TREE. What else is missing?

Perhaps it would be wise to break this feature test up into: TypeOfNumber.feature, TypeOfGraph.feature, etc. for some specific testing of grouped types then keep TypeOf.feature as a catch-all for everything else?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, I'll add those tests. And yea, splitting off the types make sense. I'll organize it that way.

Then the traversal will raise an error with message containing text of "Can't parse string 'test' as number."
Then the traversal will raise an error with message containing text of "Can't convert number of type Integer to Byte due to overflow."

@GraphComputerVerificationInjectionNotSupported
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's good that we restrict asNumber testing on GraphComputer so much. I think GraphComputerVerificationInjectionNotSupported should be more of an exception than a rule.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the main reason is we used a lot of inject() cases cause they are the easiest, I'll add some non-inject ones.

| K_STRINGU | K_GTYPE DOT K_STRINGU
| K_TREE | K_GTYPE DOT K_TREE
| K_TREEU | K_GTYPE DOT K_TREEU
| K_UUID | K_GTYPE DOT K_UUID
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we accept all of UUID, GType.UUID, uuid, and GType.uuid as valid GTypes for consistency?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, added a UUIDL keyword

}

/**
* Register a type with a custom name.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I don't think it's our place to enforce any rules on name registration, would we want to define a convention on capitalization which we recommend providers follow? The preloaded types from GType appear to be taking the PascalCase names from their Java classes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm leaning towards removing this custom function, or restrict it to a defined convention, like Stephen mentioned, to something like providerPrefix:simpleTypeName, where the allowed input is just a custom providerPrefix string and we attach it to the simple name.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's be sure that these conventions make it to the provider documentation.

{P.typeOf(GType.NUMBER), 1, true},
{P.typeOf(GType.STRING), "hello", true},
{P.typeOf(Boolean.class), false, true},
{P.typeOf(GType.NULL), null, true},
Copy link
Contributor

@Cole-Greer Cole-Greer Sep 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you include a few false cases here? It would also be good to have some cases which use the global type registry such as P.typeOf("Integer")

package org.apache.tinkerpop.gremlin.util;

import org.apache.tinkerpop.gremlin.process.traversal.N;
import org.apache.tinkerpop.gremlin.process.traversal.GType;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly to Stephen's comment on AsNumberStepTest, it would be good to include one or two cases here which assert errors for non-numeric GTypes.

Given the modern graph
And the traversal of
"""
g.V().values("age").is(P.typeOf(GType.INT))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to include a case that's uses typeOf to filter a stream of mixed types as that will likely be the main use case here. We don't really have graphs with mixed type properties, but something like this would be good:

g.V().values("age", "name").is(P.typeOf(GType.INT))
==> 29
==> 27
==> 32
==> 35

[[gremlin-java-differences]]
=== Differences
Gremlin-Java provides additional syntactic sugar that leverages Java's type system. Most notably, the `P.typeOf()`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gremlin-Java provides additional syntactic sugar that leverages Java's type system. Most notably,

"Most notably," - implies there's other places we have syntactic sugar related to the Java type system. Are there others?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

Gremlin-Java provides additional syntactic sugar that leverages Java's type system. Most notably, the `P.typeOf()`
predicate accepts Java `Class` objects directly, providing a more natural and type-safe way to perform type checking:
[source,java]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason the examples in this section are not "live" (i.e. Gremlin Console based)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

g.V().has("weight", P.typeOf(Double.class))
----
This is equivalent to using `GType` enums, but offers compile-time type checking and IDE support. Other Gremlin
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is equivalent to using GType enums, but offers compile-time type checking and IDE support.

I don't follow this statement. Doesn't GType syntax offer the same?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

g.V().values("age").is(P.typeOf(GType.INT))
g.V().values("name").is(P.typeOf(GType.STRING))
----
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think it would be good to have some mention of what happens when you go off the standard GType mappings. like what happens if you are in TinkerGraph and store a java.awt.Color, will typeOf(Color.class) work for that case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added an explanation with example.

----
[[a-note-on-types]]
== A Note on Types
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this section jumps a bit too quickly to the details and needs more introductory text to help orient readers to what they are about to read.

Gremlin steps typically operate over a handful of types that are mostly standard across graph systems. There are the common primitive types like, ..... container types, like, .... and structural types particular to graphs such as Vertex, Edge, etc.......... Gremlin identifies these types in the GType enumeration, offering a clear presentation of the standard data types one might typically encounter with Gremlin. This enumeration is an important part of the Gremlin language in that it acts as the argument to the typeOf predicate used for filtering values based on their runtime data type.

GType consists of the following enumerations:

I quickly hacked that together, but perhaps it conveys the difference in tone and lead-in?

The new `P.typeOf()` predicate allows filtering traversers based on their runtime type. It accepts either a `GType`
enum constant or a string representation of a simple class name. This predicate is particularly useful for type-safe
filtering in heterogeneous data scenarios.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This predicate is particularly useful for type-safe filtering in heterogeneous data scenarios.

the examples should demonstrate that better since this is deemed "particularly useful". how about:

g.V().values("age","name").is(P.typeOf("Integer"))

to drive this section.

==>josh
==>peter
// Filter by class name registered in the GlobalTypeCache
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure you should reference the GlobalTypeCache here - it's not introduced as a topic and no one will know what it is.

==>josh
==>peter
// Filter by class name registered in the GlobalTypeCache
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe the Note on Types section should have some user facing reference to GlobalTypeCache? what about the provider docs for those trying to extend this feature?

*Description:* converts the incoming traverser to the nearest parsable type if no argument is provided, or to the desired numerical type, based on the number token (`N`) provided.
*Syntax:* `asNumber()` | `asNumber(N numberToken)`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we've added a new predicate but gremlin-semantics doesn't have any mention of it. i can't remember if we have a section for predicates yet here? but maybe it's time we structure something in? same for enums (i.e. GType)?

* **Temporal types**: `DATETIME`, `DURATION`
* **Special types**: `NULL`, `NUMBER` (supertype for all numeric types)
[gremlin-groovy,modern]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These examples aren't very good. Consider some better, more realistic use cases. Isn't this about union'd data we want to pick part in some way? values('age','name') for example? or what about cases where edges and vertices are mixed? maybe you should describe a case where data is mixed within a property (either purposely or accidentally) and typeOf can help? i'm not sure if all this goes here or not. maybe it's part of has() or a new Gremlin Recipe.

==== Type Predicate
Type predicate can accept custom types via `String` class name registered inside the `Type.GlobalTypeCache`. The Gremlin grammar accepts any string literal, which is resolved into the class registered inside the cache.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section is really light for such an important topic. Why/when would providers use this? How about a link to provider documentation that explains how this all works?

# specific language governing permissions and limitations
# under the License.

@StepClassData @DataByte
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach to test organization looks a lot better, however, i'm not sure the @DataByte (and associated annotations of similar type) are doing what they need to. I could have forgotten the conversation, but i thought the implication of this annotation was that a graph might not support BYTE and therefore opt out of this test, but the way the tests are written, can't every graph support these tests because we start with an int that every graph will store and then us Gremlin conversion steps to get them to BYTE? You could have equally done this with the modern graph and the "age" property.

That said, this setup isn't bad at all and it paves the way for us to actually test: g.addV("data").property("byte", 5b) so that we actually validate these types through the grammar. I think that these individual tests could have a @DataByteStore annotation so that providers who didn't support that type could drop just those tests.

You don't need to do this for this PR, but I think we want that sort of testing place in time for 3.8.0 as a follow-on task to this merging.

One thing that shouldn't be missed is that you've added new test annotations with all this - there should be an update to the dev docs to describe all of these.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The further down i go reviewing these StepClassData tests, the more I think something doesn't look quite right with not storing an actual byte in some of these tests. For example, the Map, List and Set tests all go so far as to store instances of those in a graph, so much so that I wondered if we had enough coverage for when graphs opted out of those, as many/most can't do that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense. I'll revisit the testing structure once more after this PR.

When iterated to list
Then the result should be empty

Scenario: g_V_valuesXsetX_isXtypeOfXGType_SETXX
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can get to {{Set}} without having to inject or store one:

gremlin> x = g.V().valueMap().select(keys).next()
==>name
==>age
gremlin> x.class
==>class java.util.LinkedHashSet

same for List and Map i guess....not sure if that changes your ideas about what tests go here and how they are structured. i guess storing them isn't bad, as long as we have suitable coverage for these types in general so that when providers opt-out of these tests there is still some protection.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, I've added a case for this. For List and Map, there are a couple of tests with fold() and valueMap() already.

# under the License.

@StepClassFilter @StepIs
Feature: Predicate - typeOf() General
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't note any tests for collections - do we need TypeOfCollection.feature?

Copy link
Contributor Author

@xiazcy xiazcy Oct 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean the list/set/map types?

| vertices |


Scenario: g_VX1X_outEXknowsX_subgraphXsgX_name_capXsgX_isXtypeOfXGType_GRAPHXX_count
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given we grouped together all the other "types" for typeOf testing, will we wonder where positive tests of Gtype.GRAPH are? (i did while reviewing this, but then scrolled a bit more and saw them here)

import org.apache.tinkerpop.gremlin.structure.VertexProperty;

import java.io.InputStream;
import java.math.BigDecimal;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: these imports not needed anymore i assume?

@kenhuuu
Copy link
Contributor

kenhuuu commented Oct 21, 2025

VOTE +1, pending closure of other comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants