Skip to content

Cover expressions as function arguments #51

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

jg-rp
Copy link
Contributor

@jg-rp jg-rp commented Sep 12, 2023

These additional test cases don't make a lot of real world sense, but short of defining one or more non-standard function extensions specifically for testing, it's the best I could come up with.

Relevant grammar:

function-expr       = function-name "(" S [function-argument
                         *(S "," S function-argument)] S ")"
function-argument   = literal /
                      filter-query / ; (includes singular-query)
                      logical-expr /
                      function-expr

logical-expr        = logical-or-expr
logical-or-expr     = logical-and-expr *(S "||" S logical-and-expr)
                        ; disjunction
                        ; binds less tightly than conjunction
logical-and-expr    = basic-expr *(S "&&" S basic-expr)
                        ; conjunction
                        ; binds more tightly than disjunction

basic-expr          = paren-expr /
                      comparison-expr /
                      test-expr

paren-expr          = [logical-not-op S] "(" S logical-expr S ")"
                                        ; parenthesized expression

},
{
"name": "functions, arguments, comparison expression",
"selector": "$[?length(@[email protected])>2]",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what this one is trying to do. @[email protected] yields a LogicalType, but length() takes a ValueType. This path is semantically invalid (syntactically valid, though). It shoudl error.

Copy link
Contributor Author

@jg-rp jg-rp Sep 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intention was to test that a comparison expression can be a valid function argument, within the confines of the standard function extensions.

I see that if LogicalTrue and LogicalFalse are not equal to JSON true and false, then the result of @[email protected] would not be a ValueType, making the query invalid.

If LogicalType can be used in place of a JSON Boolean (implicitly converting it to a ValueType?), I would expect length() to return Nothing, and the comparison between Nothing and 2 to result in LogicalFalse.

I'm not sure, and I can always define some mock functions that do accept a LogicalType parameter, outside of the CTS.

Copy link
Collaborator

@gregsdennis gregsdennis Sep 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We opted to separate the result of a boolean expression from the JSON literals true and false and recognize that the JSON literals are just tokens that don't carry any meaning. There's good reason to do this, and I'll let you peruse the issues for more, but the main idea was that if we didn't separate these concepts, then test expressions would be ambiguous.

For example, consider the complete expression @.foo. Is this testing whether foo exists, or whether foo contains a JSON true? Unless we are explicit that the JSON true is separate from logical true, this can't be determined.

I expect that this test case should generate an error. @glyn / @cabo can you confirm, please? What's the rule for syntactically correct but semantically incorrect paths? (I expect Section 2.1 has the requirements we're looking for.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, @gregsdennis , that makes sense. I've provisionally marked that test case as invalid and given it a more relevant name.

Section 2.4.9 has some examples that cover of this kind of thing. I had missed that before.

@jg-rp jg-rp force-pushed the argument-expressions branch from df3ef0c to 3b28489 Compare September 14, 2023 10:48
{
"name": "functions, arguments, function expression",
"selector": "$.values[?match(@.a, value($..['regex'])]",
"document": {
Copy link
Collaborator

@f3ath f3ath Sep 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For an invalid selector, the document is irrelevant and should be omitted. Sorry, I misread the test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless I've misunderstood (quite possible 😃), $.values[?match(@.a, value($..['regex'])] looks like a valid query, and the given result is correct when applied to that document.

  1. $..['regex'] evaluates to a nodelist, of NodesType, which happens to contain a single node.
  2. value() accepts a NodesType argument and returns a ValueType. In this case value($..['regex']) evaluates to "a.*", because its argument contains a single node.
  3. match() is called with the ValueType implicitly converted from the singular query @.a, and the ValueType ("a.*") from the call to value().

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is good, but the path is invalid. It needs another )

$.values[?match(@.a, value($..['regex']))]

Tested at https://json-everything.net/json-path.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Looks like I have some issues with unbalanced bracket detection.

@jg-rp jg-rp force-pushed the argument-expressions branch from 3b28489 to 40e8d4f Compare September 17, 2023 06:48
@@ -3819,6 +3819,48 @@
"selector": "$[-1:-10:-231584178474632390847141970017375815706539969331281128078915168015826259279872]",
"invalid_selector": true
},
{
"name": "functions, arguments, parenthesized expression",
"selector": "$[?match((@.a), ('a.*'))]",
Copy link
Collaborator

@f3ath f3ath Sep 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is correct. Here's how I see it. The parens around the arguments can only be used in a logical-expr, whose building blocks (apart from the parens, &&, ||) are comparison-expr and test-expr. So @.a and 'a.*' can only be test-expr meaning they are of LogicalType. But match() only accepts ValueType arguments, therefore the expression is not well-typed and the implementation must raise an error, according to

Any function expressions in a query must be well-formed (by conforming to the above ABNF) and well-typed, otherwise the JSONPath implementation MUST raise an error (see Section 2.1). To define which function expressions are well-typed, a type system is first introduced.

So, the expression should be treated as invalid.

P.S. So far there are no "standard" functions which accept LogicalType args. Which I think is good since boolean args may be considered a code smell.

Copy link
Collaborator

@gregsdennis gregsdennis Sep 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see where you're going with this.I'm mobile right now, but I'd like to see how my implementations handles this. It can be tested at https://json-everything.net/json-path.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One can argue that it could be better not to allow logical-expr in

function-argument   = literal /
                      filter-query / ; (includes singular-query)
                      logical-expr /
                      function-expr

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW my implementation did not allow parens in function args until now. This is a useful test, there's just no such standard function that can be used in it. In my implementation I introduced a non-standard logical function to have this case covered.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, understood.

Copy link
Collaborator

@gregsdennis gregsdennis Sep 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cabo does this mean that the check you showed has a bug, or am I interpreting this wrong?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the problem is that a reasonable parser implementation will discard the () around @.a, so the fact that this was syntactically a logicaltype is not visible to the semantic processing.

If I force this by doing one of

$[?match(!(@.a), 'a.*')]
$[?match([email protected], 'a.*')]
$[?match((@.a && @.a), 'a.*')]

I get one of

*** Cannot use ["not", ["@", "a"]] with declared_type logical for required type value in ["func", "match", ["not", ["@", "a"]], "a.*"]
*** Cannot use ["and", ["@", "a"], ["@", "a"]] with declared_type logical for required type value in ["func", "match", ["and", ["@", "a"], ["@", "a"]], "a.*"]

Giving pure parentheses a type-coercing semantics is a flaw that overly complicated languages such as C++ have, but that I didn't know we had put into JSONPath. Now I have to think how to integrate this into jpt's parser/ast generator.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reasonable parser implementation will discard the () around @.a

I'm not convinced of this. Reasonability is subjective.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/reasonable parser/parser written by someone who has done expression language parsers before/

The AST gets a bit ugly when you need to include the parens as a node.
Of course, it can be done.

@jg-rp
Copy link
Contributor Author

jg-rp commented Sep 25, 2023

I'm inclined to scrap this PR in favour of a new one, "Cover well-typedness of standard functions", which can add to existing test files in tests/functions/.

That would leave handling of redundant parentheses and covering well-typedness of functions accepting LogicalType arguments as separate issues.

Any objections?

@gregsdennis
Copy link
Collaborator

I don't have problems with adding commits to this.

If you create a new PR, please link back to this one to capture the conversation.

@gregsdennis
Copy link
Collaborator

@jg-rp do you have the time/ability to continue with this?

@jg-rp
Copy link
Contributor Author

jg-rp commented Dec 20, 2023

I've gone ahead an created PR #53 covering the use of function expressions as function arguments for some of the standard function extensions.

I'll open new issues shortly related to the extra parentheses discussion here, and to summarise the theoretically valid combinations of nested function expressions that can't be covered using standard function extensions alone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants