Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPARQL 1.2 Functions related to initial text direction and language tags #154

Open
afs opened this issue Sep 13, 2024 · 6 comments
Open

Comments

@afs
Copy link
Contributor

afs commented Sep 13, 2024

SPARQL 1.2 Functions for language string literals

Functions:
hasLANG(literal), hasLANGDIR(literal),
LANG(literal), LANGDIR(literal),
STRLANGDIR(xsd:string, xsd:string, xsd:string) , STRLANGDIR(xsd:string, xsd:string, xsd:string).

LANG(literal) is part of SPARQL 1.1 and is extended for rdf:dirLangString.

Accessors:

RDF Term hasLANG hasLANGDIR LANG LANGDIR
"abc"@en true false "en" ""
"abc"@en--ltr true true "en" "ltr"
"abc"@en--LTR true true "en" "ltr"
"abc" false false "" ""
"abc"^^rdf:dirLangString false false "" ""
"abc"^^rdf:langString false false "" ""
"123"^^xsd:integer false false "" ""
<http://example/xyz> error error error error

Constructors:

Constructor Literal
STRLANG("abc", "en") "abc"@en
STRLANG("abc", "") error
STRLANG(123, "") error
STRLANGDIR("abc", "en", "ltr") "abc"@en--ltr
STRLANGDIR("abc", "en", "LTR") error
STRLANGDIR("abc", "en", "") error
STRLANGDIR("abc", "", "ltr") error
STRLANGDIR(123, "", "ltr") error
STRLANGDIR(<x:uri>, "en", "ltr") error

It is possible to write "abc"^^rdf:dirLangString and "abc"^^rdf:langString in N-Triples and Turtle.

The functions hasLang and hasLANGDIR` test whether an RDF term has the language tag of initial text direction component. See RDF Concepts, section "Literals". They don't test by datatype.

LANG is in SPARQL 1.1. This determines the choice for LANGDIR when passed a non-literal and the result of LANGDIR(123).

The accessors LANG and LANGDIR return the facet or "" following LANG in SPARQL 1.1.
The argument must be a literal otherwise it is an error.

In these cases, hasLANG/hasLANGDIR is false and the return of LANG and LANGDIR is "".
The facet is not present.

It may be possible to write a literals with text direction but no language tag in some other format (note: for RDF/XML we can require "lang=" if "dir=" is present").

Notes

hasFUNC(arg) is equivalent to FUNC(arg) != "".

The name hasLANG/hasLANGDIR is different in style to isLITERAL etc because the has* tests a component, not the RDF term as a whole.

hasLANG applies to rdf:langString and rdf:dirLangString.

Initial Text direction is canonicalized to lowercase: c.f. langtag being canonicalized in RDF 1.2.

It is not possible to write a literal in Turtle or N-Triples with a text direction but no language tag, nor is it possible to write a literal other than rdf:dirLangString and rdf:langStringwith language tag. These are illegal in RDF Concepts but may be it will occur naturally in other syntaxes as corner cases. The accessors approach works on components and would be well-defined.

@afs
Copy link
Contributor Author

afs commented Sep 13, 2024

Difference to the earlier #113 draft

@hartig
Copy link
Contributor

hartig commented Sep 13, 2024

Makes sense!

@rubensworks
Copy link
Member

For reference, there's an issue open with some concerns about the current text direction approach. If an alternative approach is taken, this will have an impact here as well.
So one option might be to hold off with the work here until w3c/rdf-concepts#79 has been discussed or resolved.

@afs
Copy link
Contributor Author

afs commented Sep 17, 2024

@rubensworks - thanks for pointing out that issue. Nothing is final until the publication of the REC 😄

There is no rush to get text into the SPARQL spec for these functions but at the same time, the WG has made a decision and we can't wait until RDF 1.2 is finalized before doing work.

The function list is my view on what is the natural outcome of the WG decision on initial text direction and the changes in RDF. That includes discussions with the internationalization working group.

Bidirectional text is a much larger problem and I don't see that the WG has decided to take up the issue. The only response I recall is along the lines of "use a content-focused literal" (e.g. rdf:HTML).

JSON-LD has non-normative "base direction". So initial text direction (terminology suggested by i18n IIRC) already exists.

Datatypes have been discussed and problems with them identified. A datatype is a class, and the subclass relationship does not work for scripts (a subclass must be usable in a place where the superclass is valid).

There is nothing to stop use of compound literals. The WG initial text direction decision does not block that nor do do the proposed SPARQL changes.

If the WG takes up w3c/rdf-concepts#79 , things may need to change.

FWIW I think the lack of a way to give a direction to a non-language tagged string is a bit odd. It would need a new datatype, not a munging of xsd:string or rdf:dirLangString. Such a change would fit the proposal here because the functions are accessors to components of RDF literals terms.

(General discussion about initial text direction in RDF 1.2 and on the RDF Concepts issues list please.)

@afs
Copy link
Contributor Author

afs commented Sep 17, 2024

The rdf-tests PR w3c/rdf-tests#135 shows that using LTR is illegal in RDF; it is not forced to lowercase.

Therefore `STRLANGDIR(?, ?, "LTR")`` should be an error.

Constructor Literal
STRLANGDIR("abc", "en", "LTR") error

Table in the description updated.

N.B. Langtags are compared and matched in a case insensitive manner but RDF concepts does not mandate lowercase. Some systems use canonical langtags (e.g. en-GB).

@afs
Copy link
Contributor Author

afs commented Sep 25, 2024

The WG looked at w3c/rdf-concepts#79 at TPAC'24 and resolved:

RESOLUTION: The working group has considered w3c/rdf-concepts#79 and will continue to support initial text direction in RDF Language-Tagged Literals. We will not otherwise consider full bidi.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants