mirror of
https://github.com/apple/pkl.git
synced 2026-01-13 23:23:37 +01:00
Glob reading with the file: schema fails on some paths
#184
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @thomaspurchas on GitHub (Jul 17, 2024).
The handling of files reads using
read*is a bit broken at the moment. Behaviour changes depending on whether or not thefile:schema is used, and behaviour in apklvs the REPL doesn't always seem consistent. Additionally there are a number of simple glob patterns that causeNullPointerExceptionss or similar incorrect behaviour.Pkl REPL:
This error is is caused by the unsafe assumption that
.getPathalways returns a value. We should probably be using. getSchemeSpecificPartinstead.Evaling the following will work
but not (addition of the
file:schema)and running
read*("**/*.txt")in the REPL results in@bioball commented on GitHub (Jul 17, 2024):
The underlying issue here is that
file:URIs are hierarchical; it's invalid for a file URI to not have a hierarchical path part (a path that starts with/); see RFC-8089.Right now,
read("file:foo.txt")will cause Pkl to readfoo.txtrelative to the process CWD, butfile:foo.txtis not a valid file URI. This might be changed to a throw in the future. For similar reasons,read*("file:*")is also not supported, and would also be changed to a throw.When globbing using a file scheme, the path part must be a hierarchical path. For example,
read*("file:/path/to/*.txt")(orread("file:///path/to/*.txt"for a more canonical representation; file URIs typically have a host section.)@thomaspurchas commented on GitHub (Jul 17, 2024):
In that situation, how are we meant to use
read()with a relative path, in particular, how can we useread()with a relative in the REPL where the schema is mandatory?Also these nuances aren't described in the docs. The docs mention the
file:scheme, and under the Globbed Read section provides examples of usingreadwithout thefile:scheme. But I can't see anywhere that explicitly states relative reads shouldn't use thefile:scheme.Also for the
readAPI, do we have to stick so closely to the RFC? Obviously we should correctly handle any RFC-8089 valid URI. But there's no reason we can't also accept technically invalid URI, but where the intent is still obvious and useful (e.g. using thefile:scheme with a relative path)@bioball commented on GitHub (Jul 17, 2024):
I don't see a strong need for relative paths with
fileURIs. If this is declared from within a file-based module, it should use a relative path without the scheme part. If declared within a non file-based module (e.g. from a package), what should the path be relative to? It can't be relative to the enclosing module, and also shouldn't be relative to the CWD, because we want imports to be statically analyzable.I think this is a tangential but definitely a problem. Ideally, you should be able to use relative file imports from within the REPL.
@thomaspurchas commented on GitHub (Jul 17, 2024):
That sounds reasonable. I guess then my issue is that docs aren't very clear. It's not obvious that using
read()withfile:and without are different. It also creates a slightly odd situation where file reading has a "special" status, in that it supports relative reads is you don't use thefile:scheme, but no other scheme has that capability. Although it doesn't really make sense for other schemes to support relative paths in any situation, it still creates a slightly inconsistentread()API, that's a little surprising.@bioball commented on GitHub (Jul 18, 2024):
Agree that we should improve our docs here. We have a section on relative paths, but it's easily missed and a common source of confusion.