Everything Slices
Preamble
Author: Ricardo Signes <rjbs@semiotic.systems>
Sponsor: Ricardo Signes <rjbs@semiotic.systems>
ID: 0005
Status: Rejected
Abstract
Perl’s slice syntax lets the programmer get at a subset of values or key/value pairs in a hash or array. This PPC proposes a syntax get a slice that contains the complete set of values or key/values in the hash or array.
Motivation
It’s always possible to get a slice of everything in a container, but it requires writing the expression that expresses “all the keys”. This duplicates the variable name, which is an opportunity for error in both reading and writing.
The nearest motivation for this change is the potential addition of n-at-a-time iteration in foreach. This syntax will work by default:
for my ($k, $v) (%hash) { ... }
To iterate over an array’s indexes and values, one would have to write:
for my ($i, $v) (%array[ keys @array ]) { ... }
Note, too, the sigil variance.
Rationale
%hash
in list context evaluates to a list of pairs,
while @array
in list context evalutes to the values.
There’s a means to get the values of a hash with values
,
but no existing built-in for (say) kv @array
. Rather than
propose adding a kv
, which would only be useful on arrays,
this PPC proposes adding a type of slice that includes all keys and
values.
%hash{*}; # equivalent to %hash{ keys %hash }
%array[*]; # equivalent to %array[ 0 .. $#array ]
@hash{*}; # equivalent to @hash{ keys %hash }
@array[*]; # equivalent to @array[ 0 .. $#array ]
Note that @array[*]
is not equivalent to
@array
, because it is a slice rather than a list
of values. This is especially notable when taking a reference or
performing assignment.
\@array; # yields a reference to the array
\@array[*]; # yields a list of references, one to each entry in the array
@array = (1 .. 100); # the list is now 100 elements long
@array[*] = (1 .. 100); # the list length is unchanged
The availability of %array[*]
is the primary reason to
add this feature, but I have proposed this slice mechanism rather than a
kv
built-in because it provides straightforward syntax to
more easily produce structures already familiar to at least intermediate
Perl programmers.
The use of *
is cribbed from Raku, where it is used a a
multipurpose placeholder. While I believe this use is a plausible
starting point for future expansion of the asterisk semantics, we could
stop here without causing undue confusion.
Specification
The token *
is permitted to stand alone (with optional
whitespace on either or both sides) inside the {...}
or
[...]
using in slicing a hash or array. A slice using
*
in place of subscript keys will act as if all keys were
provided. For arrays, results will be provided in the numeric order of
the keys. This may be optimized, but need not be.
Assignment to a key/value air everything-slice is forbidden, as it is for any other key/value slice. Assignment to a value-only everything-slice is permitted as usual.
The asterisk may not be used as part of a larger expression inside the subscript. For example, this is not legal:
%array[ grep {; $_ > 5 } * ]
The behavior of an everything-slice on a tied array should be identical to the behavior of:
@array[ keys @array ];
The behavior of an everything-slice on a tied hash should be identical to the behavior of:
@hash{ keys %hash };
Backwards Compatibility
The use of *
as a standalone subscript is already a
syntax error, and could be introduced without requiring any other
changes or deprecations.
I believe that updating static analyzers will not be extremely complex, but I’m not particularly expert in any of them.
The deparser and coverage tools may need updating, but presumably no more than many other small changes. (This does not introduce any new runtime branching behavior.)
I can’t speak to the effect on Devel::NYTProf at all.
This can’t be easily implemented in older versions of perl.
Security Implications
[ none yet foreseen ]
Examples
[ can produce ]
Prototype Implementation
Is there something that shows the idea is feasible, and lets other people play with it? Such as
- A module on CPAN
- A source filter
- Hack the core C code - fails tests, but lets folks play
Future Scope
If we adopt *
as a “everything” placeholder, we may want
to use it in more places, possibly to be investigated by skimming off
the top of Raku when applicable.
Rejected Ideas
Just Add kv
The simplest alternative here is to simply add a new built-in,
kv
, which operates on an array and evalutes to a list of
index/value pairs, or operates on a hash, returning the key/value
pairs.
This is much simpler semantically and, presumably, in implementation. On the other hand, it has somewhat fewer potential applications.
The Empty Subscript Option
One alternative to the asterisk that was proposed in the past was using an empty subscript. For example:
# Instead of:
@array[*]
# One could write:
@array[]
I believe this looks much more like a mistake. Moreover, note that
while @array[]
is currently illegal,
@array[()]
is legal, and evaluates to an empty list. I
think this would lead to confusion. (Also, while I do not want to act as
if generated code is a primary target, I think this does needlessly
complicated code generation.)
The Endless Range Option
Another alternative syntax is an extended range operator. These are already legal:
@array[ 2 .. 3 ]
@array[ 0 .. 99 ]
@array[ 0 .. $#array ]
The proposal is that, at least in the context of an array subscript, the ends could be left off. That is:
@array[ .. 99 ] # equivalent to 0..99
@array[ 0.. ] # equivalent to 0 .. $#array
@array[ .. ] # equivalent to 0 .. $#array
There may be further ambiguity here, but I have mostly ignored this
option because its applicability to hashes seems weird to me. Hash keys
are not ordered, and can’t be ranged. Only the bare ..
would be useful:
%hash{..} # equivalent to %hash{ keys %hash }, currently a syntax error
That said, the implicit-ended range operator may have more direct uses than the asterisk.
Open Issues
Use this to summarise any points that are still to be resolved.
Copyright
Copyright (C) 2021, Ricardo Signes.
This document and code and documentation within it may be used, redistributed and/or modified under the same terms as Perl itself.