Everything Slices

Preamble

Author:  Ricardo Signes <rjbs@semiotic.systems>
Sponsor: Ricardo Signes <rjbs@semiotic.systems>
ID:      0005
Status:  Rejected

Abstract

Perl’s slice syntax lets the programmer get at a subset of values or key/value pairs in a hash or array. This PPC proposes a syntax get a slice that contains the complete set of values or key/values in the hash or array.

Motivation

It’s always possible to get a slice of everything in a container, but it requires writing the expression that expresses “all the keys”. This duplicates the variable name, which is an opportunity for error in both reading and writing.

The nearest motivation for this change is the potential addition of n-at-a-time iteration in foreach. This syntax will work by default:

for my ($k, $v) (%hash) { ... }

To iterate over an array’s indexes and values, one would have to write:

for my ($i, $v) (%array[ keys @array ]) { ... }

Note, too, the sigil variance.

Rationale

%hash in list context evaluates to a list of pairs, while @array in list context evalutes to the values. There’s a means to get the values of a hash with values, but no existing built-in for (say) kv @array. Rather than propose adding a kv, which would only be useful on arrays, this PPC proposes adding a type of slice that includes all keys and values.

%hash{*};   # equivalent to %hash{ keys %hash }
%array[*];  # equivalent to %array[ 0 .. $#array ]

@hash{*};   # equivalent to @hash{ keys %hash }
@array[*];  # equivalent to @array[ 0 .. $#array ]

Note that @array[*] is not equivalent to @array, because it is a slice rather than a list of values. This is especially notable when taking a reference or performing assignment.

\@array;    # yields a reference to the array
\@array[*]; # yields a list of references, one to each entry in the array

@array    = (1 .. 100); # the list is now 100 elements long
@array[*] = (1 .. 100); # the list length is unchanged

The availability of %array[*] is the primary reason to add this feature, but I have proposed this slice mechanism rather than a kv built-in because it provides straightforward syntax to more easily produce structures already familiar to at least intermediate Perl programmers.

The use of * is cribbed from Raku, where it is used a a multipurpose placeholder. While I believe this use is a plausible starting point for future expansion of the asterisk semantics, we could stop here without causing undue confusion.

Specification

The token * is permitted to stand alone (with optional whitespace on either or both sides) inside the {...} or [...] using in slicing a hash or array. A slice using * in place of subscript keys will act as if all keys were provided. For arrays, results will be provided in the numeric order of the keys. This may be optimized, but need not be.

Assignment to a key/value air everything-slice is forbidden, as it is for any other key/value slice. Assignment to a value-only everything-slice is permitted as usual.

The asterisk may not be used as part of a larger expression inside the subscript. For example, this is not legal:

%array[ grep {; $_ > 5 } * ]

The behavior of an everything-slice on a tied array should be identical to the behavior of:

@array[ keys @array ];

The behavior of an everything-slice on a tied hash should be identical to the behavior of:

@hash{ keys %hash };

Backwards Compatibility

The use of * as a standalone subscript is already a syntax error, and could be introduced without requiring any other changes or deprecations.

I believe that updating static analyzers will not be extremely complex, but I’m not particularly expert in any of them.

The deparser and coverage tools may need updating, but presumably no more than many other small changes. (This does not introduce any new runtime branching behavior.)

I can’t speak to the effect on Devel::NYTProf at all.

This can’t be easily implemented in older versions of perl.

Security Implications

[ none yet foreseen ]

Examples

[ can produce ]

Prototype Implementation

Is there something that shows the idea is feasible, and lets other people play with it? Such as

Future Scope

If we adopt * as a “everything” placeholder, we may want to use it in more places, possibly to be investigated by skimming off the top of Raku when applicable.

Rejected Ideas

Just Add kv

The simplest alternative here is to simply add a new built-in, kv, which operates on an array and evalutes to a list of index/value pairs, or operates on a hash, returning the key/value pairs.

This is much simpler semantically and, presumably, in implementation. On the other hand, it has somewhat fewer potential applications.

The Empty Subscript Option

One alternative to the asterisk that was proposed in the past was using an empty subscript. For example:

# Instead of:
@array[*]

# One could write:
@array[]

I believe this looks much more like a mistake. Moreover, note that while @array[] is currently illegal, @array[()] is legal, and evaluates to an empty list. I think this would lead to confusion. (Also, while I do not want to act as if generated code is a primary target, I think this does needlessly complicated code generation.)

The Endless Range Option

Another alternative syntax is an extended range operator. These are already legal:

@array[ 2 .. 3 ]

@array[ 0 .. 99 ]

@array[ 0 .. $#array ]

The proposal is that, at least in the context of an array subscript, the ends could be left off. That is:

@array[ .. 99 ] # equivalent to 0..99

@array[ 0..   ] # equivalent to 0 .. $#array
@array[  ..   ] # equivalent to 0 .. $#array

There may be further ambiguity here, but I have mostly ignored this option because its applicability to hashes seems weird to me. Hash keys are not ordered, and can’t be ranged. Only the bare .. would be useful:

%hash{..}   # equivalent to %hash{ keys %hash }, currently a syntax error

That said, the implicit-ended range operator may have more direct uses than the asterisk.

Open Issues

Use this to summarise any points that are still to be resolved.

Copyright (C) 2021, Ricardo Signes.

This document and code and documentation within it may be used, redistributed and/or modified under the same terms as Perl itself.