Stable SV Boolean Type
Preamble
Author: Paul Evans <PEVANS>
Sponsor:
ID: 0008
Status: Implemented
Abstract
Add to regular (defined and non-referential) scalars the ability to
track whether they represent a boolean value. Values created from
boolean expressions (such as $x == $y
) will reliably
remember this fact, in a way that can be introspected even if the value
is stored into a variable and retrieved later.
Motivation
Language interoperability concerns sometimes lead to situations where it is necessary to represent typed data - such as strings or numbers - in a way that can be reliably distinguished. Elsewhere in Perl there are attempts to add this distinction. This PPC attempts to address how to handle values of a boolean type; values that represent a simple true-or-false nature.
For example, JSON and MsgPack are commonly-encountered serialisation formats that Perl can generate and parse, though in both cases while the formats themselves can represent boolean truth values distinctly from numbers or strings, perl cannot preserve that distinction. Compare this to some other languages popular at the moment:
$ python3 -c 'import json; print(json.dumps([1, "1", 1 == 1]))'
[1, "1", true]
$ nodejs -e 'console.log(JSON.stringify([1, "1", 1 == 1]))'
[1,"1",true]
$ perl -MJSON::MaybeUTF8=encode_json_utf8 -E 'say encode_json_utf8([1, "1", 1 == 1])'
[1,"1",1]
There is a tendancy for modules to make up this shortfall with
workarounds like the JSON::PP::Boolean
type:
$ perl -MJSON::MaybeUTF8=decode_json_utf8 -MData::Dump=pp
-E 'say pp(decode_json_utf8($ARGV[0]))' '[1,true]'
[1, bless(do{\(my $o = 1)}, "JSON::PP::Boolean")]
Obviously such a solution is specific to JSON encoding and does not apply to, for example, message gateway between JSON and MsgPack, which would require some translation inbetween. A true in-core solution to this problem would have many benefits to interoperability of data handling between these various modules.
((TODO: Add some comments about purely in-core handling as well that don’t rely on serialisation))
Rationale
Specification
There are two parts to this specification; the lower-level in-core details of the C code implementation that are visible to XS code; and the higher-level Perl-visible details that are made visible to Perl programs.
C-level Internals
At the C level, SVs will require a macro to test whether they contain a boolean.
SvIsBOOL(sv)
The core immortals PL_sv_yes
and PL_sv_no
will always respond true to SvIsBOOL()
, as will any SV
initialised by copying from them by using sv_setsv()
and
friends. The upshot here is that the result of boolean-returning ops
will be SvIsBOOL()
and this flag will remain with any
copies of that value that get made - either over the arguments stack or
stored in lexicals or elements of aggregate structures.
These SvIsBOOL()
values will still be subject to the
usual semantics regarding macros like SvPV
or
SvIV
- so numerically they will still be 1 or 0, and
stringily they will still be “1” and “” (though see the Future Scope
section below).
It may be possible to find a spare SvFLAGS
field for
this purpose, at which point there would be macros on a theme similar to
the SvPOK
/SvIOK
/etc.. set. I would suggest
avoiding the single-letter abbreviations of the older style of code -
letters aren’t so expensive these days that we can’t at least spell out
the word “BOOL” in full.
Flag bits being rare and expensive, it may turn out that finding a
spare flag bit is simply impossible (or at least, undesired). This could
be implemented instead by using special pointer values in the
SvPVX()
slots of such booleans. A small change to
sv_setsv()
and friends would be possible to cause it to
copy the actual pointer value itself, such that any “boolean” SV can be
distinguished by pointer equality of this field, to that of the original
“yes” and “no” immortals.
Perl-level Interface
Once the interpreter can reliably store the concept of “is a boolean” and expose this at least to XS modules, the question remains on how pureperl code can make use of it. How can boolean values be created, and how can we test if a given value is a boolean?
For creating such values, in many cases it is sufficient to simply use any of the existing boolean predicate operators, for example the comparison operators:
my $sv = 4 > 5; # The $sv now has SvIsBOOL
It may be considered useful to provide two new zero-arity functions,
true
and false
, to explicitly create these
values (which are at least a little more obvious in intent than the
equivalent !!1
and !!0
). These can be
requested from the builtin
module (PPC 0009):
use builtin 'true', 'false';
1, "1", true); # Three distinct values
func(0, "0", "", false); # Four distinct values func(
The question remains on how pureperl code can inspect the “type” of a value to inspect if it is one of these special boolean values. While perl core doesn’t have any testing keywords for this, the nearest match to currently available features may be to add a new builtin function:
use builtin 'isbool';
sub distinguish($x) {
say !isbool $x ? "not a boolean" :
$x ? "true" :
"false";
}
Whatever solution is considered here should be designed to interact well with any possible larger considerations from the fallout of stronger string-vs-number distinctions, and other ideas currently floating around.
I accept that this is the weakest part of this suggestion so far, and welcome comment here in particular (though please first read the “Future Scope” section below).
Backwards Compatibility
There are not expected to be any backwards compatiblity problems with this proposal. At the interpreter level, extra information is being added to certain SV values, without changing the meaning of any existing information currently stored. Code that is unaware of the new semantics will continue to see existing values unaltered.
Likewise at the Perl syntax level, the only new functionality being
added consists of new functions in the builtin
namespace
(PPC 0009).
Security Implications
Likewise, as this proposal only adds extra information to that being stored by SVs, it is not anticipated there are any security implications of doing so. There is not expected to be a security effect from exposing the fact that some SVs happen to express values of boolean intent.
Examples
(See embedded code above)
Prototype Implementation
I made an attempt at using SvPVX()
tracking of the
immortal constants and the SvIsBOOL()
test macro, at
https://github.com/leonerd/perl5/tree/stable-bool
This was accepted and merged into core perl in time to be released as part of Perl version 5.35.4.
There is no attempt yet to provide the true
,
false
or isbool
builtin functions as the
builtin function mechanism is not yet available. A workaround for
isbool
is provided by Scalar::Util
.
Future Scope
By its nature this proposal fits in the category of “give Perl values more typing information”. As such it is likely to fit well with other ideas of a similar nature - such as clearer distinction between strings and numbers, between Unicode text and byte buffers, and other concepts. It may become possible to group these ideas together in a better way, to better solve such questions as how to add predicate-test functions in a way that is visible to Perl syntax.
It may also be possible to provide a lexically-guarded feature that
alters the way that boolean values are stringified. It is noted that in
Perl currently, a false value stringifies to the empty string, which
leads to certain difficulties in debugging output. Perhaps under the
boolean
feature, or perhaps under its own unique feature
name, boolean values could stringify differently:
use feature 'boolean';
my ($x, $y) = (true, false);
print "X=$x Y=$y\n";
__END__
X=true Y=false
Rejected Ideas
One thing that is certainly impossible to do currently is change the stringification of existing boolean values. There is far too much code around - primarily in unit tests - which relies on the current stringification of boolean true and false values, and this cannot be modified. Doing so would break such tests as:
"", 'somefunc returns false' ); is( somefunc(),
Open Issues
Copyright
Copyright (C) 2021, Paul Evans.
This document and code and documentation within it may be used, redistributed and/or modified under the same terms as Perl itself.