Rebol Enhancement Proposal 2005
Author: Ladislav Mecir Date: 23-Sep-2005/9:15:48+2:00
Contents:
1. Acknowledgments
2. Rebol Block Implementation and the REMOVE Function
3. Rebol Blocks and the Preallocated Space
4. Sameness of Rebol Blocks
5. Rebol List Implementation and the REMOVE Function
6. Errors As Results of Rebol (Sub)expressions
7. Signaling Errors Versus Yielding Errors
8. Error Reporting
9. Control Functions
9.1 CFOR
10. Indices Of Series
10.1 A SKIP Trap
10.2 Big Skips
10.3 REAL-INDEX?
10.4 PICK "Hole"
10.5 PICK deficiency
10.6 Negative Big Skips
11. Functions, RETURN, THROW, And Function Attributes
12. Functions
12.1 Local Words
13. The Result of PARSE
14. Literal Rules For Block Parsing
15. PARSE, TO and Numbers
16. PARSE, TO and Blocks
17. The Evaluation Order
18. #[unset!]
18.1 The primary purpose of #[unset!]
18.2 #[unset!] As A Return Value Of Some Functions
18.3 #[unset!] as the default argument value
18.4 #[unset!] as the value we can get querying undefined words
18.5 The Main Disadvantage
19. The Evaluation Order and the Infix Operators
20. What are the Rebol operators?
21. Unbound Words
22. Less Aggressive Evaluation
23. References
1. Acknowledgments
Carl Sassenrath is the inventor of the wonderful language.
Gabriele Santilli, Jeff Kreis, Daan Oosterveld, Frank Sievertsen, Joel Neely and Romano Paolo Tenca corrected some of my errors and helped me to improve this text.
See Bug fixes and enhancements at the REP site too.
2. Rebol Block Implementation and the REMOVE Function
See the Block implementation at REP.
3. Rebol Blocks and the Preallocated Space
See the Preallocated Space at REP.
4. Sameness of Rebol Blocks
See Block sameness at REP.
5. Rebol List Implementation and the REMOVE Function
The following is true for the current implementation:
>> a: make list! [1 2]
== make list! [1 2]
>> remove a
== make list! [2]
>> a
== make list! []
>> tail? a
** Script Error: Out of range or past end
** Near: tail? a
It is possible to change the implementation in such a way, that:
>> a: make list! [1 2]
== make list! [1 2]
>> remove a
== make list! [2]
>> a
== make list! [2]
>> tail? a
== false
, while retaining roughly the same speed.
6. Errors As Results of Rebol (Sub)expressions
See Errors as results at REP.
7. Signaling Errors Versus Yielding Errors
There may be a question: How to "signal" an error, instead of "yielding" it as a result of a (sub)expression?
Illustration: a subexpression yields an error as an argument for the TYPE? function:
type? make error! {pokus}
In the following example a subexpression signals an error instead of supplying its result to the TYPE? function:
type? 1 / 0
Variant #1: in Rebol there is a mechanism to actively evaluate a value. It is general enough and it is used for Rebol functions with a big success. It uses words referring to functions. Such words actively call the functions when evaluated. In the Interpretation article I called such values word-active. This variant is based on a proposition to make Rebol errors word-active too.
Disadvantage: Poor backward compatibility.
Variant #2: Define a new function SIGNAL which should signal an error given as its argument.
Disadvantage: A necessity to define one new function.
8. Error Reporting
Let's try this code:
a: func [x] [do x]
a [1 / 0]
This fires an error:
** Math Error: Attempt to divide by zero
** Where: a
** Near: 1 / 0
A report like:
** Math Error: Attempt to divide by zero
** Near: 1 / 0
** where: do x
** where: a [1 / 0]
might explain better what happened.
Proposition: it would be good to have a mechanism that would allow us to choose the depth of the error report. Moreover, instead of supplying just a function name I suggest to show the function call.
9. Control Functions
See Control functions at REP.
9.1 CFOR
Instead of the FOR mezzanine I use my own CFOR function inspired by the C language for cycle statement. Although similar to the C language construct, it is more general than its preimage. Moreover, it is much faster, than the FOR mezzanine. The implementation can be found at Cfor and I think, that it would be good to have it as a standard mezzanine in Rebol.
10. Indices Of Series
10.1 A SKIP Trap
The NEXT function isn't linear in the following sense: for any series S the expression
next tail s
yields the same result as the expression
tail s
This breaks a symmetry, because in all other cases the NEXT function creates a series with different index, than the index of its argument. The same holds for the BACK function at the head of a series and for the SKIP function at both ends.
Although this behaviour looks reasonable, it is the cause of two bugs in the mezzanine function FOR. If even the implementor of the language gets "caught" by a feature, then the feature should be reconsidered.
10.2 Big Skips
The INDEX? function gives us useful information about series, although sometimes it works as follows:
a: "11"
b: next a
clear a
index? b
; ** Script Error: Out of range or past end
; ** Near: index? b
It looks (at a first glance), that there is nothing wrong with this approach, because the "large indices" may be considered "illegal". Nevertheless, the big skip occurred before we used the INDEX? function, i.e. the function cannot help us to eliminate it as a possible error.
10.3 REAL-INDEX?
We can define a function able to yield a realistic index value for any series:
real-index?: function [
{return a realistic index for any series}
series [series!]
] [orig-tail result] [
orig-tail: tail :series
while [error? try [result: index? :series]] [
insert insert tail :series #"1" head :series
]
clear :orig-tail
result
]
Test:
a: "11"
b: next a
clear a
real-index? b ; == 2
The only trouble with the REAL-MINDEX? function is, that its non-native implementation is too expensive, that is why it should be implemented natively instead (e.g. as a refinement of the INDEX? function)
10.4 PICK "Hole"
If I am able to pick a value at the position I in a series and a value at the position I - 2 in the same series, I would expect, that I am able to pick a value at the position I - 1 too. That isn't true, if I is equal to 1.
This is a similar trap as above, although I haven't seen many bugs caused by overlooking it.
It would be good to have a native function able to work more consistently.
10.5 PICK deficiency
Let's look at the code:
a: "123456789"
b: skip a 8
pick b -8 ; == #"1"
Now let's suppose, that we do:
clear skip a 6
pick b -8
It would be good to have a function able to work consistently in this case and yield a value, if we don't get outside of the bounds after the index arithmetic is done, not sooner.
10.6 Negative Big Skips
I think, that there are good reasons, why we should have a function able to create series with negative indices exactly like we (can) have a function creating series with big positive indices (we could then simulate series starting at a specific index).
See Indexing proposal too.
11. Functions, RETURN, THROW, And Function Attributes
I proposed more than two variants of the FUNC function.
The first one solves inconsistencies in function attributes very elegantly while not imposing any heavy computational burden on the interpreter.
Proposition: I suggest that a TFUNC-like function should be implemented natively.
See Function attributes at REP for description of advantages of the approach.
This can satisfy any needs the existing [catch] and [throw] attributes are designed to satisfy but more elegantly and consistently.
Another useful function is CFUNC, which deserves a native implementation too.
12. Functions
12.1 Local Words
Rebol functions use different kind of local words definition, than the MAKE OBJECT! construct. There are users including myself, that prefer the MAKE OBJECT! method to define local words.
While we can use our own functions to achieve this goal, like e.g. my LFUNC, some users would surely prefer to have it as standard.
set-words: function [
{Get all set-words from a block}
block [block!]
] [elem words] [
words: make block! length? block
parse block [
any [
set elem set-word! (
insert tail words to word! :elem
) | skip
]
]
words
]
locals?: function [
{Get all locals from spec.}
spec [block!]
] [locals] [
locals: make block! length? spec
parse spec [
any [
set item any-word! (
append locals to word! :item
) | skip
]
]
locals
]
lfunc: func [
{Define a function with set-words handled implicitly as local.}
[catch]
spec [block!]
body [block!] "The body block of the function"
] [
vars: exclude set-words body locals? spec
if not empty? vars [
spec: head insert insert tail copy spec /local vars
]
throw-on-error [make function! spec body]
]
, where LFUNC can be used to create functions as follows:
comment [
; Usage:
a: 4
f: lfunc [] [a: 2]
f ; == 2
a ; == 4
]
The newest implementation capable of defining static variables and more is at LFUNC.
13. The Result of PARSE
To yield TRUE PARSE must:
a) "get successfully" through the whole rule
b) get to the end position of the input
14. Literal Rules For Block Parsing
Let's suppose, that we want to use PARSE to check, whether a block contains numbers 1 2 3 in this order. The rule can be:
parse block [1 1 1 1 1 2 1 1 3]
, which looks awful.
Proposition: it would be good to have something like:
parse block [literally [1 2 3]]
15. PARSE, TO and Numbers
parse "abcdefgh" [thru 2 a: (print a) to 1 a: (print a)]
cdefgh
abcdefgh
== false
Is this intended?
16. PARSE, TO and Blocks
TO doesn't work with block rules in PARSE, although we can write some work-arounds like e.g. in [PARSEEN].
17. The Evaluation Order
See Evaluation order.
18. #[unset!]
18.1 The primary purpose of #[unset!]
The interpreter uses a special "internal" value to indicate, that a word having this value hasn't been initialized.
If a user makes a typo and tries to get the value of a word that hasn't been initialized, REBOL interpreter can detect his error by recognizing that the word has got this value.
Thus, #[unset!] is "steering" Rebol Typo Protection Mechanism and helping to debug Rebol scripts.
On the other hand, this property of #[unset!] is purely "internal" to the interpreter, i.e. it doesn't require #[unset!] to be acceptable by Rebol functions or available to programmers.
Actually, the opposite is true. Any additional purpose or property we assign to #[unset!] can make #[unset!] less suitable for its primary purpose.
18.2 #[unset!] As A Return Value Of Some Functions
Some functions return #[unset!] as their result. Example:
do []
print "OK"
()
get/any 'nonsense
The reason is to obtain a value indicating, that no (other) useful value is available.
18.3 #[unset!] as the default argument value
A comfortable way to handle this situation would be more like:
f: func [n [number!] default: 0] [print n]
Instead of the current:
f: func [n [number! unset!]] [
if not value? 'n [n: 0]
print n
]
18.4 #[unset!] as the value we can get querying undefined words
Currently we have got a possibility to write:
o: make object! [
attribute: 1
unset 'attribute
]
mold/all o
; == {
; make object! [
; attribute: #[unset!]
; ]}
mold/all second o
; == {[
; make object! [
; attribute: #[unset!]
; ] #[unset!]]}
do mold/all o
18.5 The Main Disadvantage
Because #[unset!] is a "legal" Rebol value - we can obtain it as a result of a Rebol expression - , some programs have to handle it.
Proposition: I propose to keep #[unset!] as a word-active value, like Rebol functions. That way we obtain typo protection for normal usages, like
x
, where it will behave like it does now. OTOH, I suggest to change the behaviour of #[unset!] for get-words and for the GET function, where I suggest to define the same behaviour as for Rebol functions, i.e. a "passive" evaluation.
This will remove one exception for the evaluation of get-words and simplify the majority of scripts.
19. The Evaluation Order and the Infix Operators
The Rebol User's guide currently states:
"There are two rules to remember when evaluating mathematical expressions:
Expressions are evaluated from left to right. Operators take precedence over functions."
Having seen the above wording I can conclude, that Rebol designer's goal was to define the Evaluation Order rules to be simple. The benefits of the simple Evaluation Order are:
- a simple EO rule set simplifies the language and enhances its usability
- a simple EO rule set simplifies the interpreter
- a simple EO rule set is better suited for a faster interpreter or a JIT compiler
The problem is, that Rebol designer didn't implement the rule he wrote. The current state is, that the evaluation of expression:
a b + c d
may be equivalent to e.g.:
(a b) + (c d)
or
(a (b + c)) d
or
a b (+ c d)
depending on the values A, B, C, D have. I would characterize Rebol as having a Value Driven Evaluation Order.
20. What are the Rebol operators?
This seems to be an easy question, but the answer is not that simple. My observations are showing, that the "operator" notion mentioned in the Rebol User's Guide (see above) doesn't mean a value of the OP! type:
do probe reduce [1 :+ 2]
; [1 op 2]
; ** Script Error: none expected value2 argument of type: number pair char money date time tuple
; ** Where: do-boot
; ** Near: op 2
Neither it means a word having a value of the OP! type:
a-word: :+
1 a-word 2
; ** Script Error: a-word expected value2 argument of type: number pair char money date time tuple
; ** Where: do-boot
; ** Near: a-word 2
My observations support the hypothesis, that an infix operator is a special word, that I am calling Op-word. The Op-words I know of are: * ** + - / // < <= <> = == =? > >= != AND OR. One of these is a category of its own, the - operator. It behaves as a unary operator:
do/next [- 3 2] ; == [-3 [2]]
, or as a binary operator:
do/next [3 - 2] ; == [1 []]
Conclusion: Rebol operators are Op-words having op! type values.
21. Unbound Words
Users may sometimes encounter Unbound Words. I think, that it is not advisable to make Unbound Words accessible to the users. More on this can be found in [Contexts].
22. Less Aggressive Evaluation
The latest versions of the interpreter less aggressively evaluate some words. I propose to make the Less Aggressive Evaluation even broader. I especially mean, that for lit-paths as well as lit-words the expressions:
a
and
:a
should be equivalent. More on this can be found in [Interpretation].
23. References
Ladislav's Rebol Page
[Evaluation]
[Contexts]
[Interpret]
[Parseen]
The End
|