REBOL

Rebol Enhancement Proposal 2005

Author: Ladislav Mecir
Date: 23-Sep-2005/9:15:48+2:00

Contents:

1. Acknowledgments
2. Rebol Block Implementation and the REMOVE Function
3. Rebol Blocks and the Preallocated Space
4. Sameness of Rebol Blocks
5. Rebol List Implementation and the REMOVE Function
6. Errors As Results of Rebol (Sub)expressions
7. Signaling Errors Versus Yielding Errors
8. Error Reporting
9. Control Functions
9.1 CFOR
10. Indices Of Series
10.1 A SKIP Trap
10.2 Big Skips
10.3 REAL-INDEX?
10.4 PICK "Hole"
10.5 PICK deficiency
10.6 Negative Big Skips
11. Functions, RETURN, THROW, And Function Attributes
12. Functions
12.1 Local Words
13. The Result of PARSE
14. Literal Rules For Block Parsing
15. PARSE, TO and Numbers
16. PARSE, TO and Blocks
17. The Evaluation Order
18. #[unset!]
18.1 The primary purpose of #[unset!]
18.2 #[unset!] As A Return Value Of Some Functions
18.3 #[unset!] as the default argument value
18.4 #[unset!] as the value we can get querying undefined words
18.5 The Main Disadvantage
19. The Evaluation Order and the Infix Operators
20. What are the Rebol operators?
21. Unbound Words
22. Less Aggressive Evaluation
23. References

1. Acknowledgments

Carl Sassenrath is the inventor of the wonderful language.

Gabriele Santilli, Jeff Kreis, Daan Oosterveld, Frank Sievertsen, Joel Neely and Romano Paolo Tenca corrected some of my errors and helped me to improve this text.

See Bug fixes and enhancements at the REP site too.

2. Rebol Block Implementation and the REMOVE Function

See the Block implementation at REP.

3. Rebol Blocks and the Preallocated Space

See the Preallocated Space at REP.

4. Sameness of Rebol Blocks

See Block sameness at REP.

5. Rebol List Implementation and the REMOVE Function

The following is true for the current implementation:

>> a: make list! [1 2]
== make list! [1 2]
>> remove a
== make list! [2]
>> a
== make list! []
>> tail? a
** Script Error: Out of range or past end
** Near: tail? a

It is possible to change the implementation in such a way, that:

>> a: make list! [1 2]
== make list! [1 2]
>> remove a
== make list! [2]
>> a
== make list! [2]
>> tail? a
== false

, while retaining roughly the same speed.

6. Errors As Results of Rebol (Sub)expressions

See Errors as results at REP.

7. Signaling Errors Versus Yielding Errors

There may be a question: How to "signal" an error, instead of "yielding" it as a result of a (sub)expression?

Illustration: a subexpression yields an error as an argument for the TYPE? function:

type? make error! {pokus}

In the following example a subexpression signals an error instead of supplying its result to the TYPE? function:

type? 1 / 0

Variant #1: in Rebol there is a mechanism to actively evaluate a value. It is general enough and it is used for Rebol functions with a big success. It uses words referring to functions. Such words actively call the functions when evaluated. In the Interpretation article I called such values word-active. This variant is based on a proposition to make Rebol errors word-active too.

Disadvantage: Poor backward compatibility.

Variant #2: Define a new function SIGNAL which should signal an error given as its argument.

Disadvantage: A necessity to define one new function.

8. Error Reporting

Let's try this code:

a: func [x] [do x]
a [1 / 0]

This fires an error:

** Math Error: Attempt to divide by zero
** Where: a
** Near: 1 / 0

A report like:

** Math Error: Attempt to divide by zero
** Near: 1 / 0
** where: do x
** where: a [1 / 0]

might explain better what happened.

Proposition: it would be good to have a mechanism that would allow us to choose the depth of the error report. Moreover, instead of supplying just a function name I suggest to show the function call.

9. Control Functions

See Control functions at REP.

9.1 CFOR

Instead of the FOR mezzanine I use my own CFOR function inspired by the C language for cycle statement. Although similar to the C language construct, it is more general than its preimage. Moreover, it is much faster, than the FOR mezzanine. The implementation can be found at Cfor and I think, that it would be good to have it as a standard mezzanine in Rebol.

10. Indices Of Series

10.1 A SKIP Trap

The NEXT function isn't linear in the following sense: for any series S the expression

next tail s

yields the same result as the expression

tail s

This breaks a symmetry, because in all other cases the NEXT function creates a series with different index, than the index of its argument. The same holds for the BACK function at the head of a series and for the SKIP function at both ends.

Although this behaviour looks reasonable, it is the cause of two bugs in the mezzanine function FOR. If even the implementor of the language gets "caught" by a feature, then the feature should be reconsidered.

10.2 Big Skips

The INDEX? function gives us useful information about series, although sometimes it works as follows:

a: "11"
b: next a
clear a
index? b
; ** Script Error: Out of range or past end
; ** Near: index? b

It looks (at a first glance), that there is nothing wrong with this approach, because the "large indices" may be considered "illegal". Nevertheless, the big skip occurred before we used the INDEX? function, i.e. the function cannot help us to eliminate it as a possible error.

10.3 REAL-INDEX?

We can define a function able to yield a realistic index value for any series:

real-index?: function [
    {return a realistic index for any series}
    series [series!]
] [orig-tail result] [
    orig-tail: tail :series
    while [error? try [result: index? :series]] [
        insert insert tail :series #"1" head :series
    ]
    clear :orig-tail
    result
]

Test:

a: "11"
b: next a
clear a
real-index? b ; == 2

The only trouble with the REAL-MINDEX? function is, that its non-native implementation is too expensive, that is why it should be implemented natively instead (e.g. as a refinement of the INDEX? function)

10.4 PICK "Hole"

If I am able to pick a value at the position I in a series and a value at the position I - 2 in the same series, I would expect, that I am able to pick a value at the position I - 1 too. That isn't true, if I is equal to 1.

This is a similar trap as above, although I haven't seen many bugs caused by overlooking it.

It would be good to have a native function able to work more consistently.

10.5 PICK deficiency

Let's look at the code:

a: "123456789"
b: skip a 8
pick b -8 ; == #"1"

Now let's suppose, that we do:

clear skip a 6
pick b -8

It would be good to have a function able to work consistently in this case and yield a value, if we don't get outside of the bounds after the index arithmetic is done, not sooner.

10.6 Negative Big Skips

I think, that there are good reasons, why we should have a function able to create series with negative indices exactly like we (can) have a function creating series with big positive indices (we could then simulate series starting at a specific index).

See Indexing proposal too.

11. Functions, RETURN, THROW, And Function Attributes

I proposed more than two variants of the FUNC function.

The first one solves inconsistencies in function attributes very elegantly while not imposing any heavy computational burden on the interpreter.

Proposition: I suggest that a TFUNC-like function should be implemented natively.

See Function attributes at REP for description of advantages of the approach.

This can satisfy any needs the existing [catch] and [throw] attributes are designed to satisfy but more elegantly and consistently.

Another useful function is CFUNC, which deserves a native implementation too.

12. Functions

12.1 Local Words

Rebol functions use different kind of local words definition, than the MAKE OBJECT! construct. There are users including myself, that prefer the MAKE OBJECT! method to define local words.

While we can use our own functions to achieve this goal, like e.g. my LFUNC, some users would surely prefer to have it as standard.

set-words: function [
    {Get all set-words from a block}
    block [block!]
] [elem words] [
    words: make block! length? block
    parse block [
        any [
            set elem set-word! (
                insert tail words to word! :elem
            ) | skip
        ]
    ]
    words
]

locals?: function [
    {Get all locals from spec.}
    spec [block!]
] [locals] [
    locals: make block! length? spec
    parse spec [
        any [
            set item any-word! (
                append locals to word! :item
            ) | skip
        ]
    ]
    locals
]

lfunc: func [
    {Define a function with set-words handled implicitly as local.}
    [catch]
    spec [block!]
    body [block!] "The body block of the function"
] [
    vars: exclude set-words body locals? spec
    if not empty? vars [
        spec: head insert insert tail copy spec /local vars
    ]
    throw-on-error [make function! spec body]
]

, where LFUNC can be used to create functions as follows:

comment [

    ; Usage:

    a: 4
    f: lfunc [] [a: 2]
    f ; == 2
    a ; == 4

]

The newest implementation capable of defining static variables and more is at LFUNC.

13. The Result of PARSE

To yield TRUE PARSE must:

a) "get successfully" through the whole rule

b) get to the end position of the input

14. Literal Rules For Block Parsing

Let's suppose, that we want to use PARSE to check, whether a block contains numbers 1 2 3 in this order. The rule can be:

parse block [1 1 1 1 1 2 1 1 3]

, which looks awful.

Proposition: it would be good to have something like:

parse block [literally [1 2 3]]

15. PARSE, TO and Numbers

parse "abcdefgh" [thru 2 a: (print a) to 1 a: (print a)]
cdefgh
abcdefgh
== false

Is this intended?

16. PARSE, TO and Blocks

TO doesn't work with block rules in PARSE, although we can write some work-arounds like e.g. in [PARSEEN].

17. The Evaluation Order

See Evaluation order.

18. #[unset!]

18.1 The primary purpose of #[unset!]

The interpreter uses a special "internal" value to indicate, that a word having this value hasn't been initialized.

If a user makes a typo and tries to get the value of a word that hasn't been initialized, REBOL interpreter can detect his error by recognizing that the word has got this value.

Thus, #[unset!] is "steering" Rebol Typo Protection Mechanism and helping to debug Rebol scripts.

On the other hand, this property of #[unset!] is purely "internal" to the interpreter, i.e. it doesn't require #[unset!] to be acceptable by Rebol functions or available to programmers.

Actually, the opposite is true. Any additional purpose or property we assign to #[unset!] can make #[unset!] less suitable for its primary purpose.

18.2 #[unset!] As A Return Value Of Some Functions

Some functions return #[unset!] as their result. Example:

do []
print "OK"
()
get/any 'nonsense

The reason is to obtain a value indicating, that no (other) useful value is available.

18.3 #[unset!] as the default argument value

A comfortable way to handle this situation would be more like:

f: func [n [number!] default: 0] [print n]

Instead of the current:

f: func [n [number! unset!]] [
    if not value? 'n [n: 0]
    print n
]

18.4 #[unset!] as the value we can get querying undefined words

Currently we have got a possibility to write:

o: make object! [
    attribute: 1
    unset 'attribute
]
mold/all o
; == {
; make object! [
;     attribute: #[unset!]
; ]}
mold/all second o
; == {[
;     make object! [
;         attribute: #[unset!]
;     ] #[unset!]]}
do mold/all o

18.5 The Main Disadvantage

Because #[unset!] is a "legal" Rebol value - we can obtain it as a result of a Rebol expression - , some programs have to handle it.

Proposition: I propose to keep #[unset!] as a word-active value, like Rebol functions. That way we obtain typo protection for normal usages, like

x

, where it will behave like it does now. OTOH, I suggest to change the behaviour of #[unset!] for get-words and for the GET function, where I suggest to define the same behaviour as for Rebol functions, i.e. a "passive" evaluation.

This will remove one exception for the evaluation of get-words and simplify the majority of scripts.

19. The Evaluation Order and the Infix Operators

The Rebol User's guide currently states:

"There are two rules to remember when evaluating mathematical expressions: Expressions are evaluated from left to right. Operators take precedence over functions."

Having seen the above wording I can conclude, that Rebol designer's goal was to define the Evaluation Order rules to be simple. The benefits of the simple Evaluation Order are:

  • a simple EO rule set simplifies the language and enhances its usability
  • a simple EO rule set simplifies the interpreter
  • a simple EO rule set is better suited for a faster interpreter or a JIT compiler

The problem is, that Rebol designer didn't implement the rule he wrote. The current state is, that the evaluation of expression:

a b + c d

may be equivalent to e.g.:

(a b) + (c d)

or

(a (b + c)) d

or

a b (+ c d)

depending on the values A, B, C, D have. I would characterize Rebol as having a Value Driven Evaluation Order.

20. What are the Rebol operators?

This seems to be an easy question, but the answer is not that simple. My observations are showing, that the "operator" notion mentioned in the Rebol User's Guide (see above) doesn't mean a value of the OP! type:

do probe reduce [1 :+ 2]
; [1 op 2]
; ** Script Error: none expected value2 argument of type: number pair char money date time tuple
; ** Where: do-boot
; ** Near: op 2

Neither it means a word having a value of the OP! type:

a-word: :+
1 a-word 2
; ** Script Error: a-word expected value2 argument of type: number pair char money date time tuple
; ** Where: do-boot
; ** Near: a-word 2

My observations support the hypothesis, that an infix operator is a special word, that I am calling Op-word. The Op-words I know of are: * ** + - / // < <= <> = == =? > >= != AND OR. One of these is a category of its own, the - operator. It behaves as a unary operator:

do/next [- 3 2] ; == [-3 [2]]

, or as a binary operator:

do/next [3 - 2] ; == [1 []]

Conclusion: Rebol operators are Op-words having op! type values.

21. Unbound Words

Users may sometimes encounter Unbound Words. I think, that it is not advisable to make Unbound Words accessible to the users. More on this can be found in [Contexts].

22. Less Aggressive Evaluation

The latest versions of the interpreter less aggressively evaluate some words. I propose to make the Less Aggressive Evaluation even broader. I especially mean, that for lit-paths as well as lit-words the expressions:

a

and

:a

should be equivalent. More on this can be found in [Interpretation].

23. References

Ladislav's Rebol Page

[Evaluation]

[Contexts]

[Interpret]

[Parseen]

The End

MakeDoc2 by REBOL - 23-Sep-2005