Chapter 8. Ordered sets

Table of Contents

Set operators
Relational operators
Projection
Selection
Natural join
Cross join and ϴ-join
Natural outer joins
Fold (aggregation or reduction)
Partitioning
Temporal relational operators
@&@ (Temporal and)
@<:@ (Temporal element-of)
@?@ (Temporal touch)
@? (Temporal semi-touch)
@!? (Temporal anti-touch)
@\/ (Temporal union)
Set ordering
![] (Order-by operator)
Resulting order
.#0 (Row number)

Rax's ordered sets allow to group a variable number of values of the same type. Sets in Rax are different from sets in math. First, Rax sets can contain duplicates (so they're actually multisets). For example, this is a Rax set:

    {1, 2, 2, 3, 4, 5}
       

while in math, we can only have this:

    {1, 2, 3, 4, 5}
       

Second, in Rax, sets are ordered. Just like every value has an associated type, so does every set have an (associated) ordering, either implicit or explicit. For example the literal set above, has an implicit ordering defined by the order in which the elements are typed in the statement. To order a Rax set explicitly, the ![] postfix operator can be used. The last line of the example below would print the set T ordered by the second field.

    {[#,#]}: T := {[1,4],[2,1],[3,7]};            // Make a set of three tuples.
    `print T![.#2];                               // Output: {[2,1],[1,4],[3,7]}
       

Note that Rax has no set change operators like Ruby's << operator or Python's pop() method. In Rax, as in many functional languages, new values can be derived from existing values, but existing values cannot be changed. Therefore, the T![.#2] creates a new ordered set that has the same elements as T but with a (possibly) different ordering.

Indexing.  Since Rax sets are ordered, the elements of the sets can be referenced by indexes, and subsets (of consecutive elements) by ranges. Set indexes are 1-based. For example:

    {#}: set := {1..10};  // Create a set of numbers from 1 to 10.
    `print set[1];        // Get the first element: 1.
    `print set[1..3];     // Get the first three elements: {1, 2, 3}.
    `print set[5..];      // Get the elements from the 5th onward: {5,6,7,8,9,10}.
    `print set[..5];      // Get the elements up to the 5th: {1,2,3,4,5}.
    `print set[3..3];     // Get the third element as a singleton: {3}.
         

Indexes that are either too big or too small effectively point to elements that are not there, and therefore return the (current) default value for missing data. For example:

    {#}: set  := {1..9};
    `print set[10];       // Output: 0
    `print set[-1];       // Output: 0
         

In the above example, a wrong index returns 0 - the default missing value for number (for more information on missing values see Chapter 12, Missing values). In the case of ranges, the returned subset is cut to the elements that exists in the original set. For example:

    {$}:  set  := "a" + ($){1..99};     // Create a set of strings.
    \set: sub1 := set[-2..4];           // Equivalent to set[..4] and set[1..4].
    \set: sub2 := set[0..4];            // Equivalent to set[..4] and set[1..4].
    \set: sub3 := set[..9999];          // Equivalent to the whole set.
    \set: sub4 := set[-1..];            // Equivalent to the whole set.
    \set: sub5 := set[3..9999];         // Equivalent to set[3..].
    \set: sub6 := set[990..999];        // Equivalent to {}.
    \set: sub7 := set[-9..-4];          // Equivalent to {}.
         

As the last two assignments, sub7 and sub6, of the above example show, an upper index that is too low will cause the empty set to be returned as will a lower bound that is too high, effectively these ranges are empty. A range can also be wrong even if its indexes are within bounds: if the upper index is lower than the lower index. Continuing the above example:

    \set: sub8 := set[60..40];          // Equivalent to {}.
    \set: sub0 := set[0..0];            // Equivalent to {}.
         

Note that no errors or warning are generated by default if indexes are out of bounds. However, by switching the warning level to four (or above) the above two examples would generated a manifold of warnings. The warning level can be set with the meta command %warn 4;. Note that at warning level 4, Rax will warn about of lot of possible suspect things, for example it would warn about possible unintended string concatenation if a literal string ends in a comma, like so: {"aap," "noot", "mies"}. Indexes and ranges cannot be assigned to. For example, these expressions will generate an error:

    {#}: set  := {1..99};
    set[1]    := 5;            // Error: Assignment needs named storage (lval).
    set[1..3] := {11, 12, 13}; // Error: Assignment needs named storage (lval).
         

If the first element of an ordered set really needs to be replaced use something like P := {5} \/ P[2..];. In this case, however, the name P would point to a new value of the set, so the word "replaced" should not be taken literally.

Comparing sets.  Sets in Rax are considered equal, if they have the same type and the same elements (including the duplicates). The ordering of two sets does not have to be the same for the sets to be considered equal. We say that set ordering in Rax is weak (similarly to tuple-field labeling).

Empty set.  In Rax, the empty set is denoted by {}. The empty set is a special value because it can be assigned to any variable of type set, regardless what type of set it is exactly. For example all the expressions below are valid:

    {#}: set_of_numbers := {};
    {$}: set_of_strings := {};
    {[#:id, $:name]}: set_of_tuples := {};
       

However, an attempt to assign any of the above variables to another will result in an error, because they're of different types:

    set_of_numbers := set_of_strings;

    // Error: Type mismatch assigning '{$}' to '{#}'
       

Nested sets.  Sets in Rax can be nested arbitrarily deep. For example:

    {{#}}: set_of_sets_of_numbers := {{0}, {0..1}, {0..2}, {0..3}};
    {[{[[{[{#},$:s]},&:r],&:r]}:s,$:s]}: aMess;
       

Some set operators actually operate on a set of sets, for example prefix union (\/) or temporal union (@\/), others return a set of sets, like the partition operator. Allowing nested sets is a major difference between Rax and SQL. Nested sets make handling many operations on sets much more intuitive. A good example is the partition operator which divides a set into subsets. SQL does not contain a partition operator because the result of every query must be a table (i.e., non-nested set of tuples), not a set of sets. Oracle's SQL dialect provides a PARTITION BY clause which allows some operations on partitions. However, since partitions cannot be explicit, the SQL syntax is quite complex. This is a classical example from Oracle, finding the top two ranking earners per department. In Oracle's SQL:

    select *
    from (select deptno, empno, ename, sal,
                 rank() over (partition by deptno
                              order by sal desc) "RANK"
          from emp)
    where "RANK" <= 2
    order by deptno, "RANK"
       

In Rax:

    {[#:DEPTNO, #:EMPNO, $:ENAME, #:SAL]}: emp := ...

    \emp <- \emp: topSalary <- {
      out := in[1..2]![-.SAL];
    };
    \emp: topEarners := \/ topSalary(partition[.DEPTNO] emp);
    `print topEarners[1..]![.DEPTNO, -.SAL];
       

In the Rax code, we explicitly partition the set of employees by DEPNO, and then find the two top earners (first two elements in the set ordered by SAL) in each partition. Finally, we unite all the top-earners sets. Note that, even though function topSalary is defined on a set of employees (\emp), it is called on a set of such sets. We call this mechanism auto-iteration and we will describe it in more detail in the next paragraph.

Auto-iteration.  Rax provides many operators on sets: classical set-theory operators, relational operators and temporal relational operators. They will be described in detail in the next section. Besides special set operators, every operator that can be applied to an element of a set, can also be applied to the whole set. The result of this operation is a set of results of applying the given operator to each element of a set. We call this mechanism auto-iteration. For example:

    {#}: set_of_numbers := {1..5};
    `print set_of_numbers * 2;
       

Even though the multiplication operator, *, is not defined between a set of numbers and a number, we can still apply it here. Rax interpreter will detect that * is defined between the element of the set (a number) and a number and will multiply every element of the set by 2 returning:

    // Output: {2,4,6,8,10}
       

The auto-iteration works also on nested sets, descending as deep as necessary, to be able to apply the operator. For example:

   {{#}}: set_of_sets_of_numbers := {{1,2},{3,4}};
   `print set_of_sets_of_numbers * 2;
       

In this example, Rax interpreter has to "peel off" two levels of sets, before it finds a type on which * 2 can be executed: it executes it on the elements of the inner sets, and returns a set of sets of results:

    // Output: {{2,4},{6,8}}
       

Auto-iteration can also be applied on both sides of an operator. For example:

    `print {1,2}*{3,4};
       

Here, Rax has to "peel off" sets on both sides to be able to apply the * operator. The result of this operation is a set of sets:

    // Output: {{3,6},{4,8}}
       

Auto-iteration works not only on operators, but also on functions. We have seen an example of that in the previous paragraph, where function topSalary was defined on a set of type {[#, #, $, #]} and called on a set of sets of this type. Another example is shown below:

    `print sin:({-M_PI, -M_PI_2, 0.0, M_PI_2, M_PI});
       

Here, we execute a built-in function sin on a set of reals. M_PI and M_PI_2 are two of Rax's many built-in math constants and mean π and π/2 respectively (for more math constants, see Chapter 17, Math library). The result of this operation is a set of values of the sin function in the given set of points:

    // Output: {0,-1,0,1,0}
       

Bear in mind that currently auto-iteration on large sets will not have great performance. Apply it on small to medium-sized sets. Currently Rax uses client-side processing for auto-iteration (see Chapter 21, Rax SQL back ends, the section called “Client-side processing”).

Set operators

In this section, we will describe the set operators in Rax, that are modeled after the classical set-theory operators. However, since Rax sets are actually multisets (i.e., they can have repeat elements) these operators will work somewhat differently. For example, in classical set theory, adding the element 1 to the singleton set {1} would not change the singleton. In Rax the result would be the set {1, 1}.

The table below lists basic set operators in Rax. The next table illustrates the differences between classical sets and Rax's multisets.

Table 8.1. Rax set operators

Rax SyntaxSymbolExplanation
#|X|Cardinality of set X. Returns the number of elements in a set. For example: #{1..5} == 5
! Uniquification. Removes duplicates from a set. For example: !{1, 2, 2, 3, 3, 3, 5} == {1, 2, 3, 5}
<: "Element of" operator. Returns true or false. For example, the following expressions evaluate to true:
2 <: {1,2,5},
3 <: {1..5}.
The following expressions evaluate to false:
2 <: {1,3,5},
6 <: {1..5},
2 <: {}.
:> "Contains" operator. Returns true or false analogically to the "element of" operator. For example, {{1..3},{4..5}} :> {1..5} evaluates to false.
!<: "Not element of" operator. Returns true or false. For example, the following expressions evaluate to false:
2 !<: {1,2,5},
3 !<: {1..5}.
The following expressions evaluate to true:
2 !<: {1,3,5},
6 !<: {1..5},
2 !<: {}.
!:> "Not contains" operator. Returns true or false analogically to the "not element of" operator. For example, {{1..3},{4..5}} !:> {1..5} evaluates to true.
\/ Set-union operator, both infix and prefix. As an infix operator, it returns a union of two sets. The repeating elements are not removed. For example: {1,2,3} \/ {2,3,4} == {1,2,2,3,3,4} As a prefix operator, it is the infix reduction over a set of sets. For example: \/{{1,2},{3,4}} == {1,2,3,4}.
/\ Set-intersection operator, both infix and prefix. As an infix operator, it returns an intersection of two sets. The repeating elements are treated as different elements. For example: {1,1,2,2,3} /\ {1,2,2,4} == {1,2,2}. As a prefix it's the infix reduction over a set of sets. For example: /\{{1,2},{2,3},{2,4}} == {2}.
* Set-of-string multiplication. A set of strings can be multiplied by a delimiter string and the result will be the concatenation of all the strings from the set delimited by the delimiter string. For example: {"aap", "noot", "mies"} * "-"; would be equal to: "aap-noot-mies". The delimiter string can be empty. Note that this operator is not commutative.
\ \ Set difference. For example: {1,2,3}\{2} == {1,3} and {1,2,2,3}\{2} == {1,2,3}.
\ Set sample. Sampling returns a set of a given number of arbitrary elements from an ordered set. For example: {1,2,3}\2 might evaluate to {1,2} or {2,3} or {1,3} and {1..999}\3 might return {1,257,513}. If the given number is greater than the cardinality of the ordered set, a set smaller than the requested sample size will be returned (in fact the entire set is returned). For example: {1,1,1}\999 == {1,1,1}. The sample operator: \, does not produce a truly random sample in the statistical sense. It is however the fastest way to get a sub sample from an ordered set.
many Note that due to above-mentioned auto-iteration, many operators can be used on a set. The following expressions will evaluate to true: {"a","b","c"} * 3 == {"aaa","bbb","ccc"}
{1,2,3} + 3 == {4,5,6}
{"a","b"}+{"x","y"}
==
{{"ax","bx"},{"ay","by"}}


Table 8.2. Set operators in classical set theory and Rax

OperatorBasic MathRax
Union {1, 2, 3} ∪ {2, 3, 4}
=
{1, 2, 3, 4}
{1,2,3} \/ {2,3,4}
==
{1,2,2,3,3,4}
Intersect {1, 1, 2, 2, 3} ∩ {1, 2, 2, 3, 3}
=
{1, 2, 3}
{1,1,2,2,3} /\ {1,2,2,3,3}
==
{1,2,2,3}
Difference {1, 1, 1, 2, 2, 3, 3} \ {1, 2}
=
{3}
{1,1,1,2,2,3,3} \ {1,2}
==
{1,1,2,3,3}