Iterating with 'each'

An each loop is a statement used to iterate over the lines of a table, executing the same set of operations once for each iteration. It is more complex but more flexible than for loops.

The each loop being an advanced feature, this article assumes in-depth knowledge about for loops, and will describe each in terms of its differences with for loops.

In terms of syntax, the only difference is that in the loop header, the for Expr in Expr is replaced with each Table. For comparison, here is an example of the each loop syntax and equivalent for:

table T = extend.range(10)
A = 0

T.M = each T scan auto
  keep A
  A = A + T.N
  return A

T.M = for N in T.N scan auto 
  keep A
  A = A + N
  return A

When given the choice of using a for or an each loop, it is strongly recommended to use the for loop, as it is significantly simpler to understand. The each loops should be used only to achieve things that are impossible with for loops (see reasons to use each below).

Table diagram

Both the each and the for loops execute their body once for every line of data in the iteration table. At the start of each iteration, the for loop loads the input data into scalar variables as described by the .. in .. pairs in its header. The each loop instead loads the input data based on the table diagram. All the tables defined in the script are classified into one of the following categories:

The position of a table T in the diagram determines what happens when reading a variable T.X, writing a variable T.X, or using into T:

Diagram Read variable Write variable into
Iteration table
Iteration
Returns the scalar value on the line corresponding to the current iteration.

X = Iteration.X
Writing to the iteration table is forbidden. Broadcasting into the iteration table is forbidden.
Upstream table
Upstream
Identifies the line that would be broadcast into the line of the iteration table that corresponds to the current iteration, and returns the scalar value on that line.

X = Upstream.X
Allowed only for keep variables. Assigns a scalar to the line corresponding to the current iteration, leaving the rest unchanged.

Upstream.X = X
Broadcasting into an upstream table is forbidden.
Upstream-cross table
UpstreamFull
Identifies the line of the left table that corresponds to the current iteration, and returns the vector in the Full table that corresponds to that line.

Full.X = UpstreamFull.x
Writing to an upstream-cross table is forbidden. Broadcasting into an upstream-cross table is forbidden.
Full table Normal behavior. Normal behavior. Normal behavior.

Exercise

What is the classification of each of those tables in the three each blocks below ?

read ".." as Category[category]
read ".." as Product[sku] expect [category]
read ".." as Channel[channel]
read ".." as Orders expect [channel, sku, date]

table CategoryWeek = cross(Category, Week)
table ProductWeek = cross(Product, Week)
table ChannelWeek = cross(Channel, Week)

Product.X = each Product
  // Here ?

Week.X = each Week
  // Here ? 

Category.X = each Category
  // Here ?

Answers

Table each Product each Week each Category
Product Iteration Full Downstream
Week Full Iteration Full
Category Upstream Full Iteration
Channel Full Full Full
Orders Downstream Downstream Downstream
CategoryWeek Upstream-Cross Unavailable Upstream-Cross
ProductWeek Upstream-Cross Unavailable Unavailable
ChannelWeek Unavailable Unavailable Unavailable

each .. scan blocks

Like for .. scan, the each .. scan blocks allow you to keep values from one iteration to the next. However, this extra capability implies that the Envision runtime can’t parallelize the iterations of an each .. scan block. Thus, this variant should only be favored when values need to be kept from one iteration to the next. A simple example is given below:

table Obs = with
  [| as Date,          as Quantity |]
  [| date(2021, 1, 1), 13          |]
  [| date(2021, 2, 1), 11          |]
  [| date(2021, 3, 1), 17          |]
  [| date(2021, 4, 1), 18          |]
  [| date(2021, 5, 1), 16          |]

Best = 0

Obs.BestSoFar = each Obs scan Obs.Date
  keep Best
  NewBest = max(Best, Obs.Quantity)
  Best = NewBest
  return NewBest

show table "" a1b4 with Obs.Date, Obs.BestSoFar

In the above script, scan Obs.Date specifies the order in which the lines of the iteration table are to be traversed. The statement keep Best specifies that the variable Best must retain its value from one iteration line to the next. Finally, Best = NewBest assigns a new value to the variable ; it will be the one available on the next iteration line.

Lines of the iteration table are processed in the ascender order. However, the option desc can be used to specify the descending order, as illustrated by:

table Obs = with
  [| as Date,          as Quantity |]
  [| date(2021, 1, 1), 13          |]
  [| date(2021, 2, 1), 11          |]
  [| date(2021, 3, 1), 17          |]
  [| date(2021, 4, 1), 18          |]
  [| date(2021, 5, 1), 16          |]

Best = 0

Obs.BestSoFar = each Obs scan Obs.Date desc
  keep Best
  NewBest = max(Best, Obs.Quantity)
  Best = NewBest
  return NewBest

show table "" a1b4 with Obs.Date, Obs.BestSoFar

The each .. scan block comes with a short series of syntactic constraints relative to the keep statements. The block requires at least one keep statement. All the keep statements must be made at the very beginning of the each .. scan block. The keep statements must refer to variables that have already been defined, prior to the each .. scan block. A variable marked with keep is modified by the execution of the each .. scan block. Its last value remains available after exiting the each .. scan block.

keep vectors must be from small tables in order to be kept in-memory, and must be scalars, full-table or upstream-table vectors.

As a rule of thumb, user-defined processes should be preferred to each .. scan blocks whenever possible. The each .. scan block should be used when the logic grows too complex, or involves keeping non-scalar variables.

Return-less blocks

It may happens that an each .. block is introduced for the sole purpose of getting the last value held by a keep variable. Thus, the return statement may be omitted altogether as illustrated by the following script:

table Currencies = with
  [| as Code |]
  [| "EUR"   |]
  [| "JPY"   |]
  [| "USD"   |]

Sep = ""
List = ""
each Currencies scan Currencies.Code
  keep Sep
  keep List
  List = "\{List}\{Sep}\{Currencies.Code}"
  Sep = ", "

show scalar "" with List

In the above script, the variable List is built through iterative concatenations. However, as only the final form is of interest, a return-less each .. block is used.

In practice, however, the above script could be rewritten in simpler way leveraging the built-in join aggregator as illustrated by:

table Currencies = with
  [| as Code |]
  [| "EUR"   |]
  [| "JPY"   |]
  [| "USD"   |]

show scalar "" with join(Currencies.Code; ", ") sort Currencies.Code

auto ordering in scan

The ordering of the scan follows the primary dimension of the table being enumerated through the use of the keyword auto:

table T = extend.range(6)
x = 0
T.X = each T scan auto
  keep x
  x = T.N - x
  return x

show table "T" a1b5 with T.N, T.X

The above script is logically identical to the one below:

table T[t] = extend.range(6)
x = 0
T.X = each T scan t
  keep x
  x = T.N - x
  return x

show table "T" a1b5 with T.N, T.X

Any-order blocks

While persisting variables from one line to the next might be needed, the specific ordering might not matter. Envision provides a syntax to deal with those situations as illustrated by:

table Obs = with
  [| as X |]
  [| 42   |]
  [| 41   |]
  [| 45   |]

myMin = 1B
myMax = -(1B)
each Obs scan Obs.*
  keep myMin
  keep myMax
  myMin = min(myMin, Obs.X)
  myMax = max(myMax, Obs.X)

show summary "" a1b2 with myMin, myMax

In the above script, the scan Values.* indicates that an arbitrary order is taken.

As a rule of thumb, this feature should be considered as fringe and sparingly used. Indeed, the Envision compiler does not rely on any proof that ordering does not matter. Hence, if accidentally ordering does matter, the ambiguity might be resolved in non-predictable ways by the Envision runtime.

each .. when blocks

Like for .. when blocks, each loops can be filtered. The each .. when block only executes its body on lines where the condition specified by when is true.

table T = extend.range(5)

s = 0
each T scan auto when T.N mod 2 == 1
  keep s
  s = s + T.N

show scalar "odd sum" with s // 9

In the above script, the filter when T.N mod 2 == 1 is applied to every line of the table T. It filters out every line where T.N is even.

The each .. when block cannot return a vector, via the keyword return as lines would be missing. Instead, variables marked as keep must be used to extract information from the iteration.

Reasons to use each

The following features are available in each loops, but not in for loops, and are thus proper reasons to use each:

If an each loop is using neither of the above, consider changing it to a for loop.

User Contributed Notes
0 notes + add a note