3 Basics of the language

Contents of this section

The first thing to be mentioned about the Tela language is that it is in many respects similar to C, or C++. So much so that I usually use the GNU Emacs C++ mode to edit Tela code, since there is not a specific Tela mode (maybe you could contribute one?). There is an if statement ( if ), a while statement ( while ) and a for statement ( for ). There are of course many syntactic differences also, but two of them are the most fundamental:

1. In C, a semicolon ends a statement whilst in Tela a semicolon is a separator between successive statements. Tela's use of semicolons is equivalent to Pascal rather than C in this respect. This means for example that the right brace is usually followed by a semicolon in Tela, or at least that a semicolon does no harm there.

2. In C the assignment is an operator, i.e. a=b is syntatically an expression not a statement. In Tela the assignment is a statement. This is also similar to Pascal. This implies that the use of assignment is more restricted in Tela than it is in C or C++.

3.1 Flow of control statements

The following subsubsections describe the Tela control structures.

The if statement

The if statement has the following syntax (stmt = statement):

if (expr)
   stmt

or

if (expr)
    stmt
else
    stmt

The conditional expression expr must be enclosed in parentheses. There must be no semicolon between stmt and else.

The expression must evaluate to integer scalar or array. The condition is considered true if the scalar is nonzero or if all the array elements are nonzero. Examples:

if (x!=0)
    format("x is nonzero, x=``\n",x)    // notice no semicolon
else {
    x = 1;
    format("x was zero and is now 1\n");// this semicolon is optional
};  // this semicolon separates this if stmt from the next stmt, if any

Nested if statements are written in the following style:

if (a == 1) {
    /* action for a==1 */
} else if (a == 2) {
    /* action for a==2 */
} else if (a == 3) {
    /* action for a==3 */
} else {
   /* action for none of the above */
};

The while statement

The while loop statement has the following syntax:

while (expr)
    stmt ;

For example:

    while (!found) {
        LookForIt();
        counter++
    };

The statement is executed until expr evaluates to true. The "trueness" of expr is similar to the if statement ( if ): a scalar and all elements of an integer array must be nonzero for expr to be true.

The repeat statement

The repeat loop statement has the form

repeat
    stmt1;
    stmt2;
    ...
until expr;

The statements stmt1, stmt2, ... are iterated until expr evaluates to true (nonzero). The statements are always executed at least once. It is analogous to the repeat statement in Pascal.

The for statement

The for loop statment has the form

for (initstmt; expr; updatestmt) stmt;

It is equivalent to the while statement

initstmt;
while (expr) {
    stmt;
    updatestmt;
};

The syntax is rather similar to C, but there is a difference: the initstmt, updatestmt and stmt are statements, not expressions. On the other hand there is no comma operator in Tela. Thus you cannot write for(i=0,j=0; i<10; i++) but instead you must write

for ({i=1; j=1}; i<=10; i++) { /* ... */ };

We intentionally changed the loop running from 1, not 0, to remind that in Tela the first array index is 1, not 0 as in C.

3.2 Expressions and assignments

The following subsubsections describe Tela expressions and assignment statements.

Operators

Expressions are composed of atomic expressions and operators. Atomic expressions can be variable names, literal constants (integers, reals, imaginary constants, characters, array constructors, and strings), function calls, or array references. Operators usually obey the precedence rules in other programming languages where possible. The operators are, from lowest to highest precedence:

Operators       Associativity       Meaning
---------       -------------       -------
:               non-associative     Vector creation
||              left                Logical OR
&&              left                Logical AND
== !=           left                Equality and nonequality
> >= < <=       left                Greater than etc.
+ -             left                Addition and subtraction
* ** / mod      left                Pointwise multiplication,
                                      matrix multiplication,
                                      pointwise division,
                                      modulus operation
- +             non-associative     Unary minus and plus
^               right               Exponentiation
!               non-associative     Logical NOT
.' '            non-associative     Transpose, Hermitian transpose

In Fortran, the operator ** would be exponentiation, in Tela it is matrix multiplication. In C ˆ would be logical XOR, in Tela it is exponentiation. In Matlab * denotes matrix multiplication and .* pointwise multiplication, in Tela * is pointwise multiplication and ** is the matrix product. There are no matrix power and division operators in Tela as there are in Matlab. The equivalent of matrix division in Tela is the function linsolve. The vector creation operator uses the notation a:step:b, which follows the Matlab convention. In Fortran-90 the step is the last member, a:b:step. The set of operators is the same as in C, with the addition of vector creation, matrix multiplication, modulus, exponentiation, transpose and Hermitian transpose operators. These additional operators follow Matlab conventions except for the difference in pointwise and matrix nature of multiplication.

Atomic expressions

Atomic expressions can be:

variable names         a, b_89, $x, $_x9$
integer constants      0, 23
real constants         1.23, 4.5E3, 0.03
imaginary constants    1.23i, 4.5E3i, 0.03i
characters             'a'
array constructors     #(1,2,3), #(a,b; c,d)
strings                "with possible escape sequences\n"
function calls         f(), f(1), f(a+b,g(c))
array references       a[i,j], a[1], a[1:imax,2:jmax-1]
mapped references      a<[i,j]>, a<[1]>

Variable names, or any identifier names for that matter, start with a letter. The rest of the characters can also be digits or underscores. The dollar sign is also accepted as a ``letter'' in identifiers.

Imaginary constants are obtained from real constants by appending i with no intervening white space. There is no explicit notation for complex constants, you must use the addition or subtraction operators as in 2+3i or 0.5-0.75i. The way to denote the imaginary unit ``i'' is to write 1i. Notice that ``i'' here is part of the syntax. There is no predefined variable or constant named ``i'', and 2+3*i will generally not work as expected (unless you have assigned i = 1i, but this is not recommended).

Character constants are equivalent to their ASCII codes (integers) if used in arithmetic expressions. This practice is similar to C.

Array constructors are a bit more complicated. Syntactically an array constructor has the form #(component-list), where component-list consists of expressions separated by commas and/or semicolons. Commas denote appending the components to form a vector. For example, #(1:5,10) will produce the integer vector #(1,2,3,4,5,10), and #(1,2.3,4:-1:1,34) produces #(1, 2.3, 4, 3, 2, 1, 34). The last example gives a real vector because one of the components was fractional number.

A semicolon in the array constructor denotes composing a higher-rank array of the components, which must be lower-rank arrays (or scalars). For example, a matrix can be constructed from its row vectors v1 and v2 by #(v1;v2). The precedence of a semicolon inside array constructor is lower than the precedence of a comma, thus #(a,b; c,d) will construct of 2x2 matrix.

The array constructors work for higher rank arrays as well. The result of array constructor using commas has rank equal to the highest rank of the components, and the ranks of the components may not differ by more than one. The important exception is the case where all the components are scalars; in this case the result is a vector. The semicolon array constructor always produces a result which has rank one greater than the rank of the components, which must be same for all components in the semicolon case. Using array constructors for higher-rank components has been rare in practice.

Notice that the commas and semicolons have completely different meaning inside array constructors than outside them.

Strings may contain escape sequences similar to C language strings.

Array references follow the Pascal syntax, separating the dimensions with a comma. The indices may be vectors, which follows the Fortran-90 and Matlab array syntax ideas (``gather'' operations).

In addition to normal array references, Tela also supports mapped array references. In mapped referencing the index objects must all agree in type (they are usually arrays). The number of indices must be equal to the rank of the indexed array and the result will have size equal to an index object. For example,

A<[1:5,1:5]>

will produce the first five diagonal elements of matrix A as a vector. Mapped indexing can be used to extract N-dimensional component subsets from M-dimensional arrays, both N and M being arbitrary. The function intpol (linear interpolation) can be thought as a generalization of mapped indexing, where the ``index'' expressions need not be integers.

Assignments

Assignments can take the following three forms:

variable = expr ;

variable[index1,...] = expr ;

[var1,var2,...] = fname(expr1,expr2,...) ;

The first form is a simple assignment, where the value of the expressions is assigned to a variable. The second form is the ``scatter'' operation, or indexed assignment. The indices follow the same rules as if the array reference appears on the right-hand-side of an assignment (see previous subsubsection).

The third form is actually not an assignment but a function call with several output arguments a la Matlab. The output variables must be separated by commas (In Matlab the commas may be left out.) The output variables must be simple identifiers, not expressions. For example, you cannot say

[b[1],b[2]] = f();         // WRONG!!!

You must use auxiliary variables, as in

[b1,b2] = f();
b[1] = b1; b[2] = b2;

There are some chances that this limitation might be removed in some future version.

3.3 Defining functions

Examples of function definition statements:

function f() { /* body */ };     // the simplest form

function f(x) { /* body */ };

function y=f(x) local(a) { /* ... */; y = sin(x) };

function [x,y] = f(a,b) { /* ... */ };

function [x,y;z,w] = f(a,b;n) global { /* ... */ };

The definition always begins with the reserved word function. After function comes the output argument specification (possibly empty), followed by the function name and input argument list, possible local or global declarations and finally the function body enclosed in curly braces.

Input arguments are passed by reference. They may not be modified in the function body. (If you try, you get a warning message.) Thus the calling program may think that the input arguments are passed by value even though they actually aren't. In C++ the type of the input arguments would be const Ttype&, and in Fortran-90 they would correspond to the ``intent input'' arguments. Input arguments are listed following the function name both when defining and calling a function.

Output arguments are listed in brackets before the ``='' sign and the function name both in definition and calling phases. If there is only one output argument, the brackets may be dropped (see the third example above). Output arguments are also passed by reference, but obviously they may and should be modified by the function body.

By default, input arguments are obligatory and outputs arguments are optional. That is, the calling program must supply all input arguments, but it may leave out some or all of the output arguments. For example, if f is defined as

function [a,b,c] = f(x,y) { /* ... */ };

all the following are legal call forms for f:

[X,Y,Z] = f(2+3i,sin(X+y));     // use all output arguments

[X,Y] = f(1,2);     // ignore the third output argument

X = f(1,2);         // ignore second and third

[] = f(1,2);        // ignore all

The default behavior of obligatory and optional arguments can be changed by using the semicolon inside argument list. At most one semicolon may appear inside an argument list. The rule is simple: all arguments before the semicolon are obligatory, that is, required, and all arguments after the semicolon are optional. This rule applies to both input and output argument lists. Examples:

function f1(x,y;z,w) { /* ... */ };

function [a;b] = f2(x,y) { /* ... */ };

function [a;] = f3() { /* ... */ };

function [a;b] = f4(x;y) { /* ... */ };

Function f1 has two obligatory input arguments and two optional ones. It has no output arguments. (It is, however, permissible to assign the ``value'' of f1 to a variable as in x=f1(0,0). The variable x has the void value after the assignment. The void value may be created explicitly by using an expression consisting of just the colon ``:'', and it doesn't output anything when printed.)

Function f2 has to obligatory input arguments, since there is no semicolon in the input argument list. One of the two possible output arguments is obligatory and the other one is optional. Function f3 has no input arguments but one obligatory output argument. Notice that the semicolon may be also the first or last thing in the argument list. Finally, function f4 has both input and output optional and obligatory arguments.

Whether or not an optional input argument is present can be tested e.g. by the standard function isdefined, which returns 1 if the argument is defined and 0 if it is undefined. The test will fail also if the caller supplied the argument but it was bound to an undefined variable; this behavior is usually what is wanted since passing an undefined variable as argument is practically the same as not passing any argument at all.

The role of obligatory and optional input arguments, and also the meaning of optional output arguments, is clear. But what about obligatory output arguments, why should anyone want to use such things? The answer is simple: they correspond to arguments that are both input and output. By making an output argument obligatory you force the caller to bind it to some variable, which probably has some initial value that the function body may use.

Maybe we should pause for a while and recall the important concepts introduced in this subsection. By default, input arguments are obligatory and output arguments are optional. How it could be otherwise? Input arguments are read in the function body, so they may not be undefined. By making the output arguments optional by default we allow the caller the freedom to ignore some output arguments. The semicolon modifies the default behavior. Everything to the left of the semicolon becomes obligatory and everything to the right of the semicolon becomes optional. Input/output arguments should be declared as obligatory output arguments.

There is still one thing about function declaration, namely the global/local declaration that may be placed in between the input argument list and the function body. The declaration may take one of four possible forms:

local           // Everything is automatically local. The default.
global          // Everything is automatically global.
local(a,b,...)  // The listed symbols are local, others are global.
global(a,b,...) // The listed symbols are global, others are local.

All ``free'' symbols that appear in the function body are either local to the function or global. ``Free'' symbol means a symbol which is not one of the input or output arguments, has not been declared autoglobal using the standard function autoglobal, and is not used as a function name in the function body. By default, all free symbols are local. This corresponds to the Matlab convention. By inserting the word global with no variable list, however, you can make all free symbols refer to the globally visible symbols. By inserting local(a,b,...) you declare the listed symbols as local; other symbols remain global. This corresponds to the practice normally used in compiled languages such as C, C++ and Pascal. And finally, by using the global(a,b,...) declaration you can use the complementary approach where every symbol except those listed are local. The last case again mimics Matlab with global declaration.

Thus you can choose among different strategies here. In some functions it is more natural to list the local variables rather than the global ones, if not for other reason but because sometimes the local variables are less in number than global ones, and vice versa. The most improtant thing to remember is to remember: Think about global/local every time you define a new function. The experience is that many, maybe even most, error situations in Tela arise from forgetting to properly declare a variable global or local.

The system has some autoglobal variables, which are always global even if they are not explicitly declared. Among these are constants such as pi, Inf, eps, on, off, and some color and line style names. Thus it is unnecessary to say global(pi) if pi is used in a function. It is possible to override the autoglobal character by explicitly listing the symbol in local declaration; for me it is bad programming style however.

On trick that can be useful when developing code is to at first make most variables global. In this way they are available in the workspace for inspection after the function has completed execution or stopped in runtime error. When the function is working you can then make as many symbols local as possible.

3.4 The package mechanism

Typically in Tela you write a t-file with many, maybe a few tens of functions. Sometimes the functions need to communicate among themselves not only via arguments but also via global variables, as in Pascal or C. For example if you are developing a fluid simulation code the values of physical parameters such as grid spacing, viscosity etc. are most naturally declared global and not explicitly passed to every function that uses them. We have been taught that using globals is bad programming style, however, in this case it actually increases the modularity of the program because if you introduce more physical parameters you need not change the call form and definition of every function.

Problems may arise if you use such a t-file in conjunction with other t-files. The global variables then share the same name space. Also the internal auxiliary function names may be the same in two t-files. One solution would be to use untypical symbol names, but this is not elegant.

The keyword package is there to hide internal symbol names from external access. The use is very simple. Just enclose the whole t-file in curly braces and put package global(var1,var2,...) in front. In the list you should put all symbol names that you want to be externally visible. Usually these are function names, but they may also be (global) variable names. All other symbols are then put in a private name space and they correspond to the static variables in C and C++.

It is also possible to use package local declaration as a complementary approach, in analogy with global/local in function definition, but this form is rarely used in practice. You may also put a character string between package and global or local to name your package. This would be required if you want to put stuff from more than one t-file in one and a same package. Actually, if the string is left out, the fully qualified t-file name is used as a unique package name.

Despite the syntactic similarity of global/local in package declaration and function definition, the meaning is quite different. Symbols which are local to a package are usually global to functions inside the package. (If they are also local in the functions, declaring them local in the package context does not make any difference.) Local variables in a function are currently implemented as slots in a runtime stack, whereas global symbols (in the function sense) are bound to workspace variables, which have name, value and attributes. The only thing the package mechanism does is that it prepends to all local (in the package sense) symbol names an invisible string which is unique to every package. This way the variables are physically in the same global hash table but with unique names.

Next Chapter, Previous Chapter

Table of contents of this chapter, General table of contents

Top of the document, Beginning of this Chapter