July 2000 Draft
JavaScript 2.0
Formal Description
Semantic Notation
previousupnext

Thursday, November 11, 1999

Introduction

To precisely specify the semantics of JavaScript 2.0, we use the notation described below to define the behavior of all JavaScript 2.0 constructs and their interactions.

Semantic Values

The semantics describe the meaning of a JavaScript 2.0 program in terms of operations on simpler objects borrowed from mathematics collectively called semantic values. Semantic values can be held in semantic variables and passed to semantic functions. The kinds of semantic values used in this specification are summarized in the table below and explained in the next few sections:

Semantic Value Examples Description
The result of a nonterminating computation
syntaxError The result of a computation that returns by throwing a semantic exception
The result of a semantic function that does not return a useful value
true, false Booleans
-3, 0, 1, 2, 93 Mathematical integers
1/2, -12/7 Mathematical rational numbers
1.0, 3.5, 2.0e-10, -0.0, -, NaN Double-precision IEEE floating-point numbers
A’, ‘b’, ‘«LF»’, ‘«uFFFF» Characters (Unicode 16-bit code points)
[value0, ... , valuen-1] Vectors — indexed lists of semantic values
“”, “abc” , “1«TAB»5” Strings
{value1value2, ... , valuen} Mathematical sets of semantic values
name1 value1name2 value2, ... , namen valuen    Tuples with named member semantic values
name or name value Tagged semantic values
function(nIntegern*n Semantic functions

There is a special semantic value (pronounced as "bottom") that represents the result of an inconsistent or nonterminating computation. Unless specified otherwise, applying any semantic operator (such as +, *, etc.) to or calling a semantic function with as any argument also yields without evaluating any remaining operands or arguments (in technical terms, semantic functions and operators are strict in all of their arguments unless specified otherwise).

If interpreting a JavaScript program according to the semantics here gives a result, an actual implementation executing that JavaScript program will either fail to terminate or throw an exception because it runs out of memory or stack space.

Semantic values of the form value represents the result of a computation that throws a semantic exception. value is the exception's value (which must be a member of the SemanticException semantic type). Unless specified otherwise, applying any semantic operator (such as +, *, etc.) to value or calling a semantic function with value as any argument also yields value (with the same value) without evaluating any remaining operands or arguments.

The throw statement takes a value v and returns v. The catch statement converts v back to v.

Semantic functions that do not return a useful value return the semantic value . There are no operations defined on .

Booleans

The semantic values true and false are booleans. The not, and, or, and xor operators operate on booleans. Like most other operators, and, or, and xor evaluate both operands before returning a result; these operators do not short-circuit.

Integers

Unless specified otherwise, numbers in the semantics written without a slash or decimal point are mathematical integers: ..., -3, -2, -1, 0, 1, 2, 3, .... The usual mathematical operators +, -, *, and unary - can be used on integers. Integers can be compared using =, , <, , >, and .

Rationals

Numbers in the semantics written with a slash are mathematical rational numbers. Every integer is also a rational. Rational numbers include, for example, 0, 1, 2, -1, 1/2, -12/7, and -24/14; the last two are different ways of writing the same rational number. The usual mathematical operators +, -, *, /, and unary - can be used on rationals. Rationals can be compared using =, , <, , >, and .

Doubles

Numbers in the semantics written with a decimal point are double-precision IEEE floating-point numbers (often abbreviated as doubles), including distinct +0.0, -0.0, +, -, and NaN. Doubles are distinct from integers and rationals; when writing doubles in the semantics, we always include a decimal point to distinguish them from integers and rationals.

Doubles other than +, -, and NaN are called finite. We define the significand of a finite double d as follows:

Characters

Characters are single Unicode 16-bit code points. We write them enclosed in single quotes ‘ and ’. There are exactly 65536 characters: ‘«u0000»’, ‘«u0001»’, ...,‘A’, ‘B’, ‘C’, ..., ‘«uFFFF»’ (see also notation for non-ASCII characters). Unicode surrogates are considered to be pairs of characters for the purpose of this specification.

The characterToCode and codeToCharacter semantic functions convert between characters and their integer Unicode values.

Vectors

A semantic vector contains zero or more elements indexed by integers starting from zero. We write a vector value by enclosing a comma-separated list of values inside bold brackets:

[element0element1, ... , elementn-1]

For example, the following semantic value is a vector whose elements are four strings:

[parsley”, “sage”, “rosemary”, “thyme]

The empty vector is written as [].

Let u = [e0e1, ... , en-1] and v = [f0f1, ... , fm-1] be vectors, i and j be integers, and x be a value. The following notations describe common operations on vectors:

Notation   Result Value
u  v The concatenated vector [e0e1, ... , en-1f0f1, ... , fm-1]
|u| The length n of the vector
u[i] The ith element ei, or if i<0 or in
u[i ... j] The vector slice [eiei+1, ... , ej] consisting of all elements of u between the ith and the jth, inclusive, or if i<0, jn, or j<i-1. The result is the empty vector [] if j=i-1.
u[i ...] The vector slice [eiei+1, ... , en-1] consisting of all elements of u between the ith and the end, or if i<0 or i>n. The result is the empty vector [] if i=n.
u[i  x]   The vector [e0, ... , ei-1xei+1, ... , en-1] with the ith element replaced by the value x and the other elements unchanged, or if i<0 or in

Semantic vectors are functional; there is no notation for modifying a semantic vector in place.

Strings

A semantic string is merely a vector of characters. For notational convenience we can write a string literal as zero or more characters enclosed in double quotes. Thus,

Wonder«LF»

is equivalent to:

[W’, ‘o’, ‘n’, ‘d’, ‘e’, ‘r’, ‘«LF»]

In addition to all of the other vector operations, we can use =, , <, , >, and to compare two strings.

Sets

A semantic set is an unordered collection of values. Each value may occur at most once in a set. There must be a well-defined = semantic operator defined on all pairs of values in the set, and that operator must induce an equivalence relation.

A semantic set is denoted by enclosing a comma-separated list of values inside braces:

{element1element2, ... , elementn}

The empty set is written as {}.

For example, the following set contains seven integers:

{3, 0, 10, 11, 12, 13, -5}

When using elements such as integers and characters that have an obvious total order, we can also write sets by using the ... range operator. For example, we can rewrite the above set as:

{0, -5, 3 ... 3, 10 ... 13}

If the beginning of the range is equal to the end of the range, then the range consists of only one element: {7 ... 7} is the same as {7}. If the end of the range is one "less" than the beginning, then the range contains no elements: {7 ... 6} is the same as {}. If the end of the range is more than one "less" than the beginning, then the set is .

Let A and B be sets and x be a value. The following notations describe common operations on sets:

Notation   Result Value
|A| The number of elements in the set A; if A has infinitely many elements
min A If there exists a value m that satisfies both m  A and for all elements x  A, x  m, then return m; otherwise return (this could happen either if A is empty or if A has an infinite descending sequence of elements with no lower bound in A)
max A If there exists a value m that satisfies both m  A and for all elements x  A, x  m, then return m; otherwise return (this could happen either if A is empty or if A has an infinite ascending sequence of elements with no upper bound in A)
A B The intersection of sets A and B (the set of all values that are present both in A and in B)
A B The union of sets A and B (the set of all values that are present in at least one of A or B)
A - B The difference of sets A and B (the set of all values that are present in A but not B)
x A Return true if x is an element of set A and false if not
A = B Return true if the two sets A and B are equal and false otherwise. Sets A and B are equal if every element of A is also in B and every element of B is also in A.

min and max are only defined for sets whose elements can be compared with <.

Tuples

A semantic tuple is an aggregate of several named semantic values. Tuples are sometimes called records or structures in other languages. A tuple is denoted by a comma-separated list of names and values between bold triangular brackets:

name1 value1name2 value2, ... , namen valuen

Each namei valuei pair is called a field. The order of fields in a tuple is irrelevant, so x 3, y 4 is the same as y 4, x 3. A tuple's names must all be distinct.

Let w be an expression that evaluates to a tuple name1 value1name2 value2, ... , namen valuen. We can extract the value of the field named namei from w by using the notation w.namei. w is required to have this field. For example, x 3, y 4.x is 3.

In the HTML versions of the semantics, each use of namei is linked back to its tuple type's definition.

Oneofs

A semantic oneof is a pair consisting of a name (called the tag) and a value. Oneofs are sometimes called variants or tagged unions in other languages. A oneof is denoted by writing the tag followed by the value:

name value

For brevity, when value is , we can omit it altogether, so red is the same as red .

Let o be an expression that evaluates to some oneof n v. We can perform the following operations on o:

Notation   Result Value
o.name The value v if n is name; otherwise
o is name    true if n is name; false otherwise

For example, (red 5) is blue evaluates to false, while (red 5) is red evaluates to true. (red 5).red evaluates to 5.

In addition to the operators above, the case statement evaluates one of several expressions based on a oneof tag.

In the HTML versions of the semantics, each use of name is linked back to its oneof type's definition.

Functions

A semantic function receives zero or more arguments, performs computations, and returns a result. We write a semantic function as follows:

function(param1type1, ... , paramntypenbody

Here param1 through paramn are the function's parameters, type1 through typen are the parameters' respective semantic types, and body is an expression that computes the function's result. When the function is called with argument values v1 through vn, the function's body is evaluated and the resulting value returned to the caller. body can refer to the parameters param1 through paramn; each reference to a parameter parami evaluates to the corresponding argument value vi. Arguments are passed by value (which in this language is equivalent to passing them by reference because there is no way to write to a parameter).

Function parameters are statically scoped. When functions are nested and an inner function f defines a parameter with the same name as a parameter of an outer function g, then f's parameter shadows g's parameter inside f.

The only operation allowed on a semantic function f is calling it, which we do using the f(arg1, ..., argn) syntax. In the presence of side effects, f is evaluated first, followed by the argument expressions arg1 through argn, in left-to-right order. If the result of evaluating f or any of the argument expressions is , then the call immediately returns without evaluating the following argument expressions, if any. If the result of evaluating f or any of the argument expressions is v for some value v, then the call immediately returns that v without evaluating the following argument expressions, if any. Otherwise, f's body is evaluated and the resulting value returned to the caller.

Semantic Types

A semantic type is a possibly infinite set of semantic values. Names of semantic types are shown in Capitalized Red Small Caps, and compound semantic type expressions are in red.

We use semantic types to make the semantics more readable by declaring the semantic type of each semantic variable (including function argument variables). Each such declaration states that the only values that will be stored in a semantic variable will be members of that variable's semantic type. These declarations can be proven statically. The JavaScript semantics have been machine type-checked to ensure that every type declaration holds, so, for example, if the semantics state that variable x has type Integer then there does not exist any place that could assign the value true to x.

Semantic type annotations allow us to restrict the description of each semantic operator and function to only describe its behavior on arguments that are members of the arguments' semantic types. Thus, for example, we need not describe the behavior of the + semantic operator when passed the semantic values true and as operands because we can prove that this case cannot arise.

Every semantic type includes the values and v for all values v whose semantic type is SemanticException. For brevity we do not list and v in the tables below.

Basic Semantic Types

The following are the basic semantic types:

Type Set of Values
Void {}
Boolean {true, false}
Integer {..., -2, -1, 0, 1, 2, ...} (All mathematical integers)
Rational   All mathematical rational numbers
Double All double-precision IEEE floating-point numbers, including , -, and NaN
Character    All 65536 characters
String Shorthand for Character[] (see vector types below)
SemanticException   Set of all values that can be thrown as semantic exceptions. This type is defined separately inside each grammar that throws such exceptions.

The type Rational includes Integer as a subtype because every integer is also a rational number. Except for and v, the types Rational and Double are disjoint.

Compound Semantic Types

We can construct compound semantic types using the notation below. Here t, t1, t2, ..., tn represent some existing semantic types.

Type   Set of Values
t[] All vectors [v0, ... , vn-1] all of whose elements v0, ... , vn-1 have type t. Note that the empty vector [] is a member of every vector type t[].
{t} All sets {v1v2, ... , vn} all of whose elements v1, ... , vn have type t. Note that the empty set {} is a member of every set type {t}.
tuple {name1t1; ... ; namentn} All tuples name1 v1, ... , namen vn for which each vi has type ti for 1  i  n. The namei's must be distinct; the order in which the nameiti fields are listed does not matter.
oneof {name1t1; ... ; namentn}    All oneofs of the form namei v, where 1  i  n and v has type ti. If tk is Void, then namektk can be abbreviated as simply namek in the oneof semantic type syntax. The namei's must be distinct; the order in which the nameiti alternatives are listed does not matter.
t1  t2  ...  tn  t Some* functions that take n arguments of types t1 through tn respectively and produce a result of type t. If n is zero (the function takes no arguments), we write this type as ()  t.
* Technically speaking, this semantic type includes only functions that are continuous in the domain-theoretical sense; this avoids set-theoretical paradoxes.
()  t

The type constructors earlier in the table bind tighter than ones later in the table, so, for example, Integer[]  Rational[] is equivalent to (Integer[])  (Rational[]) (a function that takes a vector of Integers and returns a vector of Rationals) rather than ((Integer[])  Rational)[] (a vector of functions, each of which takes a vector of Integers and returns a Rational). In the rare cases where this is needed, parentheses are used to override precedence.

Semantic Operators

The table below lists the semantic operators in order from the highest precedence (tightest-binding) to the lowest precedence (loosest-binding). Operators under the same heading of the table have the same precedence and associate left-to-right, so, for example, 7-3+2-1 is interpreted as ((7-3)+2)-1 instead of 7-(3+(2-1)) or (7-(3+2))-1. When needed, parentheses can be used to group expressions.

The type signatures of the operators are also listed. Some operators are polymorphic; t, t1, t2, ..., and tn can represent any semantic types. The types of some operators are underdetermined; for example, [] can have type t[] for any type t. In these cases the particular choice of type is inferred from the context.

Each operator in the table below is strict: it evaluates all of its operands left-to-right, and if any operand evaluates to , then the operator immediately returns without evaluating the following operands, if any. If any operand evaluates to v for some value v, then the operator immediately returns that v without evaluating the following operands, if any.

Operator   Signatures Description
Nonassociative Operators
(x) t  t Return x. Parentheses are used to override operator precedence.
|u| t[]  Integer u is a vector [e0e1, ... , en-1]. Return the length n of that vector.
{t Integer The number of elements in the set u; if u has infinitely many elements
[x0x1, ... , xn-1] t  ...  t  t[] Return a vector with the elements x0x1, ... , xn-1.
{x1x2, ... , xn} t  ...  t  {t} Return a set with the elements x1x2, ... , xn. Any duplicate elements are included only once in the set. When t is Integer or Character, we can also replace any of the xi's by a range xi ... yi that contains all integers or characters greater than or equal to xi and less than or equal to yi. yi must not be less than xi "minus" one.
name1 x1, ... , namen xn t1  ...  tn  tuple {name1t1; ... ; namentn} Return a tuple with the fields name1 x1, ... , namen xn.
name oneof {namename2t2; ... ; namentn} Return a oneof value with tag name and value .
Action[nonterminali] Determined by Action's declaration This notation can only be used inside an action definition for a grammar production that has nonterminal nonterminal on the production's right side. Return the value of action Action invoked on the ith instance of nonterminal nonterminal on the right side of . The subscript i can be omitted if there is only one instance of nonterminal nonterminal in .
nonterminali Character This notation can only be used inside an action definition for a grammar production that has nonterminal nonterminal on the production's left or right side. Furthermore, every complete expansion of grammar nonterminal nonterminal must expand it into a single character.
Return the character to which the ith instance of nonterminal nonterminal on the right side of expands. The subscript i can be omitted if there is only one instance of nonterminal nonterminal in . If the subscript is omitted and nonterminal nonterminal appears on the left side of , then this expression returns the single character to which this whole production expands.
Suffix Operators
u[i] t[]  Integer  t u is a vector [e0e1, ... , en-1]. Return the ith element ei, or if i<0 or in.
u[i ... j] t[]  Integer  Integer  t[] u is a vector [e0e1, ... , en-1]. Return the vector slice [eiei+1, ... , ej] consisting of all elements of u between the ith and the jth, inclusive, or if i<0, jn, or j<i-1. The result is the empty vector [] if j=i-1.
u[i ...] t[]  Integer  t[] u is a vector [e0e1, ... , en-1]. Return the vector slice [eiei+1, ... , en-1] consisting of all elements of u between the ith and the end, or if i<0 or i>n. The result is the empty vector [] if i=n.
u[i  x]   t[]  Integer  t  t[] u is a vector [e0e1, ... , en-1]. Return the vector [e0, ... , ei-1xei+1, ... , en-1] with the ith element replaced by the value x and the other elements unchanged, or if i<0 or in.
w.namei tuple {name1t1; ... ; namentn ti w is a tuple name1 v1, ... , namen vn. Return the value vi of w's field named namei.
oneof {name1t1; ... ; namentn ti w is a oneof namek v for some k between 1 and n inclusive. Return the value v if namei is namek, or if not.
f(x1, ..., xn) (t1  ...  tn t) t1  ...  tn  t Call the function f with the arguments x1 through xn and return the result.
Prefix Operators
-x Integer  Integer or
Rational  Rational
The mathematical negation of x
min A {t t Return the minimal element of set A. Specifically, if there exists a value m that satisfies both m A and for all elements x A, x m, then return m; otherwise return (this could happen either if A is empty or if A has an infinite descending sequence of elements with no lower bound in A). The type t must have = and < operations that define a total order.
max A {t t Return the maximal element of set A. Specifically, if there exists a value m that satisfies both m A and for all elements x A, x m, then return m; otherwise return (this could happen either if A is empty or if A has an infinite ascending sequence of elements with no upper bound in A). The type t must have = and < operations that define a total order.
name x t  oneof {nametname2t2; ... ; namentn} Return a oneof value with tag name and value x.
Multiplicative Operators
x * y Integer  Integer  Integer or
Rational  Rational  Rational
The mathematical product of x and y
x / y Rational  Rational  Rational The mathematical quotient of x and y; if y=0
A  B {t {t {t} The intersection of sets A and B (the set of all values that are present both in A and in B)
Additive Operators
x + y Integer  Integer  Integer or
Rational  Rational  Rational
The mathematical sum of x and y
x - y The mathematical difference of x and y
u  v t[]  t[]  t[] u is a vector [e0e1, ... , en-1] and v is a vector [f0f1, ... , fm-1]. Return the concatenated vector [e0e1, ... , en-1f0f1, ... , fm-1].
A  B {t {t {t} The union of sets A and B (the set of all values that are present in at least one of A or B)
A - B {t {t {t} The difference of sets A and B (the set of all values that are present in A but not B)
Comparison Operators
x = y Rational  Rational  Boolean or
Character  Character  Boolean or
String  String  Boolean or
{t {t Boolean
Comparisons return true if the relation holds or false if not.
Rationals are compared mathematically.
Characters are compared according to their code points.
Two strings are equal when they have the same lengths and contain exactly the same sequences of characters. A string x is less than string y when either x is the empty string and y is not empty, the first character of x is less than the first character of y, or the first character of x is equal to the first character of y and the rest of string x is less than the rest of string y.
Two sets x and y are equal if every element of x is also in y and every element of y is also in x. Only = and can be used to compare sets.
x  y
x < y
x  y
x > y
x  y
x A t  {t Boolean Return true if x is an element of set A and false if not
o is namei oneof {name1t1; ... ; namentn Boolean o is a oneof namek v for some k between 1 and n inclusive. Return true if namei is namek, or false otherwise.
Logical Negation
not a Boolean  Boolean true if a is false; false if a is true
Logical Conjunction
a and b Boolean  Boolean  Boolean true if both a and b are true; false if at least one of a and b is false
Logical Disjunction
a or b Boolean  Boolean  Boolean true if at least one of a and b is true; false if both a and b are false
a xor b true if a is true and b is false or a is false and b is true; false if both a and b are true or both a and b are false

Semantic Statements

Semantic statements are similar to the semantic operators above in that they are also used to construct expressions, take zero or more operands, and return a value. Unlike other semantic operators, semantic statements are usually non-strict: they do not always evaluate all of their operands. Semantic statements have lower precedence than any of the semantic operators above.

Some semantic statements are syntactic sugars, which means that they are defined as macros that expand into other, simpler statements and operators.

Function

function(param1type1, ... , paramntypenbody

See the description of function values.

Let

let var1type1 = expr1; ... ; varntypen = exprn in body

Evaluate expr1 through exprn in order and save the results. If any expri evaluates to , then immediately return without evaluating the following expr's. If any expri evaluates to v for some value v, then immediately return that v without evaluating the following expr's. Otherwise evaluate body with new local variable bindings of var1 through varn bound to the saved results of evaluating expr1 through exprn, respectively. Return the result of evaluating body.

type1 through typen are the local variables' respective semantic types. The type of the entire let expression is the type of its body.

The let expression above is syntactic sugar for:

(function(var1type1, ... , varntypenbody)(expr1, ... , exprn)

If

if expr then bodytrue else bodyfalse

Evaluate expr. If it evaluates to , then immediately return . If expr evaluates to v for some value v, then immediately return that v. Otherwise expr must evaluate to either true or false. If it evaluated to true, then evaluate bodytrue and return its result. If expr evaluated to false, then evaluate bodyfalse and return its result.

expr must have type Boolean. The entire if expression has any type t such that both bodytrue has type t and bodyfalse has type t.

Case

case expr of
    name1(var1type1): body1;
    ...
    namen(varntypen): bodyn;
    end

Evaluate expr. If it evaluates to , then immediately return . If expr evaluates to v for some value v, then immediately return that v. Otherwise expr must evaluate to a oneof name v where name matches namei for some i between 1 and n inclusive. Evaluate the corresponding bodyi with a new local variable vari bound to v. Return bodyi's result.

If we are not interested in using the oneof's value for a particular bodyi, we can shorten that bodyi's clause from:

    namei(varitypei): bodyi

to:

    nameibodyi

In this case no local variable is bound while evaluating bodyi.

expr must have type oneof {name1type1; ... ; namentypen}. The entire case expression has any type t such that all of its bodyi's have type t. The namei's must be distinct. The order in which the case clauses are listed does not matter.

Throw

throw expr

Evaluate expr. If it evaluates to , then immediately return . If expr evaluates to v for some value v, then immediately return that v. Otherwise expr must evaluate to some value v, in which case return v.

expr must have type SemanticException. The entire throw expression has any type whatsoever (because every semantic type includes v).

Try-Catch

try
    bodytry
catch (varSemanticException)
    bodyhandler

Evaluate bodytry to obtain a value w. If w does not have the form v for some value v, then return w. Otherwise w is v for some value v. In this case evaluate bodyhandler with a new local variable var bound to v and return bodyhandler's result.

The type of var is always SemanticException. The entire try-catch expression has any type t such that both bodytry has type t and bodyhandler has type t.

Semantic Functions

The sections below list the predefined semantic functions, their type signatures, and short descriptions. All functions are strict and evaluate their arguments left-to-right.

Integer Manipulation

These functions perform bitwise operations on integers. The integers are treated as though they were written in binary notation, with each 1 bit representing true and 0 bit representing false. The integers must be nonnegative.

Function   Signature Description
bitwiseAnd(x, y) Integer  Integer  Integer The bitwise AND of x and y
bitwiseOr(x, y) The bitwise OR of x and y
bitwiseXor(x, y) The bitwise XOR of x and y
bitwiseShift(x, count) Integer  Integer  Integer Shift x to the left by count bits. If count is negative, shift x to the right by -count bits. Bits shifted out are lost; bit shifted in are zero. This function is equivalent to multiplying x by 2count and truncating the result (toward negative infinity) to an integer. x can be negative.

Double Manipulation

Function   Signature Description
rationalToDouble(r) Rational  Double The rational number r rounded to the nearest IEEE double-precision floating-point value as follows:
Consider the set of all doubles, with -0.0, +, -, and NaN removed and with two additional values added to it that are not representable as doubles, namely 21024 and -21024. Choose the member of this set that is closest in value to r. If two values of the set are equally close, choose the one with an even significand; for this purpose, the two extra values 21024 and -21024 are considered to have even significands. Finally, if 21024 was chosen, replace it with +; if -21024 was chosen, replace it with -; if +0.0 was chosen, replace it with -0.0 if and only if r < 0; any other chosen value is used unchanged. The result is the value of rationalToDouble(r).
This procedure corresponds exactly to the behavior of the IEEE 754 "round to nearest" mode.

Character Conversions

Function   Signature Description
characterToCode(c) Character  Integer The number of the Unicode code point c
codeToCharacter(i) Integer  Character The Unicode code point number i, or if i<0 or i>65535

Character Utilities

The function digitValue is defined as follows:

digitValue(cCharacter) : Integer
  = if c  {‘0’ ... ‘9’}
     then characterToCode(c) - characterToCode(‘0’)
     else if c  {‘A’ ... ‘Z’}
     then characterToCode(c) - characterToCode(‘A’) + 10
     else if c  {‘a’ ... ‘z’}
     then characterToCode(c) - characterToCode(‘a’) + 10
     else 

Character Class Queries

Function   Signature Description
isOrdinaryInitialIdentifierCharacter(c) Character  Boolean Return true if the nonterminal OrdinaryInitialIdentifierCharacter can expand into c and false otherwise
isOrdinaryContinuingIdentifierCharacter(c) Character  Boolean Return true if the nonterminal OrdinaryContinuingIdentifierCharacter can expand into c and false otherwise

Semantic Definitions

Value Definitions

We can define a global semantic constant named var as follows:

var : type = expr

expr should evaluate to a value of type type. expr should not have side effects, and it should not evaluate to .

In the HTML versions of the semantics, each reference to the global semantic constant var is linked to var's definition.

Function Definitions

We can define a global semantic function named f as follows:

f(param1type1, ... , paramntypen) : type = body

param1 through paramn are the function's parameters, type1 through typen are the parameters' respective semantic types, type is the function result's semantic type, and body is an expression that computes the function's result.

The above definition is syntactic sugar for the global constant definition:

f : type1  type2  ...  typen  type = function(param1type1, ... , paramntypenbody

In the HTML versions of the semantics, each reference to the global semantic function f is linked to f's definition.

For example, the function definition

square(xInteger) : Integer = x*x

defines a function named square that takes an Integer parameter x and returns an Integer that is the square of x. This is equivalent to the following global definition:

square : Integer  Integer = function(xIntegerx*x

Type Definitions

We can give a new name to a semantic type t by using the type definition, which has the form:

type name = t

For example, the following notation defines RegExp as a shorthand for tuple {reBodyStringreFlagsString}:

type RegExp = tuple {reBodyStringreFlagsString}

In the HTML versions of the semantics, each reference to the semantic type name name is linked to name's definition.

Semantic Actions

Semantic actions tie together the grammar and the semantics. A semantic action ascribes semantic meaning to a grammar production.

To illustrate the use of semantic actions, we shall look at an example, followed by a detailed description of the notation for specifying semantic actions.

Example

Consider the following grammar, with the start nonterminal Numeral:

Digit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
Digits 
   Digit
|  Digits Digit
Numeral 
   Digits
|  Digits # Digits

This grammar defines the syntax of an acceptable input: “37”, “33#4” and “30#2” are acceptable syntactically, while “1a” is not. However, the grammar does not indicate what these various inputs mean. That is the job of the semantics, which are defined in terms of actions on the parse tree of grammar rule expansions. Consider the following sample set of actions defined on this grammar, with a starting Numeral action called (in this example) Value:

type SemanticException = oneof {syntaxError}

action Value[Digit] : Integer = digitValue(Digit)

action DecimalValue[Digits] : Integer

DecimalValue[Digits  Digit] = Value[Digit]

DecimalValue[Digits  Digits1 Digit] = 10*DecimalValue[Digits1] + Value[Digit]

action BaseValue[Digits] : Integer  Integer

BaseValue[Digits  Digit](baseInteger)
  = let dInteger = Value[Digit]
     in if d < base
         then d
         else throw syntaxError

BaseValue[Digits  Digits1 Digit](baseInteger)
  = let dInteger = Value[Digit]
     in if d < base
         then base*BaseValue[Digits1](base) + d
         else throw syntaxError

action Value[Numeral] : Integer

Value[Numeral  Digits] = DecimalValue[Digits]

Value[Numeral  Digits1 # Digits2]
  = let baseInteger = DecimalValue[Digits2]
     in if base  2 and base  10
         then BaseValue[Digits1](base)
         else throw syntaxError

Action names are written in violet cursive type. The last action definition states in the example above that the action Value can be applied to any expansion of the nonterminal Numeral, and the result is an Integer. This action maps all acceptable inputs to integers or syntaxError. If the result is syntaxError, then the input satisfies the grammar but contains an error detected by the semantics; this is the case for the input “30#2”. A result of would indicate a nonterminating computation; this cannot happen in this example.

There are two definitions of the Value action on Numeral, one for each grammar production that expands Numeral. Each definition of an action is allowed to call actions on the terminals and nonterminals on the right side of the expansion. For example, Value applied to the first Numeral production (the one that expands Numeral into Digits) simply applies the DecimalValue action to the expansion of the nonterminal Digits and returns the result. On the other hand, Value applied to the second Numeral production (the one that expands Numeral into Digits # Digits) performs a computation using the results of the DecimalValue and BaseValue applied to the two expansions of the Digits nonterminals. In this case there are two identical nonterminals Digits on the right side of the expansion, so we use subscripts to indicate on which one we're calling the actions DecimalValue and BaseValue.

The BaseValue action illustrates a syntactic sugar for defining an action that is a function; this syntactic sugar is analogous to that for defining global functions.

The Value action on Digit illustrates the direct use of a nonterminal in a semantic expression: digitValue(Digit). Here the Digit semantic expression evaluates to the character into which the Digit grammar rule expands.

We can fully evaluate the semantics on our sample inputs to get the following results:

Input    Semantic Result
37 37
33#4 15
30#2 syntaxError

Action Declarations

action Action[nonterminal] : type

This declaration states that action Action is defined on nonterminal nonterminal. Any reference to action Action[nonterminal] in a semantic expression returns a value of type type. The values of action Action must be defined using action definitions for each grammar production that has nonterminal on the left side.

Action Definitions

Action[nonterminal  expansion] = expr

This notation defines the value of action Action on nonterminal nonterminal in the case where nonterminal nonterminal expands to the given expansion. expansion can contain zero or more terminals and nonterminals (as well as other notations allowed on the right side of a grammar production). Furthermore, the terminals and nonterminals of expansion can be subscripted to allow them to be unambiguously referenced by action references or nonterminal references inside expr.

The type of action Action on nonterminal nonterminal must be declared using an action declaration. expr must have the type given by that action declaration.

nonterminal  expansion must be one of the productions in the grammar.

Action Function Definitions

Action[nonterminal  expansion](param1type1, ... , paramntypen) = body

This notation is a syntactic sugar for defining an action whose value is a function. This notation is equivalent to:

Action[nonterminal  expansion] =
    function(param1type1, ... , paramntypenbody

Combined Action Declarations and Definitions

action Action[nonterminal] : type = expr

This declaration is sometimes used when all expansions of nonterminal nonterminal share the same action semantics. This declaration states both the type type of action Action on nonterminal nonterminal as well as that action's value expr. Note that the expansions are not given between the square brackets, and expr can refer only to the nonterminal nonterminal on the left side of grammar productions. No additional action definitions are needed for nonterminal nonterminal.

See the Value action on Digit in the example above for an example of this declaration.


Waldemar Horwat
Last modified Thursday, November 11, 1999
previousupnext