1. Running Amos II
  2. AmosQL
    2.1 Types
    2.2 Objects
    2.3 Functions
    2.4 Query statements
    2.5 Updates
    2.6 Cursors
  3. Database procedures
    3.1 Iterating over the result of a query
  4. Peer management
    4.1 Peer communication
    4.2 Peer queries and views
  5. Wrapping external systems  5.1 Multi-directional foreign functions
    5.2 Generic wrapper functionality
    5.3 Relational database wrappers
  6. Physical database design
  7. 6.1 Indexing
    6.2 Clustering
  8. System functions and commands
  9. 7.1 Comparison operators
    7.2 Arithmetic functions
    7.3 String functions
    7.4 Aggregation functions
    7.5 Temporal functions
    7.6 Sorting functions
    7.7 Accessing the type system
    7.8 Extents
    7.9 Query optimizer tuning
    7.10 Unloading
    7.11 Miscellaneous
  10. References 
 

Amos II Release 7 User's Manual

Amos II Release 7 Version 1

Staffan Flodin, Martin Hansson, Vanja Josifovski, Timour Katchaounov, Tore Risch, and Martin Sköld

Uppsala DataBase Laboratory
Department of Information Technology
Uppsala University
Sweden

March 29, 2005

Summary

Amos II is an extensible mediator system allowing different kinds of distributed data sources to be integrated. The principles of the Amos II system and AmosQL are described in the document [RJK03], which you are recommended to read first. The system is centered around an object-relational and functional query language, AmosQL. The system can store data in its main-memory object store. Furthermore, wrappers can be defined for different kinds of data sources and external storage managers to make them queryable. The basic wrapper interface is based on user defined multi-directional foreign functions having various capabilities used to access external data sources in different ways [LR92]. On top of the basic foreign function mechanism object oriented abstractions are defined through mapped types [FR97]. Several distributed Amos II peers can collaborate in a federation. This manual describes details of how to use the Amos II system and the AmosQL query language. It includes documentation of basic peer communication primitives, multi-database queries and views, the wrapper functionality, and the predefined wrappers for relational databases through ODBC and JDBC. Multi-database view are further described in [JR99a][JR99b]. The distributed query decomposer is described in [JR02] and [KJR03]. The main difference between Release 7 and earlier releases of Amos II is a new extensible and object-oriented data source wrapping system (Ch 4). 

Acknowledgments

The following persons have contributed to the development of the Amos II system:
Silvio Brandani, Kristofer Cassel, Daniel Elin, Marcus Eriksson, Gustav Fahl, Staffan Flodin, Jörn Gebhardt, Martin Hansson, Björn Hellander, Milena Ivanova, Vanja Josifovski, Markus Jägerskogh, Jonas Karlsson, Timour Katchaounov,  Maryam Ladjvardi, Salah-Eddine Machani, Joakim Näs, Kjell Orsborn, Thomas Padron-McCarthy, Johan Petrini, Giedrius Povilavicius, Tore Risch, Josef Schindler, Martin Sköld, Christian Werner, and Magnus Werner.

1 Running Amos II

It is recommended that each user creates a private directory for Amos II, <privdir>. You should copy the following files to <privdir>
   amos2.exe
amos2.dmp
amos.dll
Amos II is then ready to run in <privdir> by the command: 
amos2 [<db>]
where [<db>] is an optional name of an Amos II database image (default is amos2.dmp which is an empty database). 
The system enters an Amos II top loop where it reads AmosQL statements, executes them, and prints their results. You need not connect to any particular database, but instead, if <db> is omitted, the system enters an empty database (amos2.dmp), where only the system objects are defined. 

When the Amos II database is defined and populated, it can be saved on disk with the AmosQL statement: 

save "filename";
NOTICE: If you save using the name 'amos2.dmp' you overwrite the system database. 

In a later session you can connect to the saved database by starting Amos II with: 

amos2 filename
The prompter in the Amos II top loop is: 
Amos n>
where n is a generation number. The generation number is increased every time an AmosQL database update statement is executed in the Amos II top loop. For example: 
Amos 1> create type Person;
Amos 2> create type Student under Person;
Database changes can be undone by using the rollback statement with a generation number as argument. For example, the statement: 
Amos 3> rollback 2;
will restore the database to the state it had at generation number 2. It thus undoes the effect of the statement: 
create type Student under Person;
After the rollback above, the type Student is removed from the database, but not type Person.
A rollback without arguments undoes all database changes of the current transaction. 

The statement commit makes changes non-undoable, i.e. all updates so far cannot be rolled back any more and the generation numbering starts over from 1. 

For example: 

Amos 2> commit;
Amos 1> ...
To shut down Amos II orderly first save the database and then type: 
Amos 1> quit;

JavaAmos is a version of the Amos II kernel connected to the Java virtual machine. With JavaAmos Amos II foreign functions can be written in Java (the callout interface) and Java functions can call Amos II functions and send AmosQL statements to Amos II for evaluation (the callin interface) [ER00]. To start JavaAmos use the script

JavaAmos
instead of amos2. It will enter a top loop reading and evaluating AmosQL statements as amos2.

The multi-database browser GOOVI [CR01] is a graphical browser for Amos II written as a Java application. You can start the GOOVI browser from the JavaAmos top loop by calling the Amos II function

goovi();
It will start the browser in a separate thread.

2 AmosQL

This section describes the syntax of AmosQL and explains some semantic details. For the syntax we use BNF notation with the following special constructs: 

A ::= B C: A consists of B followed by C. 
A ::= B | C, alternatively (B | C): A consists of B or C.
A ::= [B]: A consists of B or nothing.
A ::= B-list: A consists of one or more Bs.
A ::= B-commalist: A consists of one or more Bs separated by commas.
'xxx': The string (keyword) xxx.

AmosQL statements are always terminated by a semicolon (;).

Comments

The comment statement can be placed anywhere outside identifiers and constants. 
Syntax:
comment ::=
        '/*' character-list '*/'


Identifiers

Identifiers have the syntax:
identifier ::=
        ('_' | letter) [identifier-character-list] 
identifier-character ::=
        alphanumeric | '_'

E.g.: MySalary, x, x1234, x1234_b
Amos II keywords are case sensitive; they are always written with lower case letters. Amos II identifiers are NOT case sensitive; i.e. they are always internally capitalized. 


Variables

Variables are of two kinds: local variables or interface variables:

   variable ::= local variable | interface-variable

  • Local variables are identifiers for data values inside AmosQL queries and functions. Local variables must be declared in function signatures (Function definitions), in from clauses (Query Statements), or by the declare statement (Database procedures). Notice that variables are not case sensitive.
  • Syntax:
    local-variable ::= identifier

       E.g. my_variable, MyVariable

  • Interface variables hold temporary results during a session. Interface variables cannot be referenced in function bodies and are not considered as being parts of the database.
  • Syntax:
    interface-variable ::= ':' identifier

       E.g. :my_interface_variable, :MyInterfaceVariable

    Interface variables are by default untyped (of type Object). The user can declare an interface variable to be of a particular type by the interface variable declare statement:

    interface-variable-declare-stmt ::= 'declare' interface-variable-declaration-commalist

    interface-variable-declaration ::= type-spec interface-variable

       E.g. declare Integer :i, Real :x;

    Interface variables can be assigned either by the into-clause of the select statement or by the interface variable assignment statement:

    set-interface-variable-stmt ::= 'set' interface-variable '=' expr

       E.g. set :i = 2 + sqrt(:x);

Constants

Constants can be integers, reals, strings, booleans, or nil.
Syntax:
constant ::=
        integer-constant | real-constant | boolean-constant | 
        string-constant | 'nil'

integer-constant ::= 
        ['-'] digit-list 

E.g. 123, -123

real-constant ::=
decimal-constant | scientific-constant

decimal-constant ::=
        ['-'] digit-list '.' [digit-list]

scientific-constant ::=
decimal-constant ['e' | 'E'] integer-constant

E.g. 1.2, -1.0, 2.3E2, -2.4e-21

boolean-constant ::=
        'true' | 'false' 

string-constant ::= 
        string-separator character-list string-separator

string-separator ::= 
        ''' | '"'

E.g. "A string", 'A string', 'A string with "', "A string with \" and '"
The surrounding string separators in string constants must be the same. If the string separator is " then \ is escape character inside the string always replacing the succeeding character. For example the string 'ab"\' can also be written as "ab\"\\", and the string a'"b can only be written as "a'\"b" since it contains both string separators.

simple-value ::= constant | variable

   E.g. :MyInterfaceVariable, MyLocalVariable, 123, "Hello World"

A simple value is either a constant or a variable reference.

Expressions

expr ::=  simple-value | function-call | collection | casting

    E.g. 1.23, 1+2, sqrt(:a), {1,2,3}, cast(:p as Student)


An expression is a either a simple value or a function call. An expression can normally be computed to produce a result value. The value of an expression is computed if the expression is entered to the Amos II top loop, e.g.:

   1+5*6;
    => 31


Entering such expressions is the simplest form of AmosQL queries.


Collections

collection ::= bag-expr | vector-expr
bag-expr ::= bag(expr-commalist)
vector-expr ::= '{' expr-comma-list '}'
   E.g. bag(1,2,3), {1,2,3}, {{1,2},{3,4}}, {1,name(:p),1+sqrt(:a)}

Collections represent sets of object. Collections can be either Bags or Vectors. Bags are unordered sets of objects with duplicates allowed, while Vectors are sequences of objects. Bags are often implicit in the system, e.g. the value of functions are often bags without explicitly specifying them to be so.


Statements

Statements instruct Amos II to perform various kinds services. The following statements can be entered to the Amos II top loop:
        create-type-stmt |
        delete-type-stmt
        create-object-stmt
        delete-object-stmt |
        create-function-stmt
        delete-function-stmt |
        query-stmt
        update-stmt |
        add-type-stmt |
        remove-type-stmt |
        for-each-stmt |
        set-interface-variable-stmt
declare-interface-variable-stmt |
commit_stmt |
rollback_stmt |
        open-cursor-stmt
        fetch-cursor-stmt
        close-cursor-stmt |
quit-stmt |
exit-stmt |
procedure-statement

 

2.1 Types

The create type statement creates a new type in the local database.

Syntax: 

create-type-stmt ::=
        'create type' type-name ['under' type-name-commalist]
                ['properties' '(' attr-function-commalist ')']

type-spec ::= type-name | 'Bag of' type-spec | 'Vector of' type-spec

type-name ::= identifier

attr-function ::=
        generic-function-name type-spec ['key']

E.g. create type Person;
create type Student under Person;
create type Kid under Person properties
(name Charstring key,
attitude Integer);
Type names must be unique across all types.
Type names are not case sensitive and the type names are always internally upper-cased. For clarity all type names used in examples in this manual have the first letter capitalized.

The new type will be a subtype of all the supertypes in the 'under' clause. If no supertypes are specified the new type becomes a subtype of the system type named Userobject. Multiple inheritance is specified through more than one supertype.

The attr-function-commalist clause is optional, and provides a way to define attributes for the new type. The attributes are functions having a single argument and a single result and are initially stored functions in the local database (but can be later redefined as other kinds of functions). The argument type of an attribute function is the type being created and the result type is specified by the type-spec. The result type must be previously defined. In the above example the function name has the argument of type Person and result of type Charstring.

If 'key' is specified for a property, it indicates that each value of the attribute is unique.
 

2.1.1 Deleting types

The delete type statement deletes a type and all its subtypes. 

Syntax:

delete-type-stmt ::=
        'delete type' type-name

E.g. delete type Person;
Functions using the deleted type will be deleted as well. 

2.2 Objects

The create object statement creates one or more objects and makes the new object(s) instance(s) of a given user type and all its supertypes.

Syntax: 

create-object-stmt ::=
        'create' type-name
        ['(' generic-function-name-commalist ')'] 'instances' initializer-commalist       

initializer ::= 
        variable |
        [variable] '(' expr-commalist ')'

Examples:
 
   create Person (name,age) instances
          :adam ('Adam',26),:eve ('Eve',32);

   create Person instances :olof;

   create Person (parents) instances
          :tore (bag(:adam,:eve));

create Person (name,age) instances
("Kalle "+"Persson" , 30*1.5);
The new objects are assigned initial values for the specified attributes. The attributes can be any updatable AmosQL functions of a single argument and value.

One object will be created for each initializer. Each initializer can have an optional variable name which will be bound to the new object. The variable name can subsequently be used as a reference to the object.

The initializer also contains a comma-separated list of initial values for the specified functions in generic-function-name-commalist.

Initial values are specified as constants or variables.

The types of the initial values must match the declared result types of the corresponding functions.

Bag valued functions are initialized using the keywords 'Bag of' (syntax bag-value).

Vector result functions are normally initialized with a comma-separated list of values enclosed in curly brackets (syntax vector-value).

It is possible to specify nil for a value when no initialization is desired for the corresponding function.

2.2.1 Deleting objects

Objects are deleted from the database with the delete statement.

Syntax:

delete-object-stmt ::= 
        'delete' variable
Referential integrity is maintained by the system which will automatically delete references to the deleted object. It is thus also removed from all stored functions where it is referenced. 

Deleted objects are printed as

#[OID nnn *DELETED*]
The objects may be undeleted by rollback. An automatic garbage collector physically removes the OIDs from the database only when their creation has been rolled back or their deletion committed, and they are not references from some variable or external system.



2.3 Functions

Functions can be either:
  • Stored functions whose extents are explicitly stored in the local Amos II database.
  • Derived functions which are defined by a single query statement.
  • Foreign functions which are defined in a programming language. There are foreign function interfaces to the programming languages C/C++ [Ris00a], Java [ER00], and Lisp [Ris00a] [Ris00b].
  • Stored procedures which are functions defined using procedural AmosQL statements. Makes AmosQL computationally complete.
  • Overloaded functions which are functions having different implementations depending on the argument types in a function call.
The create function statement creates a new user function.

Syntax:

create-function-stmt ::=
      'create function' generic-function-name argument-spec '->' result-spec [fn-implementation]

generic-function-name ::= identifier

   E.g. create function age(Person) -> Integer;

function-name ::= generic-function-name |
       type-name-list '.' generic-function-name '->' type-name-list

type-name-list ::= type-name | type-name '.' type-name-list

   E.g. plus, number.number.plus->number

argument-spec ::='(' [argument-declaration-commalist] ')' 

argument-declaration ::= type-spec [local-variable] ('key' | 'nonkey')

result-spec ::=   argument-spec | tuple-result-spec 

tuple-result-spec ::= ['<'] argument-declaration-commalist ['>'] 

fn-implementation ::=   'as' (derived-body | procedure-body |
                        foreign-body | 'stored') 

derived-body ::= simple-select-stmt


   E.g. create function name(Person) -> Charstring as stored;
        create function friends(Person) -> Bag of Person as stored;

The argument-spec and the result-spec together specify the signature of the function, i.e. the types and optional names of formal parameters and results.

Semantics:

  • The types used in the signatures must be previously defined. The name of an argument or result parameter can be left unspecified if it is never referenced in the function implementation. The names of the argument and result parameters for a given function definition must be unique.
  • Bag of specifications on a single result parameter declares it to be a bag (set with tuples allowed). Derived functions always have implicit Bag of results so for the result of derived functions 'Bag of' need not be explicitly declared.
  • Derived functions can also have arguments declared 'Bag of' making them aggregation operators.
  • AmosQL functions may also have restricted tuple valued results. This is indicated by bracketing the result declarations (see syntax for multiple-result-spec and example below).
Function names are not case sensitive and are internally stored upper-cased.

2.3.2 Specifying cardinality constraints

A cardinality constraint is a system maintained restriction on the number of allowed occurrences in the database of an argument or result of a function. For example, a cardinality constraint can be that there can be at most one salary and one name per person, while a person may have any number of parents. The only cardinality constraint which is currently supported in AmosQL is to make a specified argument or result of a stored function unique, by specifying it as a key.

For example:
   create function name(Person key) -> Charstring key as stored;
In this case the system guarantees that there is a one-to-one relationship between OIDs of type Person and their names.

If the 'key' cardinality constraint is violated by a database update the following error message is printed:

   Update would violate upper object participation (updating function ...)
The keyword 'nonkey' specifies that the parameter has no cardinality constraint.

The default cardinality constraint for a stored function is key for the first argument of a stored function and nonkey for all other arguments and the result(s). This implies that stored functions are by default single valued.

Since the first argument of a stored function by default is declared as key the function name could thus also have been written:

   create function name(Person) -> Charstring key as stored;

Another example:

   create function married(Person husband,Person wife key)->Boolean as stored;
Polygamous marriages are refused to be stored by the function married, since the first argument husband has the default cardinality constraint key.

The Bag of declaration on the result of a stored function actually just overrides the default key declaration of its 1st argument with nonkey. Thus the function parents above can be written in two alternative ways:

   create function parents(Person) -> Bag of Person as stored;
create function parents(Person nonkey) -> Person as stored;
Cardinality constraints can also be specified for foreign functions. These are important for optimizing queries involving foreign functions. However, it is up to the foreign function implementor to guarantee that specified cardinality constraints hold. Cardinality constraint declarations are ignored for derived functions.

The extent of a function can be regarded as a bag (or relation) of tuples mapping its argument(s) to the corresponding result(s). The key annotation for an argument or result specifies that the argument/result is a key to this extent relation.
 

2.3.3 Deleting functions

Functions are deleted with the delete function statement.

Syntax:

delete-function-stmt ::= 
        'delete function' function-name
For example:
   delete function married;
Deleting a function also deletes all functions using the deleted function.

2.4 Query statements

Queries retrieve objects having specified properties. They are specified using the query statement denoting either variables, constants, function calls, or select statements.

query-stmt ::= select-stmt | expr

      

The simplest and fastest kind of queries is to call a Amos II function defined either by the system or the user, e.g.

	upper("abc");
persoNamed("Kalle");
1+2;
date();
All Amos II functions are optimized when they are defined and simple function calls like the ones above are going to be called without any query optimization. This is called a fast path call to a function. By contrast general select statements require query optimization before they are executed and this is much slower than fast path function calls. You are therefore encouraged to always define queries as derived functions that are called using the fast path interface. 

2.4.1 Function calls

Syntax:

function-call ::=
        function-name '(' [parameter-value-commalist] ')' |
expr infix-operator expr

infix-operator ::= '+' | '-' | '*' | '/'

E.g. sqrt(2.1), 1+2, "a" + "b" 

parameter-value ::=
        expr | '(' simple-select-stmt ')' 


The built-in functions +,-,*,/ have equivalent infix syntax with the usual priorities. 
For example:
 (income(:eve) + income(:ulla)) * 0.5;
is equivalent to:
 times(plus(income(:eve),income(:ulla)),0.5);
The '+' operator is defined for both numbers, strings, and vectors denoting string and vector concatenation for types Charstring and Vector, respectively. Resolution of such overloaded functions is described in Overloaded Functions and Late Binding .

In a function call, the types of the actual parameters and results must be the same as, or subtypes of, the types of the corresponding formal parameters or results. 

2.4.2 The select statement

The select statement provides the most flexible way to specify queries.

Syntax:

select-stmt ::=
        'select' ['distinct'] expr-commalist
                [into-clause] 
                [from-clause]
                [where-clause]

simple-select-stmt ::= 
        'select' ['distinct'] expr-commalist
        [from-clause]
[where-clause]

into-clause ::= 
        'into' variable-commalist 

from-clause ::= 
        'from' variable-declaration-commalist 

variable-declaration ::= 
        type-spec local-variable

where-clause ::=
       'where' predicate-expression

For example:
select name(p) from Person p where age(p)<2 and city(p)="Uppsala"; /* Return singleton values */
select income(x),income(father(x))+income(mother(x)) from Person x; /* Returns tuples */
select {income(x),income(father(x))+income(mother(x))} from Person x; /* Returns vectors */
The expr-commalist defines the object(s) to be retrieved.

The from-clause declares types of local variables used in the query. Query variables can be bound to bags on which aggregation functions can be applied. Sequences (lists, arrays) of values is supported through type Vector.

The where-clause gives selection criteria for the search. The details of the where clause is described below in Predicate expressions .

The result of a select statement is a bag of single result tuples. Duplicates are removed only when the keyword 'distinct' is specified, in which case a set is returned.

An into-clause is available for specifying variables to be bound to the result. NOTICE that in case more than one result tuple is returned, the variables will be bound only to the elements of the first encountered tuple.

For example:

 select p into :eve2 from Person p where name(p) = 'Eve';
 name(:eve2);
This query retrieves into the environment variable :eve2 the Person whose name is 'Eve'. If more than one person is named Eve only the first one will be bound to :eve2.

2.4.3 Predicate expressions

The general syntax of predicate expressions is:
predicate-expression ::=
        predicate-expression 'and' predicate-expression | 
        predicate-expression 'or' predicate-expression | 
        '(' predicate-expression ')' | 
        simple-predicate

E.g. x < 5 or (x > 6 and 5 < y)

simple-predicate ::= 
        expr |
        eq-compare-expr = eq-compare-expr |
expr comparison expr

eq-compare-expr ::=
        expr | tuple-expr

tuple-expr ::=
        '<' expr-commalist '>' 

comparison ::= 
        < | > | <= | >= | !=

E.g. x > 2, "a" != s, 1+y <= sqrt(5.2), parents(p) = <m,f>, odd(:p)
Boolean functions returning type Boolean are treated as predicate expressions. Boolean functions return true or nothing (nil). The constant false is nil casted to type Boolean.

The comparison operators (=, !=, <, <=, and >=) are treated as binary boolean functions. They are all defined for objects of any type.

2.4.4 Overloaded functions, late binding, casting

Function names may be overloaded, i.e., functions having the same name may be defined differently for different argument types. This allows generic functions to apply to several different object types. Each specific implementation of an overloaded function is called a resolvent.

For example, assume the two following Amos II function definitions:

create function less(Number i, Number j)->Boolean
        as select true where i < j; 
create function less(Charstring s,Charstring t)->Boolean 
        as select true where s < t;
Its resolvents will have the signatures:
  less(Number,Number) -> Boolean
  less(Charstring,Charstring) -> Boolean
Internally the system stores the resolvents under different function names. The name of a resolvent is obtained by concatenating the type of its arguments with the name of the overloaded function followed by the symbol '->' and the type of the result.(Function definitions). The two resolvents above will be given the names number.number.less->boolean and charstring.charstring.less->boolean

Overloaded function resolvents are allowed to differ on their argument types and the result types. The query compiler resolves the correct resolvent to apply based on the types of the arguments; the type of the result is not considered. If there is an ambiguity, i.e. several resolvents qualify in a call, or if no resolvent qualify an error will be generated by the query compiler.

When overloaded function names are encountered in AmosQL function bodies, the system will try to use local variable declarations to choose the correct resolvent (early binding). 
For example:

 create function younger(Person p,Person q)->Boolean
        as select less(age(p),age(q));
will choose the resolvent number.number.less->boolean, since age returns integers and the resolvent number.number.less->boolean is applicable to integers by inheritance. The other function resolvent charstring.charstring.less->boolean does not qualify since it is not legal to apply to arguments of type Integer.

On the other hand, this function:

create function nameordered(Person p,Person q)->Boolean
        as select less(name(p),name(q));
will choose the resolvent charstring.charstring.less->boolean since the function name returns a string. In both cases the type resolution (selection of resolvent) will be done at compile time.

Dynamic type resolution at run time, late binding, is done for top loop function call s to choose the correct resolvent. For example,

less(1,2);
will choose number.number.less->boolean

To avoid the overhead of late binding one may use casting. Alternatively the explicit resolvent name notation can be used to explicitly call a specific resolvent:

number.number.less->boolean(1,2);
Function definitions and queries may also conatin late bound overloaded functions. For example, suppose that managers are employees whose incomes are the sum of the income as a regular employee plus some manager bonus: 
 create type Employee under Person;
 create type Manager under Employee;
 create function mgrbonus(Manager)->Integer as stored;
 create function income(Employee)->Integer as stored; 
 create function income(Manager m)->Integer i 
        as select income(cast(m as Employee)) + mgrbonus(m);
Now, suppose that we need a function that returns the gross incomes of all persons in the database, i.e. we use manager.income->integer for managers and employee.income->integer for non-manager. In Amos II such a function is defined as:
 create function grossincomes()->Integer i 
        as select income(p)
        from Employee p;
/* income(p) late bound */
Since income is overloaded with resolvents employee.income->integer and manager.income->integer and both qualify to apply to employees, the resolution of income(p) will be done at run time. If sums of incomes as employees are sought the desired resolvent has to be specified as an explicit resolvent name employee.income->integer

Since the detection of the necessity of dynamic resolution is done at compile time, overloading a function name may lead to a cascading recompilation of functions defined in terms of that function name. This may take some time and the recompilation is therefore deferred until a function marked for recompilation is called. For a more detailed presentation of the management of late bound functions see [FR95].

The type of an expression can be explicitly defined using the casting statement:

 casting ::= 'cast'(expr 'as' type-spec)

for example

 create function income(Manager m)->Integer i 
        as select income(cast(m as Employee)) + mgrbonus(m);

By using casting statements one can avoid late binding and explicit resolvent names.

2.4.5 Subqueries and aggregation operators

Aggregation operators are defined as functions where one or several arguments are declared as bags:
   Bag of Type x
The following system aggregation operators are defined:
   sum(Bag of Number x) -> Number 
   count(Bag of Object x) -> Integer
   maxagg(Bag of Object x) -> Object
   minagg(Bag of Object x) -> Object
   some(Bag of Object x) -> Boolean
   notany(Bag of Object x) -> Boolean
Notice that sum must be applied only to 'uniform' bags of numbers.

Aggregation operators are applied on subqueries. For example, the following query counts how many functions there are currently defined in the database:

count(select f from Function f); 

If an aggregation operator is applied on the result of a function it is first coerced to a subquery. The following to queries return the same values:

count(select allfunctions());
count(allfunctions());
In the second call to count the call to allfunctions is first coerced into a subquery that returns the bag whose elements are counted.

Local variables in queries may be declared as bags that can be used as arguments to aggregation operators. For example totalincomes to compute the sum of all incomes can be defined as:

create function totalincomes()->Integer
  as select sum(select income(p) from Person P);
or
 create function totalincomes()->integer
        as select sum(b) from Bag of Integer b 
        where b = (select income(p) from Person p);
NOTICE: Stored functions cannot be aggregation operators and variables declared 'Bag of' cannot be returned from select statements.

2.5 Updates

Information stored in Amos II can be thought of as mappings from function arguments to results. These mappings are either defined at object creation time (Objects), or altered by one of the function update statements 'set', 'add', or 'remove'. Functions have as extent  a bag of tuples mapping arguments to results. Updating a stored function means updating its extent.

Syntax:

update-stmt ::=
        update-op update-item [from-clause] [where-clause

update-op ::= 
        'set' | 'add' | 'remove' 

update-item ::= 
        function-name '(' expr-commalist ')' '=' expr
Not every function is updatable. Amos II defines a function to be updatable if it is a stored function, or if it is derived from a single updatable function without a join.

Semantics:

  • set sets the value of an updatable function given the arguments. 
  • A boolean function can be set to either 'true' or'false'. Setting the value of a boolean function to 'false' means that the truth value is removed from the function. 

  • add adds the specified tuple(s) to the result of an updatable function with bag result, analogous to set.
  • remove removes the specified tuple(s) from the result of an updatable function with bag result, analogous to set.
The update statements are not allowed to violate the cardinality constraints ('key') specified by the create-type-stmt or the create-function-stmt.

NOTICE that the user can declare explictit update rules for derived functions.


2.5.1 Dynamically changing object types

The add-type-stmt changes the type of one or more objects to the specified type.

Syntax:

add-type-stmt ::=
        'add type' type-name ['(' [generic-function-name-commalist] ')'] 
        'to' variable-commalist
The updated objects may be assigned initial values for all the specified property functions in the same manner as in the create object statement.

The remove-type-stmt makes one or more objects no longer belong to the specified type.

Syntax:

remove-type-stmt ::= 
        'remove type' type-name 'from' variable-commalist
Referential integrity is maintained so that all references to the objects as instances of the specified type cease to exist.  An object will always be an instance of some type. If all user defined types have been removed, the object will still be member of userobject.

2.5.1 User update procedures

It is possible to register user defined user update procedures for any function. The user update procedures are stored procedures that are transparently invoked when update statements are executed for the function.

    set_addfunction(Function f, Function up)->Boolean
  set_remfunction(Function f, Function up)->Boolean
  set_setfunction(Function f, Function up)->Boolean

The function f is the function for which we wish to declare a user update function and up is the actual update procedure. The arguments of a user update procedures is the concatenation of argument and result tuples of f. For example, assume we have a function

  create function netincome(Employee e) -> Number
         as select income(e)-taxes(e);

Then we can define the following user update procedure:

  create function set_netincome(Employee e, Number i)-> Boolean
         as begin
               set taxes(e)= i*taxes(e)/income(e) + taxes(e);
               set income(e) = i*(1-taxes(e))/income(e) + income(e);
            end;

The following declaration makes netincome updatable with the set statement:

  set_setfunction(functionnamed("EMPLOYEE.NETINCOME->NUMBER"),
                  functionnamed("EMPLOYEE.NUMBER->BOOLEAN"));

Now one can update netincome with, e.g.:

  set netincome(p)=32000 from Person p where name(p)="Tore";

2.5.3 Dynamic updates

Sometimes it is necessary to be able to create objects whose types are not known until runtime. Similarly one may wish to update functions without knowing the name of the function until runtime. For this there are the following system procedures:

createobject(Type t)->Object
createobject(Charstring tpe)->Object
deleteobject(Object o)->Boolean
addfunction(Function f, Vector argl, Vector resl)->Boolean
remfunction(Function f, Vector argl, Vector resl)->Boolean
setfunction(Function f, Vector argl, Vector resl)->Boolean

createobject creates a surrogate object of the type specified by its argument.

deleteobject deletes a surrogate object.

The stored procedures registered by setfunction, addfunction, or remfunction updates a function given an argument list and a result tuple as vectors. Returns TRUE if the update succeeded.

2.6 Cursors

Cursors provide a way to iterate over the result of a query or a function call. The open-cursor-stmt and the fetch-cursor-stmt iterate over the result from a query or function call.

Syntax:

open-cursor-stmt ::=
        'open' cursor-name 'for' simple-select-stmt 

cursor-name ::= 
        variable 

fetch-cursor-stmt ::= 
        'fetch' cursor-name (into-clause | next-clause) 

next-clause ::= 
        'next' integer-constant 

close-cursor-stmt ::=
        'close' cursor-name
For example:
create person (name,age) instances :Viola ('Viola',38);
open :c1  for select p from Person p where name(p) = 'Viola'; 
fetch :c1 into :Viola1;
close :c1; 
name(:Viola1); 
--> "Viola";
A cursor is created by the open-cursor-stmt and is represented by a bag of result tuples containing objects with unknown types.

The result of the query is always materialized and stored in the cursor bag.

The fetch-cursor-stmt fetches the first result tuple from the cursor; i.e. the tuple is removed from the front of the cursor bag. nil is returned if there are no more result tuples left in the cursor.

If present in a fetch-cursor-stmt, the into clause will bind elements of the first result tuple to AmosQL interface variables. There must be one interface variable for each element in the result tuple.

If present in a fetch-cursor-stmt, the next-clause will display the specified number of result tuples and remove them from the cursor bag.

If neither a next nor an into clause is present in a fetch-cursor-stmt, a single result tuple is fetched and displayed.

The close-cursor-stmt deallocates the cursor bag.

It is sometimes useful to count the number of result tuples in a cursor bag:

create function ageofpersonnamed(Charstring nm)-> Integer a
        as select age(p) from Person p where name(p)=nm; 
open :c1 for select ageofpersonnamed('Eve');
count(:c1); 
1

3 Database procedures

A database procedure is an Amos II function defined as a sequence of AmosQL statements that may have side effects (i.e. database update statements or variable assignments). Procedures may return a stream of results by using a special result statement. Each time result is called another result item is emitted from the procedure. Procedures should not be used in queries (but this restriction is currently not enforced). Most, but not all, AmosQL statements are allowed in procedure bodies as can be seen by the syntax below.

Syntax: 

    procedure-body ::=
        block | procedure-stmt

procedure-stmt ::=
        create-object-stmt
        delete-object-stmt
        for-each-stmt |
        update-stmt |
        add-type-stmt |
        remove-type-stmt |
        set-local-variable-stmt
        query-stmt
        if-stmt |
commit-stmt |
abort-stmt

    block ::= 
        'begin' 
               ['declare' variable-declaration-commalist ';'] 
               procedure-body-semicolonlist 
        'end'        

    result-stmt ::= 
        'result' expr 

    for-each-stmt ::= 
        'for each' [for-each-option] variable-declaration-commalist 
                [where-clause] for-each-body 

for-each-option ::= 'distinct' | 'original'

for-each-body ::= procedure-body

    if-stmt ::= 
        'if' expr 
        'then' procedure-body 
        ['else' procedure-body

set-local-variable-stmt ::= 
        'set' local-variable '=' function-call
Examples: 
 create function creperson(Charstring nm,Integer inc) -> Person p 
                as 
                begin 
                  create Person instances p; 
                  set name(p)=nm; 
                  set income(p)=inc; 
                  result p; 
                end;

 set :p = creperson('Karl',3500); 

 create function makestudent(Object o,Integer sc) -> Boolean 
                as add type Student(score) to o (sc); 

 makestudent(:p,30); 

 create function flatten_incomes(Integer threshold) -> Boolean 
                as for each Person p where income(p) > threshold 
                     set income(p) = income(p) - 
                                     (income(p) - threshold) / 2;

 flatten_incomes(1000);
NOTICE: Queries and updates embedded in procedure bodies are optimized at compile time. The compiler saves the optimized query plans in the database so that dynamic query optimization is not needed when procedures are executed. 

Procedures may return (bags of) results as a streams. The result-stmt is used for this, where expr is returned as the result from the procedure.  If result is never called the result of the procedure is nil. Usually the result type of procedures not returning any value is Boolean.

3.1 Iterating over the result of a query

The for-each statement iterates over the result of the query specified by the variable declarations (variable-declaration-commalist) and the where clause executing the for-each-body for each result variable binding of the query. For example the following function adds inc to the incomes of all persons with salaries higher than limit and returns their old incomes: 

 create function increase_incomes(Integer inc,Integer limit)
                                         -> Integer oldinc 
            as for each Person p, Integer i 
               where i > limit 
                 and i = income(p) 
              begin 
                result i; 
                set income(p) = i + inc 
              end;
NOTICE: The result statement does not not abort the control flow (different from, e.g., return in C), but it only specifies that a value is to be emitted to the result bag (i.e. stream) of the function and then the procedure evaluation is continued as usual. The for-each-stmt does not return any value at all unless result-stmt is used within its body.  

The for-each-option specifies how to treat the result of the query iterated over. If it is omitted the system default is to first materialize a copy of the result of the query iterated over before applying the for-each-body on each element of the copy. If 'distinct' is specified the iteration is over a copy where duplicates in addition have been removed. If  the option is 'original' the code is applied  directly on the database contents in place.

NOTICE: Even though option 'original' can be very efficient and useful there is a problem sometimes. In the example, if 'original' had been specified and the stored function income(p) happen to have an ordered index sorted on the result the iteration would loop for ever! The reason is that the ordered index would be scanned to efficiently match the condition 'i > limit' and therefore the income updates will be made in income order, rather than person order. Thus each income updated a new row with the higher updated income will be inserted into the index after the current income and therefore the same person will be visited again. Therefore the iteration will loop forever.

On the other hand the space cost of the default copy option may be very high for a massive iteration over, e.g., an external very large wrapped relational table. In such cases the 'original' option can be used for better space efficiency.


 

4 Peer management

Using a basic Amos II peer communication system, distributed Amos II peers can be set up that communicate using TCP/IP. A federation of Amos II peers is managed by a name server which is an Amos II peer knowing addresses and names of other Amos II peers. Queries, AmosQL commands, and results can be shipped between peers. Once a federation is set up, multi-database facilities can be used for defining queries and views spanning several Amos II peers [RJK03]. Mediation  and reconcilitation primitives can be used for defining object-oriented multi-peer views [JR99a][JR99b].

4.1 Peer communication


The following AmosQL system functions and procedures are available for inter-peer communication:

   nameserver(Charstring name)->Charstring
Makes the current stand-alone database into a name server and registers there itself as a peer with the given name. If name is "" the name server will become anonymous and not registered as a peer.

   listen()
Starts the mediator server listening loop. Informs the name server that this peer is ready to receive incoming messages. The listening loop can be interrupted with CTRL-C and resumed again by calling listen(). The name server must be listening before any other peer can register.

   register(Charstring name)->Charstring
Registers in the name server the current stand-alone database as a peer with the given name. The name server must be running on the local host. The system will complain if the name is already registered in the name server.  The peer needs to be activated with listen() to be able to receive incoming requests.

   register(Charstring name, Charstring host)->Charstring
Registers the current database as a peer in the name server running on the given host. Signals error if peer with same name already registered in federation.

   reregister(Charstring name)->Charstring
   reregister(Charstring name, Charstring host)->Charstring

as register but unregisters other peer with same name rather than signalling error. Good when restarting peer registered in name server after crash.  

   this_amosid()->Charstring name
Returns the name of the peer where the call is issued. Returns "NIL" if issued in a not registered standalone Amos II system.

   other_peers()->Bag of Charstring name
Returns the names of the other peers in the federation managed by the name server.

   ship(Charstring peer, Charstring cmd)-> Bag of Vector
Ships the AmosQL command cmd for execution in the named peer. The result is shipped back to the caller as a set of tuples represented as vectors.  If an error happens at the other peer the error management is also shipped back.

   at "peer" select ...;
Skips the query to the specified peer for execution. Ships the result back as bag of tuples.

   send(Charstring peer, Charstring cmd)-> Charstring peer
Sends the AmosQL command cmd for asynchronous execution in the named peer without waiting for the result to be returned.

   broadcast(Charstring cmd)-> Bag of Charstring
Sends the AmosQL command cmd for asynchronous execution in all other peers. Returns the names of the receiving peers.

   hostname()->Charstring name
returns the name of the host where the current peer is running.

   kill_peer(Charstring name)->Boolean
Kills the specified peer. Don't kill the name server if you still want to be able to communicate with other peers!

4.1 Peer queries and views

Once you have established connections to Amos II peers you can define multi-database queries views of data from your peers. You first have to import the meta-data about selected types and functions from the peers. This is done by defining proxy types and proxy functions [RJK03] using the system function import_types:

   import_types(Vector of Charstring typenames,  Charstring p)-> Bag of Type pt
defines proxy types pt for types named typenames in peer p. Proxy functions are defined for the functions in p having the imported types as only argument. Inheritance among defined proxy types  is defined by the corresponding  inheritance relationships between imported types in the peer p.

Once the proxy types and functions are defined they can transparently be queried. Proxy types can be references using @ notation to reference types in other peers.
For example,
   select name(p) from Person@p1;
selects the name property of objects of type Person in peer p1.

import_types imports only those functions having one of typenames as single arguments. You can import other functions using system function import_func:

   import_func(Charstring fn, Charstring p)->Function pf
imports a function named fn from peer p returning proxy function pf.

On top of the imported types object-oriented multi-peer views can be defined, as described in [RJK03] consisting of derived types [JR99a] whose extents are derived through queries intersecting extents of other types, and IUTs [JR99b] whose extents reconciles unions of other type extents. Notice that the implementation of IUTs is limited. (In particular the system flag latebinding('OR'); must be set before IUTs can be used and this may cause other problems).


5 Wrapping external data sources

Amos II contains number of primitives for accessing different external data sources by defining wrappers for each kind external sources. The wrapper functionality in Amos II is greatly enhanced compared to earlier releases of Amos II. In particular wrappers allow specification of different capabilities for different kinds of data sources [Han03]. Capabilities provide hooks to define data source specific re-write rules and cost models.

Wrappers can furthermore be made updatable so that a wrapped existing data source appears to the mediator as an object-relational database. Wrappers for relational databases using JDBC or ODBC interfaces are predefined in Amos II. There are also wrappers for accessing other Amos II peers.

This chapter first describes multi-directional foreign functions [LR92], the basis for accessing external systems from Amos II queries and defining wrappers. Then some general types and functions used for defining meta-data of eternal sources are described. Finally we describe the relational database wrappers for JDBC and ODBC.

5.1 Multi-directional foreign functions

The basis for accessing external systems from Amos II is to define foreign functions. A foreign functions allows subroutines defined in C/C++ [Ris00a], Lisp [Ris00a] [Ris00b], or Java [ER00] to be called from Amos II queries. This allows access to external databases, storage managers, or computational libraries. A foreign function is defined by  the following function implementation:

foreign-body ::= simple-foreign-definition | multidirectional-definition

simple-foreign-body ::= 'foreign' [string-constant]

multidirectional-definition ::= 'multidirectional' capability-list

capability ::= (binding-pattern 'foreign'
string-constant ['cost' cost-spec]['rewriter' string-constant])

binding-pattern ::=
A string constant containing b:s and f:s

cost-number ::= integer-constant | real-constant

cost-spec ::= function-name | '{' cost-number ',' cost-number '}'

  For example:

   create function iota(Integer l, Integer u) -> Integer as foreign 'iota';
   create function sqroot(number x)-> Number r
       as multidirectional
          ("bf" foreign 'sqrt' cost {2,2})
          ("fb" foreign 'square' cost {1.2,1});
   create function bk_access(integer handle_id )-><vector  key, vector >
       /* Core cluster function */                    
       as multidirectional
          ('bff' foreign 'bk_scan' cost bk_cost rewriter "abkw")
          ('bbf' foreign 'bk_get' cost {1002, 1});
   create function myabs(real x)->real y
       as multidirectional
          ("bf" foreign "JAVA:Foreign/absbf" cost {2,1})
          ("fb" foreign "JAVA:Foreign/absfb" cost {4,2});

The syntax using multidirectional definition is the general one having several options for different capabilities of foreign functions. The simplified syntax using simple-foreign-body is mainly for quick implementations of functions without paying attention to expensive functions or scalable query execution.

A foreign function can implement either
  1. Filters which are predicates (boolean valued functions, e.g. '<') that succeed when certain conditions over its results are satisfied.
  2. Computations that produce a result given that the arguments are known, e.g. +..
  3. Generators that produce a stream of several  result tuples given the argument tuple, e.g. iota or a function sending SQL statements to a relational database for evaluation.
Amos II functions in general are multi-directional, i.e. they can be executed also when the result of a function is given and some corresponding argument values are sought. For example, if we have a function
    parents(person)-> bag of person 
we can ask these AmosQL queries:
    parents(:p);  /* Result is the bag of parents of :p */
select c from person c where parents(c) = :p;
                /* Result is bag of children of :p */ 
It is often desirable to make foreign Amos II functions multi-directional as well. For example, we may wish to ask these queries using the square root function sqroot above:
    sqroot(4.0);    /* Result is -2.0 and 2.0 */ 
select x from number x where sqroot(x)=4.0;
                /* result is 16.0 */
select sqroot(4.0)=2.0;
/* Is the square root of 4 = 2 */
With simple foreign functions only the first (forward) function call is possible. Multi-directional foreign functions permit also the second kind of queries.
The benefit of multi-directional foreign functions is that a larger class of queries calling the function is executable (safe), and that the system can make better query optimization. A multi-directional foreign function can have several capabilities implemented depending on the binding pattern of its arguments and results. The binding pattern is a string of b:s and f:s, indicating which arguments or results in a given situation are known or unknown, respectively.

For example, the square root (sqroot) actually has the following possible binding patterns:

(1) If we know x but not r, the binding pattern is "bf" and the implementation should return r as the square root of x.
(2) If we know r but not x, the binding pattern in "fb" and the implementation should return x as r**2.
(3) If we know both r and x then the binding pattern is "bb" and the implementation should check that x = r**2.
 
Case (1) and (2) are implemented by multi-directional foreign function sqroot above.
Case (3) need not be implemented as it is inferred by the system by first executing r**2 and then checking that the result is equal to x (see [LR92]).

To implement a multi-directional foreign function you first need to think of which binding patterns require implementations. In the sqroot case one implementation handles the square root and the other one handles the square. The binding patterns will be "bf" for the square root and "fb" for the square.

The following steps are required to define a foreign function:
  1. Implement each foreign function capability using the interface of the implementation language.
  2. In case the foreign code is statically linked (only in C/C++) the implementation must be linked to the kernel by make a driver program referencing it as an external function [Ris00a]. This is not needed for foreign functions in Java or Lisp that are dynamically loaded into the system.
  3. Statically linked code must also be assigned a symbolic name to be referenced in the foreign function capability definition (sqrt and square in the example) See [Ris00a].
  4. Finally the multidirectional foreign function needs to be defined through a foreign function definition in AmosQL as above. Here the implementor also associates a binding pattern and optional cost estimates with each capability. Normally the foreign function definition is done separate from the actual code implementing its capabilities, in an AmosQL script.
For further details on how to define multidirectional foreign functions for the different implementation languages see  [Ris00a][Ris00b].

A capability can also have an associated rewrite rule (keyword rewriter) implemented in Lisp. It is called by the query processor to generate several foreign function calls from an applied capability. This requires some knowledge about the Amos II internals. It is not further elaborated here.

5.1.1 Cost and result size estimates

Different capabilities of multi-directional foreign functions often have different execution costs. In the sqroot example the cost of computing the square root is higher than the cost of computing the square. When there are several alternative implementations of a multi-directional foreign function the cost-based query optimizer needs cost hints that help it choose the most efficient implementation. In the example we might want to indicate that the cost of executing a square root is double as large as the cost of executing a square.

Furthermore, the query optimization is also very much dependent on the expected size of the result from a function call. This is called the fanout (or selectivity of predicates) of the call for a given binding pattern. In the sqroot example the implementation sqrtbf usually has a fanout of 2, while sqrtfb has a fanout of 1.

Thus for good query optimization each foreign function capability should have associated costs and fanouts:
  • The cost is an estimate of how expensive it is to completely execute a foreign function for given set of arguments.
  • The fanout estimates the expected number of elements in the result stream, given the arguments.
The cost and fanout for a multidirectional foreign function implementation can be either specified as a constant vector of two numbers (as in sqroot above) or as an Amos II cost function returning the vector of cost and fanout for a given function call. The numbers normally need only be rough numbers, as they are used by the query optimizer to compare the costs of different possible execution plans to produce the optimal one. The number 1 for the cost of a foreign function should roughly be the cost to perform a cheap function call, such as '+' or '<'.

If the simplified syntax is used or no cost is specified the system tries to put reasonable defaults on foreign functions. It assumes they are cheap with low fanout. If you have expensive foreign functions you are strongly advised to specify cost hints. It is also advisable to specify a fanout for functions you know have less or equal to 1 or greater than 2. The system assumes aggregation functions are expensive to execute by default.

The cost function cfn is an Amos II function with signature
create function <cfn>(Function f, Vector bpat, Vector args)  
-> <Integer cost, Integer fanout> as ...;
e.g.

create function typesofcost(Function f, Vector bpat, Vector args)
-> <Integer cost, Integer fanout> as foreign ...;
The cost function is normally called at compile time when the optimizer needs the cost and fanout of a function call in some query. The arguments and results of the cost function are:

f       is the full name the called Amos II function.
bpat is the binding pattern of the call as a vector of strings ’b’ and ’f’, e.g. {"f","b"} indicating which arguments in the call are bound or free, respectively.
argl is a vector of actual variable names and constants used in the call.
cost   is the computed estimated cost to execute a call to f with the given binding pattern and argument list. The cost to access a tuple of a stored function (by hashing) is 2; other costs are calibrated accordingly.
fanout is the estimated fanout of the execution, i.e. how many results are returned from the execution.

If the cost hint function does not return anything it indicates that the function is not executable in the given context.

The costs and fanouts are normally specified in each capability specification of a multidirectional foreign function definition, as in the example. The costs can also be specified after the definition of a foreign function by using the following Amos II system procedure:
   costhint(Charstring fn,Charstring bpat,Vector ch)->Boolean  
e.g.
   costhint("number.sqrt->number","bf",{4,2}); 
costhint("number.sqrt->number","fb",{2,1});
fn is the full name of the resolvent.
bpat is the binding pattern string.
ch is a vector with two numbers where the first number is the estimated cost and the second is the estimated fanout.

A cost function cfn can be assigned to a capability with:
costhint(Charstring fn, Charstring bpat, Function cfn)  -> Boolean

To find out what cost estimates are associated with a function use:
   costhints(Function r)-> Bag of <Charstring bpat, Object q>
It returns the cost estimates for resolvent r and their associated binding patterns.

5.2 Generic wrapper functionality

Some wrapper functionality is completely data source independent while other functionality is specific for a particular kind of data source. Therefore, to share wrapper functionality Amos II contains a hierarchy of wrappers. In this section we describe common functionality used for defining any kind of wrapper.

5.2.2 Data sources

Types and objects of type Datasource describe properties of different kinds of data sources accessible through Amos II. Each kind of wrapped data source has a corresponding data source type, for example type Amos, Relational, or Jdbc to describe Amos II peers, relational databases, or relational data bases accessed through JDBC, respectively. These types are all subtypes of type Datasource. Each instance of a data source type represents a particular data source of that kind, e.g. a particular relational database accessed trough JDBC are instances of type Jdbc.

5.2.3 Mapped types

A mapped type is a type whose instances are identified by a key consisting of one or several other objects [FR97]. Mapped types are needed when proxy objects corresponding to external values from some data source are created in a mediator. For example, a wrapped relational database may have a table PERSON(SSN,NAME,AGE) where SSN is the key. One may then wish to define a mapped type named Pers in the mediator representing proxy objects for the persons in the table. The instances of the proxy objects are identified by a SSN. The type Pers should furthermore have the following property functions derived from the contents of the wrapped table PERSON:
     ssn(Pers)->Integer
     name(Pers)->Charstring
     age(Pers)->Integer


The instances and primary properties of a mapped type are defined through a Core Cluster Function that returns these as a set of tuples, one for each instance. In our example the core cluster function should return tuples of three elements <ssn,name,age>. The relational database wrapper will automatically generate such a core cluster function for table PERSON (type Pers) with signature:
   pers_cc()->Bag of <Integer ssn, Charstring name, Integer age> as ...

A mapped type is defined through a system procedure with signature:
   create_mapped_type(Charstring name, Vector keys, Vector attrs, Charstring ccfn)-> Mappedtype
where
   name is the name of the mapped type
   keys is a vector of the names of the keys identifying each instance of the mapped type.
   attrs is a vector of the names of the properties of the mapped type.
   ccfn is the name of the core cluster function.

In our example the relational database wrapper will automatically define the mapped type Pers through this call:
   create_mapped_type("Pers",{"ssn"},{"ssn","name","age"},"pers_cc");

Notice that the implementor of a mapped type must guarantee key uniqueness.

Once the mapped type is defined it can be queried as any other type, e.g.:
   select name(p) from Pers p where age(p)>45;

5.2.3 Type translation

Types in a specific data source are translated to corresponding types in the mediator using the following system functions:
  • amos_type(Datasource ds, Charstring native_type_name) -> Type;
  • for example
    amos_type(relational_named("IBDS"),"VARCHAR");
    amos_type returns the Amos II type corresponding to the a specific data source.
  • wrapped_type(Datasource ds, Type t) -> Charstring typename;
  • for example
    wrapped_type(relational_named("IBDS"),typenamed("CHARSTRING"));
    returns the data source type corresponding to an Amos II type. Since one external type may correspond to more than one Amos II type wrapped_type is not the inverse of amos_type
The most common relational types and their Amos II counterparts are provided by default. Both functions are stored functions that can be updated as desired for future wrappers.

5.3 Relational database wrappers

There are predefined wrappers for relational databases using either the ODBC or JDBC standard relational database interfaces. Everything that is above the ODBC/JDBC interface level (type conversion, translation etc) has been lifted to an abstract Relational Wrapper specified through a data source named Relational. The specific ODBC/JDBC wrappers override the abstract function interfaces of the general relational wrapper.

The ODBC wrapper is primarily tested with Microsoft MS Access and the JDBC wrapper is tested with FireBird's InterClient JDBC driver and Microsoft's SQLServer driver. 

5.3.1 The abstract wrapper Relational

An instance of type Relational represents a relational database and the functions represents interfaces and capabilities of a relational database in general. The Relational wrapper is an abstract wrapper in the sense that it does not implement a specific data source and thus has no instances, but rather it specifies the functionality that a general relational database interface must implement. In the type hierarchy below the abstract Relational wrapper there are specific implemented wrappers for the relational database interface standards ODBC and JDBC. The subtypes have one instance for each data source of those kinds that the system accesses. The following type hierarchy is used:

	Datasource
|
Relational
/ \
Jdbc Odbc

All interface functions for relational databases are defined for type Relational. Some relational database interface functions are defined as abstract functions. Abstract functions are implemented as foreign functions having a special implementation ... foreign 'abstract-function'. Abstract functions cannot be executed and a run-time error will be raised if they are forced to; they always have to be overridden by a corresponding resolvent for a subtype.

The use of abstract functions makes sure that Amos II will use dynamic type resolution to look for implementations of these functions in any subtype to Relational. The net effect is that the interface for any relational database wrapper is explicitly specified and must be followed. This is exactly what the JDBC and ODBC wrappers do. The JDBC and ODBC wrappers and implementation of these functions will be explained below, for now we will assume that the functions are implemented and work for some subtype of Relational.

5.3.1.1 Relational interface functions

The relational data source interface functions provide the general interface for accessing different kinds of relational databases. Some of the interface functions are abstract and need to be defined specifically for each subtype of Relational. Other data source specific functions, such as the system function import_table rely on this.
The basic relational interfaces functions provide a low level interface to a specific relational DBMS application programming interface.   The instances of relational data sources are created using data source type specific data source constructor functions that loads necessary drivers and looks up that such a data source exists. For example,
    jdbc("ibds","interbase.interclient.Driver"); 
creates a new JDBC data source instance named IBDS (capitalized) using the Interbase interclient driver.

Each relational database source requires the following data source interface functions:

  • relational_named(Charstring nm)-> Relational
    for example
    
            
  • relational_named("IBDS"); relational_named returns the relational data source instance with a given name. The example returns the relational data source named IBDS.
  • connect(Relational r, Charstring ds_identifier, 
    Charstring username, Charstring password) -> Relational
    for example
    
            
  • connect(relational_named("IBDS"),"EMPLOYEE.GDB","SYSDBA","masterkey"); connect opens a new connection to a relational data source. The function is abstract for type Relational and implemented for each subtype of Relational.
  • disconnect(Relational r) -> Boolean
  • disconnect is an abstract function that close the connection. Specific implementation for each subtype of Relational
  •  tables(Relational r) 
    -> Bag of <Charstring table, Charstring catalog,
    Charstring schema, Charstring owner>
  • For example
    tables(relational_named("IBDS"));
    tables returns a bag of table description tuples for a given relational database source. The relational wrappers use this function to find the tables of the relational database. The function is abstract for type Relational. It is called by the (non-abstract) generic wrapper function has_table defined below.
  • columns(Relational r, Charstring table_name) 
    -> Bag of <Charstring column_name, Charstring column_type>
  • for example
    columns(relational_named("IBDS"),"SALES");
    columns returns, for a given table table_name in a relational data source r, its columns along with their SQL data types. The function is abstract for type Relational.
  • primary_keys(relational r, charstring table_name) 
    -><charstring column_name, charstring constraint_name>
  • for example
    primary_keys(relational_named("IBDS"),"SALES");
    primary_keys returns, for a given relational table table_name, a bag of the columns constituting the primary key. constraint_name is the name of a primary key constraint in the relational database if available. The function is abstract for type Relational.
  • has_table(Relational r, Charstring table_name) -> Boolean
  • for example
    has_table(relational_named("IBDS"),"SALES");
    has_table returns TRUE if the table table_name exists in the relational database r.
  • cardinality(Relational r, Charstring table_name) -> Integer
  • for example
    cardinality(relational_named("IBDS"),"SALES");
    cardinality returns the number of tuple in table table_name in the relational database r. The function is abstract for type Relational.
  • import_table(Relational r, Charstring table_name)
  • for example
    import_table(relational_named("IBDS"),"SALES");
    import_table creates an object-oriented view of a relational table in the mediator peer. The view is represented by a mapped type which is a type whose extent is defined by the rows of the table. Each instance of the mapped type corresponds to a row in the table. For the mapping, the instances of a mapped type internally stores the key of the corresponding table row. The name of the mapped type is constructed by concatenating the table name, the character _ and the data source name, for example Person_ibds. Mapped type names are internally capitalized, as for other Amos II types.

For each columns in the table import_table will generate a corresponding derived wrapper function returning the column's value given an instance of the mapped type. For example, a table named person having the column ssn will have a function

     ssn(Person_ibds)->Integer 
returning the ssn of a person from the imported relational table.

  • import_table(Relational r, 
    Charstring table_name,
    Boolean updatable)
    -> Mappedtype mt
    for example
    import_table(relational_named("IBDS"),"COUNTRY",true);
    The extra argument allows the user to specify the wrapper to be updatable. This means that Amos II update statements to the wrapper functions are transparently translated into corresponding SQL update statements of the wrapped table and create statements for the mapped type is translated into SQL insert statements.

    Updatable wrappers define transparent object-oriented data abstractions over existing relational database tables.

    For example, if there are columns CURRENCY and COUNTRY in table country, a new rows in the table added with

     create Country_ibds(currency,country) instances ("Yen","Japan");
    The currency of Japan can be transparently updated too:
     set currency(c)= "Jen" from Country_ibds c where country(c)= "Japan";

The most general resolvent of import_table is:

  • import_table(Relational r, Charstring catalog_name,
    Charstring schema_name, Charstring table_name,
    Charstring typename, Boolean updatable,
    Vector supertypes) -> Mappedtype mt
  • The table resides in the given catalog and schema. If catalog is "", the table is assumed not to be in a catalog. If schema is "", the table is assumed not to be in a schema. typename is the desired name of the mapped type created, as alternative to the system generated concatenation of table and data source name. updatable gives an updatable mapped type. supertypes is a vector of either type names or type objects, or a mixture of both. The effect is that Amos II will perceive the mapped type as an immediate subtype of the types.
There are also two other variants of import_table to be used for relational databases where schema and catalog names need not be specified:
  • import_table(Relational r, 
    Charstring table_name,
    Charstring typename,
    Boolean updatable,
    Vector supertypes) -> Mappedtype mt

    import_table(Relational r,
    Charstring table_name,
    Charstring type_name,
    Boolean updatable) -> Mappedtype mt
  • drop_table(Relational r, Charstring name) -> Integer
  • for example
    drop_table(relational_named("IBDS"),"SALES");
    drop_table deletes the specified relational database table.

5.3.1.2 Importing many-to-many relationships

All tables in relational databases do not correspond to 'entities' in an ER diagram and therefore cannot be directly mapped to types. The most common case is tables representing many-to-many relationships between entities. Typically such tables have two columns, each of which is a foreign key imported from two other tables, expressing the many-to-many relationship between the two. Obviously, only entities are desired to be imported as types and special types should not be generated for such relationship tables. A many-to-many relationship in a relational database corresponds directly to a function returning a bag in AmosQL, and should be imported using import_relation rather than import_table:

  • import_relation(Relational r,
    Charstring table_name, Charstring argument_column_name,
    Charstring result_column_name, Charstring function_name,
    Boolean updatable)
    -> Function
    • table_name - the name of the table containing the relation.
    • argument_column_name - the name of the column which is argument of the function.
    • result_column_name - the name of the column which is result of the function.
    • function_name - the desired name of the function.
    • updatable - whether the function should be transparently updatable via set, add, and remove.

For example, assume we have two entities, person and telephone. Most of the time telephones are not regarded as entities in their own respect since nobody would care to know more about a telephone than its extension number. In this scenario however, the physical location of the telephone is also kept in the database, so telephones have a table of their own.

A person can be reached through several telephones, and every telephone may be answered by several person. The typical schema looks as follows:

person
ssn name ...



person_telephones
ssn ext_no



telephone
ext_no location ...



Now, the correct way to wrap this schema is the following way:
  1. import_table(my_relational, 'person');
  2. import_table(my_relational, 'telephone');
  3. import_relation(my_relational, 'telephone', 'ssn','ext_no','phones', false);
    or
    import_relation(my_relational, 'telephone', 'ext_no',ssn','phones', false);
    (a matter of taste)
  4. create function phones(person@my_relational p) -> telephone@my_relational t as select t where phones(ssn(p)) = ext_no(t);
    or, if the other direction was chosen:
    create function phones(telephone@my_relational t) -> person@my_relational p as select p where phones(ext_no(t)) = ssn(p);

As seen above, the user has the option of importing a many-to-many table as two different functions. The reason is simple: Whereas functions have directions, many-to-many relationships are direction neutral. Thus the additional information of desired direction must be expressed by the user.

Also note that only a function with a single argument and result can be created.

  • imported_keys(Jdbc j, Charstring fktable) 
    -> <Charstring pktable, Charstring pkcolumn, Charstring fkcolumn>
  • for example
    imported_keys(relational_named("IBDS"),"SALES");
    • pktable - The table referenced by this foreign key.
    • pkcolumn - The column referenced by this foreign key.
    • fkcolumn - The foreign key column in this table.

    imported_keys returns information about the exported keys of a relational table. Abstract in type Relational.

  • exported_keys(Jdbc j, Charstring pktable) 
    -> <Charstring pkcolumn, Charstring fktable, Charstring fkcolumn>
  • for example
    exported_keys(relational_named("IBDS"),"COUNTRY");
    • pkcolumn - The primary key column in this table.
    • fktable - The table whose foreign key references this table.
    • fkcolumn - The foreign key column in the table that references this table.
    exported_keys returns information about the exported keys of a table in a datasource. Abstract in type Relational.
  • referenced_table(Relational r, Charstring fktable, Charstring fkcolumn) 
    -> Charstring pktable
  • referenced_table returns the name of the table that column fkcolumn in table fktable references by its foreign key, if applicable.

5.3.1.2 SQL statement execution

The function executes an arbitrary sql statement as a string. It is implemented in all relational wrappers as it is used by the translator from AmosQL to SQL to generate SQL queries from AmosQL.

  • sql(Relational r, Charstring query, Vector arguments) -> Vector results
  • Abstract function in Relational. Executes the query query at the data source r and returns the result as a vector. The query string may contain question marks that will be bound to arguments in the vector arguments. A feature is that the query will be compiled and cached so that subsequent calls to sql with the same string (even though the arguments differ) will execute substantially faster.

  • sql(Relational r, Charstring query, Vector arguments) -> Vector results
  • Equivalent to sql(r, query, {}).
  • sqlu(Relational r, Charstring update) -> Integer
  • Abstract function in Relational. Executes the update update at the data source r and returns the number of altered rows in the table.

5.3.2 The JDBC Wrapper

The basic Amos II system contains a wrapper for JDBC relational data sources. It allows transparent for access to data in relational databases through OO views. JDBC data sources in Amos II are represented by the type Jdbc, subtype of Relational. Jdbc overrides all of the abstract functions of Relational and does not declare any functions of its own. 

5.3.2.1 JDBC Data Access Functions

  • jdbc(Charstring logname, Charstring driver) -> Jdbc

    Constructs a Jdbc object having the logical nick-name logname and attempts to load the JDBC driver.

  • connect(Jdbc j, Charstring url) -> Jdbc

    Calls the JDBC method DriverManager.getConnection(String url), which uses a default user and password

  • connect(Jdbc j, Charstring url, Charstring user_name, Charstring pass) -> Jdbc

    Calls the JDBC method DriverManager.getConnection(String url, String userName, String pass).

  • disconnect(Jdbc ds) -> Boolean

    Calls the JDBC method Connection.close(). This is also done automatically at system exit.

  • sql(Jdbc j, Charstring query, Vector params -> Bag of Vector res

    Execute an SQL statement by first checking of the string is cached, in which case there exists already a PreparedStatement where the parameters can be set to those in params. Otherwise a call to PrepareStatement is issued and the result cached along with the query string.

    Obviously, a lot can be gained in using identical strings for identical queries, and using as many ?'s as possible. Indeed, this is what the translator strives for at all times.

5.3.2.2 JDBC Data Meta-Data Functions

  • tables(Jdbc j) -> Bag of <Charstring table, Charstring catalog, Charstring schema, Charstring owner>

    Calls the method getTables and projects some of the information to Amos. See also section 4.2.1.1

  • columns(Jdbc j, Charstring table) -> Bag of < Charstring column_name, Charstring type_name>

    Calls the method getColumns and projects some of the information to Amos. See also section 4.2.1.1

  • primary_keys(Jdbc j, Charstring table) -> Bag of <Charstring column_name, Charstring constraint_name>

    Calls the method getColumns and projects some of the information to Amos. See also section 4.2.1.1

  • imported_keys(Jdbc j, Charstring fktable) -> <Charstring pktable, Charstring pkcolumn, Charstring fkcolumn>

    Calls the method getImportedKey and projects some of the information to Amos. See also section 4.2.1.1

  • exported_keys(jdbc j, charstring pktable)

    Calls the method getExporedKeys and projects some of the information to Amos. See also section 4.2.1.1

5.3.3 The ODBC wrapper

The basic Amos II system contains a wrapper for ODBC relational data sources. It allows transparent access to data in relational databases through OO views. ODBC data sources in Amos II are represented by the type Odbc, subtype of Relational. Odbcoverrides all of the abstract functions of Relational.

5.3.2.1 ODBC Data Access Functions

  • odbc(Charstring logname) -> Odbc

    Constructs an Odbc object having the logical nick-name logname. See also section 4.2.1.1

  • connect(Odbc o, Charstring url) -> Odbc

    See section 4.2.1.1

  • connect(Odbc o, Charstring url, Charstring user_name, Charstring pass) -> Odbc

    See section 4.2.1.1

  • disconnect(Odbc o) -> Boolean

    See section 4.2.1.1

  • sql(Odbc o, Charstring query, Vector params -> Bag of Vector res

    See section 4.2.1.1

5.3.2.2 ODBC Data Meta-Data Functions

  • tables(Odbc j) -> Bag of <Charstring table, Charstring catalog, Charstring schema, Charstring owner>

    See section 4.2.1.1

  • columns(Odbc j, Charstring table) -> Bag of < Charstring column_name, Charstring type_name>

    See section 4.2.1.1

  • primary_keys(Odbc j, Charstring table) -> Bag of <Charstring column_name, Charstring constraint_name>

    See section 4.2.1.1

  • imported_keys(Odbc j, Charstring fktable) -> <Charstring pktable, Charstring pkcolumn, Charstring fkcolumn>

    See section 4.2.1.1

  • exported_keys(odbc j, charstring pktable)

    See section 4.2.1.1

6 Physical database design

This section describes some AmosQL commands for database space and performance tuning. 

6.1 Indexing

The system supports indexing on any single argument or result of a stored function. Indexes can be unique or non-unique. A unique index disallows storing different values for the indexed argument or result. The cardinality constraint 'key' of stored functions (Cardinality Constraints) is implemented as unique indexes. Thus by default the system puts a unique index on the first argument of stored functions. That index can be made non-unique by suffixing the first argument declaration with the keyword 'nonkey' or to specify 'bag of' for the result, in which case a non-unique index is used instead.

For example, in the following function there can be only one name per person:

      create function name(Person)->Charstring as stored;
By contrast, names allow more than one name per person: 
      create function names(Person p nonkey)->Charstring nm as stored;
Alternative definition of names:
      create function names(Person p)->Bag of Charstring nm as stored;
Any other argument or result declaration can also be suffixed with the keyword 'key' to indicate the position of a unique index. For example, the following definition puts a unique index on nm to prohibit two persons to have the same name:
      create function name(Person p)->Charstring nm key as stored;
Indexes can also be explicitly created on any argument or result with a system procedure create_index:
      create_index(Charstring function, Charstring argname, Charstring index_type,
Charstring uniqueness)
For example:
      create_index("person.name->charstring", "nm", "hash", "unique");
create_index("names", "charstring", "mbtree", "mutiple");
The parameters of create_index are:

function: The name of a stored function. Use the resolvent name for overloaded functions.

argname: The name of the argument/result parameter to be indexed. When unambiguous, the names of types of arguments/results can also be used here.

index_type: Type kind of index to put on the argument/result. The supported index types are currently hash indexes (type hash) and ordered B-tree indexes (type mbtree). The default index for key/nonkey declarations is hash.

uniqueness: Index uniqueness indicated by unique for unique indexes and multiple for non-unique indexes.

Indexes are deleted by the system procedure:

      drop_index(Charstring functioname, Charstring argname); 
The meaning of the parameters are as for function create_index. There must always be at least one index left on each stored function and therefore the system will never delete the last remaining index on a stored function.

To save space it is possible to delete the default index on the first argument of a stored function. For example, the following stored function maps parts to unique identifiers through a unique hash index on the identifier:

      create type Part; 
create function partid(Part p)->Integer id as stored;
partid will have two indexes, one on p and one on id. To drop the index on p, do the following:
      drop_index('partid', 'p');

6.2 Clustering

Functions can be clustered by creating multiple result stored functions, and then each individual function can be defined as a derived function. 

For example, to cluster the properties name and address of persons one can define:  

create function personprops(Person p) -> 
        <Charstring name,Charstring address> as stored; 
create function name(Person p) -> Charstring nm 
        as select nm from Charstring a 
                where personprops(p) = <nm,a>; 
create function address(Person p) -> Charstring a 
        as select a from Charstring nm 
                where personprops(p) = <nm,a>;
Clustering does not improve the execution time performance significantly in a main-memory DBMS such as Amos II. However, clustering can decrease the database size considerably. 

7 System functions and commands

7.1 Comparison operators

The built-in, infix comparison operators are: 

=(Object x, Object y) -> Boolean   (infix operator =)
!=(Object x, Object y) -> Boolean  (infix operator !=) 
>(Object x, Object y) -> Boolean   (infix operator >) 
>=(Object x,Object y) -> Boolean   (infix operator >=)
<(Object x, Object y) -> Boolean   (infix operator <) 
<=(Object x,Object y) -> Boolean   (infix operator <=)
All objects can be compared. Strings are compared by characters, lists by elements, OIDs by identifier numbers. Equality between a bag and another object denotes set membership of that object. The comparison functions can, of course, be overloaded for user defined types. 

7.2 Arithmetic functions

abs(Number x) -> Number y
div(Number x, Number y)   -> Number z (infix operator /) 
max(Object x, Object y)   -> Object z 
min(Object x, Object y)   -> Object z 
minus(Number x, Number y) -> Number z (infix operator -) 
mod(Integer x, Integer y) -> Integer z
plus(Number x, Number y)  -> Number z (infix operator +) 
times(Number x, Number y) -> Number z (infix operator *) 
iota(Integer l, Integer u)-> Bag of Integer z 
sqrt(Number x) -> Number z
round(Number x) -> Integer i
log10(Number x) -> Real y
iota constructs a bag of integers between l and u. 
For example, to execute n times AmosQL statement stmt do: 
for each Integer i where i = iota(1,n) 
        stmt;

7.3 String functions

String concatenation is made using the '+' operator, e.g.
"ab" + "cd" + "ef"; returns "abcdef"

char_length(Charstring)->Integer
Count number of characters in string.

itoa(Integer)->Charstring
Convert integer to string.

lower(Charstring)->Charstring
Lowercase string.

like(Charstring string, Charstring pattern) -> Boolean
Test if string matches regular expression pattern where '*' matches sequence of characters and '?' matches single character. For example:
like("abc","??c") returns TRUE
like("ac","a*c") returns TRUE
like("ac","a?c") fails

like_i(Charstring string, Charstring pattern) -> Boolean
Case insensitive like.

substring(Charstring string,Integer start, Integer end)->Charstring
Extract substring from given character positions. First character has position 0.

upper(Charstring)->Charstring
Uppercase string.

7.4 Aggregation functions

Some of these system functions are described in Subqueries and Aggregation Operators

Number of objects in bag: 
count(Bag of Object o) -> Integer c
Extract elements of collections: 
in(Bag of Object b) -> Bag of Object o 
in(Vector v) -> Bag of Object o
Largest object in bag: 
maxagg(Bag of Object x) -> Object y
Smallest number in bag: 
minagg(Bag of Object x) -> Object y
Test if bag empty. Logical NOT EXISTS: 
notany(Bag of Object o) -> Boolean b
Test if there are any elements in bag. Logical EXISTS: 
some(Bag of Object x) -> Boolean b
Sum uniform bags of numbers:
sum(Bag of Number x) -> Number s 
Insert x between elements in bag b:
inject(Bag of Object b, Object x) -> Bag of Object r
Make a string of elements in bag b:
concatagg(Bag of Object b)-> Charstring s
for example
concatagg(bag("ab",2,"cd")); => "ab2cd" 
      
concatagg(inject(bag("ab",2,"cd"),",")); => "ab,2,cd"
      
      

7.5 Temporal functions

Amos II supports three data types for referencing Time, Timeval, and Date.
Type Timeval  is for specifying absolute time points including year, month, and time-of-day.
The type Date specifies just year and date, and type Time specifies time of day. A limitation is that the internal operating system representation is used for representing Timeval values, which means that one cannot specify value too far in the past or future.

Constants of type Timeval are written as |year-month-day/hour:minute:second|, e.g. |1995-11-15/12:51:32|.
Constants of type Time are written as |hour:minute:second|, e.g. |12:51:32|.
Constants of type Date are written as |year-month-day|, e.g. |1995-11-15|.

The following functions exist for types Timeval, Time, and Date:

now() -> Timeval
The current absolute time.

time() -> Time
The current time-of-day.

clock() -> Real
The number of seconds since the system was started.

date() -> Date
The current year and date.

timeval(Integer year,Integer month,Integer day,
        Integer hour,Integer minute,Integer second) -> Timeval

Construct Timeval.

time(Integer hour,Integer minute,Integer second) -> Time
Construct Time.

date(Integer year,Integer month,Integer day) -> Date
Construct Date.

time(Timeval) -> Time
Extract Time from Timeval.

date(Timeval) -> Date
Extract Date from Timeval.

date_time_to_timeval(Date, Time) -> Timeval
Combine Date and Time to Timeval.

year(Timeval) -> Integer
Extract year from Timeval.

month(Timeval) -> Integer
Extract month from Timeval.

day(Timeval) -> Integer
Extract day from Timeval.

hour(Timeval) -> Integer
Extract hour from Timeval.

minute(Timeval) -> Integer
Extract minute from Timeval.

second(Timeval) -> Integer
Extract second from Timeval.

year(Date) -> Integer
Extract year from Date.

month(Date) -> Integer
Extract month from Date.

day(Date) -> Integer
Extract day from Date.

hour(Time) -> Integer
Extract hour from Time.

minute(Time) -> Integer
Extract minute from Time.

second(Time) -> Integer
Extract second from Time.

timespan(Timeval, Timeval) -> <Time, Integer usec>
Compute difference in Time and microseconds between two time values


7.6 Sorting functions

Currently AmosQL has no special syntax for sorting. Instead there are several that can be used to sort bags or vectors.

Sort bags or vectors with a custom comparison function

This group of sort functions are useful to sort bags or vectors of either objects or tuples with a custom function supplied by the user. Either the function object or function named can be supplied. The comparison function must take two arguments with types compatible with the elements of the bag or the vector and return a boolean value.
sort(Vector v1, Function compfno) -> Vector
sort(Vector v, Charstring compfn) -> Vector
sort(Bag b, Function compfno) -> Vector
sort(Bag b, Charstring compfn) -> Vector
Example:
create function younger(Person p1, Person p2) -> Boolean as
select TRUE where age(p1) < age(p2);

/* Sort all persons sorted by their age */
sort((select p from Person p), 'YOUNGER');

Sort bags with a default comparison function

Whenever bags of literals (tuples of literals) are sorted for simplicity one may use the natural order of the literal types. Here the tuple positions are 1-based and order is one of 'inc' or 'dec'. Sort the bag b by the element at position pos in order specified by order.
sort(Bag b, Integer pos, Charstring order) -> Vector
Sort the bag of tuples b in default ascending order, by the element at position pos:
sort(Bag b, Integer pos) -> Vector
Sort all elements in a bag b by comparing the whole elements in alphabetic order.
sort_tuples(Bag b, Charstring order) -> Vector
Same as above, defaults to ascending order:
sort(Bag b) -> Vector
Note: Surrogate object will be sorted too, but by their OID which usually has no meaning.

Example:

sort((select i from Integer i where i = iota(0,10)), 1, 'dec');

7.7 Accessing the type system

allfunctions() -> Bag of Function f
allfunctions(Type t)-> Bag of Function f
allobjects () -> Bag of Object o
alltypes() -> Bag of Type t

returns all functions, objects, and types, respectively, in the database.

allfunctions(Type t)-> Bag of Function
returns all functions where some argument or result is of type t.

subtypes(Type t) -> Bag of Type s
supertypes(Type t) -> Bag of Type s

returns the types immediately below/above type t in the type hierarchy.

allsupertypes(Type t) -> Bag of Type s
returns all types above t in the type hierarchy.

typesof(Object o) -> Bag of Object t
returns the type set of an object.

typeof(Object o) -> Type t
returns the most specific type of an object.

functionnamed(Charstring nm) -> Function fn
returns the function named nm. Notice that function names are in upper case.

kindoffunction(Function f) -> Charstring knd
returns the kind of the function f as a string. The result can be one of 'stored', 'derived', 'foreign' or 'overloaded'.

name(Function fn) -> Charstring nm
returns the name of the function fn.

typenamed(Charstring nm) -> Type t
returns the type named nm. Notice that type names are in upper case.

name(Type t) -> Charstring nm
returns the name of the type t.

objectname(Object o, Charstring nm) -> Boolean
returns TRUE if the object o has the name nm.

usedwhere(Function f) -> Function c
returns the functions calling the function f.

useswhich(Function f) -> Function c
returns the functions called from the function f.

generic(Function f) -> Function g
returns the generic function of a resolvent.

resolvents(Function g) -> Bag of Function r
returns the resolvents of an overloaded function g.

attributes(Type t) -> Bag of Function g
returns the generic functions having a single argument of type t and a single result.

methods(Type t) -> Bag of Function r
returns the resolvents having a single argument of type t and a single result.

resolventtype(Function fn) -> Bag of Type t
returns the types of the first arguments of the resolvents of function resolvent fn.

arguments(Function r) -> Bag of Vector args
returns sequence (vector) describing arguments of signature of resolvent r. Each element in the vector is a triplet (vector) describing one of the arguments with structure {type,name,uniqueness} where type is the type of the argument, name is the identifier for the argument, and uniqueness is either "key" or "nonkey" depending on the declaration of the argument.. For example,
arguments(resolvents(functionnamed("TIMESPAN")))
 
--> {{#[OID 371 "TIMEVAL"],"TV1","nonkey"},{#[OID 371 "TIMEVAL"],"TV2","nonkey"}}

results(Function r) -> Bag of Vector res
Analogous to arguments for result (tuple) of function.

cardinality(Type t) -> Integer c
returns the number of object of type t and all its subtypes.

arity(Function f)-> Integer a
returns the number of arguments of function.

width(Function f) -> Integer w
returns the width of the result tuple of function f.

sourcecode(Function f) -> Bag of Charstring
returns the sourcecode of a function, if available. For generic functions the sources of all resolvents are returned.


 7.8 Extents

The local Amos II database can be regarded as a set of extents. There are two kinds of extents:
  • Type extents represent surrogate objects belonging to a particular type.

    • The deep extent of a type is defined as the set of all surrogate objects belonging to that type and to all its descendants in the type hierarchy. The deep extent of a type is retrieved with:
      deep_extent(Type t)->Bag of Object
      for example
      deep_extent(typenamed("OBJECT"))
      returns all surrogate objects stored in the local database.
    • The shallow extent of a type is defined as all surrogate objects belonging only to that type and not to any of its descendants. The shallow extent is retrieved with:
      shallow_extent(Type t)->Bag of Object
      for example
      shallow_extent(typenamed("OBJECT"));
      returns nothing since type Object has no own instances.

  • Function extents represent the state of stored functions. The extent of a function is the bag of tuples mapping its argument(s) to the corresponding result(s).
    extent(Function fn) -> Bag of Vector
    for example
    select f,extent(f) from function f where
    kindoffunction(f)="stored";
    Function extent returns the extent of the function fn. The extent tuples are returned as a bag of vectors. The function can be any kind of function.

    For stored functions the extent is directly stored in the local database. The example query thus returns the state of all stored functions. The state of the local database is this state plus the deep extent of type Object.

    The extent is always defined for stored functions and can also be computed for derived functions through their function definitions. The extent of a derived function may not be computable, unsafe, in which case the extent function returns nothing.

    The extent of a foreign function is always empty.

    The extent of a generic function is the union of the extents of its resolvents.

7.9 Query optimizer tuning


optmethod(Charstring new) -> Charstring old
sets the optimization method used for cost-based optimization in Amos II to the method named new.
Three optimization modes for AmosQL queries can be chosen:

"ranksort"
: (default) is fast but not always optimal. 
"exhaustive": is optimal for small queries only. It uses first dynamic programming for up to 5 seconds and then resorts to ranksort.
optmethod returns the old setting of the optimization method.
reoptimize(Function f) -> Boolean
reoptimize(Charstring f) -> Boolean
reoptimizes function named fn.

7.10 Unloading

The data stored in the local database can be unloaded to a text file as an unload script of AmosQL statements generated by the system. The state of the database can be restored by loading the unload script as any other AmosQL file. Unloading is useful for backups and for transporting data between different systems such as different Amos II versions or Amos II systems running on different computers. The unload script can be edited in case the schema has changed.

unload(Charstring filename)->Boolean
generates an AmosQL script that restores the current local database state if loaded into an Amos II peer with the same schema as the current one.

excluded_fns()->Bag of Function
set of functions that should not be unloaded. Can be udated by user.


7.11 Miscellaneous

eval(Charstring stmt) -> Object r
evaluates the AmosQL statement stmt.


error(Charstring msg) -> Boolean

prints an error message on the terminal and raises an exception. Transaction aborted.

print(Object x) -> Boolean
prints x. Always returns true. The printing is by default to the standard output (console) of the peer where print is executed but can be redirected to a file by the function openwritefile:

openwritefile(Charstring filename)->Boolean
openwritefile changes the output stream for print to the specified filename. The file is closed and output directed to standard output by calling closewritefile();.

amos_version() -> Charstring

returns string identifying the current version of Amos II.

quit;
quits Amos II.

exit;
returns to the program that called Amos II if the system is embedded in some other system.
Same as quit; for stand-alone Amos II.


goovi();
starts the multi-database browser GOOVI[CR01].
This works only under JavaAmos.

The redirect statement reads AmosQL statements from a file: 

redirect-stmt ::= '<' string-constant
For example 
< 'person.amosql';

load_AmosQL(Charstring filename)->Charstring
loads a file containing AmosQL statements.

getenv(Charstring var)->Charstring value
retrieves the value of OS environment variable var. Generates an error of variable not set.

The trace and untrace functions are used for tracing foreign function calls:

    trace(Function fno)->Bag of Function r
  trace(Charstring fn)->Bag of Function r
  untrace(Function fno)->Bag of Function r
  untrace(Charstring fn)->Bag of Function r


If an overloaded functions is (un)traced it means that all its resolvents are (un)traced. Results are the foreign functions (un)traced. For example:

Amos 2> trace("SQRT");
#[OID 114 "NUMBER.SQRT->NUMBER"]
Amos 2> sqrt(2.2);
>>#[OID 114 "NUMBER.SQRT->NUMBER"]#(2.2 *)
<<#[OID 114 "NUMBER.SQRT->NUMBER"]#(2.2 1.48324)
1.48324
<<#[OID 114 "NUMBER.SQRT->NUMBER"]#(2.2 -1.48324)
-1.48324
Amos 2>

dp(Object x, Number priority)-> Boolean
For debug printing in where clauses. Prints x on the console. Always returns true. The placement of dp in the execution plan is regulated with priority which must be positive numeric constant. The higher priority the earlier in the execution plan.

8 References

[CR01]K.Cassel and T.Risch: An Object-Oriented Multi-Mediator Browser. Presented at 2nd International Workshop on User Interfaces to Data Intensive Systems, Zürich, Switzerland, May 31 - June 1, 2001

[ER00] D.Elin and T. Risch: Amos II Java Interfaces,  Uppsala University, 2000.

[FR95] S. Flodin and T. Risch, Processing Object-Oriented Queries with Invertible Late Bound Functions, Proc. VLDB Conf., Zürich, Switzerland, 1995.

[FR97] G. Fahl and T. Risch: Query Processing over Object Views of Relational Data, The VLDB Journal , Vol. 6 No. 4, November 1997, pp 261-281.

[Han03] M.Hansson: Wrapping External Data by Query Transformations, Uppsala Master Theses in Computing Science 243, Dept. of Information Technology, Uppsala, Sweden, 2003.

[JR99a] V.Josifovski, T.Risch: Functional Query Optimization over Object-Oriented Views for Data Integration Journal of Intelligent Information Systems (JIIS), Vol. 12, No. 2-3, 1999.

[JR99b] V.Josifovski, T.Risch: Integrating Heterogeneous Overlapping Databases through Object-Oriented Transformations. In Proc. 25th Intl. Conf. On Very Large Databases, Edinburgh, Scotland, September 1999.

[JR02] V.Josifovski, T.Risch: Query Decomposition for a Distributed Object-Oriented Mediator System . Distributed and Parallel Databases J., Kluwer, May 2002.

[KJR03] T.Katchaounov, V.Josifovski, and T.Risch: Scalable View Expansion in a Peer Mediator System, Proc. 8th International Conference on Database Systems for Advanced Applications (DASFAA 2003), Kyoto, Japan, March 2003.

[LR92] W.Litwin and T.Risch: Main Memory Oriented Optimization of OO Queries Using Typed Datalog with Foreign Predicates, IEEE Transactions on Knowledge and Data Engineering, Vol. 4, No. 6, December 1992 ( http://user.it.uu.se/~udbl/publ/tkde92.pdf). 

[Ris00a] T.Risch: Amos II External Interfaces, Uppsala University, 2000.

[Ris00b]T.Risch: ALisp User's Guide, Uppsala University, 2000.

[RJK03] T.Risch, V.Josifovski, and T.Katchaounov: Functional Data Integration in a Distributed Mediator System, in P.Gray, L.Kerschberg, P.King, and A.Poulovassilis (eds.): Functional Approach to Data Management - Modeling, Analyzing and Integrating Heterogeneous Data, Springer, ISBN 3-540-00375-4, 2003.