|
1 Running Amos II
It is recommended that each user creates a private directory
for Amos II, <privdir>. You should copy the following
files to <privdir>:
amos2.exe
amos2.dmp amos.dll
Amos II is then ready to run in <privdir> by the
command:
amos2 [<db>]
where [<db>] is an optional name of an Amos II database
image (default is amos2.dmp which is an empty
database).
The system enters an Amos II top loop where it reads AmosQL statements,
executes them, and prints their results. You need not connect to any
particular database, but instead, if <db> is omitted,
the system enters an empty database (amos2.dmp), where only
the system objects are defined.
When the Amos II database is defined and populated, it can be
saved on disk with the AmosQL statement:
save "filename";
NOTICE: If you save using the name 'amos2.dmp' you overwrite
the system database.
In a later session you can connect to the saved database by
starting Amos II with:
amos2 filename
The prompter in the Amos II top loop is:
Amos n>
where n is a generation number. The generation number
is increased every time an AmosQL database update statement is executed
in the Amos II top loop. For example:
Amos 1> create type Person;
Amos 2> create type Student under Person;
Database changes can be undone by using
the rollback statement with a generation number as argument.
For example, the statement:
Amos 3> rollback 2;
will restore the database to the state it had at generation number 2.
It thus undoes the effect of the statement:
create type Student under Person;
After the rollback above, the type Student is removed from
the database, but not type Person.
A rollback without arguments undoes all database changes of the current
transaction.
The statement commit makes
changes non-undoable, i.e. all updates so far cannot be rolled back any
more and the generation numbering starts over from 1.
For example:
Amos 2> commit; Amos 1> ...
To shut down Amos II orderly first save the database and then
type:
Amos 1> quit;
JavaAmos is a version of the Amos II kernel connected
to
the Java virtual machine. With JavaAmos Amos II foreign functions can
be
written in Java (the callout interface) and Java functions can
call
Amos II functions and send AmosQL statements to Amos II for evaluation
(the callin interface) [ER00]. To start
JavaAmos use the script
JavaAmos
instead of amos2. It will enter a top loop reading and
evaluating AmosQL statements as amos2.
The multi-database browser GOOVI [CR01]
is
a graphical browser for Amos II written as a Java application. You can
start the GOOVI browser from the JavaAmos top loop by calling the Amos
II function
goovi();
It will start the browser in a separate thread.
2 AmosQL
This section describes the syntax of AmosQL and explains some semantic
details. For the syntax we use BNF notation
with the following special constructs:
A ::= B C: A consists of B followed by C.
A ::= B | C, alternatively (B | C): A consists of B or C.
A ::= [B]: A consists of B or nothing.
A ::= B-list: A consists of one or more Bs.
A ::= B-commalist: A consists of one or more Bs separated by commas.
'xxx': The string (keyword) xxx.
AmosQL statements are always terminated by a semicolon (;).
Comments
The comment statement can be placed anywhere outside identifiers and
constants.
Syntax:
comment ::= '/*' character-list '*/'
Identifiers
Identifiers have the syntax:
identifier ::=
('_' | letter) [identifier-character-list]
identifier-character ::= alphanumeric | '_'
E.g.: MySalary, x, x1234, x1234_b
Amos II keywords are case sensitive; they are always written
with lower case letters. Amos II identifiers are NOT case sensitive;
i.e. they are always internally capitalized.
Variables
Variables are of two kinds: local variables or interface
variables:
variable ::= local variable | interface-variable
- Local variables are identifiers for data values
inside AmosQL queries and functions. Local variables must be
declared in function signatures (Function
definitions), in from clauses (Query Statements),
or by the declare statement (Database
procedures). Notice that variables are not case sensitive.
Syntax:
local-variable ::= identifier
E.g. my_variable, MyVariable
- Interface variables hold temporary results
during a session. Interface variables cannot be referenced in
function bodies and are not considered as being parts of the
database.
Syntax:
interface-variable ::= ':' identifier
E.g. :my_interface_variable,
:MyInterfaceVariable
Interface variables
are by default untyped (of type Object). The user can declare an
interface variable to be of a particular type by the interface variable declare statement:
interface-variable-declare-stmt ::= 'declare'
interface-variable-declaration-commalist
interface-variable-declaration ::= type-spec
interface-variable
E.g. declare Integer :i, Real :x;
Interface variables can be assigned either by the
into-clause of
the select statement or by the interface variable assignment statement:
set-interface-variable-stmt ::= 'set' interface-variable '=' expr
E.g. set :i = 2 + sqrt(:x);
Constants
Constants can be integers, reals, strings, booleans, or nil.
Syntax:
constant ::= integer-constant | real-constant | boolean-constant | string-constant | 'nil'
integer-constant ::= ['-'] digit-list E.g. 123, -123 real-constant ::= decimal-constant | scientific-constant
decimal-constant ::= ['-'] digit-list '.' [digit-list]
scientific-constant ::= decimal-constant ['e' | 'E'] integer-constant E.g. 1.2, -1.0, 2.3E2, -2.4e-21 boolean-constant ::= 'true' | 'false'
string-constant ::= string-separator character-list string-separator
string-separator ::= ''' | '"' E.g. "A string", 'A string', 'A string with "', "A string with \" and '"
The surrounding string separators in string constants must be the same.
If the string separator is " then \ is escape character inside
the string always replacing the succeeding character. For example the
string 'ab"\' can also be written as "ab\"\\", and the string a'"b can
only be written as "a'\"b" since it contains both string separators.
simple-value ::= constant |
variable
E.g. :MyInterfaceVariable, MyLocalVariable, 123, "Hello
World"
A simple value is either a constant or a variable reference.
Expressions
expr ::= simple-value | function-call | collection
| casting
E.g. 1.23, 1+2, sqrt(:a), {1,2,3}, cast(:p as Student)
An expression is a either a simple value or a function call. An
expression can normally be computed to produce a result value. The
value of an expression is computed if the expression is entered to the
Amos II top loop, e.g.:
1+5*6;
=> 31
Entering such expressions is the simplest form of AmosQL queries.
Collections
collection ::= bag-expr | vector-expr
bag-expr ::= bag(expr-commalist)
vector-expr ::= '{' expr-comma-list '}'
E.g. bag(1,2,3), {1,2,3}, {{1,2},{3,4}},
{1,name(:p),1+sqrt(:a)}
Collections represent sets of object. Collections can be either Bags or
Vectors. Bags are unordered sets of objects with duplicates allowed,
while Vectors are sequences of objects. Bags are often implicit in the
system,
e.g. the value of functions are often bags without explicitly
specifying
them to be so.
Statements
Statements instruct Amos II to perform various kinds services. The
following statements can be entered to the Amos II top loop:
create-type-stmt | delete-type-stmt | create-object-stmt | delete-object-stmt | create-function-stmt | delete-function-stmt | query-stmt | update-stmt | add-type-stmt | remove-type-stmt | for-each-stmt | set-interface-variable-stmt | declare-interface-variable-stmt | commit_stmt | rollback_stmt | open-cursor-stmt | fetch-cursor-stmt | close-cursor-stmt | quit-stmt | exit-stmt | procedure-statement
2.1 Types
The create type statement creates a new type in the
local database.
Syntax:
create-type-stmt ::= 'create type' type-name ['under' type-name-commalist] ['properties' '(' attr-function-commalist ')']
type-spec ::= type-name | 'Bag of' type-spec | 'Vector of' type-spec
type-name ::= identifier
attr-function ::= generic-function-name type-spec ['key'] E.g. create type Person; create type Student under Person; create type Kid under Person properties (name Charstring key, attitude Integer);
Type names must be unique across all types.
Type names are not case sensitive and the type names are
always internally upper-cased. For clarity all type names used
in examples in this manual have the first letter capitalized.
The new type will be a subtype of all the supertypes in the 'under'
clause. If no supertypes are specified the new type becomes a subtype
of the system type named Userobject. Multiple inheritance
is specified through more than one supertype.
The attr-function-commalist clause is optional, and
provides a way to define attributes for the new type. The attributes
are functions having a single
argument and
a single result and are initially stored
functions in the local database (but can be later redefined as
other kinds of functions). The argument type of an attribute function
is the type being created and
the result type is specified by the type-spec. The result type
must be previously defined. In the above example the function name has the argument of type Person and result of type Charstring.
If 'key' is
specified for a property, it indicates that each value of the attribute
is unique.
2.1.1 Deleting types
The delete type statement deletes a type and all its
subtypes.
Syntax:
delete-type-stmt ::= 'delete type' type-name E.g. delete type Person;
Functions using the deleted type will be deleted as well.
2.2 Objects
The create object statement creates one or more objects and
makes the new object(s) instance(s) of a given user type and all
its supertypes.
Syntax:
create-object-stmt ::= 'create' type-name ['(' generic-function-name-commalist ')'] 'instances' initializer-commalist
initializer ::= variable | [variable] '(' expr-commalist ')'
Examples:
create Person (name,age) instances :adam ('Adam',26),:eve ('Eve',32);
create Person instances :olof;
create Person (parents) instances :tore (bag(:adam,:eve));
create Person (name,age) instances ("Kalle "+"Persson" , 30*1.5);
The new objects are assigned initial values for the specified
attributes. The attributes can be any updatable AmosQL functions of a
single argument and value.
One object will be created for each initializer. Each
initializer can have an optional variable name which will be bound to
the new object. The variable name can subsequently be used as a
reference to the object.
The initializer also contains a comma-separated list of
initial values
for the specified functions in generic-function-name-commalist.
Initial values are specified as constants or
variables.
The types of the initial values must match the declared result
types of the corresponding functions.
Bag valued functions are initialized using the keywords 'Bag
of' (syntax bag-value).
Vector result functions are normally initialized with a
comma-separated list of values enclosed in curly brackets (syntax vector-value).
It is possible to specify nil for a value when no
initialization is desired for the corresponding function.
2.2.1 Deleting objects
Objects are deleted from the database with the delete
statement.
Syntax:
delete-object-stmt ::= 'delete' variable
Referential integrity is maintained by the system which will
automatically delete references to the deleted object. It is thus also
removed from all stored functions where it is referenced.
Deleted objects are printed as
#[OID nnn *DELETED*]
The objects may be undeleted by rollback. An automatic
garbage collector physically removes the OIDs from the database only
when their creation has been rolled back or their deletion committed,
and they are not references from some variable or external system.
2.3 Functions
Functions can be either:
- Stored functions whose extents are explicitly stored
in the local Amos II database.
- Derived functions which are defined by a single query statement.
- Foreign functions
which are defined in a programming
language. There are foreign function interfaces to the programming
languages C/C++ [Ris00a], Java [ER00],
and Lisp [Ris00a] [Ris00b].
- Stored procedures
which are functions defined using procedural AmosQL statements. Makes
AmosQL computationally complete.
- Overloaded functions
which are functions having different implementations depending on the
argument types in a function call.
The create function
statement creates a new user function.
Syntax:
create-function-stmt ::=
'create function' generic-function-name
argument-spec '->' result-spec [fn-implementation]
generic-function-name ::= identifier
E.g. create function age(Person) -> Integer;
function-name ::= generic-function-name |
type-name-list '.' generic-function-name
'->' type-name-list
type-name-list ::= type-name | type-name
'.' type-name-list
E.g. plus, number.number.plus->number
argument-spec ::='(' [argument-declaration-commalist] ')'
argument-declaration ::= type-spec [local-variable] ('key' | 'nonkey')
result-spec ::= argument-spec | tuple-result-spec
tuple-result-spec ::= ['<'] argument-declaration-commalist
['>']
fn-implementation ::=
'as' (derived-body | procedure-body |
foreign-body | 'stored')
derived-body ::= simple-select-stmt
E.g. create function name(Person) -> Charstring as
stored;
create function friends(Person) -> Bag
of Person as stored;
The argument-spec and the result-spec
together specify the signature of the function, i.e. the types
and optional names of formal parameters and results.
Semantics:
- The types used in the signatures must be previously
defined. The name of an argument or result parameter can be left
unspecified if it is never referenced in the function implementation.
The names of the argument and result parameters for a given function
definition must be unique.
- Bag of specifications on a single result parameter
declares it to be a bag (set with tuples allowed). Derived functions
always have implicit Bag of results so for the result of
derived functions 'Bag of' need not be explicitly declared.
- Derived functions can also have arguments declared 'Bag
of' making them aggregation operators.
- AmosQL functions may also have restricted tuple valued
results. This is indicated by bracketing the result declarations (see
syntax for multiple-result-spec and example below).
Function names are not case sensitive and are internally
stored upper-cased.
2.3.2 Specifying cardinality constraints
A cardinality constraint is a system maintained restriction on
the number of allowed occurrences in the database of an argument or
result of a function. For example, a cardinality constraint can be that
there can be at most one salary and one name per person, while a person
may have any number of parents. The only cardinality constraint which
is
currently supported in AmosQL is to make a specified argument or result
of a stored function unique, by specifying it as a key.
For example:
create function name(Person key) -> Charstring key as stored;
In this case the system guarantees that there is a one-to-one
relationship between OIDs of type Person and their names.
If the 'key' cardinality constraint is violated by a database
update the following error message is printed:
Update would violate upper object participation (updating function ...)
The keyword 'nonkey' specifies that the parameter has no
cardinality constraint.
The default cardinality constraint for a stored function is key
for the first argument of a stored function and nonkey for
all other arguments and the result(s). This implies that stored
functions
are by default single valued.
Since the first argument of a stored function by default is
declared as key the function name could thus also
have been
written:
create function name(Person) -> Charstring key as stored;
Another example:
create function married(Person husband,Person wife key)->Boolean as stored;
Polygamous marriages are refused to be stored by the function married,
since the first argument husband has the default cardinality
constraint key.
The Bag of declaration on the result of a stored
function actually just overrides the default key declaration
of its 1st argument with nonkey. Thus the function parents
above can be written in two alternative ways:
create function parents(Person) -> Bag of Person as stored; create function parents(Person nonkey) -> Person as stored;
Cardinality constraints can also be specified for foreign functions.
These are important for optimizing queries involving foreign functions.
However, it is up to the foreign function implementor to guarantee that
specified cardinality constraints hold. Cardinality constraint
declarations are ignored for derived functions.
The extent
of a function can be regarded as a bag (or relation) of tuples mapping
its argument(s) to the corresponding result(s). The key
annotation for an argument or result specifies that the argument/result
is a key to this extent relation.
2.3.3 Deleting functions
Functions are deleted with the delete function statement.
Syntax:
delete-function-stmt ::= 'delete function' function-name
For example:
delete function married;
Deleting a function also deletes all functions using the deleted
function.
2.4 Query statements
Queries retrieve objects having specified properties. They are
specified using the query statement
denoting either variables, constants, function calls, or select
statements.
query-stmt ::= select-stmt | expr
The simplest and fastest kind of queries is to call a
Amos II function defined either by the system or the user, e.g.
upper("abc"); persoNamed("Kalle"); 1+2; date();
All Amos II functions are optimized when they are defined and simple
function calls like the ones above are going to be called
without any query optimization. This is called a fast path call
to a function. By contrast general select
statements require query optimization before they are executed and
this is much slower than fast path function calls. You are therefore
encouraged to always define queries as derived
functions that are called using the fast path interface.
2.4.1 Function calls
Syntax:
function-call ::= function-name '(' [parameter-value-commalist] ')' | expr infix-operator expr
infix-operator ::= '+' | '-' | '*' | '/'
E.g. sqrt(2.1), 1+2, "a" + "b"
parameter-value ::= expr | '(' simple-select-stmt ')'
The built-in functions +,-,*,/ have equivalent infix syntax
with the usual priorities.
For example:
(income(:eve) + income(:ulla)) * 0.5;
is equivalent to:
times(plus(income(:eve),income(:ulla)),0.5);
The '+' operator is defined for both numbers, strings, and vectors
denoting string and vector concatenation for types Charstring
and Vector, respectively. Resolution of such overloaded
functions is described in Overloaded
Functions and Late Binding .
In a function call, the types of the actual parameters and
results must be the same as, or subtypes of, the types of the
corresponding formal parameters or results.
2.4.2 The select statement
The select statement provides the most flexible way to specify
queries.
Syntax:
select-stmt ::= 'select' ['distinct'] expr-commalist [into-clause] [from-clause] [where-clause]
simple-select-stmt ::= 'select' ['distinct'] expr-commalist [from-clause] [where-clause]
into-clause ::= 'into' variable-commalist
from-clause ::= 'from' variable-declaration-commalist
variable-declaration ::= type-spec local-variable
where-clause ::= 'where' predicate-expression
For example: select name(p) from Person p where age(p)<2 and city(p)="Uppsala"; /* Return singleton values */ select income(x),income(father(x))+income(mother(x)) from Person x; /* Returns tuples */ select {income(x),income(father(x))+income(mother(x))} from Person x; /* Returns vectors */
The expr-commalist defines the object(s) to be retrieved.
The from-clause declares types of local variables
used
in the query. Query variables can be bound to bags on which aggregation functions can be applied. Sequences
(lists, arrays) of values is supported through type Vector.
The where-clause gives selection criteria for the
search. The details of the where clause is described below in Predicate expressions .
The result of a select statement is a bag of single result
tuples. Duplicates are removed only when the keyword 'distinct' is
specified, in which case a set is returned.
An into-clause is available for specifying variables
to be bound to the result. NOTICE that in case more than one result
tuple is returned, the variables will be bound only to the elements of
the first encountered tuple.
For example:
select p into :eve2 from Person p where name(p) = 'Eve'; name(:eve2);
This query retrieves into the environment variable :eve2 the
Person whose name is 'Eve'. If more than one person is named
Eve only the first one will be bound to :eve2.
2.4.3 Predicate
expressions
The general syntax of predicate expressions is:
predicate-expression ::= predicate-expression 'and' predicate-expression | predicate-expression 'or' predicate-expression | '(' predicate-expression ')' | simple-predicate
E.g. x < 5 or (x > 6 and 5 < y)
simple-predicate ::= expr | eq-compare-expr = eq-compare-expr | expr comparison expr
eq-compare-expr ::= expr | tuple-expr
tuple-expr ::= '<' expr-commalist '>'
comparison ::= < | > | <= | >= | !=
E.g. x > 2, "a" != s, 1+y <= sqrt(5.2), parents(p) = <m,f>, odd(:p)
Boolean functions returning type Boolean are
treated as predicate expressions. Boolean functions return true or
nothing (nil). The constant false is nil casted
to type Boolean.
The comparison operators (=, !=, <, <=, and >=) are
treated as binary boolean functions. They are all defined for objects
of
any type.
2.4.4 Overloaded functions, late binding, casting
Function names may be overloaded,
i.e., functions having the
same name may be defined differently for different argument types. This
allows generic functions to apply to several different object types.
Each specific implementation of an overloaded function is called a resolvent.
For example, assume the two following Amos II function
definitions:
create function less(Number i, Number j)->Boolean as select true where i < j; create function less(Charstring s,Charstring t)->Boolean as select true where s < t;
Its resolvents will have the signatures:
less(Number,Number) -> Boolean less(Charstring,Charstring) -> Boolean
Internally the system stores the resolvents under different function
names. The name of a resolvent is obtained by concatenating the type of
its arguments with the name of the overloaded function followed by
the symbol '->' and the type of the result.(Function definitions). The two
resolvents above will be given the names number.number.less->boolean
and charstring.charstring.less->boolean.
Overloaded function resolvents are allowed to differ on their
argument types and the result types. The query compiler resolves the
correct
resolvent to apply based on the types of the arguments; the type of
the result is not considered. If there is an ambiguity, i.e. several
resolvents qualify in a call, or if no resolvent qualify an error will
be generated
by the query compiler.
When overloaded function names are encountered in AmosQL
function bodies, the system will try to use local variable declarations
to choose the correct resolvent (early binding).
For example:
create function younger(Person p,Person q)->Boolean as select less(age(p),age(q));
will choose the resolvent number.number.less->boolean,
since age returns integers and the resolvent number.number.less->boolean
is applicable to integers by inheritance. The other function resolvent charstring.charstring.less->boolean
does not qualify since it is not legal to apply to arguments of type Integer.
On the other hand, this function:
create function nameordered(Person p,Person q)->Boolean as select less(name(p),name(q));
will choose the resolvent charstring.charstring.less->boolean
since the function name
returns a string. In both cases the type
resolution (selection of resolvent) will be done at compile
time.
Dynamic type resolution at run time, late binding, is
done for top loop function call s to choose the correct resolvent. For
example,
less(1,2);
will choose number.number.less->boolean
To avoid the overhead of late binding one may use casting. Alternatively the explicit resolvent
name notation can be used to explicitly call a specific resolvent:
number.number.less->boolean(1,2);
Function definitions and queries may also conatin late bound
overloaded functions. For example, suppose that managers are employees
whose incomes are the sum of the income as a regular employee plus some
manager bonus:
create type Employee under Person; create type Manager under Employee; create function mgrbonus(Manager)->Integer as stored; create function income(Employee)->Integer as stored; create function income(Manager m)->Integer i as select income(cast(m as Employee)) + mgrbonus(m);
Now, suppose that we need a function that returns the gross incomes of
all persons in the database, i.e. we use manager.income->integer
for managers and employee.income->integer for non-manager.
In Amos II such a function is defined as:
create function grossincomes()->Integer i as select income(p) from Employee p; /* income(p) late bound */
Since income is overloaded with resolvents employee.income->integer
and manager.income->integer and both qualify to apply to
employees, the resolution of income(p) will be done at run
time.
If sums of incomes as employees are sought the desired resolvent has to
be
specified as an explicit resolvent name employee.income->integer.
Since the detection of the necessity of dynamic resolution is
done at compile time, overloading a function name may lead to a
cascading recompilation of functions defined in terms of that function
name. This may take some time and the recompilation is therefore
deferred until a function
marked for recompilation is called. For a more detailed presentation of
the management of late bound functions see [FR95].
The type of an expression can be
explicitly defined using the casting statement:
casting ::= 'cast'(expr 'as' type-spec)
for example
create function income(Manager m)->Integer i as select income(cast(m as Employee)) + mgrbonus(m);
By using casting statements one can avoid late binding and
explicit resolvent names.
2.4.5 Subqueries and aggregation operators
Aggregation operators are defined as functions where one or several
arguments are declared as bags:
Bag of Type x
The following system aggregation operators are defined:
sum(Bag of Number x) -> Number count(Bag of Object x) -> Integer maxagg(Bag of Object x) -> Object minagg(Bag of Object x) -> Object some(Bag of Object x) -> Boolean notany(Bag of Object x) -> Boolean
Notice that sum must be applied only to 'uniform' bags of
numbers.
Aggregation operators are applied on subqueries. For example, the
following query counts how many functions there are currently defined
in the database:
count(select f from Function f);
If an aggregation operator is applied on the result of a function
it is first coerced to a subquery. The following to queries return the
same values:
count(select allfunctions());
count(allfunctions());
In the second call to count the call to allfunctions
is first coerced into a subquery that returns the bag whose elements
are counted.
Local variables in queries may be declared as bags that can
be used as arguments to aggregation operators. For example
totalincomes to compute the sum of all incomes can be
defined as:
create function totalincomes()->Integer
as select sum(select income(p) from Person P);
or
create function totalincomes()->integer as select sum(b) from Bag of Integer b where b = (select income(p) from Person p);
NOTICE: Stored functions cannot be aggregation operators
and variables declared 'Bag of' cannot be returned from select
statements.
2.5 Updates
Information stored in Amos II can be thought of as mappings from
function arguments to results. These mappings are either defined at
object creation time (Objects), or altered
by one of the
function update statements 'set', 'add', or 'remove'. Functions
have as extent a bag of tuples
mapping arguments to results. Updating a stored function means updating
its extent.
Syntax:
update-stmt ::= update-op update-item [from-clause] [where-clause]
update-op ::= 'set' | 'add' | 'remove'
update-item ::= function-name '(' expr-commalist ')' '=' expr
Not every function is updatable. Amos II defines a function to be
updatable if it is a stored function, or if it is derived from a single
updatable function without a join.
Semantics:
- set sets the value of an updatable function
given the arguments.
A boolean function can be set to either 'true' or'false'.
Setting the value of a boolean function to 'false' means that the truth
value is removed from the function.
- add adds the specified tuple(s) to the result
of an updatable function with bag result, analogous to set.
- remove removes the specified tuple(s) from the
result of an updatable function with bag result, analogous to set.
The update statements are not allowed to violate the cardinality
constraints ('key') specified by the create-type-stmt
or the create-function-stmt.
NOTICE that the user can declare explictit
update rules for derived functions.
2.5.1 Dynamically changing object types
The add-type-stmt changes the type of one or more objects to
the specified type.
Syntax:
add-type-stmt ::= 'add type' type-name ['(' [generic-function-name-commalist] ')'] 'to' variable-commalist
The updated objects may be assigned initial values for all the
specified property functions in the same manner as in the create object
statement.
The remove-type-stmt makes one or more objects no
longer belong to the specified type.
Syntax:
remove-type-stmt ::= 'remove type' type-name 'from' variable-commalist
Referential integrity is maintained so that all references to the
objects as instances of the specified type cease to exist. An
object will always be an instance of some type. If all user defined
types
have been removed, the object will still be member of userobject.
2.5.1 User update procedures
It is possible to register user defined user update procedures
for any function. The user update procedures are stored
procedures that are transparently invoked when update statements
are executed for the function.
set_addfunction(Function f, Function
up)->Boolean
set_remfunction(Function f, Function up)->Boolean
set_setfunction(Function f, Function up)->Boolean
The function f is the function for which we wish to declare a
user update function and up is the actual update procedure.
The arguments of a user update procedures is the concatenation of
argument and result tuples of f.
For example, assume we have a function
create function
netincome(Employee e) -> Number
as select
income(e)-taxes(e);
Then we can define the following user update procedure:
create function
set_netincome(Employee e, Number i)-> Boolean
as begin
set taxes(e)= i*taxes(e)/income(e) + taxes(e);
set income(e) = i*(1-taxes(e))/income(e) + income(e);
end;
The following declaration makes netincome updatable with the set statement:
set_setfunction(functionnamed("EMPLOYEE.NETINCOME->NUMBER"),
functionnamed("EMPLOYEE.NUMBER->BOOLEAN"));
Now one can update netincome
with, e.g.:
set
netincome(p)=32000 from Person p where name(p)="Tore";
2.5.3 Dynamic updates
Sometimes it is necessary to be able to create objects whose
types are not known until runtime. Similarly one may wish to update
functions without knowing the name of the function until runtime. For
this there
are the following system procedures:
createobject(Type t)->Object
createobject(Charstring tpe)->Object
deleteobject(Object o)->Boolean
addfunction(Function f, Vector argl, Vector resl)->Boolean
remfunction(Function f, Vector argl, Vector resl)->Boolean
setfunction(Function f, Vector argl, Vector resl)->Boolean
createobject creates a surrogate object of the type
specified by its argument.
deleteobject deletes a surrogate object.
The stored procedures registered by setfunction, addfunction,
or remfunction updates a function given an argument list and
a result tuple as vectors. Returns TRUE if the update
succeeded.
2.6 Cursors
Cursors provide a way to iterate over the result of a query or a
function call. The open-cursor-stmt and the
fetch-cursor-stmt
iterate over the result from a query or function call.
Syntax:
open-cursor-stmt ::= 'open' cursor-name 'for' simple-select-stmt
cursor-name ::= variable
fetch-cursor-stmt ::= 'fetch' cursor-name (into-clause | next-clause)
next-clause ::= 'next' integer-constant
close-cursor-stmt ::= 'close' cursor-name
For example:
create person (name,age) instances :Viola ('Viola',38); open :c1 for select p from Person p where name(p) = 'Viola'; fetch :c1 into :Viola1; close :c1; name(:Viola1); --> "Viola";
A cursor is created by the open-cursor-stmt and is
represented by a bag of result tuples containing objects with unknown
types.
The result of the query is always materialized and stored in
the cursor bag.
The fetch-cursor-stmt fetches the first result tuple
from the cursor; i.e. the tuple is removed from the front of the cursor
bag. nil is returned if there are no more result tuples left
in the cursor.
If present in a fetch-cursor-stmt, the into
clause will bind elements of the first result tuple to AmosQL interface
variables. There must be one interface variable for each element in the
result
tuple.
If present in a fetch-cursor-stmt, the next-clause
will display the specified number of result tuples and remove them from
the cursor bag.
If neither a next
nor an into clause is
present in a
fetch-cursor-stmt, a single result tuple is fetched and displayed.
The close-cursor-stmt deallocates the cursor bag.
It is sometimes useful to count the number of result tuples in
a cursor bag:
create function ageofpersonnamed(Charstring nm)-> Integer a as select age(p) from Person p where name(p)=nm; open :c1 for select ageofpersonnamed('Eve'); count(:c1); 1
3 Database procedures
A database procedure is an Amos II function defined as a sequence of
AmosQL statements that may have side effects (i.e. database update
statements or variable assignments). Procedures may return a stream of
results by using a special result statement. Each time result
is called another result item is emitted from the procedure. Procedures
should not be used in queries (but this restriction is currently not
enforced). Most, but not all, AmosQL statements are allowed in
procedure bodies as can be seen by the syntax below.
Syntax:
procedure-body ::= block | procedure-stmt procedure-stmt ::= create-object-stmt | delete-object-stmt | for-each-stmt | update-stmt | add-type-stmt | remove-type-stmt | set-local-variable-stmt | query-stmt | if-stmt | commit-stmt | abort-stmt
block ::= 'begin' ['declare' variable-declaration-commalist ';'] procedure-body-semicolonlist 'end'
result-stmt ::= 'result' expr
for-each-stmt ::= 'for each' [for-each-option] variable-declaration-commalist [where-clause] for-each-body
for-each-option ::= 'distinct' | 'original'
for-each-body ::= procedure-body
if-stmt ::= 'if' expr 'then' procedure-body ['else' procedure-body]
set-local-variable-stmt ::= 'set' local-variable '=' function-call
Examples:
create function creperson(Charstring nm,Integer inc) -> Person p as begin create Person instances p; set name(p)=nm; set income(p)=inc; result p; end;
set :p = creperson('Karl',3500);
create function makestudent(Object o,Integer sc) -> Boolean as add type Student(score) to o (sc);
makestudent(:p,30);
create function flatten_incomes(Integer threshold) -> Boolean as for each Person p where income(p) > threshold set income(p) = income(p) - (income(p) - threshold) / 2;
flatten_incomes(1000);
NOTICE: Queries and updates embedded in procedure bodies are
optimized at compile time. The compiler saves the optimized query plans
in the database so that dynamic query optimization is not needed when
procedures are executed.
Procedures may return (bags of) results as a streams. The
result-stmt is used for this, where expr is returned as
the result from the procedure. If result is never called the
result of the procedure is nil. Usually the result type of
procedures not returning any value is Boolean.
3.1 Iterating over the result
of a query
The for-each
statement
iterates over the result of the query specified by the variable
declarations
(variable-declaration-commalist)
and the where clause
executing
the for-each-body for each result variable binding of the
query. For example the following function adds inc to the
incomes of all persons with salaries higher than limit and
returns their old incomes:
create function increase_incomes(Integer inc,Integer limit) -> Integer oldinc as for each Person p, Integer i where i > limit and i = income(p) begin result i; set income(p) = i + inc end;
NOTICE: The result
statement does not not abort the control flow (different from, e.g.,
return in C), but it only specifies that
a value is to be emitted to the result bag (i.e. stream) of
the function and then the procedure evaluation is continued as usual.
The for-each-stmt does not return any
value at all unless result-stmt is used within
its body.
The for-each-option specifies how to treat the result of the
query
iterated over. If it is omitted the system default is to first
materialize
a copy of the result of the query iterated over before applying
the for-each-body on each element of the copy. If 'distinct'
is specified the iteration is over a copy where duplicates in addition
have
been removed. If the option is 'original' the code is
applied
directly on the database contents in place.
NOTICE: Even though option 'original' can be very efficient
and
useful there is a problem sometimes. In the example, if 'original'
had been specified and the stored function income(p) happen
to have an ordered index sorted on the result the iteration would loop
for
ever! The reason is that the ordered index would be scanned to
efficiently
match the condition 'i > limit' and therefore the income
updates
will be made in income order, rather than person order. Thus each
income
updated a new row with the higher updated income will be inserted into
the
index after the current income and therefore the same person
will
be visited again. Therefore the iteration will loop forever.
On the other hand the space cost of the default copy option may be very
high for a massive iteration over, e.g., an external very large
wrapped
relational table. In such cases the 'original' option can be
used
for better space efficiency.
4 Peer management
Using a basic Amos II peer communication system, distributed Amos II
peers can be set up that communicate using TCP/IP. A federation of Amos
II peers is managed by a name server
which is an Amos II peer knowing
addresses and names of other Amos II peers. Queries, AmosQL commands,
and results can be shipped between peers. Once a federation is set up,
multi-database facilities can be used for defining queries and views
spanning several Amos II peers [RJK03].
Mediation and reconcilitation primitives can be used for defining
object-oriented multi-peer views [JR99a][JR99b].
4.1 Peer communication
The following AmosQL system functions and procedures are available for
inter-peer communication:
nameserver(Charstring name)->Charstring
Makes the current stand-alone database into a name server and registers
there itself as a peer with the given name. If name is "" the
name server will become anonymous and not registered as a peer.
listen()
Starts the mediator server listening loop. Informs the name
server that this peer is ready to receive incoming messages. The
listening loop can be interrupted with CTRL-C and resumed again by
calling listen(). The name server must be listening before
any
other peer can register.
register(Charstring name)->Charstring
Registers in the name server the current stand-alone database as a peer
with the given name. The name server must be running on the
local host. The system will complain if
the name is already registered in the name server. The peer needs
to be activated with listen() to be able to receive incoming requests.
register(Charstring name, Charstring
host)->Charstring
Registers the current database as a peer in the name server running on
the given host. Signals error if peer with same name already
registered in federation.
reregister(Charstring name)->Charstring
reregister(Charstring name,
Charstring host)->Charstring
as register
but unregisters other peer with same name rather than signalling error.
Good when restarting peer registered in name server after crash.
this_amosid()->Charstring name
Returns the name of the peer where the call is issued.
Returns "NIL" if issued in a not registered standalone Amos
II system.
other_peers()->Bag of Charstring name
Returns the names of the other peers in the federation
managed by the name server.
ship(Charstring peer, Charstring cmd)->
Bag of
Vector
Ships the AmosQL command cmd for execution in the named peer.
The result is shipped back to the caller as a set of tuples represented
as vectors. If an error happens at the other peer the error
management is also shipped back.
at "peer" select ...;
Skips the query to the specified peer for execution. Ships the result
back as bag of tuples.
send(Charstring peer, Charstring cmd)->
Charstring peer
Sends the AmosQL command cmd for asynchronous execution in
the named peer without waiting for the result to be returned.
broadcast(Charstring cmd)-> Bag of
Charstring
Sends the AmosQL command cmd for asynchronous execution in
all other
peers. Returns the names of the receiving peers.
hostname()->Charstring name
returns the name of the host where the current peer is running.
kill_peer(Charstring name)->Boolean
Kills the specified peer. Don't kill the name server if you still want
to be able to communicate with other peers!
4.1 Peer queries and views
Once you have established connections to Amos II peers you can
define multi-database queries views of data from your peers. You first
have to import the meta-data about selected types and functions from
the
peers. This is done by defining proxy
types and proxy functions
[RJK03] using the system function import_types:
import_types(Vector of Charstring typenames, Charstring p)->
Bag of Type pt
defines proxy types pt
for types named typenames
in peer p. Proxy
functions are defined for the functions in p having the imported types as
only argument. Inheritance among defined proxy types is defined
by the corresponding inheritance relationships between imported
types in the peer p.
Once the proxy types and functions are defined they can
transparently be queried. Proxy types can be references using @ notation to reference types
in other peers.
For example,
select name(p)
from Person@p1;
selects the name property
of objects of type Person
in peer p1.
import_types
imports only those functions having one of typenames as
single
arguments. You can import other functions using system function import_func:
import_func(Charstring fn, Charstring p)->Function pf
imports a function named fn
from peer p returning
proxy function pf.
On top of the imported types object-oriented multi-peer views
can be defined, as described in [RJK03]
consisting of derived types [JR99a] whose extents are derived through queries
intersecting extents of other types, and IUTs [JR99b]
whose extents reconciles unions of other type extents. Notice that
the implementation of IUTs is limited. (In particular the system flag latebinding('OR'); must be set
before IUTs can be used and this may cause other problems).
5 Wrapping external data sources
Amos II contains number of primitives for accessing different external
data sources by defining wrappers for each kind external
sources. The wrapper functionality in Amos II is greatly enhanced
compared to earlier releases of Amos II. In particular wrappers allow
specification of different capabilities
for different kinds of data sources [Han03].
Capabilities provide hooks to define data source specific re-write
rules and cost models.
Wrappers can furthermore be made updatable so that a
wrapped existing data source appears to the mediator as an
object-relational database. Wrappers for relational
databases using JDBC or ODBC
interfaces are predefined in Amos II. There are also wrappers for
accessing other Amos II peers.
This chapter first describes multi-directional
foreign functions [LR92], the basis for
accessing external systems from Amos II queries and defining wrappers.
Then some general types and functions used for defining meta-data of
eternal sources are described. Finally we describe the relational
database wrappers for JDBC and ODBC.
5.1 Multi-directional foreign
functions
The basis for accessing external systems from Amos II is to define foreign functions. A foreign
functions allows subroutines defined in C/C++ [Ris00a], Lisp [Ris00a] [Ris00b], or Java [ER00] to be called from Amos II
queries. This allows access to external databases, storage managers, or
computational libraries. A foreign function is defined by the
following function implementation:
foreign-body ::=
simple-foreign-definition | multidirectional-definition
simple-foreign-body ::= 'foreign' [string-constant]
multidirectional-definition ::=
'multidirectional' capability-list
capability ::=
(binding-pattern 'foreign' string-constant
['cost'
cost-spec]['rewriter' string-constant])
binding-pattern ::= A string constant containing b:s
and f:s
cost-number ::= integer-constant | real-constant
cost-spec ::= function-name
| '{' cost-number ',' cost-number '}'
For example:
create function iota(Integer l, Integer u) -> Integer
as foreign 'iota';
create function sqroot(number x)-> Number r
as multidirectional
("bf" foreign 'sqrt'
cost {2,2})
("fb" foreign
'square'
cost {1.2,1});
create function bk_access(integer handle_id
)-><vector key, vector >
/* Core cluster function
*/
as multidirectional
('bff' foreign
'bk_scan' cost
bk_cost rewriter "abkw")
('bbf' foreign
'bk_get' cost
{1002, 1});
create function myabs(real x)->real y
as multidirectional
("bf" foreign
"JAVA:Foreign/absbf" cost {2,1})
("fb" foreign
"JAVA:Foreign/absfb" cost {4,2});
The syntax using multidirectional
definition is the general one having several
options for different capabilities of
foreign functions. The simplified syntax using simple-foreign-body is mainly for
quick implementations of functions without paying attention to
expensive functions or scalable query execution.
A foreign function can implement either
- Filters which are
predicates (boolean valued functions, e.g. '<') that succeed when
certain conditions over its results are satisfied.
- Computations that
produce a result given that the arguments are known, e.g. +..
- Generators that
produce a stream of several result tuples given the argument
tuple, e.g. iota or a function sending SQL
statements to a relational database for evaluation.
Amos II functions in general are multi-directional,
i.e. they can be executed also when the result of a function is given
and some corresponding argument values are sought. For example, if we
have a function
parents(person)-> bag of person
we can ask these AmosQL queries:
parents(:p); /* Result is the bag of parents of :p */ select c from person c where parents(c) = :p; /* Result is bag of children of :p */
It is often desirable to make foreign
Amos II functions multi-directional as well. For example, we may wish
to ask these queries using the square root function sqroot above:
sqroot(4.0); /* Result is -2.0 and 2.0 */ select x from number x where sqroot(x)=4.0; /* result is 16.0 */ select sqroot(4.0)=2.0; /* Is the square root of 4 = 2 */
With simple foreign functions only the
first (forward) function call is possible. Multi-directional foreign
functions permit also the second kind of queries.
The benefit of multi-directional foreign functions is that a larger
class of queries calling the function is executable (safe), and that
the system can make better query optimization. A multi-directional
foreign function can have several capabilities
implemented depending on the binding
pattern of its arguments and results. The binding pattern is a
string of b:s and f:s, indicating which arguments or results in a given
situation are known or unknown, respectively.
For example, the square root (sqroot)
actually has the following possible binding patterns:
(1) If we know x but not r, the binding pattern is "bf" and the implementation
should return r as the
square root of x.
(2) If we know r but not x, the binding pattern in "fb" and the implementation
should return x as r**2.
(3) If we know both r and
x then the binding
pattern is "bb" and the
implementation should check that x
= r**2.
Case (1) and (2) are implemented by multi-directional foreign function sqroot above.
Case (3) need not be implemented as it is inferred by the system by
first executing r**2 and
then checking that the result is equal to x (see [LR92]).
To implement a multi-directional foreign function you first need to
think of which binding patterns require implementations. In the sqroot case one implementation
handles the square root and the other one handles the square. The
binding patterns will be "bf"
for the square root and "fb"
for the square.
The following steps are required to define a foreign function:
- Implement each
foreign function capability using the interface of the implementation
language.
- In case the foreign code is statically linked (only in
C/C++) the implementation must be linked
to the kernel by make a driver program referencing it as an
external function [Ris00a]. This is not needed
for foreign functions in Java or Lisp that are dynamically loaded into
the system.
- Statically linked code must also be assigned a symbolic name to be referenced in
the foreign function capability definition (sqrt
and square in the
example) See [Ris00a].
- Finally the multidirectional foreign function needs to be defined through a
foreign function definition in AmosQL as above. Here the
implementor also associates a binding pattern and optional cost
estimates with each capability. Normally the foreign function
definition is done separate from the actual code implementing its
capabilities, in an AmosQL script.
For further details on how to define multidirectional foreign functions
for the different implementation languages see [Ris00a][Ris00b].
A capability can also have an associated rewrite rule (keyword rewriter)
implemented in Lisp. It is called by the query processor to
generate several foreign function calls from an applied capability.
This requires some knowledge about the Amos II internals. It is not
further elaborated here.
5.1.1 Cost and result size
estimates
Different capabilities of multi-directional foreign functions often
have different execution costs. In the sqroot example the cost of
computing the square root is higher than the cost of computing the
square. When there are several alternative implementations of a
multi-directional foreign function the cost-based query optimizer needs
cost hints that help it choose the most efficient implementation. In
the example we might want to indicate that the cost of executing a
square root is double as large as the cost of executing a square.
Furthermore, the query optimization is also very much dependent on the
expected size of the result from a function call. This is called the fanout (or selectivity of
predicates) of the call for a given binding pattern. In the sqroot
example the implementation sqrtbf
usually has a fanout of 2, while sqrtfb
has a fanout
of 1.
Thus for good query optimization each foreign function capability
should
have
associated costs and fanouts:
- The cost is an
estimate of how expensive it is to completely execute a foreign
function for given set of arguments.
- The fanout estimates
the expected number of elements in the result stream, given the
arguments.
The cost and fanout for a multidirectional foreign function
implementation can be either specified as a constant vector of two
numbers (as in sqroot above) or as an Amos II cost function returning the vector
of cost and fanout for a given
function call. The
numbers normally need only be rough numbers, as they are used by the
query optimizer to compare the costs of different possible execution
plans to produce the optimal one. The number 1 for the cost of a
foreign function should roughly be the cost to perform a cheap function
call, such as '+' or '<'.
If the simplified syntax is used or no
cost is specified the system
tries to put reasonable defaults on foreign functions. It assumes they
are cheap with low fanout. If you have expensive foreign functions you
are strongly advised to specify cost hints. It is also advisable to
specify a fanout for functions you know have less or equal to 1 or
greater than 2. The
system assumes aggregation functions are expensive to execute by
default.
The cost function cfn is
an Amos II function with signature
create function <cfn>(Function f, Vector bpat, Vector args) -> <Integer cost, Integer fanout> as ...; e.g.
create function typesofcost(Function f, Vector bpat, Vector args) -> <Integer cost, Integer fanout> as foreign ...;
The cost function is normally called at compile time when the
optimizer needs the cost and fanout of a function call in some query.
The arguments and results of the cost function are:
f
is the full name the called Amos II
function.
bpat is the binding
pattern of the call
as a vector of strings ’b’ and ’f’, e.g. {"f","b"} indicating which
arguments in the call are bound or free, respectively.
argl is a vector of
actual variable
names and constants used in the call.
cost is the computed estimated cost to execute
a call to f with the given binding pattern and argument list. The cost
to access a tuple of a stored function (by hashing) is 2; other costs
are calibrated accordingly.
fanout is the
estimated fanout of the
execution, i.e. how many results are returned from the execution.
If the cost hint function does not return anything it indicates that
the function is not executable in the given context.
The costs and fanouts are normally specified in each capability
specification of a multidirectional foreign function definition, as in
the example. The costs can also be specified after the definition of a
foreign function by using the following Amos II system
procedure:
costhint(Charstring fn,Charstring bpat,Vector ch)->Boolean
e.g.
costhint("number.sqrt->number","bf",{4,2}); costhint("number.sqrt->number","fb",{2,1});
fn is the full name
of the resolvent.
bpat is the binding
pattern string.
ch is a vector with
two numbers where the first number is the
estimated cost and the second is the estimated fanout.
A cost function cfn can
be assigned to a capability with:
costhint(Charstring fn,
Charstring bpat, Function cfn) -> Boolean
To find out what cost estimates are associated with a
function use:
costhints(Function r)-> Bag of <Charstring
bpat, Object
q>
It returns the cost estimates for resolvent r and their associated
binding
patterns.
5.2 Generic wrapper
functionality
Some wrapper functionality is completely data source independent while
other functionality is specific for a particular kind of data source.
Therefore, to share wrapper functionality Amos II contains a hierarchy
of wrappers. In this section we describe common functionality used for
defining any kind of wrapper.
5.2.2 Data sources
Types and objects of type Datasource describe properties of
different kinds of data sources accessible through Amos II. Each kind
of wrapped data source has a corresponding data source type,
for example type Amos, Relational, or Jdbc to
describe Amos II peers, relational databases, or relational data bases
accessed through JDBC, respectively. These types are all subtypes of
type Datasource. Each instance of a data source type
represents a particular data source of that kind, e.g. a particular
relational
database accessed trough JDBC are instances of type Jdbc.
5.2.3 Mapped types
A mapped type is a type whose instances are identified by a key
consisting of one or several other objects [FR97].
Mapped types are needed when proxy objects corresponding to external
values from some data source are created in a mediator. For example, a
wrapped relational database may have a table PERSON(SSN,NAME,AGE) where
SSN is
the key. One may then wish to define a mapped type named Pers
in
the mediator representing proxy objects for the persons in the table.
The
instances of the proxy objects are identified by a SSN. The type Pers
should furthermore have the following property functions derived from
the
contents of the wrapped table PERSON:
ssn(Pers)->Integer
name(Pers)->Charstring
age(Pers)->Integer
The instances and primary properties of a mapped type are defined
through a Core Cluster Function that returns these as a set of
tuples, one for each instance. In our example the core cluster function
should return tuples of three elements <ssn,name,age>. The
relational database wrapper will automatically generate such a core
cluster function for table PERSON (type Pers) with signature:
pers_cc()->Bag of <Integer ssn, Charstring
name, Integer age> as ...
A mapped type is defined through a system procedure with signature:
create_mapped_type(Charstring name, Vector keys,
Vector attrs, Charstring ccfn)-> Mappedtype
where
name is the name of the mapped type
keys is a vector of the names of the keys
identifying each instance of the mapped type.
attrs is a vector of the names of the properties
of the mapped type.
ccfn is the name of the core cluster function.
In our example the relational database wrapper will automatically
define the mapped type Pers through this call:
create_mapped_type("Pers",{"ssn"},{"ssn","name","age"},"pers_cc");
Notice that the implementor of a mapped type must guarantee key
uniqueness.
Once the mapped type is defined it can be queried as any other
type, e.g.:
select name(p) from Pers p where age(p)>45;
5.2.3 Type translation
Types in a specific data source are translated to corresponding types
in the mediator using the following system functions:
amos_type(relational_named("IBDS"),"VARCHAR");
amos_type returns the Amos II type corresponding to
the a specific data source.
wrapped_type(relational_named("IBDS"),typenamed("CHARSTRING"));
returns the data source type corresponding to an Amos II type. Since
one external type may correspond to more than one Amos II type wrapped_type
is not the inverse of amos_type.
The most common relational types and their Amos II counterparts are
provided by default. Both functions are stored functions that can be
updated as desired for future wrappers.
5.3 Relational database wrappers
There are predefined wrappers for relational databases using
either the ODBC or JDBC standard relational database interfaces.
Everything
that is above the ODBC/JDBC interface level (type conversion,
translation
etc) has been lifted to an abstract Relational Wrapper
specified
through a data source named Relational. The specific
ODBC/JDBC
wrappers override the abstract function interfaces of the general
relational
wrapper.
The ODBC wrapper is primarily tested with Microsoft MS Access
and the JDBC wrapper is tested with FireBird's InterClient
JDBC
driver and Microsoft's SQLServer driver.
5.3.1 The abstract wrapper Relational
An instance of type Relational represents a
relational database and the functions represents interfaces and
capabilities of a relational database in general. The Relational
wrapper is an abstract wrapper in the sense that it does not
implement a specific data source and thus has no instances, but rather
it specifies the functionality that a general relational database
interface must implement. In the type hierarchy below the abstract Relational
wrapper there are specific implemented wrappers for the
relational database interface standards ODBC and JDBC. The subtypes
have one instance for each data source of those kinds that the system
accesses. The following type hierarchy is used:
Datasource | Relational / \ Jdbc Odbc
All interface functions for relational databases are defined
for type Relational. Some relational database interface
functions are defined as abstract functions. Abstract functions
are implemented as foreign functions having a special implementation ...
foreign
'abstract-function'. Abstract
functions cannot be executed
and a run-time error will be raised if they are forced to; they always
have to be overridden by a corresponding resolvent for a subtype.
The use of abstract functions makes sure that Amos II will use
dynamic type resolution to look for implementations of these functions
in any subtype to Relational. The net effect is that the
interface
for any relational database wrapper is explicitly specified and must
be followed. This is exactly what the JDBC and ODBC wrappers do. The
JDBC and ODBC wrappers and implementation of these functions will be
explained
below, for now we will assume that the functions are implemented and
work
for some subtype of Relational.
5.3.1.1 Relational interface functions
The relational data source interface functions provide the general
interface for accessing different kinds of relational databases. Some
of the interface functions are abstract and need to be defined
specifically for each subtype of Relational. Other data
source specific functions, such as the system function import_table
rely on this.
The basic relational interfaces functions provide a low level
interface to a specific relational DBMS application programming
interface. The instances of relational data sources are
created using data source type specific data source constructor
functions that loads necessary drivers and looks up that such a
data source exists. For example,
jdbc("ibds","interbase.interclient.Driver");
creates a new JDBC data source instance named IBDS (capitalized) using
the Interbase interclient driver.
Each relational database source requires the following data
source interface functions:
-
connect(Relational r, Charstring ds_identifier, Charstring username, Charstring password) -> Relational
for example
connect(relational_named("IBDS"),"EMPLOYEE.GDB","SYSDBA","masterkey"); connect
opens a new connection to a relational data source. The function is
abstract for type Relational and implemented for each subtype
of Relational.
-
tables(Relational r) -> Bag of <Charstring table, Charstring catalog, Charstring schema, Charstring owner>
For example
tables(relational_named("IBDS"));
tables returns a bag of table description tuples for a
given relational database source. The relational wrappers use this
function to find the tables of the relational database.
The function is abstract for type Relational. It
is called by the (non-abstract) generic wrapper function has_table
defined below.
columns(relational_named("IBDS"),"SALES");
columns returns, for a given table table_name
in a relational data source r, its columns along with
their SQL data types. The function is abstract for type Relational.
primary_keys(relational_named("IBDS"),"SALES");
primary_keys returns, for a given relational table table_name,
a bag of the columns constituting the primary key. constraint_name
is the name of a primary key constraint in the relational database if
available. The function is abstract for type Relational.
has_table(relational_named("IBDS"),"SALES");
has_table returns TRUE if the table table_name
exists in the relational database r.
cardinality(relational_named("IBDS"),"SALES");
cardinality returns the number of tuple in table table_name
in the relational database r. The function is abstract
for type Relational.
import_table(relational_named("IBDS"),"SALES");
import_table creates an object-oriented view of a
relational table in the mediator peer. The view is represented by a mapped type which is a type whose extent is
defined by the rows of the table. Each instance of the mapped type
corresponds to a row in the table. For the mapping, the instances of a
mapped type internally stores the key of the corresponding table row.
The name of the mapped type is constructed by concatenating the table
name, the character _ and the data source name, for example Person_ibds.
Mapped type names are internally capitalized, as for other Amos II
types.
For each columns in the table import_table will
generate a corresponding derived wrapper function returning the
column's value given an instance of the mapped type. For example,
a table named person having the column ssn
will have a function
ssn(Person_ibds)->Integer
returning the ssn of a person from the
imported relational table.
-
import_table(Relational r, Charstring table_name, Boolean updatable) -> Mappedtype mt
for example
import_table(relational_named("IBDS"),"COUNTRY",true);
The extra argument allows the user to specify the wrapper to
be updatable. This means that Amos II update statements to the
wrapper functions are transparently translated into corresponding SQL
update statements of the wrapped table and create statements
for the mapped type is translated into SQL insert statements.
Updatable wrappers define transparent object-oriented
data abstractions
over existing relational database tables.
For example, if there are columns CURRENCY and COUNTRY
in table country, a new rows in the table added with
create Country_ibds(currency,country) instances ("Yen","Japan");
The currency of Japan can be transparently updated too:
set currency(c)= "Jen" from Country_ibds c where country(c)= "Japan";
The most general resolvent of import_table is:
-
import_table(Relational r, Charstring catalog_name, Charstring schema_name, Charstring table_name, Charstring typename, Boolean updatable, Vector supertypes) -> Mappedtype mt
The table resides in the given catalog and schema. If catalog
is "", the table is assumed not to be in a catalog. If schema is "",
the table is assumed not to be in a schema. typename is
the desired name of the mapped type created, as alternative to the
system generated concatenation of table and data source name. updatable
gives an updatable mapped type. supertypes is a vector of
either type names or type objects, or a mixture of both. The effect is
that Amos II will perceive the mapped type as an immediate subtype of
the types.
There are also two other variants of import_table to be used for
relational databases where schema and catalog names need not be
specified:
-
import_table(Relational r, Charstring table_name, Charstring typename, Boolean updatable, Vector supertypes) -> Mappedtype mt
import_table(Relational r, Charstring table_name, Charstring type_name, Boolean updatable) -> Mappedtype mt
drop_table(relational_named("IBDS"),"SALES");
drop_table deletes the specified relational database
table.
5.3.1.2 Importing many-to-many relationships
All tables in relational databases do not correspond to
'entities' in an ER diagram and therefore cannot be directly mapped to
types. The most common case is tables representing many-to-many
relationships between entities. Typically such tables have two columns,
each of which is a foreign key imported from two other tables,
expressing the many-to-many relationship between the two. Obviously,
only
entities are desired to be imported as types and special types
should not be generated for such relationship tables. A many-to-many
relationship in a relational database corresponds directly to a
function returning a bag in AmosQL, and should be imported
using import_relation rather than import_table:
-
import_relation(Relational r, Charstring table_name, Charstring argument_column_name, Charstring result_column_name, Charstring function_name, Boolean updatable) -> Function
table_name - the name of the table
containing the relation.
argument_column_name - the name of the
column which is argument of the function.
result_column_name - the name
of the column which is result of the function.
function_name - the desired name of the
function.
updatable - whether the function should be
transparently updatable via set, add, and remove.
For example, assume we have two entities, person
and telephone. Most of the time telephones are not
regarded as entities in their own respect since nobody would care to
know more about a telephone than its extension number. In this scenario
however, the physical location of the telephone is also kept in the
database, so telephones have a table of their own.
A person can be reached through several telephones, and every
telephone may be answered by several person. The typical schema looks
as follows:
person_telephones
ssn |
ext_no |
|
|
|
telephone
ext_no |
location |
... |
|
|
|
Now, the correct way to wrap this schema is the following way:
import_table(my_relational, 'person');
import_table(my_relational, 'telephone');
import_relation(my_relational, 'telephone',
'ssn','ext_no','phones', false);
or
import_relation(my_relational, 'telephone',
'ext_no',ssn','phones', false);
(a matter of taste)
create function phones(person@my_relational p) ->
telephone@my_relational t as select t where phones(ssn(p)) = ext_no(t);
or, if the other direction was chosen:
create function phones(telephone@my_relational t) ->
person@my_relational p as select p where phones(ext_no(t)) = ssn(p);
As seen above, the user has the option of importing a
many-to-many table as two different functions. The reason is simple:
Whereas functions have directions, many-to-many relationships are direction
neutral. Thus the additional information of desired direction must
be expressed by the user.
Also note that only a function with a single argument and
result can be created.
-
imported_keys(Jdbc j, Charstring fktable) -> <Charstring pktable, Charstring pkcolumn, Charstring fkcolumn>
for example
imported_keys(relational_named("IBDS"),"SALES");
pktable - The table referenced by this
foreign key.
pkcolumn - The column referenced by this
foreign key.
fkcolumn - The foreign key column in this
table.
imported_keys returns information about the
exported keys of a relational table. Abstract in type Relational.
-
exported_keys(Jdbc j, Charstring pktable) -> <Charstring pkcolumn, Charstring fktable, Charstring fkcolumn>
for example
exported_keys(relational_named("IBDS"),"COUNTRY");
pkcolumn - The primary key column in this
table.
fktable - The table whose foreign key
references this table.
fkcolumn - The foreign key column in the
table that references this table.
exported_keys returns information about the exported
keys of a table in a datasource. Abstract in type Relational.
5.3.1.2 SQL statement execution
The function executes an arbitrary sql statement as a string.
It is implemented in all relational wrappers as it is used by the translator
from AmosQL to SQL to generate SQL queries from
AmosQL.
-
sql(Relational r, Charstring query, Vector arguments) -> Vector results
Abstract function in Relational. Executes the
query query at the data source r and
returns
the result as a vector. The query string may contain question marks
that will be bound to arguments in the vector arguments.
A feature is that the query will be compiled and
cached so that subsequent calls to sql with the
same string (even though the arguments differ) will execute
substantially faster.
5.3.2 The JDBC Wrapper
The basic Amos II system contains a wrapper for JDBC
relational data sources. It allows transparent for access to data in
relational databases through OO views. JDBC data sources in Amos II are
represented by the type Jdbc, subtype of Relational.
Jdbc overrides all of the abstract functions of Relational
and does not declare any functions of its own.
5.3.2.1 JDBC Data Access Functions
jdbc(Charstring logname, Charstring driver) -> Jdbc
Constructs a Jdbc object having the logical
nick-name logname and attempts to load the JDBC driver.
connect(Jdbc j, Charstring url) -> Jdbc
Calls the JDBC method DriverManager.getConnection(String
url), which uses a default user and password
connect(Jdbc j, Charstring url, Charstring user_name,
Charstring pass) -> Jdbc
Calls the JDBC method DriverManager.getConnection(String
url, String userName, String pass).
-
disconnect(Jdbc ds) -> Boolean
Calls the JDBC method Connection.close().
This is also done automatically at system exit.
sql(Jdbc j, Charstring query, Vector params -> Bag
of Vector res
Execute an SQL statement by first checking of the string
is cached, in which case there exists already a PreparedStatement
where the parameters can be set to those in params.
Otherwise a call to PrepareStatement is issued and the
result cached along with the query string.
Obviously, a lot can be gained in using identical strings
for identical queries, and using as many ?'s as possible.
Indeed, this is what the translator strives
for at all times.
5.3.2.2 JDBC Data Meta-Data Functions
tables(Jdbc j) -> Bag of <Charstring table,
Charstring catalog, Charstring schema, Charstring owner>
Calls the method getTables and projects
some of the information to Amos. See also section 4.2.1.1
columns(Jdbc j, Charstring table)
-> Bag of < Charstring column_name, Charstring
type_name>
Calls the method getColumns and projects
some of the information to Amos. See also section 4.2.1.1
primary_keys(Jdbc j, Charstring table) -> Bag of
<Charstring column_name, Charstring constraint_name>
Calls the method getColumns and projects
some of the information to Amos. See also section 4.2.1.1
imported_keys(Jdbc j, Charstring fktable) ->
<Charstring pktable, Charstring pkcolumn,
Charstring fkcolumn>
Calls the method getImportedKey and
projects some of the information to Amos. See also section 4.2.1.1
exported_keys(jdbc j, charstring pktable)
Calls the method getExporedKeys and
projects some of the information to Amos. See also section 4.2.1.1
5.3.3 The ODBC wrapper
The basic Amos II system contains a wrapper for ODBC
relational data sources. It allows transparent access to data in
relational databases through OO views. ODBC data sources in Amos II are
represented by the type Odbc, subtype of Relational.
Odbcoverrides all of the abstract functions of Relational.
5.3.2.1 ODBC Data Access Functions
5.3.2.2 ODBC Data Meta-Data Functions
tables(Odbc j) -> Bag of <Charstring table,
Charstring catalog, Charstring schema, Charstring owner>
See section 4.2.1.1
columns(Odbc j, Charstring table)
-> Bag of < Charstring column_name, Charstring
type_name>
See section 4.2.1.1
primary_keys(Odbc j, Charstring table) -> Bag of
<Charstring column_name, Charstring constraint_name>
See section 4.2.1.1
imported_keys(Odbc j, Charstring fktable) ->
<Charstring pktable, Charstring pkcolumn,
Charstring fkcolumn>
See section 4.2.1.1
exported_keys(odbc j, charstring pktable)
See section 4.2.1.1
6 Physical database design
This section describes
some AmosQL commands for database space and performance tuning.
6.1 Indexing
The system supports indexing on any single argument or result of a
stored function. Indexes can be unique or non-unique. A
unique index disallows storing different values for the indexed
argument or result. The cardinality constraint 'key' of stored
functions (Cardinality Constraints)
is implemented as unique indexes. Thus by default the system puts a
unique index on the first argument of stored functions. That
index can be made non-unique by suffixing the first argument
declaration with the keyword 'nonkey' or to specify 'bag of' for the
result,
in which case a non-unique index is used instead.
For example, in the following function there can be only one
name per person:
create function name(Person)->Charstring as stored;
By contrast, names allow more than one name per person:
create function names(Person p nonkey)->Charstring nm as stored;
Alternative definition of names:
create function names(Person p)->Bag of Charstring nm as stored;
Any other argument or
result declaration can also be suffixed with the
keyword 'key' to indicate the position of a unique index. For
example, the following definition puts a unique
index on nm to prohibit two persons to have the same name:
create function name(Person p)->Charstring nm key as stored;
Indexes can also be explicitly created on any argument or result with a
system procedure create_index:
create_index(Charstring function, Charstring argname, Charstring index_type, Charstring uniqueness)
For example:
create_index("person.name->charstring", "nm", "hash", "unique"); create_index("names", "charstring", "mbtree", "mutiple");
The parameters of create_index are:
function: The name of a stored function. Use the
resolvent name for overloaded functions.
argname: The name of the argument/result parameter
to be indexed. When unambiguous, the names of types of
arguments/results
can also be used here.
index_type: Type kind of index to put on the
argument/result. The supported index types are currently hash indexes
(type hash) and ordered B-tree indexes (type mbtree).
The default index for key/nonkey declarations is hash.
uniqueness: Index uniqueness indicated by unique
for unique indexes and multiple for non-unique indexes.
Indexes are deleted by the system procedure:
drop_index(Charstring functioname, Charstring argname);
The meaning of the parameters are as for function create_index.
There must always be at least one index left on each
stored function and therefore the system will never delete the last
remaining index on a stored function.
To save space it is possible to delete the default index on
the first argument of a stored function. For example, the following
stored function maps parts to unique identifiers through a unique hash
index on the identifier:
create type Part; create function partid(Part p)->Integer id as stored;
partid will have two indexes, one on p and one
on id. To drop the index on p, do the following:
drop_index('partid', 'p');
6.2 Clustering
Functions can be clustered by creating multiple result stored
functions, and then each individual function can be defined as a
derived function.
For example, to cluster the properties name and address of
persons one can define:
create function personprops(Person p) -> <Charstring name,Charstring address> as stored; create function name(Person p) -> Charstring nm as select nm from Charstring a where personprops(p) = <nm,a>; create function address(Person p) -> Charstring a as select a from Charstring nm where personprops(p) = <nm,a>;
Clustering does not improve the execution time performance
significantly in a main-memory DBMS such as Amos II. However,
clustering can decrease the database size considerably.
7 System functions and commands
7.1 Comparison operators
The built-in, infix comparison operators
are:
=(Object x, Object y) -> Boolean (infix operator =) !=(Object x, Object y) -> Boolean (infix operator !=) >(Object x, Object y) -> Boolean (infix operator >) >=(Object x,Object y) -> Boolean (infix operator >=) <(Object x, Object y) -> Boolean (infix operator <) <=(Object x,Object y) -> Boolean (infix operator <=)
All objects can be compared. Strings are compared by characters, lists
by elements, OIDs by identifier numbers. Equality between a bag and
another object denotes set membership of that object. The comparison
functions can, of course, be overloaded for
user defined types.
7.2 Arithmetic functions
abs(Number x) -> Number y div(Number x, Number y) -> Number z (infix operator /) max(Object x, Object y) -> Object z min(Object x, Object y) -> Object z minus(Number x, Number y) -> Number z (infix operator -) mod(Integer x, Integer y) -> Integer z plus(Number x, Number y) -> Number z (infix operator +) times(Number x, Number y) -> Number z (infix operator *) iota(Integer l, Integer u)-> Bag of Integer z sqrt(Number x) -> Number z round(Number x) -> Integer i log10(Number x) -> Real y
iota constructs a bag of integers
between l and u.
For example, to execute n times AmosQL statement stmt do:
for each Integer i where i = iota(1,n) stmt;
7.3 String functions
String concatenation is made using
the '+' operator, e.g.
"ab" + "cd" + "ef";
returns "abcdef"
char_length(Charstring)->Integer
Count number of characters in string.
itoa(Integer)->Charstring
Convert integer to string.
lower(Charstring)->Charstring
Lowercase string.
like(Charstring string, Charstring pattern) -> Boolean
Test if string matches regular expression pattern where '*' matches
sequence of characters and '?' matches single character. For example:
like("abc","??c") returns TRUE
like("ac","a*c") returns TRUE
like("ac","a?c") fails
like_i(Charstring string, Charstring pattern) -> Boolean
Case insensitive like.
substring(Charstring string,Integer start, Integer
end)->Charstring
Extract substring from given character positions. First character has
position 0.
upper(Charstring)->Charstring
Uppercase string.
7.4 Aggregation functions
Some of these system functions are described in Subqueries
and
Aggregation Operators.
Number of objects in bag:
count(Bag of Object o) -> Integer c
Extract elements of collections:
in(Bag of Object b) -> Bag of Object o in(Vector v) -> Bag of Object o
Largest object in bag:
maxagg(Bag of Object x) -> Object y
Smallest number in bag:
minagg(Bag of Object x) -> Object y
Test if bag empty. Logical NOT EXISTS:
notany(Bag of Object o) -> Boolean b
Test if there are any elements in bag. Logical EXISTS:
some(Bag of Object x) -> Boolean b
Sum uniform bags of numbers:
sum(Bag of Number x) -> Number s
Insert x between elements in bag b:
inject(Bag of Object b, Object x) -> Bag of Object r
Make a string of elements in bag b:
concatagg(Bag of Object b)-> Charstring s
for example
concatagg(bag("ab",2,"cd")); => "ab2cd"
concatagg(inject(bag("ab",2,"cd"),",")); => "ab,2,cd"
7.5 Temporal functions
Amos II supports three data types for referencing Time, Timeval,
and Date.
Type Timeval is for specifying absolute time points
including year, month, and time-of-day.
The type Date specifies just year and date, and type Time
specifies time of day. A limitation is that the internal operating
system representation is used for representing Timeval values,
which means that one cannot specify value too far in the past or future.
Constants of type Timeval are written as |year-month-day/hour:minute:second|,
e.g. |1995-11-15/12:51:32|.
Constants of type Time are written as |hour:minute:second|,
e.g. |12:51:32|.
Constants of type Date are written as |year-month-day|,
e.g. |1995-11-15|.
The following functions exist for types Timeval, Time, and Date:
now() -> Timeval
The current absolute time.
time() -> Time
The current time-of-day.
clock() -> Real
The number of seconds since the system was started.
date() -> Date
The current year and date.
timeval(Integer year,Integer month,Integer day,
Integer hour,Integer minute,Integer second)
-> Timeval
Construct Timeval.
time(Integer hour,Integer minute,Integer second) -> Time
Construct Time.
date(Integer year,Integer month,Integer day) -> Date
Construct Date.
time(Timeval) -> Time
Extract Time from Timeval.
date(Timeval) -> Date
Extract Date from Timeval.
date_time_to_timeval(Date, Time) -> Timeval
Combine Date and Time to Timeval.
year(Timeval) -> Integer
Extract year from Timeval.
month(Timeval) -> Integer
Extract month from Timeval.
day(Timeval) -> Integer
Extract day from Timeval.
hour(Timeval) -> Integer
Extract hour from Timeval.
minute(Timeval) -> Integer
Extract minute from Timeval.
second(Timeval) -> Integer
Extract second from Timeval.
year(Date) -> Integer
Extract year from Date.
month(Date) -> Integer
Extract month from Date.
day(Date) -> Integer
Extract day from Date.
hour(Time) -> Integer
Extract hour from Time.
minute(Time) -> Integer
Extract minute from Time.
second(Time) -> Integer
Extract second from Time.
timespan(Timeval, Timeval) -> <Time, Integer usec>
Compute difference in Time and microseconds between
two time values
7.6 Sorting functions
Currently AmosQL has no special syntax for sorting. Instead there are
several that can be used to sort bags or vectors.
Sort bags or vectors with a custom comparison function
This group of sort functions are useful to sort bags or vectors of
either objects or tuples with a custom function supplied by the user.
Either the function object or function named can be supplied. The
comparison function must take two arguments with types compatible with
the elements of the bag or the vector and return a boolean value.
sort(Vector v1, Function compfno) -> Vector
sort(Vector v, Charstring compfn) -> Vector
sort(Bag b, Function compfno) -> Vector
sort(Bag b, Charstring compfn) -> Vector
Example:
create function younger(Person p1, Person p2) -> Boolean as select TRUE where age(p1) < age(p2);
/* Sort all persons sorted by their age */ sort((select p from Person p), 'YOUNGER');
Sort bags with a default comparison function
Whenever bags of literals (tuples of literals) are sorted for
simplicity one may use the natural order of the literal types. Here the
tuple positions are 1-based and order is one of 'inc' or
'dec'. Sort the bag b by the element at position pos
in order specified by order.
sort(Bag b, Integer pos, Charstring order) -> Vector
Sort the bag of tuples b in default ascending order, by the
element at position pos:
sort(Bag b, Integer pos) -> Vector
Sort all elements in a bag b by comparing the whole elements
in
alphabetic order.
sort_tuples(Bag b, Charstring order) -> Vector
Same as above, defaults to ascending order:
sort(Bag b) -> Vector
Note: Surrogate object will be sorted too, but by their
OID which usually has no meaning.
Example:
sort((select i from Integer i where i = iota(0,10)), 1, 'dec');
7.7 Accessing the type system
allfunctions() -> Bag of Function f
allfunctions(Type t)-> Bag of Function f
allobjects () -> Bag of Object o
alltypes() -> Bag of Type t
returns all functions, objects, and types, respectively, in the
database.
allfunctions(Type t)-> Bag of Function
returns all functions where some argument or result is of type t.
subtypes(Type t) -> Bag of Type s
supertypes(Type t) -> Bag of Type s
returns the types immediately below/above type t in the type
hierarchy.
allsupertypes(Type t) -> Bag of Type s
returns all types above t in the type hierarchy.
typesof(Object o) -> Bag of Object t
returns the type set of an object.
typeof(Object o) -> Type t
returns the most specific type of an object.
functionnamed(Charstring nm) -> Function fn
returns the function named nm. Notice that function names are
in upper case.
kindoffunction(Function f) -> Charstring knd
returns the kind of the function f as a string. The result
can be one of 'stored', 'derived', 'foreign' or 'overloaded'.
name(Function fn) -> Charstring nm
returns the name of the function fn.
typenamed(Charstring nm) -> Type t
returns the type named nm. Notice that type names are in
upper case.
name(Type t) -> Charstring nm
returns the name of the type t.
objectname(Object o, Charstring nm) -> Boolean
returns TRUE if the object o has the name nm.
usedwhere(Function f) -> Function c
returns the functions calling the function f.
useswhich(Function f) -> Function c
returns the functions called from the function f.
generic(Function f) -> Function g
returns the generic function of a resolvent.
resolvents(Function g) -> Bag of Function r
returns the resolvents of an overloaded function g.
attributes(Type t) -> Bag of Function g
returns the generic functions having a single argument of type t
and a single result.
methods(Type t) -> Bag of Function r
returns the resolvents having a single argument of type t and
a single result.
resolventtype(Function fn) -> Bag of Type t
returns the types of the first arguments of the resolvents of
function resolvent fn.
arguments(Function r) -> Bag of Vector args
returns sequence (vector) describing arguments of signature
of resolvent r. Each element in the vector is a triplet
(vector) describing one of the arguments with structure {type,name,uniqueness}
where type is the type of the argument, name
is the identifier for the argument, and uniqueness is either
"key" or "nonkey" depending on the declaration of the
argument.. For example,
arguments(resolvents(functionnamed("TIMESPAN")))
--> {{#[OID 371 "TIMEVAL"],"TV1","nonkey"},{#[OID 371
"TIMEVAL"],"TV2","nonkey"}}
results(Function r) -> Bag of Vector res
Analogous to arguments for result (tuple) of function.
cardinality(Type t) -> Integer c
returns the number of object of type t and all its subtypes.
arity(Function f)-> Integer a
returns the number of arguments of function.
width(Function f) -> Integer w
returns the width of the result tuple of function f.
sourcecode(Function f)
-> Bag of Charstring
returns the sourcecode of a function, if available. For generic
functions the sources of all resolvents are returned.
7.8 Extents
The local Amos II database can be regarded as a set of extents.
There are two kinds of extents:
- Type extents represent surrogate objects belonging
to a particular type.
- The deep extent of a type is defined as the set
of
all surrogate objects belonging to that type and to all its descendants
in the type hierarchy. The deep extent of a type is retrieved with:
deep_extent(Type t)->Bag of Object
for example
deep_extent(typenamed("OBJECT"))
returns all surrogate objects stored in the local database.
- The shallow extent of a type is defined as all
surrogate objects belonging only to that type and not to any of its
descendants. The shallow extent is retrieved with:
shallow_extent(Type t)->Bag of Object
for example
shallow_extent(typenamed("OBJECT"));
returns nothing since type Object has no own instances.
- Function extents
represent the state of stored functions. The extent of a function is
the bag of tuples mapping its argument(s) to the corresponding
result(s).
extent(Function fn) -> Bag of Vector
for example
select f,extent(f) from function f where kindoffunction(f)="stored";
Function extent returns the extent of the function fn.
The extent tuples are returned as a bag of vectors.
The function can be any kind of function.
For stored functions the extent is directly stored in the
local database. The example query thus returns the state of all stored
functions. The state of the local database is this state plus the deep
extent of type Object.
The extent is always defined for stored functions and can
also be computed for derived functions through their function
definitions. The extent of a derived function may not be computable, unsafe,
in which case the extent function returns nothing.
The extent of a foreign function is always empty.
The extent of a generic function is the union of the
extents of its resolvents.
7.9 Query optimizer tuning
optmethod(Charstring new) -> Charstring old sets the optimization method used for cost-based optimization in Amos II to the method named new. Three optimization modes for AmosQL queries can be chosen: "ranksort": (default) is fast but not always optimal. "exhaustive": is optimal for small queries only. It uses first dynamic programming for up to 5 seconds and then resorts to ranksort.
optmethod returns the old setting of the optimization
method.
reoptimize(Function f) -> Boolean reoptimize(Charstring f) -> Boolean reoptimizes function named fn.
7.10 Unloading
The data stored in the local database can be unloaded to a text file as
an unload script of AmosQL statements generated by the system.
The state of the database can be restored by loading the unload script
as any other AmosQL file. Unloading is useful for backups and for
transporting data between different systems such as different Amos II
versions
or Amos II systems running on different computers. The unload script
can
be edited in case the schema has changed.
unload(Charstring filename)->Boolean
generates an AmosQL script that restores the current local database
state if loaded into an Amos II peer with the same schema as
the current one.
excluded_fns()->Bag of Function
set of functions that should not be unloaded. Can be udated by
user.
7.11 Miscellaneous
eval(Charstring stmt) -> Object r
evaluates the AmosQL statement stmt.
error(Charstring msg) -> Boolean
prints an error message on the terminal and raises an exception.
Transaction aborted.
print(Object x) -> Boolean
prints x. Always returns true. The
printing is by default to the standard output (console) of
the peer where print is executed but can be redirected to
a file by the function openwritefile:
openwritefile(Charstring filename)->Boolean
openwritefile changes the output stream for print
to the specified filename. The file is closed and output directed to
standard output by calling closewritefile();.
amos_version() -> Charstring
returns string identifying the current version of Amos II.
quit;
quits Amos II.
exit;
returns to the program that called Amos II if the system is embedded in
some other system.
Same as quit; for stand-alone Amos
II.
goovi(); starts the multi-database browser GOOVI[CR01]. This works only under JavaAmos.
The redirect statement reads AmosQL statements from a
file:
redirect-stmt ::= '<' string-constant
For example
< 'person.amosql';
load_AmosQL(Charstring
filename)->Charstring
loads a file containing AmosQL statements.
getenv(Charstring
var)->Charstring value
retrieves the value of OS environment variable var. Generates an error of
variable not set.
The trace and untrace functions are used for
tracing foreign function calls:
trace(Function fno)->Bag of Function r
trace(Charstring fn)->Bag of Function r
untrace(Function fno)->Bag of Function
r
untrace(Charstring fn)->Bag of Function r
If an overloaded functions is (un)traced it means that all its
resolvents are (un)traced. Results are
the foreign functions (un)traced. For example:
Amos 2> trace("SQRT");
#[OID 114 "NUMBER.SQRT->NUMBER"]
Amos 2> sqrt(2.2);
>>#[OID 114 "NUMBER.SQRT->NUMBER"]#(2.2 *)
<<#[OID 114 "NUMBER.SQRT->NUMBER"]#(2.2 1.48324)
1.48324
<<#[OID 114 "NUMBER.SQRT->NUMBER"]#(2.2 -1.48324)
-1.48324
Amos 2>
dp(Object x, Number priority)-> Boolean
For debug printing in where clauses. Prints x on the console.
Always returns true. The placement of dp in the
execution plan is regulated with priority which must be
positive numeric constant. The higher priority the earlier in the
execution plan.
8 References
[CR01]K.Cassel and T.Risch: An
Object-Oriented Multi-Mediator Browser. Presented at 2nd International Workshop on
User Interfaces to Data Intensive Systems, Zürich,
Switzerland, May 31 - June 1, 2001
[ER00] D.Elin and T. Risch: Amos II Java
Interfaces, Uppsala University, 2000.
[FR95] S. Flodin and T. Risch, Processing
Object-Oriented Queries with Invertible Late Bound Functions, Proc.
VLDB Conf., Zürich, Switzerland, 1995.
[FR97] G. Fahl and T. Risch: Query Processing
over Object Views of Relational Data, The VLDB Journal , Vol. 6 No.
4, November 1997, pp 261-281.
[Han03] M.Hansson: Wrapping
External Data by Query Transformations, Uppsala Master Theses in
Computing Science 243, Dept. of Information Technology, Uppsala,
Sweden, 2003.
[JR99a] V.Josifovski, T.Risch: Functional
Query Optimization over Object-Oriented Views for Data Integration
Journal of Intelligent Information Systems (JIIS), Vol.
12, No. 2-3, 1999.
[JR99b] V.Josifovski, T.Risch: Integrating
Heterogeneous Overlapping Databases through Object-Oriented
Transformations. In Proc. 25th Intl. Conf. On Very Large
Databases, Edinburgh, Scotland, September 1999.
[JR02] V.Josifovski, T.Risch: Query
Decomposition for a Distributed Object-Oriented Mediator System . Distributed and
Parallel Databases J., Kluwer, May 2002.
[KJR03] T.Katchaounov, V.Josifovski, and
T.Risch: Scalable
View Expansion in a Peer Mediator System, Proc. 8th
International Conference on
Database Systems for Advanced Applications (DASFAA 2003), Kyoto,
Japan, March 2003.
[LR92] W.Litwin and T.Risch: Main Memory
Oriented Optimization of OO Queries Using Typed Datalog with Foreign
Predicates, IEEE Transactions on Knowledge and Data Engineering,
Vol. 4, No. 6, December 1992 (
http://user.it.uu.se/~udbl/publ/tkde92.pdf).
[Ris00a] T.Risch: Amos II
External Interfaces, Uppsala University, 2000.
[Ris00b]T.Risch: ALisp User's Guide,
Uppsala University, 2000.
[RJK03] T.Risch, V.Josifovski, and
T.Katchaounov: Functional
Data Integration in a Distributed Mediator System, in P.Gray,
L.Kerschberg, P.King, and A.Poulovassilis (eds.): Functional
Approach to Data Management - Modeling, Analyzing and Integrating
Heterogeneous Data, Springer, ISBN 3-540-00375-4, 2003.
|