What Remains Of The Relational Model If We Drop The Relvars?

20 August 2002


Table of Contents

Abstract
Preface
History
The Idea
Details
Mapping Data Models Onto Each Other
Relations And Predicates
Set Of Tuples = Tupset
Typle Type
The Whole Database And Catalog
Uniform Set Of Tuples
Projection
Hide And Seek With Predicates And Relations
Functional Dependencies, Unique Keys And Normalization
Derived Relations
DDL, DML
Questions Open
Expected Benefits
Possible Applications
Implementation Considerations
Performance And Maintenance
Choice Of Lock Objects For Concurrent Access
Future Work
Brief Glossary
References

This article discusses what will remain from the relational data model if we leave out individual relational variables.

The resulting hypothetical data model preserves some properties of the relational one and seems to feature simplified, more user-friendly, incremental data scheme design process and may probably be suitable for home applications and quick prototyping of classical relational databases.

The work is inspired by [C. J. Date].

The author has also been certainly influenced by Datalog query language.

All the statments made in this article about the proposed table-less data model are of a sketchy nature and implications are not clear yet. It is unknown if similar data model has already been discussed or prototyped. If not (or not enough) further research and work and building prototypes may be possible to find out it usefulness/uselessness. The author is more then willing to receive any possible feedback.

[C. J. Date] in Section 3.4 says approxmately

... a relation header represents a predicate

It used to be a grand problem for the author of this paper to understand this until finally he hadn't concluded that it would better be read this as

... a relation header defines a predicate type

(where predicate type is a relation header in application to a predicate)

But meditation over this puzzling abstract has also given burth to another idea.

Intepretation of a relational database assigns meaning to relational variables.

What if we assign meaning to relation headers?

In practice this means dropping relational variables at all and making the whole database one large typeless variable. Data will be stored in the form of tuples each tuple as before being a set of <attribute-name, attribute-value> pairs. attribute-names become global to the database as well as the attribute-name to attribute-domain mapping.

The whole database is just a single variable that has an almost arbitrary tupset as its value.

Just like a relational DBMS our table-less DBMS needs a catalong as a way to query and possibly modify metadata such as

Like a catalog of a relational database that is visible to users as a set of special “system” relvars a catalog in table-less database should probably be visible as a set of “system” attributes that are used to build “system tuples”.

Probably user shouldn't be allowed to store arbitrary tuples with “system” attributes to the database. Probably there should be a predefined fixed number of “system” typle types and “system” attributes shouldn't be allowed to be stored into the database as part of any other tuple types.

If a consistency constraints system able to enforce similar limitations over user-defined attributes will be built into the data model, then the limitaions imposed on “system” attributes would be visible (but not modifiable) via the catalog too.

On the other hand at the current time it is not clear to the author if such limitation should be imposed on system attributes at all. It may prove usefull to attach special meaning to certain tuple tupes but not limit the user in other usages of system attributes except the key uniquness consistency constraints.

Given some table-less database with N attributes defined we can build 2**N-1 [2] subsets of the whole attribute set and respectively build 2**N-1 projections of the contents of the database.

These are 2**N-1 relations that may be considered to be stored in the table-less database. They are a close analog of the relational variables from the relational database world.

These are also the 2**N-1 implicit predicates of our table-less database.

We shall call them stored relations and stored predicates further on.

These stored predicates/relations are naturally named and denoted by their attribute sets. The relation header (also called predicate type in this paper) coming from the relational data model has become the name of table-less relvars and their implicit predicates analog.

These relations/predicates are not independent however, the following rule holds:

{name_1=val_1, name_2=val_2,... name_n=val_n} {name_i_1=val_i_1, name_i_2=val_i_2,... name_i_m=val_i_m}

where {i_1,i_2,...i_m} is any subset of {1,2,...n}.

While in relational data model consistency constraints are associated to relvars in the table-less data model they may be associated to the stored relations/predicates.

All kinds of constraints known for relational databases should be possible, while additional kinds are also foreseen. For example, we may prohibit to store a tuple of the form

loves='John'

without storing the tuple

loves='John', loved='Jane'

These 2**N-1 implicit predicates of the database are also the entity that informal predicates may be associated to.

It should be possible to specify consistency constraints similar to unique key constraints in relational data model. Of course it may be done by a formal predicate of the following sort:

where x1, x2, y and z are variables (implicitly universally quantified) and pc1, pc2, v are constants denoting the names of attributes; this constraints says that in the {pc1,pc2,v} relation pc1,pc2 pair serves as a unique key.

But we may also go for games of another sort. Imagine we have system attributes named 'attr' and 'func_depends_upon' and 'mult_depends_upon'. Imagine we use these attributes to store functional and multiple dependecy meta-information directly into the database (this becomes part of the catalog). Let for example the following information about attributes name 'v1','v2' and 'v3' be stored:

{'attr'='v3','funct_depends_upon'='v2'}
{'attr'='v2','funct_depends_upon'='v1'}

That is we have a dependency chain: v1->v2->v3. Then, whenever user requests to insert into the database tuple of the form

{'v1'='a1', 'v2'='a2', 'v3'='a3'}

two tuples are inserted instead:

{'v1'='a1', 'v2'='a2'}
{'v2'='a2', 'v3'='a3'}

Similarly, the following “joining implication” is considered to be true (and used when evaluating any requests, etc):

(again, x, y, z are universally quantified variables.)

This is what may be called “automatic normalization”. As for any other part of this paper implications of the proposed ideas are note clear yet.

Any of these has not been designed yet for the model, but possibly the DML could have a Datalog style subset [3] [Ceri] for defining derived relations, consistency constraints, and probably database queries.

The author also feels that probably no special DDL will be needed if all the manipulations will be performable by updating the catalog of the database via regular DML operations.

As the original Datalog does not provide any update operations it has not yet been decided on what the whole DML for the model might look like. The question is wide open (as many others too).

Great many thins are unclear with the model. It is anticipated that it may have both underwater (unevident) and above-water (evedent) rocks to wreck any further advance.

What currently is mostly unclear is how to organize inserting and deleting tuples from the database. For example, if there is already a

{'n1'='v1', 'n2'='v2'}

tuple in the database, and a new

{'n1'='v1', 'n2'='v2', 'n3'='v3'}

tuple is being inserted, should the original one be elimiated? What if later the

{'n1'='v1', 'n2'='v2', 'n3'='v3'}

tuple is removed, should then the original tuple be restored?

It is possible that users will find it easier to design data schemes in the new data model because

It is possible that DBMS built according to this technology may be good for data management applications that

Here is a list of some (possibly overlapping) application domains each meeting several these criteria:

  • organizing home or personal archieves

  • catalogues of all kinds, for example annotated file archieves with structural search capabilities

  • taxonomies and classifications - in biology, chemestry, genetics

  • storing scientific experimental data

  • prototyping - one possible scenario is to design classical realtional database by careful examination of table-less database created and tested by application domain experts

This article will be published as https://tagunov.tripod.com/table-less/

Future actions may include:

  • finding out if similar ideas have aready been discussed

  • setting up a forum and/or mail list (where is it best to do so?)

  • public discussion

  • definition of insert/delete behavior

  • sketching a DML

  • setting up an open-source project if there is interest

  • developing a naive prototype

  • inventing a name better then “table-less” both for the data model and possible future prototype

the order of these actions is insignificant ;-)

Currently the author would like to thank everybody for the attention and ask to send him any feedback possible! :-)

Relational Database Terminlogy

After [C. J. Date]:

relation header
relation type

a set of <attribute-name, attribute-domain> pairs.

attribute-domain is INT, CHAR, user-defined-type, etc.

tuple

a set of <attribute-name, attribute-value> pairs.

relational value
relval

a combination of a relation header and a finit set of conforming tuples, called relation body

relational variable
relvar

a durable mutable database entity, such as SQL table, with an associated relation type that can be assigned a relational value of the same type

alternatively - a relational algebra expression (or an equivalent) over other relational variables, VIEW in SQL

three following entitites may be (or always are) associated with a relvar in a relational database:

- similarly named but deeply different in nature.

relational database

a set of relational variables

Additional Relational Database Terminology

This paper uses a couple of non-standard terms:

predicate type

same as relation header but in application to a predicate

implicit predicate

a logical function defines and is defined by the set of argument combinations for which it is true

predicate (as a logical function) and relation are dual views of the same entity

this paper calls the predicate implicitly associated with a relational variable its implicit predicate

this predicate changes as the value of the relvar changes

implicit predicate may be used to formulate formal predicats or to query database in a predicate caluculus-based DML [C. J. Date]

although not done in this paper, if needed one might also speak about an implicit predicate of relational value; then the implicit predicate of relational variable would be the implicit predicate of its current value



[1] SQL is known to have both relational algebra and predicate calculus features.

[2] 2**N-1 because we're not interested in the empty attribute set. The ** operation here denotes raising to power, notation probably coming from BASIC :-)

[3] Except that the author fills more comfortable with having the target on the right of the expression :-)

[4] Grouping attributes by tables is not only about the normalization: imagine that we have functional dependencies a->b, a->c, then we may group them either as

Table ab

ab

Table ac

ac

or as

Table abc

abc

[5] Dealing with “weakly structured data” here stands for storing diverse interrelated data entities each characterizable in a relational fashion but such that the set of attributes significantly changes from one to another and is generally unpredictable.