In research, sometimes, a new topics rises, blooms, slows down, and
perhaps dies. I have worked many years on two such topics, deductive databases
and object databases. These topics never died but at some points people would
laugh when you would submit a paper on one of them. There was something like
the feeling of being a Dinosaur coming directly from before the Web, i.e. from
the Stone Age.
I was invited last year to give a talk in a Dagstuhl workshop on Relationships,
Objects, Roles, and Queries in Modern Programming Languages. I discovered a new
community interested in object databases. The success of systems such as DB4o
also demonstrates that object databases are back. I am not surprised: this was
a great idea. (Interestingly, I was not attending that workshop but another one on workflow, because of some works on Active XML, a language in the Datalog
spirit.)
Deductive database with Datalog was also a great idea. I am speaking
about this here to answer to a request of a friend (Dave Maier): I'm working with Todd Green on a
contribution to the book for David Warren's symposium, on the history of
Datalog. One of the things we want to address is the reasons behind the
resurgence of Datalog. To set the stage
for that, we probably need to talk about why interest declined in Datalog and
deductive databases after the 1980's.
We're asking around for insight…
What caused the decline of Datalog? What is causing its revival?
Warning: I am not sure I am the right person to ask since I never left
the boat. I have been a constant fan. Ask those who deserted why they stopped
caring about Datalog? Ask the new converts why they discover it now?
I can see 3 reasons:
1.
The
language is a scam.
2.
The lack
of killer applications.
3.
The guru
system guys shied away (because of 1-2?).
Let us elaborate on (1): the scam. This is back to the advantages of
“declarative programming”. The first scam was Prolog: The language is not
really declarative. The second scam was Datalog: It is declarative, but there is not much you
can do with it.
Datalog is simple and beautiful – Horn clauses. We theory guys had a
ball with it. There were beautiful results to obtain even at the cost of
further simplifications (e.g., monadic to be able to decide containment). But
the scam is that if you want to do anything serious beyond your stupid positive
first-order queries, you need more.
There was no fix that I know of for Prolog. There were fixes for
Datalog. Extend the language. And this was done during the last 30 years:
Updates [e.g. SA. and Vianu], Skolem [e.g. Gottlob], Constraints [e.g. Revesz],
Time [e.g. Chomicki], Distribution and Trees [e.g. SA. in ActiveXML],
Aggregations [e.g. Consens, Mendelzon], Delegation [e.g. SA in Webdamlog]. I am
sure I am missing some.
Now we get to (2): the lack of killer apps. The main argument for
Datalog was the computation of transitive closure. This was stupid. Transitive
closure could easily be expressed in supported versions of SQL. The bizarrerie
is that although the language was simplistic, the killer apps had to be
intense. They have to be such that they cannot be easily supported by the good
old relational systems. The jury is still out but we now have candidates:
Declarative networking [e.g. Lou, Hellerstein et al], Data integration [e.g. Clio,
Orchestra], Program verification [e.g. Semmle], Data extraction from HTML [e.g.
Gottlob, Lixto], Knowledge representation [e.g. Gottlob], Business Artifact and
workflows [e.g. SA., ActiveXML], Web data management [e.g. SA., Webdamlog]…
Finally, let us now consider (3): the guru system guys. These guys were
often working or at least consulting for relational vendors. They were rapid at
denigrating ruptures with the good old SQL engines. They did the same for
object databases. It is interesting to see that some of the renewed interest in
Datalog engines comes from the works of Hellerstein. A top system guy, who once
wrote with Stonebraker that Datalog was trash, developing a Datalog system.
This is nothing but Oedipus killing his father and bedding his mother.
Now beyond the true pleasure of fans like me to read the mea culpa of
Hellerstein, it is important to observe that Joe Hellerstein (1) used many
known extensions to the pure Datalog in his systems and (2) promoted his works with
beautiful applications such as networking in the thesis of Boon Tau Loo.
In Webdamlog, we propose for killer apps data management on the Web. In
brief, reasons for that:
1.
The Web is
a graph so recursion is built in: you ask someone, who asks someone who asks
you.
2.
Web users
don’t want to write in a programming language. Declarative languages seem the
right way to go.
But of course, Datalog is too simplistic. This is why I spent years
studying extensions of Datalog for Web data management.
Wouldn’t that be cool if Datalog (properly extended) was the data
language of the Web.
At the end of the article, the right link for Webdamlog is http://webdam.inria.fr/.
RépondreSupprimerMerci. Serge
Supprimer