Experimenting eXist-DB on Docker

I’ve been using eXist-DB for some time for the project of the Dicionário da Academia de Ciências de Lisboa, that is being revived from the PDF into TEI, so that a new digital version can be soon released.

Recently I needed to update the server where eXist-DB was running, and decided to use a dockerized version of it. Although that can make things a little slower (not really sure), it makes things easier to replicate, and now I can have, easily, the dictionary database running on my laptop or in the server, using the same code.

I am using the default latest version of eXist-DB docker image. The only difference is that, because my XQuery code uses FunctX functions, I needed to import that module. Thus, my Dockerfile is composed by:

FROM existdb/existdb:latest

ADD http://exist-db.org/exist/apps/public-repo/public/functx-1.0.1.xar /exist/autodeploy

I have the data and application on a GIT repository, as exported by the eXist-DB backup tool. Thus, I decided to create a simple script to import the data, instead o creating the docker image already with that data. Therefore, my docker-compose.yml file is composed by:

version: '3.3'
services:
  exist:
    build: ./dacl/docker
    container_name: exist
    ports:
        - 8080:8080
        - 8443:8443
    volumes:
        - ./data:/exist/data
        - ./config:/exist/config
        - ./dacl:/import
        - ./outdir:/export

The relevant parts:

  • The path to the folder including the Dockerfile (dacl/docker)
  • Ports 8080 and 8443 are used by eXist-DB, and I am just forwarding them to the host
  • Created four volumes: data stores the binary database data, config stores the configuration files, and import and export volumes are used to import data, and export data for backup.

For importing all data into the database I am using a shell script. The first five lines import some collections. The last two execute two auxiliary scripts, the first to re-index application data, and the second to create the proper groups, users and assign passwords.

First, the import.sh script is:

#!/usr/bin/env bash

docker-compose exec -T exist java org.exist.start.Main backup -u admin -r /import/db/academia/__contents__.xml
docker-compose exec -T exist java org.exist.start.Main backup -u admin -r /import/db/apps/academia/__contents__.xml

docker-compose exec -T exist java org.exist.start.Main backup -u admin -r /import/db/academia-2001/__contents__.xml
docker-compose exec -T exist java org.exist.start.Main backup -u admin -r /import/db/apps/academia-2001/__contents__.xml

docker-compose exec -T exist java org.exist.start.Main backup -u admin -r /import/db/schemas/__contents__.xml

docker-compose exec -T exist java org.exist.start.Main client -u admin -F /import/xq/repair.xq
docker-compose exec -T exist java org.exist.start.Main client -u admin -F /import/xq/users.xq

Note that the XQuery scripts are being held in the same folder that is mounted in the import volume. Otherwise, you will not be able to access it from inside the container.

The repair XQuery script holds this code:

import module namespace repair="http://exist-db.org/xquery/repo/repair"
at "resource:org/exist/xquery/modules/expathrepo/repair.xql";
repair:clean-all(),
repair:repair()

And finally, the users Xquery script has the following code:

sm:passwd('admin','admin-password),
sm:create-group('dacl'),
sm:create-account('ana','ana-password','dacl'),
sm:chgrp(xs:anyURI('/db/academia'), 'dacl'),
sm:chgrp(xs:anyURI('/db/academia-2001'), 'dacl'),
sm:chgrp(xs:anyURI('/db/apps/academia-2001'), 'dacl'),
sm:chgrp(xs:anyURI('/db/apps/academia'), 'dacl'),
sm:chmod(xs:anyURI('/db/academia-2001'), 'rwxrwx---'),
sm:chmod(xs:anyURI('/db/academia'), 'rwxrwx---'),
sm:chmod(xs:anyURI('/db/apps/academia-2001'), 'rwxrwx---'),
sm:chmod(xs:anyURI('/db/apps/academia'), 'rwxrwx---')

Also, in case it gets useful, this is my backup.sh script

docker-compose exec -T exist java org.exist.start.Main backup -u admin -p admin.entrada -b /db/academia -d /export
docker-compose exec -T exist java org.exist.start.Main backup -u admin -p admin.entrada -b /db/academia-2001 -d /export
docker-compose exec -T exist java org.exist.start.Main backup -u admin -p admin.entrada -b /db/apps/academia -d /export
docker-compose exec -T exist java org.exist.start.Main backup -u admin -p admin.entrada -b /db/apps/academia-2001 -d /export
docker-compose exec -T exist java org.exist.start.Main backup -u admin -p admin.entrada -b /db/schemas -d /export

rsync -aASPvz --delete-after outdir/db/ dacl/db/

DATE=`date +%Y%m%d`
cd dacl && git commit -a -m "Backup $DATE" && git push origin v5

Of course this is not rocket science, and this approach might have a lot of problems, but in the other hand, it might get handy to someone.

IGI Global: the clown of scientific publishing?

I am not sure how I agreed to write a chapter for a book to be published in IGI Global. Probably, being edited by a friend that invited me personally to send a proposal made the difference.

I have my contribution ready, but starting to think on just forgetting it. Why? Because IGI Global is, surely, kidding with me. They have a set of rules for their contributions, and somewhere in the middle, they say, and I quote:

 

LaTex. LaTex files are NOT accepted because they are not compatible with IGI Global’s typesetting program. As an alternative, we require that you use MathType (see “Equations” below).

First, dear IGI, when not possible to use the fancy form of LaTeX, the latest X should be in uppercase. Second, if hey are not compatible with your typesetting program, that is probably because you are using the wrong typesetting program. And, no, LaTeX is not useful only for math. Please learn what LaTeX is, try to use it, then evaluate how it can be useful or not for your editorial requirements.

Third (or fourth, I think I will stop counting), look to other publishing houses. Who are your adversaries? Springer, probably. Do you know they use LaTeX? Yeah, they do! And they create good quality document. Of course they do, they use LaTeX. And no, I have an IGI book, and no, your books does not have typesetting quality. I am sorry.

Finally, because I have some hours to lose formatting the chapter, if you want us to use Microsoft Word, please create a template in Word. Do you know what that is? You know how it can be useful? Do you? I am sure you don’t.

 

Javascript Reversi

In the last two days I engaged in developing Reversi, just to learn how the minimax algorithm works. To make it easier to share, and remove GUI toolkit dependencies, my approach was using HTML (a simple page with a 8×8 table), three images (empty cell, black or white cells), a CSS file that fills empty cells or used cells, and highlights movement possibilities, and a couple of JavaScript files (jquery, a reversi-board.js file to handle the board as an object, a reversi.js file to handle the interface between the board and the HTML file, and finally a minimax.js file to handle the minimax algorithm.

At the moment the game is playable, and not too slow. The code can be optimized to make it faster. In the next couple of days I might do that.

Also, regarding the AI code, it can be made better. In one side, the minimax algorithm can try to analyze more moves in advance (only three at the moment), in the other, the board evaluation function can be made better as well.

If you wish to play, go ahead: http://eremita.di.uminho.pt/~ambs/reversi

Symposium on Languages, Applications and Technologies

Slate 2012

I am the Chair of the Symposium on Languages, Applications and Technologies. It is the first year this conference will take place, but some of its tracks have some history. There are three main tracks, on Human-Human Languages processing (processing of natural languages), Human-Computer Languages processing (processing of programming languages) and Computer-Computer Languages processing (communication languages, serialization languages, etc).

The conference will have its proceedings published in OASIcs, and a printed version will be available for the participants. A keynote by Alexander Paar is already confirmed. He will talk about an interesting topic. His presentation title is From Program Execution to Automatic Reasoning: Integrating Ontologies into Programming Languages.

The conference call for papers is still open (until March 28th).

Classification Systems

Yesterday I gave an invited class on Classification Systems. I prepared some slides that are available on Slideshare. It is an opinionated view on four different types of classification: folksonomies, taxonomies, thesauri and ontologies. The online version lost some quality, but you can download the PDF version. Hope this is interesting for someone.