Re: Searchable chess positions in a Postgress DB - Mailing list pgsql-general

From Cédric Villemain
Subject Re: Searchable chess positions in a Postgress DB
Date
Msg-id 201204111609.00047.cedric@2ndquadrant.com
Whole thread Raw
In response to Searchable chess positions in a Postgress DB  (Sidney Cadot <sidney@jigsaw.nl>)
List pgsql-general
Le mercredi 11 avril 2012 09:15:59, Sidney Cadot a écrit :
> Dear all,
>
> As a hobby project, I am toying around with a database containing
> about 5 million chess games. On average, these games have about 80
> positions (~ 40 moves by both black and white), which means there are
> about 400 million chess positions in there.
>
> I have written code to extract these positions, and now I want to put
> them into a Postgres database. Specifically, I want to do this in a
> way that allows *fast* lookups of positions, e.g. "give me all
> positions that have a White King on c4 and either a Black Bishop or
> White Knight on f7".
>
> Currently, my "Positions" table looks like this:
>
>       Column       |  Type   | Modifiers
> -------------------+---------+-----------
>  gameindex         | integer | not null
>  plyindex          | integer | not null
>  pseudofenboard    | text    | not null
>  fenside           | text    | not null
>  fencastling       | text    | not null
>  fenenpassant      | text    | not null
>  possiblemovecount | integer | not null
>  isincheck         | boolean | not null
> Indexes:
>     "positions_pkey" PRIMARY KEY, btree (gameindex, plyindex)
> Foreign-key constraints:
>     "positions_gameindex_fkey" FOREIGN KEY (gameindex) REFERENCES
> games(gameindex)
>
> The "PseudoFenBoard" field currently holds a string describing the
> position. For example, the starting position of chess looks like this:
>
> "rnbqkbnr/pppppppp/________/________/________/________/PPPPPPPP/RNBQKBNR"
>
> This design allows me to formulate the kind of positional queries that
> I want (by using regular expression matching), but executing them will
> involve a slow, linear traversal of the 400M table rows, which is not
> desirable.
>
> I am toying around with the ugly idea to make a "Positions" table that
> has a single field for each of the squares, e.g.
>
> CREATE TABLE Position2 (
>     GameIndex INTEGER NOT NULL,
>     PlyIndex  INTEGER NOT NULL,
>     a1        "char"  NOT NULL,
>     a2        "char"  NOT NULL,
>                                     -- (60 fields defs omitted)
>     h7        "char"  NOT NULL,
>     h8        "char"  NOT NULL
> );
>
> This would allow the creation of indices on each of the 64 fields
> separately, which should help to achieve near-instantaneous position
> query performance, especially after gathering proper statistics for
> all the field-specific indices.
>
> I realize that this design is quite ugly, so I would be interested to
> hear if there are nicer alternatives that can perform equally well.
>
> Also, above I use the 1-byte "char" type. Is this the only type in
> PostGres that is guaranteed to be just a single byte, or are there
> better alternatives? A 13-state enum would be best (listing the 6
> white pieces, 6 black pieces, and 'empty' states for every square on
> the board) but as I understand from the documentation, enums always up
> take 4 bytes per entry.
>
> Any ideas for improvement would be greatly appreciated.

I'll go test with BloomFiltering and multiple columns.
http://www.sai.msu.su/~megera/wiki/bloom

--
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation

pgsql-general by date:

Previous
From: Kuba
Date:
Subject: OS authentication on Windows
Next
From: Janning Vygen
Date:
Subject: Re: PANIC: corrupted item pointer