Re: Bit datatype performance? - Mailing list pgsql-general

From Antonio Vieiro
Subject Re: Bit datatype performance?
Date
Msg-id 4E70CF42.1070703@antonioshome.net
Whole thread Raw
In response to Re: Bit datatype performance?  (Radosław Smogura <rsmogura@softperience.eu>)
Responses Re: Bit datatype performance?
Re: Bit datatype performance?
List pgsql-general
Hi again,

Thanks for the tip. In fact I was thinking of creating an index on the
bitmask, so I could use:

... where t.bits = :mymask

directly, avoiding a full table scan. I assume this is possible
(indexing bit and comparing bits), isn't it?

Thanks,
Antonio

El 14/09/11 15:58, Radosław Smogura escribió:
> On Wed, 14 Sep 2011 12:00:35 +0200, Antonio Vieiro wrote:
>> Hi all,
>>
>> One of my entities 'E' may be 'tagged' with an arbitrary set of 256
>> tags 'T'.
>>
>> A first approach could be to add a M:N relationship between 'E' and 'T'.
>>
>> A second way to do this could be to add a BIT(256) datatype to 'E',
>> setting bits to '1' if the entity is tagged with each one of the 256
>> tags (i.e. using a 'bitmask' on the set of tags).
>>
>> Since querying entities 'E' with a certain set of tags 'T' must be
>> very fast I was wondering if the second approach would be faster. What
>> do you think?
>>
>> Thanks for any hints,
>> Antonio
>
> I assume each entity may have one or more different tags.
>
> Actually performing test like
> ... where (t.bits & :mymask) = :mymask
> should be quite fast and faster then creating additional relations, but
> only if it's highly probable that your query will almost always scan
> whole table.
>
> The advantage of indexes is that the index is used 1st and tail (slower)
> parts of query will always get "subset" of table. In bitset, You will
> probably scan whole table.
>
> So I think, you should do some performance test for large number of
> data, and compare both ways. I think bitset will be fast for really
> small data, but M:N relations may be faster for really large data sets.
>
> You need to measure size of your database too, in M:N case with 256 tags
> it may be quite large.


pgsql-general by date:

Previous
From: Ivan Voras
Date:
Subject: Re: PostgreSQL benchmarked on XFS vs ZFS vs btrfs vs ext4
Next
From: Rich Shepard
Date:
Subject: Identifying Reason for Column Name Returned by SELECT