Thread: Has there been any discussion of custom dictionaries being defined inthe database?
Has there been any discussion of custom dictionaries being defined inthe database?
From
Morris de Oryx
Date:
I've been experimenting with the FTS features in Postgres over the past few days. Mind blow.
We're deployed on RDS, which does not give you any file system to access. I'd love to be able to create a custom thesaurus dictionary for our situation, which seems like it is impossible in a setup like ours.
Has there been any discussion of making dictionary configuration files accessible via a dictionary table instead of a physical, structured disk file? Or, alternatively, something that could be accessed remotely/externally as a URL or FDW?
Thanks for any comments.
Re: Has there been any discussion of custom dictionaries being defined in the database?
From
Tom Lane
Date:
Morris de Oryx <morrisdeoryx@gmail.com> writes: > We're deployed on RDS, which does not give you any file system to access. > I'd love to be able to create a custom thesaurus dictionary for our > situation, which seems like it is impossible in a setup like ours. > Has there been any discussion of making dictionary configuration files > accessible via a dictionary table instead of a physical, structured disk > file? Or, alternatively, something that could be accessed > remotely/externally as a URL or FDW? Nope. TBH, I don't find this case terribly compelling. You should be beating up RDS for not letting you configure your DB the way you want. regards, tom lane
Re: Has there been any discussion of custom dictionaries beingdefined in the database?
From
Morris de Oryx
Date:
Fair.
Given that Amazon is bragging this week about turning off Oracle, it seems like they could kick some resources towards contributing something to the Postgres project. With that in mind, is the idea of defining dictionaries within a table somehow meritless, or unexpectedly difficult?
Re: Has there been any discussion of custom dictionaries being defined in the database?
From
Tom Lane
Date:
Morris de Oryx <morrisdeoryx@gmail.com> writes: > Given that Amazon is bragging this week about turning off Oracle, it seems > like they could kick some resources towards contributing something to the > Postgres project. With that in mind, is the idea of defining dictionaries > within a table somehow meritless, or unexpectedly difficult? Well, it'd just be totally different. I don't think anybody cares to provide two separate definitions of common dictionaries (which'd have to somehow be kept in sync). As for why we did it with external text files in the first place --- for at least some of the dictionary types, the point is that you can drop in data files that are available from upstream sources, without any modification. Getting the same info into a table would require some nonzero amount of data transformation. Having said that ... in the end a dictionary is really just a set of functions implementing the dictionary API; where they get their data from is their business. So in theory you could roll your own dictionary that gets its data out of a table. But the dictionary API would be pretty hard to implement except in C, and I bet RDS doesn't let you install your own C functions either :-( regards, tom lane
Re: Has there been any discussion of custom dictionaries beingdefined in the database?
From
Morris de Oryx
Date:
Nope, no custom C installs. RDS is super convenient in many ways, but also limited. You can't, for example, run TimeScale, install RUM indexes (if those still work), or any novel plugins. And you can't do anything at all requiring a file reference. The backup features are outstanding. But, yeah, sometimes frustrating.
Re: Has there been any discussion of custom dictionaries beingdefined in the database?
From
Karsten Hilbert
Date:
On Thu, Oct 17, 2019 at 11:52:39AM +0200, Tom Lane wrote: > Morris de Oryx <morrisdeoryx@gmail.com> writes: > > Given that Amazon is bragging this week about turning off Oracle, it seems > > like they could kick some resources towards contributing something to the > > Postgres project. With that in mind, is the idea of defining dictionaries > > within a table somehow meritless, or unexpectedly difficult? > > Well, it'd just be totally different. I don't think anybody cares to > provide two separate definitions of common dictionaries (which'd have to > somehow be kept in sync). Might crafty use of server side COPY TO ... PROGRAM ... enable OP to drop in dictionary data files as needed ? Karsten -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B
Re: Has there been any discussion of custom dictionaries beingdefined in the database?
From
Tomas Vondra
Date:
On Thu, Oct 17, 2019 at 11:52:39AM +0200, Tom Lane wrote: >Morris de Oryx <morrisdeoryx@gmail.com> writes: >> Given that Amazon is bragging this week about turning off Oracle, it seems >> like they could kick some resources towards contributing something to the >> Postgres project. With that in mind, is the idea of defining dictionaries >> within a table somehow meritless, or unexpectedly difficult? > >Well, it'd just be totally different. I don't think anybody cares to >provide two separate definitions of common dictionaries (which'd have to >somehow be kept in sync). > >As for why we did it with external text files in the first place --- >for at least some of the dictionary types, the point is that you can >drop in data files that are available from upstream sources, without any >modification. Getting the same info into a table would require some >nonzero amount of data transformation. > IMHO being able to load dictionaries from a table would be quite useful, and not just because of RDS. For example, it's not entirely true we're just using the upstream dictionaries verbatim - it's quite common to add new words, particularly in specialized fields. That's way easier when you can do that through a table and not through a file. >Having said that ... in the end a dictionary is really just a set of >functions implementing the dictionary API; where they get their data >from is their business. So in theory you could roll your own >dictionary that gets its data out of a table. But the dictionary API >would be pretty hard to implement except in C, and I bet RDS doesn't >let you install your own C functions either :-( > Not sure. Of course, if we expect the dictionary to work just like the ispell one, with preprocessing the dictionary into shmem, then that requires C. I don't think that's entirely necessary, thoug - we could use the table directly. Yes, that would be slower, but maybe it'd be sufficient. But I think the idea is ultimately that we'd implement a new dict type in core, and people would just specify which table to load data from. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services