Thread: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?
Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?
Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?
3. The ability to embed PG to run in an automatic, quiet manner as part of something else. I know about SQLite, but it's extremely limited to the point of being virtually useless IMO, which is why I cannot use that for anything nontrivial
I can’t for the life of me imagine how you arrived at this. SQLite is very capable indeed. Its dialect of SQL is (deliberately) very similar to Postgres, featuring such niceties as recursive CTEs and window functions, and it can handle heavy use and multi-terabyte databases if you need (cf bedrockdb).
Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?
Regarding self-tuning, it was actually part of the original vision for relational databases that they would do that, but I’m not aware of any modern SQL database that does, although there are third-party tools that will eg offer index suggestions (I don’t know which is best, so I won’t propose any).
4. There is no built-in means to have PG manage (or even suggest) indexes on its own. Trying to figure out what indexes to create/delete/fine-tune, and determine all the extremely complex rules for this art (yes, I just called index management an *art*, because it is!), is just utterly hopeless to me. It never gets any easier. Not even after many years. It's the by far worst part of databases to me (combined with point five). Having to use third-party solutions ensures that it isn't done in practice, at least for me. I don't trust, nor do I want to deal with, external software and extensions in my databases. I still have nightmares from PostGIS, which I only keep around, angrily, out of absolute necessity. I fundamentally don't like third-party add-ons to things, but want the core product to properly support things. Besides, this (adding/managing indexes) is not even some niche/obscure use-case, but something which is crucial for basically any nontrivial database of any kind!
Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?
I can’t for the life of me imagine how you arrived at this. SQLite is very capable indeed. >Its dialect of SQL is (deliberately) very similar to Postgres, featuring such niceties as >recursive CTEs and window functions, and it can handle heavy use and multi-terabyte > databases if you need (cf bedrockdb). But it is still a single user database in the sense that concurrent access to a table is not handled well. I use it for single user embedded applications
Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?
tutiluren@tutanota.com writes: > 1. All non-ANSI characters are turned into "?"s for application_name. Yeah, that's hard to do much with unfortunately. We cannot assume that all databases in an installation share the same encoding, so for globally visible strings like application_name, the only safe solution is to restrict them to ASCII. On the other hand, the very same thing could be said of database names and role names, yet we have never worried much about whether those were encoding-safe when viewed from databases with different encodings, nor have there been many complaints about the theoretical unsafety. So maybe this is just overly anal-retentive and we should drop the restriction, or at least pass through data that doesn't appear to be invalidly encoded. > 2. pg_dump misinterprets non-ANSI values for the "--exclude-*" options > (at least the --exclude-table-data one, which is the one I've tested) on > Windows, resulting in it being impossible to make more "sophisticated" > backups of PostgreSQL databases; it's either all or nothing. TBH, I'm going to throw that back on you as probably pilot error. There's no reason for such cases not to work if you're running the terminal window in the same encoding that pg_dump thinks it's using. > I spent a lot of time and efforts experimenting with and asking about > this, but eventually gave up and concluded that it was yet another bug > in an open source project "only" on Windows with no real/pressing > interest in fixing it. Yeah, I remember that discussion. It being open source cuts two ways: problems get fixed by people who have the motivation to find a workable fix. Not being a Windows user, I don't particularly care about whether this case is broken or not, and even if I cared more, I do not have the resources to figure it out myself. There are other people around the project who do Windows, but you evidently have not managed to persuade them that this is something they should spend time on, either. > 3. The ability to embed PG to run in an automatic, quiet manner as part > of something else. That would be a packaging problem for the "something else" to figure out, no? There are enough platform-specific issues about "automatic" that I don't see it as very practical for the core Postgres project to take on, in any case. I'm aware of some packagings, such as Postgres.app for macOS (https://postgresapp.com), that might be closer to what you are looking for than the core project ever could be. > 4. There is no built-in means to have PG manage (or even suggest) > indexes on its own. Indeed, and that does not seem to me like something that ought to be built in. There is at least one "index advisor" plug-in floating around, which perhaps could help you. > 5. Ever since my early days with PG in the mid-2000s, I've tried > numerous times to read the manual, wikis and comments for the > configuration files, specifically the performance directives, and asked > many, many times for help about that, yet never been able to figure out > what they want me to enter for all the numerous options. At this point, > it isn't me being lazy/stupid; it's objectively very difficult to > understand all of that. We do the best we can as far as the documentation goes; if you have concrete suggestions about how to improve that, we'll surely listen. > The practical end result of this is that I've always gone back to using > the untouched default configuration file (except for the logging-related > options), which, especially in the past on FreeBSD, *severely* crippled > my PG database to not even come close to taking advantage of the full > power of the hardware. You seem to be wishing that the default configuration was designed to try to commandeer the whole machine. It's deliberate project policy that that not be so --- and given that a couple of bullet points up, you were arguing for Postgres to be easily usable as an invisible part of some other app, it doesn't seem like you really want it to be so either. There is a lot of info out there about Postgres tuning, and yes some of it is contradictory, because one size doesn't fit all. People have different goals about how they want the system to act, and the proper settings may vary across platforms or PG versions, and there's a fair amount of just plain different opinions. But I think you may be overcomplicating it. IMO there's not that much you have to adjust to start with --- maybe just increase shared_buffers and possibly work_mem --- and then tweak other settings only when you find yourself running into that limit. If you can figure out our logging settings then you can figure out the rest of this, too. regards, tom lane
Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?
On Mon, Sep 14, 2020 at 05:39:57PM -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote: > tutiluren@tutanota.com writes: > > 1. All non-ANSI characters are turned into "?"s for application_name. > > Yeah, that's hard to do much with unfortunately. We cannot assume that > all databases in an installation share the same encoding, so for globally > visible strings like application_name, the only safe solution is to > restrict them to ASCII. > > On the other hand, the very same thing could be said of database names > and role names, yet we have never worried much about whether those were > encoding-safe when viewed from databases with different encodings, nor > have there been many complaints about the theoretical unsafety. So maybe > this is just overly anal-retentive and we should drop the restriction, > or at least pass through data that doesn't appear to be invalidly > encoded. Perhaps recode database/role names from the source database's encoding into utf8, and then recode from utf8 to the destination database's encoding? For "globally visible strings", maybe recode to the client_encoding setting, or a new encoding setting for this purpose since client_encoding seems to be linked to the database that the client is connected to. I'd've thought that the application name would arrive encoded as client_encoding (which defaults to the database encoding). Maybe globally visible strings need to be recoded on arrival from the client_encoding to utf8 (or a "server_encoding" or "global_encoding" setting) so they are always stored in a known encoding so they can be recoded as necessary when viewed via connections to other databases using a different encoding. Just some thoughts. If they don't make any sense, feel free to ignore them. :-) > > 5. Ever since my early days with PG in the mid-2000s, I've tried > > numerous times to read the manual, wikis and comments for the > > configuration files, specifically the performance directives, and asked > > many, many times for help about that, yet never been able to figure out > > what they want me to enter for all the numerous options. At this point, > > it isn't me being lazy/stupid; it's objectively very difficult to > > understand all of that. > > We do the best we can as far as the documentation goes; if you have > concrete suggestions about how to improve that, we'll surely listen. I thought the documentation on postgres performance tuning was fine. Here's what I got from it (for a dedicated database server host): shared_buffers = 1GB # Should be 1/4 of RAM work_mem = 16MB # Should be bigger to do more sorts in-memory but it's per sort per user so not too big maintenance_work_mem = 128MB # Should be <= 256MB to make vacuum fast without taking away too much RAM from other tasks min_wal_size = 480MB # Should be at least 3 * 10 * 16MB to avoid too many checkpoints slowing down writes checkpoint_completion_target = 0.9 # Should be 0.9 if checkpoint_segments was increased to spread out checkpoint writes random_page_cost = 4.0 # Should be 4.0 for HDD, 1.5-2.5 for SSD, 1.0-1.01 if db fits entirely in RAM effective_cache_size = 3GB # Should be 1/2 to 3/4 of RAM It's the "Should..." comments that matter. And it might be out of date... Actually, it is out of date. The comment for checkpoint_completion_target refers to checkpoint_segments which no longer exists (since 9.5) so disregard that. cheers, raf
Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?
raf <raf@raf.org> writes: > On Mon, Sep 14, 2020 at 05:39:57PM -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> On the other hand, the very same thing could be said of database names >> and role names, yet we have never worried much about whether those were >> encoding-safe when viewed from databases with different encodings, nor >> have there been many complaints about the theoretical unsafety. So maybe >> this is just overly anal-retentive and we should drop the restriction, >> or at least pass through data that doesn't appear to be invalidly >> encoded. > Perhaps recode database/role names from the source > database's encoding into utf8, and then recode from utf8 > to the destination database's encoding? A lot of people seem to believe that transcoding through utf8 is 100% safe. They're wrong :-( --- the Japanese, at least, have reason not to trust it, because of the existence of multiple incompatible conversion standards. And you're still left with the question of what to do when the destination encoding hasn't got the character. Moreover, this is all moderately expensive unless the encodings in question are already utf8 or latin1. So if we go this way I'd prefer to do it as I said above -- just drop or question-mark-ize any characters that don't pass validation in the recipient DB. That's fairly cheap and it will work perfectly in the typical case where the whole cluster is on one encoding anyway. regards, tom lane
Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?
On Mon, Sep 14, 2020 at 05:39:57PM -0400, Tom Lane wrote: > tutiluren@tutanota.com writes: > > 1. All non-ANSI characters are turned into "?"s for application_name. > > Yeah, that's hard to do much with unfortunately. We cannot assume that > all databases in an installation share the same encoding, so for globally > visible strings like application_name, the only safe solution is to > restrict them to ASCII. > > On the other hand, the very same thing could be said of database names > and role names, yet we have never worried much about whether those were > encoding-safe when viewed from databases with different encodings, nor > have there been many complaints about the theoretical unsafety. So maybe > this is just overly anal-retentive and we should drop the restriction, > or at least pass through data that doesn't appear to be invalidly > encoded. I think the issue is that role and database names are controlled by privileged users, while application_name is not. -- Bruce Momjian <bruce@momjian.us> https://momjian.us EnterpriseDB https://enterprisedb.com The usefulness of a cup is in its emptiness, Bruce Lee
Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?
On Mon, Sep 14, 2020 at 10:22:31PM +0200, tutiluren@tutanota.com wrote: > 4. There is no built-in means to have PG manage (or even suggest) indexes on > its own. Trying to figure out what indexes to create/delete/fine-tune, and > determine all the extremely complex rules for this art (yes, I just called > index management an *art*, because it is!), is just utterly hopeless to me. It > never gets any easier. Not even after many years. It's the by far worst part of > databases to me (combined with point five). Having to use third-party solutions > ensures that it isn't done in practice, at least for me. I don't trust, nor do > I want to deal with, external software and extensions in my databases. I still > have nightmares from PostGIS, which I only keep around, angrily, out of > absolute necessity. I fundamentally don't like third-party add-ons to things, > but want the core product to properly support things. Besides, this (adding/ > managing indexes) is not even some niche/obscure use-case, but something which > is crucial for basically any nontrivial database of any kind! I think you did a very good job of explaining your issues. I think the underlying problem is that Postgres is targeting a wide market, and your use-case for a more limited or self-contained database doesn't fit many of those markets. Also, PostGIS is one of the most complex extensions, so adding simpler ones should not be as hard. -- Bruce Momjian <bruce@momjian.us> https://momjian.us EnterpriseDB https://enterprisedb.com The usefulness of a cup is in its emptiness, Bruce Lee
Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?
Bruce Momjian <bruce@momjian.us> writes: > On Mon, Sep 14, 2020 at 05:39:57PM -0400, Tom Lane wrote: >> On the other hand, the very same thing could be said of database names >> and role names, yet we have never worried much about whether those were >> encoding-safe when viewed from databases with different encodings, nor >> have there been many complaints about the theoretical unsafety. So maybe >> this is just overly anal-retentive and we should drop the restriction, >> or at least pass through data that doesn't appear to be invalidly >> encoded. > I think the issue is that role and database names are controlled by > privileged users, while application_name is not. That's certainly an argument against a completely laissez-faire approach, but if we filtered invalidly-encoded data on the reading side, it seems like we would be in good enough shape. regards, tom lane
Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?
2. pg_dump misinterprets non-ANSI values for the "--exclude-*" options (at least the --exclude-table-data one, which is the one I've tested) on Windows, resulting in it being impossible to make more "sophisticated" backups of PostgreSQL databases; it's either all or nothing. Other programs, including my own test scripts and commands, are perfectly able to use any Unicode character sent from/through both cmd.exe and PHP CLI, but not pg_dump, so the idea that "Windows it at fault" here just doesn't seem true. (Although I don't doubt for a second that it often *is* the case... Microsoft is not a nice entity in any way.) I spent a lot of time and efforts experimenting with and asking about this, but eventually gave up and concluded that it was yet another bug in an open source project "only" on Windows with no real/pressing interest in fixing it. For me, this means that I lose a ton of fresh data every day, or have to make *gigantic* backups. (I have several huge "temporary debug log" tables whose data have zero long-term value but tons of short-term value.) It makes me feel crippled and excluded in an uncomfortable manner.
3. The ability to embed PG to run in an automatic, quiet manner as part of something else. I know about SQLite, but it's extremely limited to the point of being virtually useless IMO, which is why I cannot use that for anything nontrivial. I want my familiar PostgreSQL, only not require it to be manually and separately installed on the machine where it is to run as part of some "application". If I could just "embed" it, this would allow me to create a single EXE which I can simply put on a different machine to run my entire "system" which otherwise takes *tons* of tedious, error-prone manual labor to install, set up and maintain. Of course, this is probably much easier said than done, but I don't understand why PG's architecture necessarily dictates that PG must be a stand-alone, separate thing. Or rather, why some "glue" cannot enable it to be used just like SQLite from a *practical* perspective, even if it still is a "server-client model" underneath the hood. (Which doesn't matter at all to me, nor should it matter to anyone else.)
4. There is no built-in means to have PG manage (or even suggest) indexes on its own. Trying to figure out what indexes to create/delete/fine-tune, and determine all the extremely complex rules for this art (yes, I just called index management an *art*, because it is!), is just utterly hopeless to me. It never gets any easier. Not even after many years. It's the by far worst part of databases to me (combined with point five). Having to use third-party solutions ensures that it isn't done in practice, at least for me. I don't trust, nor do I want to deal with, external software and extensions in my databases. I still have nightmares from PostGIS, which I only keep around, angrily, out of absolute necessity. I fundamentally don't like third-party add-ons to things, but want the core product to properly support things. Besides, this (adding/managing indexes) is not even some niche/obscure use-case, but something which is crucial for basically any nontrivial database of any kind!
5. Ever since my early days with PG in the mid-2000s, I've tried numerous times to read the manual, wikis and comments for the configuration files, specifically the performance directives, and asked many, many times for help about that, yet never been able to figure out what they want me to enter for all the numerous options. At this point, it isn't me being lazy/stupid; it's objectively very difficult to understand all of that.
The practical end result of this is that I've always gone back to using the untouched default configuration file (except for the logging-related options), which, especially in the past on FreeBSD, *severely* crippled my PG database to not even come close to taking advantage of the full power of the hardware. Instead, it felt like I was using maybe 1% of the machine's power, even with a proper database design and indexes and all of that stuff, simply because the default config was so "conservative" and it couldn't be just set to "use whatever resources are available".
I wish so much for PG to have a mode where it self-tunes itself as needed, over time, based on the actual workload, or at least allowed some kind of abstract "performance mode" such as: "you are allowed to use significant system resources, PG", or: "You are one of my most important applications. Just use as much power as you currently need, but at least save about 10% for the rest of the system, will you?" Maybe this is also harder than it sounds to accomplish, but for somebody like me who has zero funding, I cannot hire some professional to sit down with me and fine-tune my system for $899/hour.
Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?
On 21/09/2020 17:53, Joshua Drake wrote: > 3. The ability to embed PG to run in an automatic, quiet manner as part > of something else. I know about SQLite, but it's extremely limited to > the point of being virtually useless IMO, which is why I cannot use that > for anything nontrivial. I want my familiar PostgreSQL, only not require > it to be manually and separately installed on the machine where it is to > run as part of some "application". If I could just "embed" it, this > would allow me to create a single EXE which I can simply put on a > different machine to run my entire "system" which otherwise takes *tons* > of tedious, error-prone manual labor to install, set up and maintain. Of > course, this is probably much easier said than done, but I don't > understand why PG's architecture necessarily dictates that PG must be a > stand-alone, separate thing. Or rather, why some "glue" cannot enable it > to be used just like SQLite from a *practical* perspective, even if it > still is a "server-client model" underneath the hood. (Which doesn't > matter at all to me, nor should it matter to anyone else.) It depends what you mean by "embedded". If you want sqlite's linked library approach, where the database calls literally run your process' address space, then that's a no go, as postgres is a multi-user database server with its own process hierarchy. However, postgres also is a rather agnostic command that does not detach from the parent's terminal/stdio unless instructed to, so nothing stops your bespoke application from launching and managing its own postmaster as a directly managed child process - started as part of application initialisation - and having that listen on a local socket only available to the application itself; this is what we implemented in some of our installations where postgres is enslaved to the cluster control system - the configuration is re-generated at every restart (IIRC Patroni does something similar minus the unix socket part). A plethora of systems are built around the notion of programs calling other programs and managing the process' life cycle. The limiting factor to such architecture tends to be the OS's process control semantics and API (notoriously weaker or more contrived on non-unix-like OSs), but that's not postgres' fault. -- Regards Fabio Ugo Venchiarutti OSPCFC Network Engineering Dpt. Ocado Technology -- Notice: This email is confidential and may contain copyright material of members of the Ocado Group. Opinions and views expressed in this message may not necessarily reflect the opinions and views of the members of the Ocado Group. If you are not the intended recipient, please notify us immediately and delete all copies of this message. Please note that it is your responsibility to scan this message for viruses. References to the "Ocado Group" are to Ocado Group plc (registered in England and Wales with number 7098618) and its subsidiary undertakings (as that expression is defined in the Companies Act 2006) from time to time. The registered office of Ocado Group plc is Buildings One & Two, Trident Place, Mosquito Way, Hatfield, Hertfordshire, AL10 9UL.
Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?
On 21/09/2020 17:53, Joshua Drake wrote:
> 3. The ability to embed PG to run in an automatic, quiet manner as part
> of something else. I know about SQLite, but it's extremely limited to
> the point of being virtually useless IMO, which is why I cannot use that
> for anything nontrivial. I want my familiar PostgreSQL, only not require
> it to be manually and separately installed on the machine where it is to
> run as part of some "application". If I could just "embed" it, this
> would allow me to create a single EXE which I can simply put on a
> different machine to run my entire "system" which otherwise takes *tons*
> of tedious, error-prone manual labor to install, set up and maintain. Of
> course, this is probably much easier said than done, but I don't
> understand why PG's architecture necessarily dictates that PG must be a
> stand-alone, separate thing. Or rather, why some "glue" cannot enable it
> to be used just like SQLite from a *practical* perspective, even if it
> still is a "server-client model" underneath the hood. (Which doesn't
> matter at all to me, nor should it matter to anyone else.)
It depends what you mean by "embedded".
If you want sqlite's linked library approach, where the database calls
literally run your process' address space, then that's a no go, as
postgres is a multi-user database server with its own process hierarchy.
However, postgres also is a rather agnostic command that does not detach
from the parent's terminal/stdio unless instructed to, so nothing stops
your bespoke application from launching and managing its own postmaster
as a directly managed child process - started as part of application
initialisation - and having that listen on a local socket only available
to the application itself; this is what we implemented in some of our
installations where postgres is enslaved to the cluster control system -
the configuration is re-generated at every restart (IIRC Patroni does
something similar minus the unix socket part).
A plethora of systems are built around the notion of programs calling
other programs and managing the process' life cycle. The limiting factor
to such architecture tends to be the OS's process control semantics and
API (notoriously weaker or more contrived on non-unix-like OSs), but
that's not postgres' fault.
--
Regards
Fabio Ugo Venchiarutti
OSPCFC Network Engineering Dpt.
Ocado Technology
--
Notice:
This email is confidential and may contain copyright material of
members of the Ocado Group. Opinions and views expressed in this message
may not necessarily reflect the opinions and views of the members of the
Ocado Group.
If you are not the intended recipient, please notify us
immediately and delete all copies of this message. Please note that it is
your responsibility to scan this message for viruses.
References to the
"Ocado Group" are to Ocado Group plc (registered in England and Wales with
number 7098618) and its subsidiary undertakings (as that expression is
defined in the Companies Act 2006) from time to time. The registered office
of Ocado Group plc is Buildings One & Two, Trident Place, Mosquito Way,
Hatfield, Hertfordshire, AL10 9UL.
Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?
I have to agree that pg_dump is largely a step child backup program. It has consistently been found over the years to be lacking in a number of areas. Unfortunately, working on pg_dump isn't sexy and it is difficult to get volunteers or even paid resources to do such a thing. The real solution for pg_dump is a complete refactor which includes pg_dumpall and it is not a small undertaking. It should be noted that it is also a less and less used program. On our team it is normally used for only very specific needs (grabbing a schema) and we use binary backups or logical replication to receive specific data.
3. The ability to embed PG to run in an automatic, quiet manner as part of something else. I know about SQLite, but it's extremely limited to the point of being virtually useless IMO, which is why I cannot use that for anything nontrivial. I want my familiar PostgreSQL, only not require it to be manually and separately installed on the machine where it is to run as part of some "application". If I could just "embed" it, this would allow me to create a single EXE which I can simply put on a different machine to run my entire "system" which otherwise takes *tons* of tedious, error-prone manual labor to install, set up and maintain. Of course, this is probably much easier said than done, but I don't understand why PG's architecture necessarily dictates that PG must be a stand-alone, separate thing. Or rather, why some "glue" cannot enable it to be used just like SQLite from a *practical* perspective, even if it still is a "server-client model" underneath the hood. (Which doesn't matter at all to me, nor should it matter to anyone else.)This is really using the wrong tool for the job type of issue. PG was never designed for such a scenario.
4. There is no built-in means to have PG manage (or even suggest) indexes on its own. Trying to figure out what indexes to create/delete/fine-tune, and determine all the extremely complex rules for this art (yes, I just called index management an *art*, because it is!), is just utterly hopeless to me. It never gets any easier. Not even after many years. It's the by far worst part of databases to me (combined with point five). Having to use third-party solutions ensures that it isn't done in practice, at least for me. I don't trust, nor do I want to deal with, external software and extensions in my databases. I still have nightmares from PostGIS, which I only keep around, angrily, out of absolute necessity. I fundamentally don't like third-party add-ons to things, but want the core product to properly support things. Besides, this (adding/managing indexes) is not even some niche/obscure use-case, but something which is crucial for basically any nontrivial database of any kind!I think you are looking at this from a very windows centric way. Open Source has its origins from the Unix paradigm where each tool was designed to solve one type of problem and you used multiple tools to create a "solution". Though we have strayed from that on some items due to the evolving nature of software needs, that is still at our core and for good reason. Having tools, flags etc... to do such things (including your point #3) creates complexity best left to "vendors" not the software project.
The practical end result of this is that I've always gone back to using the untouched default configuration file (except for the logging-related options), which, especially in the past on FreeBSD, *severely* crippled my PG database to not even come close to taking advantage of the full power of the hardware. Instead, it felt like I was using maybe 1% of the machine's power, even with a proper database design and indexes and all of that stuff, simply because the default config was so "conservative" and it couldn't be just set to "use whatever resources are available".Not to be unkind but this does seem lazy. There are literally hundreds of "how to make postgres go fast", "how to optimize postgres" if you take 15 minutes to Google. It is true that the project (outside of the wiki) doesn't have much information in the official documentation but that doesn't mean that the information is not available.
I wish so much for PG to have a mode where it self-tunes itself as needed, over time, based on the actual workload, or at least allowed some kind of abstract "performance mode" such as: "you are allowed to use significant system resources, PG", or: "You are one of my most important applications. Just use as much power as you currently need, but at least save about 10% for the rest of the system, will you?" Maybe this is also harder than it sounds to accomplish, but for somebody like me who has zero funding, I cannot hire some professional to sit down with me and fine-tune my system for $899/hour.See my comment about Google. The information is out there and easy to find.
Discord and Slack
Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?
On 9/22/20 3:28 PM, tutiluren@tutanota.com wrote: > > Sep 21, 2020, 7:53 PM by jd@commandprompt.com: > > I have to agree that pg_dump is largely a step child backup program. > It has consistently been found over the years to be lacking in a > number of areas. Unfortunately, working on pg_dump isn't sexy and it > is difficult to get volunteers or even paid resources to do such a > thing. The real solution for pg_dump is a complete refactor which > includes pg_dumpall and it is not a small undertaking. It should be > noted that it is also a less and less used program. On our team it > is normally used for only very specific needs (grabbing a schema) > and we use binary backups or logical replication to receive specific > data. > > Huh? Are you saying that there is another, superior way to back up > PostgreSQL databases other than pg_dump? I re-read the manual on it just > now, but didn't see a single word about it being "legacy" or > "deprecated" or even that there's any other way to do it. What do you mean? It is not deprecated or legacy and it is still used by many. The issue is that people try to do 'partial' dumps without reading the docs on what those switches actually do and the potential conflicts therein. For large installations there it is often better to use some form of replication(binary and/or logical) to maintain a continuous backup. pg_dump can take a long time on very big databases, so it may not keep up. > > This is really using the wrong tool for the job type of issue. PG > was never designed for such a scenario. > > I hate the "wrong tool for the job" argument. It assumes that everyone > has infinite time, energy and brain capacity to learn endless redundant > tools just to "use the right tool for the job" rather than "use what you > actually know". I know PG. I don't know SQLite. They are very different. > So obviously, I want to use PG. This comes down to what your definition of embedded is? A matter of determining whether we are talking apples or oranges. > > What exactly makes PG unsuitable for this? I don't get it. But at the > same time, I also realize that it's not going to happen at this point. > The entire concept of a desktop computer appears to be phased out as we > speak... > > While I understand what you mean, and even agree in theory, in practice, > this always results in crappy third-party solutions which I don't want > to deal with. PostGIS, for example, forces me to use "postgis" for its > schema instead of "PostGIS" just because they arrogantly didn't > construct their internal queries properly. "Little" things like that. Huh? A schema is just a name space, why does it matter how the extension chooses to define it? I mean you could have number of permutations of postgis. > > The practical end result of this is that I've always gone back > to using the untouched default configuration file (except for > the logging-related options), which, especially in the past on > FreeBSD, *severely* crippled my PG database to not even come > close to taking advantage of the full power of the hardware. > Instead, it felt like I was using maybe 1% of the machine's > power, even with a proper database design and indexes and all of > that stuff, simply because the default config was so > "conservative" and it couldn't be just set to "use whatever > resources are available". > > > Not to be unkind but this does seem lazy. There are literally > hundreds of "how to make postgres go fast", "how to optimize > postgres" if you take 15 minutes to Google. It is true that the > project (outside of the wiki) doesn't have much information in the > official documentation but that doesn't mean that the information is > not available. > > Hundreds of crappy, outdated, confusing, badly written "web tutorialz" > are worth nothing. A couple of clear, unambiguous documentation > paragraphs are worth their (metaphorical) weight in gold. The problem is the 'unambiguous' part. There are so many ways Postgres is used, writing a configuration doc that suited everyone would have so many if/and/or/buts that it would define ambiguity. If you want that information post an email here with the information on your server specs and proposed usage. > > Claiming that "the information is out there" is just hand-waving. It's > shifting the burden to the user to actively hunt for information, and > very likely be misled by all the garbage articles out there. I learned > some horrible practices early on from "web tutz" and it took me many > years to unlearn that stuff. > > I know that writing documentation isn't fun, but it's necessary. Also, > my overall point was to not even have to deal with the specifics, but > just be able to tell PG with a single config option that it's allowed to > "use most of the machine's resources". That would entail building an AI into the code that would deal with all the possible OS(versions), Postgres(versions), hardware permutations. > > I wish so much for PG to have a mode where it self-tunes itself > as needed, over time, based on the actual workload, or at least > allowed some kind of abstract "performance mode" such as: "you > are allowed to use significant system resources, PG", or: "You > are one of my most important applications. Just use as much > power as you currently need, but at least save about 10% for the > rest of the system, will you?" Maybe this is also harder than it > sounds to accomplish, but for somebody like me who has zero > funding, I cannot hire some professional to sit down with me and > fine-tune my system for $899/hour. > > > See my comment about Google. The information is out there and easy > to find. > > I guess I'm the worst idiot in the world, then, who can't DuckDuckGo > (Google is evil) it even after 15 years. > > Seriously, I didn't type my feedback "for fun". It may be difficult for > very intelligent people to understand (as often is the case, because you > operate on a whole different level), but the performance-related > PostgreSQL configuration options are a *nightmare* to me and many > others. I spent *forever* reading about them and couldn't make any sense > of it all. Each time I tried, I would give up, frustrated and angry, > with no real clue what "magic numbers" it wanted. > > It's quite baffling to me how this can be so difficult for you all to > understand. Even if we disregard the sheer intelligence factor, it's > clear that users of PG don't have the same intimate knowledge of PG's > internals as the PG developers, nor could possibly be expected to. > > As mentioned, I kept going back to the default configuration over and > over again. Anyone who doesn't is either a genius or pretends/thinks > that they understand it. (Or I'm extremely dumb.) Does your server runs to your satisfaction with the default settings? If not what issues do you see? > > Very often, I get the feeling that things like that are the way they are > on purpose. Work security and whatnot. But it's very frustrating for > people like me who can't afford to buy help and don't have the enormous > brain capacity necessary to comprehend the complex relations between the > numerous performance-related config options. It really is that difficult. > > Discord and Slack > > Those modern services don't even let me load them. It's the same thing > with everything these days: "verify with phone", "we've detected > suspicious activity", "fake error message", etc. -- Adrian Klaver adrian.klaver@aklaver.com
Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?
I have to agree that pg_dump is largely a step child backup program.It has consistently been found over the years to be lacking in anumber of areas. Unfortunately, working on pg_dump isn't sexy and itis difficult to get volunteers or even paid resources to do such athing. The real solution for pg_dump is a complete refactor whichincludes pg_dumpall and it is not a small undertaking. It should benoted that it is also a less and less used program. On our team itis normally used for only very specific needs (grabbing a schema)and we use binary backups or logical replication to receive specificdata.Huh? Are you saying that there is another, superior way to back up PostgreSQL databases other than pg_dump? I re-read the manual on it just now, but didn't see a single word about it being "legacy" or "deprecated" or even that there's any other way to do it. What do you mean?It is not deprecated or legacy and it is still used by many. The issue is that people try to do 'partial' dumps without reading the docs on what those switches actually do and the potential conflicts therein. For large installations there it is often better to use some form of replication(binary and/or logical) to maintain a continuous backup. pg_dump can take a long time on very big databases, so it may not keep up.
This is really using the wrong tool for the job type of issue. PGwas never designed for such a scenario.I hate the "wrong tool for the job" argument. It assumes that everyone has infinite time, energy and brain capacity to learn endless redundant tools just to "use the right tool for the job" rather than "use what you actually know". I know PG. I don't know SQLite. They are very different. So obviously, I want to use PG.This comes down to what your definition of embedded is? A matter of determining whether we are talking apples or oranges.
While I understand what you mean, and even agree in theory, in practice, this always results in crappy third-party solutions which I don't want to deal with. PostGIS, for example, forces me to use "postgis" for its schema instead of "PostGIS" just because they arrogantly didn't construct their internal queries properly. "Little" things like that.Huh? A schema is just a name space, why does it matter how the extension chooses to define it? I mean you could have number of permutations of postgis.
I know that writing documentation isn't fun, but it's necessary. Also, my overall point was to not even have to deal with the specifics, but just be able to tell PG with a single config option that it's allowed to "use most of the machine's resources".That would entail building an AI into the code that would deal with all the possible OS(versions), Postgres(versions), hardware permutations.
Does your server runs to your satisfaction with the default settings?
Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?
> > > That would entail building an AI into the code that would deal with > all the possible OS(versions), Postgres(versions), hardware > permutations. > > I... guess. If "AI" means "a series of ifs". Which is what software... > is? I doubt that people who can make the world's most advanced open > source database cannot check the amount of RAM and see how fast the > CPU/disk is. But the very nature of "embedded" implies the db is subservient to something else, and that something else may need most of the resources. But I heartily agree that tuning a server can be difficult. It's highly dependent on the context and hardware resources is only a portion of that context. I've seen generous help on this list for specific configuration help.
Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?
On 9/23/20 11:51 AM, tutiluren@tutanota.com wrote: > It is not deprecated or legacy and it is still used by many. The > issue is that people try to do 'partial' dumps without reading the > docs on what those switches actually do and the potential conflicts > therein. For large installations there it is often better to use > some form of replication(binary and/or logical) to maintain a > continuous backup. pg_dump can take a long time on very big > databases, so it may not keep up. > > Well, I've never done partial backups, frankly because I don't trust it > to actually work. Full backups may be more wasteful with disk space, but > at least "makes sense" in some abstract way. (Also, I try to keep as > little data as possible, so they are never *gigantic*.) Well not partial as in incremental. Instead dump only some portion of the schema with or without its associated data. > This comes down to what your definition of embedded is? A matter of > determining whether we are talking apples or oranges. > > Just what I said in my original question. Something which is bundled and > invisible to the user. That is going to be difficult with Postgres as it is its own process. It would take a good bit of tooling to hide that from the user. > Huh? A schema is just a name space, why does it matter how the > extension chooses to define it? I mean you could have number of > permutations of postgis. > > I'm saying that PostGIS has a bug due to incorrectly constructed > internal queries which makes it impossible to properly name the schema > where PostGIS is to reside, causing my database to look very ugly when > it has to say "postgis" instead of "PostGIS" for PostGIS's schema. And > that was an example of how sloppy/bad third-party things always are, and > is one reason why I don't like it when I have to rely on "extensions". If that is the sum of your issues with PostGIS then I really don't have much sympathy. They are extensions so you aren't required to use them and rely on their way of doing things. You have the choice of writing your own code/extension or do without completely. I personally find compromise on my part wrt to extensions greatly reduces the amount of coding/effort I have to do to move forward. Honestly, I don't see how it is different from the compromises that arise out of using the core project itself. > That would entail building an AI into the code that would deal with > all the possible OS(versions), Postgres(versions), hardware > permutations. > > I... guess. If "AI" means "a series of ifs". Which is what software... > is? I doubt that people who can make the world's most advanced open > source database cannot check the amount of RAM and see how fast the > CPU/disk is. It is more then that. It would have to take into account the behavior changes that happen in Postgres between major versions. It also would have to account for OS specific parameters and the changes that happen there between OS versions. It also would need to 'know' how the database was going to be used; readonly, heavy writes, etc. Also how the database should play with other programs on the same machine. Add to the mix containers, cloud instances and so on and you are outrunning the ability of 'ifs' to handle it. > > Does your server runs to your satisfaction with the default settings? > > Right now, yes, but that says very little as I'm the only user of it. > I've had many nightmares in the past, however, where even determining > whether the changes in the config did anything (good or bad) has been > impossible. I fundamentally don't like the idea that the config is so > "conservative" (crippled) with no obvious/easy way to "set a different > general mode". If you honestly think that the numerous > performance-related options are easy to understand, I don't know what to > say. The thing is 'general mode' is going to mean something different to someone running a database in the MB-low GB range vs. high GB vs. TB vs. PB. As to monitoring see: https://www.postgresql.org/docs/12/monitoring-stats.html https://www.postgresql.org/docs/12/runtime-config-logging.html#RUNTIME-CONFIG-LOGGING-WHAT I know it is an extension, still it is useful: https://www.postgresql.org/docs/12/pgstatstatements.html -- Adrian Klaver adrian.klaver@aklaver.com
Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?
> Huh? A schema is just a name space, why does it matter how the
On 9/23/20 11:51 AM, tutiluren@tutanota.com wrote:
> extension chooses to define it? I mean you could have number of
> permutations of postgis.
>
> I'm saying that PostGIS has a bug due to incorrectly constructed
> internal queries which makes it impossible to properly name the schema
> where PostGIS is to reside, causing my database to look very ugly when
> it has to say "postgis" instead of "PostGIS" for PostGIS's schema. And
> that was an example of how sloppy/bad third-party things always are, and
- Maturity
- Functionality
- Performance
- Cost
- Documentation
- Support (huge community)
Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?
On 24/09/2020 18:13, Tony Shelver wrote: > > > On 9/23/20 11:51 AM, tutiluren@tutanota.com > <mailto:tutiluren@tutanota.com> wrote: > > > Huh? A schema is just a name space, why does it matter how the > > extension chooses to define it? I mean you could have number of > > permutations of postgis. > > > > I'm saying that PostGIS has a bug due to incorrectly constructed > > internal queries which makes it impossible to properly name the schema > > where PostGIS is to reside, causing my database to look very ugly when > > it has to say "postgis" instead of "PostGIS" for PostGIS's schema. And > > that was an example of how sloppy/bad third-party things always are, and > > is one reason why I don't like it when I have to rely on "extensions". > > All lowercase is good, as you don't have to remember which bits are capitalized. And besides, there are far more important issues to consider as Tony has covered in detail. It is definitely not a bug! [..] > Oracle's equivalent is probably the closest. Pity that installing > Oracle and their products as a whole is a nightmare, and rather > wallet-draining... > > Have a look at GIS / Mapping projects around the world, a majority are > implemented on PostGIS. Openstreetmap is probably the biggest (think > open source version of Google Maps), and it moved to PostGIS from > MySQL several years ago. > We did a lot of research into PostGIS, as GIS / tracking is a core > part of our business. > We didn't find a product that could compare on > > * Maturity > * Functionality > * Performance > * Cost > * Documentation > * Support (huge community) > > I remember going to free seminars promoting the Oracle database over 25 years ago, and the only thing I can remembers now is the pie charts saying how much revenue Oracle had versus all the others. Never how many transactions, size of databases, number of users, nor any really useful metric -- just how much Oracle was being paid! To be honest, I've heard many bad things about Oracle, and rarely anything good. There are many reasons for going with PostgreSQL and PostGIS, but you are free to use something else if you prefer. Cheers, Gavin
Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?
On Sun, Sep 20, 2020 at 01:15:26PM -0400, Tom Lane wrote: > Bruce Momjian <bruce@momjian.us> writes: > > On Mon, Sep 14, 2020 at 05:39:57PM -0400, Tom Lane wrote: > >> On the other hand, the very same thing could be said of database names > >> and role names, yet we have never worried much about whether those were > >> encoding-safe when viewed from databases with different encodings, nor > >> have there been many complaints about the theoretical unsafety. So maybe > >> this is just overly anal-retentive and we should drop the restriction, > >> or at least pass through data that doesn't appear to be invalidly > >> encoded. > > > I think the issue is that role and database names are controlled by > > privileged users, while application_name is not. > > That's certainly an argument against a completely laissez-faire approach, > but if we filtered invalidly-encoded data on the reading side, it seems > like we would be in good enough shape. Yes, if we want to filter, sure. I thought we were not 100% able to filter, but I guess if it safe, we can do it. -- Bruce Momjian <bruce@momjian.us> https://momjian.us EnterpriseDB https://enterprisedb.com The usefulness of a cup is in its emptiness, Bruce Lee
Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?
On Wed, Sep 23, 2020 at 12:28:14AM +0200, tutiluren@tutanota.com wrote: > Sep 21, 2020, 7:53 PM by jd@commandprompt.com: > See my comment about Google. The information is out there and easy to find. > > I guess I'm the worst idiot in the world, then, who can't DuckDuckGo (Google is > evil) it even after 15 years. > > Seriously, I didn't type my feedback "for fun". It may be difficult for very > intelligent people to understand (as often is the case, because you operate on > a whole different level), but the performance-related PostgreSQL configuration > options are a *nightmare* to me and many others. I spent *forever* reading > about them and couldn't make any sense of it all. Each time I tried, I would > give up, frustrated and angry, with no real clue what "magic numbers" it > wanted. > > It's quite baffling to me how this can be so difficult for you all to > understand. Even if we disregard the sheer intelligence factor, it's clear that > users of PG don't have the same intimate knowledge of PG's internals as the PG > developers, nor could possibly be expected to. > > As mentioned, I kept going back to the default configuration over and over > again. Anyone who doesn't is either a genius or pretends/thinks that they > understand it. (Or I'm extremely dumb.) I think there is a clear dependency that people reading the docs, particularly for performance purposes, must have an existing knowledge of a lot of low-level things --- this could be the cause of your frustration. -- Bruce Momjian <bruce@momjian.us> https://momjian.us EnterpriseDB https://enterprisedb.com The usefulness of a cup is in its emptiness, Bruce Lee
Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?
> Huh? A schema is just a name space, why does it matter how the> extension chooses to define it? I mean you could have number of> permutations of postgis.>> I'm saying that PostGIS has a bug due to incorrectly constructed> internal queries which makes it impossible to properly name the schema> where PostGIS is to reside, causing my database to look very ugly when> it has to say "postgis" instead of "PostGIS" for PostGIS's schema. And> that was an example of how sloppy/bad third-party things always are, and> is one reason why I don't like it when I have to rely on "extensions".All lowercase is good
, as you don't have to remember which bits are capitalized.
And besides, there are far more important issues to consider as Tony has covered in detail.
It is definitely not a bug!
There are many reasons for going with PostgreSQL and PostGIS, but you are free to use something else if you prefer.
Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?
On 9/24/20 6:20 PM, Bruce Momjian wrote: > On Wed, Sep 23, 2020 at 12:28:14AM +0200, tutiluren@tutanota.com wrote: >> Sep 21, 2020, 7:53 PM by jd@commandprompt.com: >> See my comment about Google. The information is out there and easy to find. >> >> I guess I'm the worst idiot in the world, then, who can't DuckDuckGo (Google is >> evil) it even after 15 years. >> >> Seriously, I didn't type my feedback "for fun". It may be difficult for very >> intelligent people to understand (as often is the case, because you operate on >> a whole different level), but the performance-related PostgreSQL configuration >> options are a *nightmare* to me and many others. I spent *forever* reading >> about them and couldn't make any sense of it all. Each time I tried, I would >> give up, frustrated and angry, with no real clue what "magic numbers" it >> wanted. >> >> It's quite baffling to me how this can be so difficult for you all to >> understand. Even if we disregard the sheer intelligence factor, it's clear that >> users of PG don't have the same intimate knowledge of PG's internals as the PG >> developers, nor could possibly be expected to. >> >> As mentioned, I kept going back to the default configuration over and over >> again. Anyone who doesn't is either a genius or pretends/thinks that they >> understand it. (Or I'm extremely dumb.) > I think there is a clear dependency that people reading the docs, > particularly for performance purposes, must have an existing knowledge > of a lot of low-level things --- this could be the cause of your > frustration. And that's a serious problem with the documentation. (Not that I know how to fix it in an OSS project.) -- Angular momentum makes the world go 'round.
Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?
On 9/23/20 11:51 AM, tutiluren@tutanota.com wrote:> Huh? A schema is just a name space, why does it matter how the> extension chooses to define it? I mean you could have number of> permutations of postgis.>> I'm saying that PostGIS has a bug due to incorrectly constructed> internal queries which makes it impossible to properly name the schema> where PostGIS is to reside, causing my database to look very ugly when> it has to say "postgis" instead of "PostGIS" for PostGIS's schema. And> that was an example of how sloppy/bad third-party things always are, and> is one reason why I don't like it when I have to rely on "extensions".Ummmm? I have PostGIS installed in my core app schema, in part because at the time I didn't know what I was doing. Better to have been in public...You may also want to look at comparable ANSI (standards based) database products (Oracle for example) when it comes to the use of case in naming conventions. Different products handle things in different ways.You may want to google around the issue, for example https://postgis.net/2017/11/07/tip-move-postgis-schema/ for moving schemas.You may want to do some research on where PostGIS comes from. It could never have been developed as a core part of Postgres, so the fact that products like PostGIS are so domain specific.The fact that the Postgesql extension system is so flexible and robust was probably a key factor in the choice it's choice in the development of PostGIS.
My suggestion is 'live with it'.
Or, move to a product that suits your use cases / desires better
, But, good luck finding another open source "free" (or any) product with the functionality, robusiness and performance of PostGIS / Postgresql.
Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?
Well not partial as in incremental. Instead dump only some portion of the schema with or without its associated data.
I'm saying that PostGIS has a bug due to incorrectly constructed internal queries which makes it impossible to properly name the schema where PostGIS is to reside, causing my database to look very ugly when it has to say "postgis" instead of "PostGIS" for PostGIS's schema. And that was an example of how sloppy/bad third-party things always are, and is one reason why I don't like it when I have to rely on "extensions".If that is the sum of your issues with PostGIS then I really don't have much sympathy.
They are extensions so you aren't required to use them and rely on their way of doing things. You have the choice of writing your own code/extension or do without completely.
That would entail building an AI into the code that would deal withall the possible OS(versions), Postgres(versions), hardwarepermutations.I... guess. If "AI" means "a series of ifs". Which is what software... is? I doubt that people who can make the world's most advanced open source database cannot check the amount of RAM and see how fast the CPU/disk is.It is more then that. It would have to take into account the behavior changes that happen in Postgres between major versions. It also would have to account for OS specific parameters and the changes that happen there between OS versions. It also would need to 'know' how the database was going to be used; readonly, heavy writes, etc. Also how the database should play with other programs on the same machine. Add to the mix containers, cloud instances and so on and you are outrunning the ability of 'ifs' to handle it.
Does your server runs to your satisfaction with the default settings?Right now, yes, but that says very little as I'm the only user of it. I've had many nightmares in the past, however, where even determining whether the changes in the config did anything (good or bad) has been impossible. I fundamentally don't like the idea that the config is so "conservative" (crippled) with no obvious/easy way to "set a different general mode". If you honestly think that the numerous performance-related options are easy to understand, I don't know what to say.The thing is 'general mode' is going to mean something different to someone running a database in the MB-low GB range vs. high GB vs. TB vs. PB.
Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?
tutiluren@tutanota.com schrieb am 25.09.2020 um 06:41: > All lowercase is good > > That's your personal opinion -- not some kind of fact, and it > definitely goes against everything that I believe. You have to accept that every programming environment has its own best practices (in terms of coding style AND naming conventions). The recommendation to use all lower-case unquoted names is not Gavin's personal opinion, it's the opinion of the Postgres community. In Postgres and many other relational databases - with SQL Server and MySQL being the (only?) exception - using non-quoted identifiers is the recommended way. In Oracle, DB2 or Firebird this leads to all uppercase names, in Postgres to all lowercase names. If you don't follow the recommended best practices, you can't really blame those that do (e.g. the PostGIS project), for your problems. > The fact that there are a lot of sloppy, US-centric people who refuse > to use correctly named identifiers and cannot understand how there > can be anything besides a-z in an alphabet, doesn't change reality > and doesn't make a bug "right". I am not in the USA (and far from being US-centric as well) and I have been working with relational databases for over thirty years. I never had problems using unquoted ASCII names (using snake_case) for my database objects. Would it be nice if I could use special characters like öäü in the names of tables and columns (without the hassle of quotingthem)? Yes, absolutely. Does not using them, limit me in any way doing what I want to do? No, it doesn't. > Always the same thing. The slightest criticism, no matter how > warranted, always results in: "Fine. Go somewhere else. Use something > else." > Never: "Oh, right. Sorry, but we always used lowercase ourselves and > therefore didn't consider this. In retrospect, it's an embarrassing > mistake! We'll fix it in the next release. Thanks for pointing that > out." Changing this behaviour has been discussed on this list multiple times. The problem is, that currently no one sees a way to change this without breaking all (or most) existing code that relies on it (at least not with the resources the project hast). It's my understanding (as a user, not a developer) that the impact on the code base would be huge, and the community doesn't indeed really see a benefit in doing so. It has especially been discussed to implement a behaviour that complies with the SQL standard which *requires* to fold non-quoted names to uppercase! Would you be more happy with a standard compliant behaviour? I guess not. So, this is indeed one of the things that you either have to accept, or move on. *Every* piece of software has some quirks (or bugs as you see it) which are annoying and can't or won't be changed. If the number of quirks (or bugs) exceeds the benefit you get from the software, then you should indeed start looking for something different. But claiming that the people on this list are not open to criticism is simply wrong and doesn't do them justice. You might want to search the internet on the reaction to Uber's change from Postgres to MySQL - many of the criticism from their side have resulted in changes to Postgres. And there are many other examples (e.g. Windows port, replication, vacuum problems). You are unlucky to be offended by the best practices when it comes to naming database objects. I think having a "case preserving, case insensitive" option in Postgres would be nice indeed, but it's not a _functional_ problem, it's just an aesthetic one. Sticking to the recommended best practices doesn't limit you in any way in exploiting all features of the software. > The answer, of course, is that you have zero interest in helping but > 100% interest in making smug insults, The only person who is insulting other people is you. But you apparently have zero interest in accepting that different environments, need different approaches. And what works in "System A" does not work the same in "System B". Thomas
Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?
On 9/25/20 1:04 AM, Thomas Kellerer wrote: > >> The fact that there are a lot of sloppy, US-centric people who refuse >> to use correctly named identifiers and cannot understand how there >> can be anything besides a-z in an alphabet, doesn't change reality >> and doesn't make a bug "right". > > I am not in the USA (and far from being US-centric as well) and I have > been working > with relational databases for over thirty years. I never had problems > using unquoted > ASCII names (using snake_case) for my database objects. > > Would it be nice if I could use special characters like öäü in the names > of tables and columns (without the hassle of quoting them)? > Yes, absolutely. > But you can use them without quoting: select version(); version ------------------------------------------------------------------------------------ PostgreSQL 12.3 on x86_64-pc-linux-gnu, compiled by gcc (SUSE Linux) 7.5.0, 64-bit create table öäü (id int , fld_1 varchar); insert into öäü values (1, 'test'); select * from öäü; id | fld_1 ----+------- 1 | test > > Thomas > > -- Adrian Klaver adrian.klaver@aklaver.com
Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?
On 9/24/20 10:40 PM, tutiluren@tutanota.com wrote: > > Well not partial as in incremental. Instead dump only some portion > of the schema with or without its associated data. > > It's funny that you should bring that up, considering how it was one of > my points... See the point about pg_dump's bug on Windows. Yes, I read your bug report(https://www.postgresql.org/message-id/MCk38UV--3-2@tutanota.com) and you where just as resistant to advice there as here. > > I'm saying that PostGIS has a bug due to incorrectly constructed > internal queries which makes it impossible to properly name the > schema where PostGIS is to reside, causing my database to look > very ugly when it has to say "postgis" instead of "PostGIS" for > PostGIS's schema. And that was an example of how sloppy/bad > third-party things always are, and is one reason why I don't > like it when I have to rely on "extensions". > > > If that is the sum of your issues with PostGIS then I really don't > have much sympathy. > > Why does nobody understand that it was an *example* and not some kind of > full PostGIS review? > > They are extensions so you aren't required to use them and rely on > their way of doing things. You have the choice of writing your own > code/extension or do without completely. > > It sure is great to have such choices... I can't take it seriously when > people say things like this. It's similar to "it's open source so you > can easily vet it yourself". It's not taking reality into consideration > at all. > > As for doing without it, that would make it impossible to deal with GPS > coordinates/maps. So it's not really a choice at all. Read this issue from PostGIS: https://trac.osgeo.org/postgis/ticket/3496 " Then we can use the variable @extschema@ in lieu of the actual schema name. This will still allow users to do: CREATE EXTENSION postgis SCHEMA whereever_I_damn_want_you_to_be; " > It is more then that. It would have to take into account the > behavior changes that happen in Postgres between major versions. It > also would have to account for OS specific parameters and the > changes that happen there between OS versions. It also would need to > 'know' how the database was going to be used; readonly, heavy > writes, etc. Also how the database should play with other programs > on the same machine. Add to the mix containers, cloud instances and > so on and you are outrunning the ability of 'ifs' to handle it. > > If it changes that much, it's far, far worse than I even thought, and it > sounds like it will be pointless to even *try* to learn it as it keeps > changing between versions/OSes/other stuff. Life is not static. If you want to stay current you have to keep learning. > > I can't help but feel as if people just don't want to answer this and > other concerns I have. As if there's some silent agreement along the > lines of "securing PG DBAs' jobs". The reason people have not answered is you have not provided the information necessary to formulate an answer. For instance, OS & version, Postgres version, size of data, database use case, number of users. > The thing is 'general mode' is going to mean something different to > someone running a database in the MB-low GB range vs. high GB vs. TB > vs. PB. > > I don't mean this to sound rude, but it's like talking to a wall... What > I mean is that there are obviously technical means for software to know > whether they are exhausting the system they are running on or not, and > expecting people to understand all these intricate internal parameters > is just... bizarre. There ought to be some kind of "abstract" setting > for those of us who aren't able to (or even *wish* to) comprehend all > the PG internals, and just want an efficient database using (roughly) as > much of our machine as we want. > > This is not the first time I feel like I'm repeating myself over and > over in different ways but never getting through. It could be that you > are so familiar with PG's internals that it all is obvious to you, but > it could just as well be that you don't want to hear about this. No I am not all that familiar with the Postgres internals. I'm an end user for what qualifies as small databases. The configuration as is works for me, in that any impediments it might impose are hidden by other parts of the stack. I just know, from years on this list, that there are folks who would help you with configuration given some starting point information. -- Adrian Klaver adrian.klaver@aklaver.com
Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?
Adrian Klaver schrieb am 25.09.2020 um 17:02: >> Would it be nice if I could use special characters like öäü in the names of tables and columns (without the hassle ofquoting them)? >> Yes, absolutely. >> > But you can use them without quoting: > > select version(); > version > ------------------------------------------------------------------------------------ > PostgreSQL 12.3 on x86_64-pc-linux-gnu, compiled by gcc (SUSE Linux) 7.5.0, 64-bit > > create table öäü (id int , fld_1 varchar); > > insert into öäü values (1, 'test'); > > select * from öäü; > id | fld_1 > ----+------- > 1 | test Ah cool ;) I didn't know that, thanks.
Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?
On 9/25/20 7:40 AM, tutiluren@tutanota.com wrote: > I don't mean this to sound rude, but it's like talking to a wall... What > I mean is that there are obviously technical means for software to know > whether they are exhausting the system they are running on or not, and > expecting people to understand all these intricate internal parameters > is just... bizarre. There ought to be some kind of "abstract" setting > for those of us who aren't able to (or even *wish* to) comprehend all > the PG internals, and just want an efficient database using (roughly) as > much of our machine as we want. I found pgtune [1] to be a very good start for tuning PG. There's also a tool available online [2]. [1] https://github.com/gregs1104/pgtune [2] https://pgtune.leopard.in.ua/
Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?
On Thu, Sep 24, 2020 at 11:47:10PM -0500, Ron wrote: > On 9/24/20 6:20 PM, Bruce Momjian wrote: > > On Wed, Sep 23, 2020 at 12:28:14AM +0200, tutiluren@tutanota.com wrote: > > > Sep 21, 2020, 7:53 PM by jd@commandprompt.com: > > > See my comment about Google. The information is out there and easy to find. > > > > > > I guess I'm the worst idiot in the world, then, who can't DuckDuckGo (Google is > > > evil) it even after 15 years. > > > > > > Seriously, I didn't type my feedback "for fun". It may be difficult for very > > > intelligent people to understand (as often is the case, because you operate on > > > a whole different level), but the performance-related PostgreSQL configuration > > > options are a *nightmare* to me and many others. I spent *forever* reading > > > about them and couldn't make any sense of it all. Each time I tried, I would > > > give up, frustrated and angry, with no real clue what "magic numbers" it > > > wanted. > > > > > > It's quite baffling to me how this can be so difficult for you all to > > > understand. Even if we disregard the sheer intelligence factor, it's clear that > > > users of PG don't have the same intimate knowledge of PG's internals as the PG > > > developers, nor could possibly be expected to. > > > > > > As mentioned, I kept going back to the default configuration over and over > > > again. Anyone who doesn't is either a genius or pretends/thinks that they > > > understand it. (Or I'm extremely dumb.) > > I think there is a clear dependency that people reading the docs, > > particularly for performance purposes, must have an existing knowledge > > of a lot of low-level things --- this could be the cause of your > > frustration. > > And that's a serious problem with the documentation. (Not that I know how to > fix it in an OSS project.) We added a glossary in PG 13, so we could certainly have some kind of hardware terms primer which explains various OS features that affect Postgres. -- Bruce Momjian <bruce@momjian.us> https://momjian.us EnterpriseDB https://enterprisedb.com The usefulness of a cup is in its emptiness, Bruce Lee
Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?
On Fri, Sep 25, 2020 at 10:04:53AM +0200, Thomas Kellerer wrote: > I am not in the USA (and far from being US-centric as well) and I have been working > with relational databases for over thirty years. I never had problems using unquoted > ASCII names (using snake_case) for my database objects. > > Would it be nice if I could use special characters like öäü in the names of tables and columns (without the hassle of quotingthem)? > Yes, absolutely. > > Does not using them, limit me in any way doing what I want to do? > No, it doesn't. > > > Always the same thing. The slightest criticism, no matter how > > warranted, always results in: "Fine. Go somewhere else. Use something > > else." > > Never: "Oh, right. Sorry, but we always used lowercase ourselves and > > therefore didn't consider this. In retrospect, it's an embarrassing > > mistake! We'll fix it in the next release. Thanks for pointing that > > out." > > Changing this behaviour has been discussed on this list multiple times. > > The problem is, that currently no one sees a way to change this without > breaking all (or most) existing code that relies on it (at least not with > the resources the project hast). > > It's my understanding (as a user, not a developer) that the impact on the code base would be huge, and > the community doesn't indeed really see a benefit in doing so. > > It has especially been discussed to implement a behaviour that complies > with the SQL standard which *requires* to fold non-quoted names to uppercase! I did write a blog entry about case folding: https://momjian.us/main/blogs/pgblog/2020.html#June_26_2020 -- Bruce Momjian <bruce@momjian.us> https://momjian.us EnterpriseDB https://enterprisedb.com The usefulness of a cup is in its emptiness, Bruce Lee
Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?
Well not partial as in incremental. Instead dump only some portion of the schema with or without its associated data.It's funny that you should bring that up, considering how it was one of my points... See the point about pg_dump's bug on Windows.
I'm saying that PostGIS has a bug due to incorrectly constructed internal queries which makes it impossible to properly name the schema where PostGIS is to reside, causing my database to look very ugly when it has to say "postgis" instead of "PostGIS" for PostGIS's schema. And that was an example of how sloppy/bad third-party things always are, and is one reason why I don't like it when I have to rely on "extensions".They are extensions so you aren't required to use them and rely on their way of doing things. You have the choice of writing your own code/extension or do without completely.It sure is great to have such choices... I can't take it seriously when people say things like this. It's similar to "it's open source so you can easily vet it yourself". It's not taking reality into consideration at all.As for doing without it, that would make it impossible to deal with GPS coordinates/maps. So it's not really a choice at all.
That would entail building an AI into the code that would deal withall the possible OS(versions), Postgres(versions), hardwarepermutations.I... guess. If "AI" means "a series of ifs". Which is what software... is? I doubt that people who can make the world's most advanced open source database cannot check the amount of RAM and see how fast the CPU/disk is.It is more then that. It would have to take into account the behavior changes that happen in Postgres between major versions. It also would have to account for OS specific parameters and the changes that happen there between OS versions. It also would need to 'know' how the database was going to be used; readonly, heavy writes, etc. Also how the database should play with other programs on the same machine. Add to the mix containers, cloud instances and so on and you are outrunning the ability of 'ifs' to handle it.If it changes that much, it's far, far worse than I even thought, and it sounds like it will be pointless to even *try* to learn it as it keeps changing between versions/OSes/other stuff.
I can't help but feel as if people just don't want to answer this and other concerns I have. As if there's some silent agreement along the lines of "securing PG DBAs' jobs".
The thing is 'general mode' is going to mean something different to someone running a database in the MB-low GB range vs. high GB vs. TB vs. PB.I don't mean this to sound rude, but it's like talking to a wall... What I mean is that there are obviously technical means for software to know whether they are exhausting the system they are running on or not, and expecting people to understand all these intricate internal parameters is just... bizarre. There ought to be some kind of "abstract" setting for those of us who aren't able to (or even *wish* to) comprehend all the PG internals, and just want an efficient database using (roughly) as much of our machine as we want.
"There's plenty of guides" and "the information is out there" doesn't help me and all the other people who have stuck with the default config and thus a massively restricted PG database for all these years. Just because it's easy to you doesn't mean it's easy to everyone else. Just dealing with composing efficient-enough SQL queries and designing an optimized database structure is (way) more than enough work for most of us. I don't have the luxury of some hired DBA who sits all day tuning the PG server. Besides, I've already explained the privacy issues with that even if I had the money...
There's my $0.02 (and then some).
Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?
> On Sep 22, 2020, at 5:28 PM, tutiluren@tutanota.com wrote: > > I hate the "wrong tool for the job" argument. It assumes that everyone has infinite time, energy and brain capacity tolearn endless redundant tools just to "use the right tool for the job" rather than "use what you actually know". I knowPG. I don't know SQLite. They are very different. So obviously, I want to use PG. https://duckdb.org/docs/why_duckdb Maybe this would provide something in between. I have not used it, but it sounds interesting. Based on SQLite engine anduses the Postgres SQL parser. Lots of built-in functions appear to be the same or very similar to Postgres. John DeSoi, Ph.D.
Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?
On Thu, Sep 24, 2020 at 10:40 PM <tutiluren@tutanota.com> wrote:Well not partial as in incremental. Instead dump only some portion of the schema with or without its associated data.It's funny that you should bring that up, considering how it was one of my points... See the point about pg_dump's bug on Windows.And you seem to have ignored the fact that one of the core developers pointed out that it likely isn't a pg_dump bug - if your terminal is using the same locale as the database, it should have no difficulty dealing with the characters you are having trouble with. It seems likely that you simply need to learn how to get your terminal set up correctly for it to work.
Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?
> On Oct 1, 2020, at 16:08, tutiluren@tutanota.com wrote: > But of course I should be grateful no matter what because it doesn't cost money. No one is asking you to be grateful. However, you are asking for other people to do things that important to you, but notthem. They are not required to do so. If you cannot persuade them, and are not in a position to pay them, then that'sa reality you'll just have to accept. -- -- Christophe Pettus xof@thebuild.com
Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?
On 10/1/20 4:08 PM, tutiluren@tutanota.com wrote: > > Yeah, this isn't rude or insulting at all... > > It would be refreshing to hear your be honest for once and just admit > that you *want* it to be difficult. You *like* that there's a high > threshold and it makes you feel superior to exclude "dumb" people who > can't figure out all these cryptic (and downright broken) things. I > truly believe that this is the reason for a lot of "weird" things which > seem to make no sense on the surface. Yes, if it makes you fell better we had a secret meeting where we decided to make your life and only your life a living hell when it came to dealing with Postgres. -- Adrian Klaver adrian.klaver@aklaver.com
Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?
On 2020-09-23 13:38:19 -0700, Adrian Klaver wrote: > On 9/23/20 11:51 AM, tutiluren@tutanota.com wrote: > > This comes down to what your definition of embedded is? A matter of > > determining whether we are talking apples or oranges. > > > > Just what I said in my original question. Something which is bundled and > > invisible to the user. > > That is going to be difficult with Postgres as it is its own process. It > would take a good bit of tooling to hide that from the user. I guess that comes down to the definition of "invisible". PostgreSQL certainly won't be invisible to a user running «ps -e» on Linux or using the task manager on Windows. It also won't be invisible to someone who examines the file system and discovers a directory structure with PostgreSQL's rather distinctive file names (the latter is of course true for SQLite, too). But I don't see any great difficulty in bundling PostgreSQL with an application, such that it is automatically installed, configured, populated and managed by the application. The user might never notice that it's there at all. But this is the responsibility of the application's author: The author can put work into an installer or they can decide that it's better (for some definition of "better") to tell the user to install PostgreSQL themselves. It has very little to do with PostgreSQL. > It is more then that. It would have to take into account the behavior > changes that happen in Postgres between major versions. It also would have > to account for OS specific parameters and the changes that happen there > between OS versions. It also would need to 'know' how the database was going > to be used; This is IMHO the most important point: Memory, CPUs, OS versions: There is only so much variety. But how the database is used - that can be very different. For example, on our largest database server I have work_mem set to 1/8 of the total RAM. This is not something I would recommend generally, but it works well for our workload. And most importantly it isn't something that can be determined statically - it depends on the behaviour of the application. So either you already know that (which I, as the author of our application do (mostly), but a Postgres tuning tool can't) or you observe the application for some time (weeks, probably) and adjust parameters (this is something a tool could do, and maybe better than a human, but this is getting into AI territory). hp -- _ | Peter J. Holzer | Story must make more sense than reality. |_|_) | | | | | hjp@hjp.at | -- Charles Stross, "Creative writing __/ | http://www.hjp.at/ | challenge!"
Attachment
Re: Can I get some PostgreSQL developer feedback on these five general issues I have with PostgreSQL and its ecosystem?
On Thu, Sep 24, 2020 at 10:40 PM <tutiluren@tutanota.com> wrote:Well not partial as in incremental. Instead dump only some portion of the schema with or without its associated data.It's funny that you should bring that up, considering how it was one of my points... See the point about pg_dump's bug on Windows.And you seem to have ignored the fact that one of the core developers pointed out that it likely isn't a pg_dump bug - if your terminal is using the same locale as the database, it should have no difficulty dealing with the characters you are having trouble with. It seems likely that you simply need to learn how to get your terminal set up correctly for it to work.Yeah, this isn't rude or insulting at all...Funny how my "incorrectly set up terminal" works perfectly for all other programs and my own test scripts, but not for pg_dump specifically. And only when using "special" characters. As already pointed out multiple times, in great detail, to deaf ears. Very interesting how you can manage to twist and bend that into it still somehow being "my fault". Because of course it cannot be pg_dump's fault. Absolutely not. It is unthinkable. It's the "rude user"'s fault who had the audacity to point out yet another PG bug which more than likely won't *ever* be fixed, as it's not even recognized, much less cared about. Probably because they *want* PostgreSQL to be crippled on Windows, judging by the responses in the past and how incredibly broken the joke of an installer is.You should call it "Linux software with minimal pre-alpha Windows support" instead of pretending that it's cross-platform, and that goes for many FOSS projects as well which think exactly the same as you. The fact that I still use this garbage OS (Windows) speaks volumes of how incredibly crappy Linux is, which is utterly *unusable*.But of course I should be grateful no matter what because it doesn't cost money. Because my time and energy is worthless. And the competition is "even worse", so that means I cannot point out any fault, ever, no matter how serious or how easily it could be fixed. I should just shut up and thank everyone for insulting me through carelessness and words. Or "fix it myself", because that's obviously an option as I haven't done it so far...I did read the rest of your e-mail, but it would be pointless to reply to it as you clearly have the mentality that everyone should dedicate their lives to configuring a database and buying books instead of using it, because everyone should be core developers and everything must always be cryptic and difficult and blablabla. I'm sick of this attitude, and especially of being called "rude" by such rude-beyond-words people.It would be refreshing to hear your be honest for once and just admit that you *want* it to be difficult. You *like* that there's a high threshold and it makes you feel superior to exclude "dumb" people who can't figure out all these cryptic (and downright broken) things. I truly believe that this is the reason for a lot of "weird" things which seem to make no sense on the surface.
I'd say take your time, take some deep breaths and decide that's good for you. Back in 2004 and after 3 yrs of full production software with postgresql someone from the mailing list (he's also in this thread!!) called me a "newbie", and I immediately started looking for alternatives, only to find out simply that there was no better DB software in the market/world back then (and still as we speak). So I stayed with PGSQL and wrote what I believe the best non-tcp-ip DB replication solution for marine and shipping business (over satellite), which still thrives today as far as easiness, automation, completeness, correctness and cost are concerned.
+ I discover every day that I am still a newbie, after 20 yrs with postgresql. This is not personal, this is about being successful in the long run, one should weigh his options and act accordingly. It took me/us a long time before we spent a single penny on someone to write or fix code that would work for us, but this moment eventually came, there is a roof when going with community software. That roof came for us much much later since we begun using PostgreSQL. Value for money is so hard to beat. I have seen the code by our MS SQL partners , some interesting and serious things happening there but when they listen what stock free pgsql can do they just freak out (the ones who understand).
So my advice, tell your CEO's the true potential of this technology and maybe show them some stats, some results of others, some numbers. Put them side by side with the rest of serious solutions and then decide.