Thread: HELP all women were raped during the May riots in Jakarta
We Promise that we will throw a nuclear bomb to Indonesia to destroy this besterd Islam country, if they still persecute our chinese. B J Habibie, we will kill you if you don't settle this case. see what they do! http://lateline.muzi.net/topics/Indonesian_atrocity/ http://muzi.net ¦L¥§¬F©²email:dkiweb@indo.net.id ¦L¥§¥~¥æ³¡Home Page: http://www.deplu.go.id ¥»¦¸¨Æ¥óªººô¶: http://members.spree.com/expose/ http://dailynews.sinanet.com/focus/1998071402/index.html FORWARDED FROM MISSIONNET'S URGENT PRAYER NETWORK ___________________________________________________________ Source: Christian Leaders Association ¨Ó·½:¤Ñ¥D±Ð»â³S¨ó·| 21 June, 1998. Jakarta, Indonesia 1998¦~6¤ë21¤é,¦L¥§¶®¥[¹F Dear friends, (¿Ë·RªºªB¤Í) Here I submit a victim's account of being raped during the May riots here in Jakarta. Reference to Huaran Bulletin Board June 12, 1998. (³o¸Ì§Ú·Q§i¶D¤j®a¦³Ãö¤¤ë¼É°Ê®É¦b¶®¥[¹Fµo¥Íªº ±j¼É¨Æ¥ó,³o¬O§Ú±q¤µ¦~6¤ë12¤éªºHuaran§G§iª©¤¤±oª¾ªº) The purpose is to request your prayers for hundreds of similar victims. (§Æ±æ¤j®a¯à¬°³o¨ÇÄ묹ªÌ¬èë) "My name is Vivian, and I am 18 years old. I have a little sister and brother. As a family we live in what is supposed to be a "secure" apartment. ("§Ú¥sVivian,¤µ¦~18·³.§Ú¦³¤@Ó©f©f©M§Ì§Ì,¦í¦b¤@´É³Q»{ ¬°¬O¦w¥þªº¤½´J) At 9.15 am, May 14th, 1998 a huge crowd had gathered around our apartment. They screamed, "Let's butcher the Chinese!", "Let's eat pigs!", "Let's have a party!" We live of the 7th floor and we got a call from a family on the 3rd floor saying that the crowd had reached the 2nd floor. They even chased some occupants upstairs. We were all very frightened. In our fright we prayed and left everything in God's hands. (1998¦~5¤ë14¤é¤W¤È9ÂI15¤À,¤@¸s¤HÂô¶i§Ú̪º¤½´J ¥L̳۵Û"§ÚÌn±þ¤FµØ¤H!","§Ú̧â½Þ¦Y¤F!",§Ú̦í¦b ¤C¼Ó,¤T¼Óªº®a¤H§i¶D§Ų́º¸s¤H¤w¸g¨ì¤F¤G¼Ó,¥LÌ¬Æ ¦Ü°lµÛ¦í¤á¤W¨Ó,§Ú̳£³QÀ~Ãa¤F,¥u¯à¦b®£Äߤ¤¬èë,±N ¤@¤Á¥æµ¹¤W«Ò) Afterward we left our room and went upstairs to the top floor, as it was impossible to go downstairs and escape. We got to the 15th floor and stayed with some friends. Not long afterwards we were surprised because some of the crowd coming out of the elevators right before we entered the room. We hurried into the room and locked the door tightly. At that time we heard them knock at the other rooms loudly and there were some screams from women and girls. Our room was filled with fear. (¤§«á§ÚÌÂ÷¶}©Ð¶¡°k¨ì³»¼Ó,§¹¥þµLªk¤U¼Ó°k¨«,§ÚÌ©M ¤@¨ÇªB¤Í°k¨ì15¼Ó,¨S¦³¦h¤[,¦b§Ú̸ú¶i©Ð¸Ì¤§«e,¨º¸s ¤H´N½Ä¥X¹q±è,§ÚÌ»°§Ö¶i¤J©Ð¸ÌÂêºò©Ðªù,¥uÅ¥¨ìªù¥~ ¶Ç¨Ó½ðªùªºÁnµ,¥H¤Î¤@¨Ç°ü¤k©M¤k«Äªº¦y¥sÁn,¾ãÓ ©Ð¶¡¥Rº¡µÛ®£Äß) We realized that they would come to us. So we spread throughout the room hiding in the corners.We could hear girls of 10 to 12 years old screaming, "Mommy, ..mommy...mom...mom...it hurts" That time I didn't know that these little girls were being raped. After about half an hour the noise diminished and we had some guts to go out and check. It was indescribable. A lot, some of them youg girls, were lying on the floor. "Oh my God, what has happened?" Seeing all of this we screemed and my little sister Fenny, screamed histerically and hugged her father. (§Ṳ́F¸Ñ¨ì¥LÌ¿ð¦·|¶i¨Ó,©Ò¥H§ÚÌ·æÁYªº¸ú¦bÀ𨤠§ÚÌÅ¥¨ì¤j¬ù10¨ì12·³ªº¤k«Äªº¦y¥s"¶ý...¶ý...¶ý....¦nµh §â¦o¥á¨ì¨Fµo¤W,§Ú°¨¤W¤ÏÀ³¨ì¦oªº¦MÀI,©ó¬O¤j¥s, ¦ý¬O¤@ӼɥÁ¥´¤F§Ú¤@Ó¦Õ¥ú,§Úª¨ª¨¤]³Q¥LÌ¥Î¤ì ´Ò¥´©ü,§Ú¶ý¶ý¦bFenny³Q¥á¨ì¨Fµo¤W®É´N©üˤF,§Ú¥u ¯à¬èë,¥u¯à¬èë¨aÃø¤£n°Á{) Uncle Dodi kept trying to stop them by offering money. His efforts were fruitless. And in the end 5 people raped Fenny. Before beginning with the raping they always said "Allahu Akbar" (an islamic phrase in arabic meaning "God is great". They were ferocius and brutal. (Dodi¨û¨û¸ÕµÛ¥Î¿úÅý¥L̤£n¬I¼É,¦ý¬O¨S¦³¥Î,µM«á ¦³¤Ó¤H±j¼É¤FFenny,¨CÓ¦b±j¼É«e³£©ÀµÛ"Allahu Akbar" ¬O¥ì´µÄõ±Ðªºµu¥yªü©Ô§B¸Üªº·N«ä¬O"°¶¤jªº¯«",¥LÌ Åã±o´Ý¼É¦Ó¥B¹³³¥Ã~¤@¯ë) Not long afterward, around 9 men came to the room and dragged me. I also saw them forcing and dragging my Aunt Vera. But at that time I passed out and everything went blank. I became conscious at around 5 or 6 pm. My head hurted and I realized I had no clothing on my body. I cried and realized my family was still there. My father was hugging my mother and little bother Doni. I also saw uncle Dodi lying on the floor and Aunt Vera was crying over his body. I felt so weak and fainted again. (¨S¦³¦h¤[,¤j·§¦³¤EÓ¤H§â§Ú©ì¥X¥h,§Ú¬Ý¨ì¥L̤]§â VeraÂTÂT©ì¥X¨Ó,¦ý¬O§Ú©ü¤F¹L¥h,¤@¤Á³£Åܦ¨ªÅ¥Õ ¤j¬ù¤U¤È5ÂI¨ì6ÂIªº®ÉÔ,§Ú³vº¥ªº«ì´_·NÃÑ,§ÚªºÀY ³¡¨ü¤F¶Ë,¨¤W¤]¤@µ·¤£±¾,§Úú¤F¥X¨Ó,¨Ã¥Bµo²{§Ú ªº®a¤HÁÙ¦b,§Úª¨ª¨©êµÛ§Ú¶ý¶ý©M§Ì§Ì,Dodi¨û¨û˦b ¦a¤W¦ÓVeraÂTÂT©êµÛ¥Lµhú,§Ú·P¨ìµê®z¦Ó¤S·w¯t¹L¥h) The next day I was in the Pluit hospital. My father and mother were beside me. With all the pains on my body I asked, "Mom, why Fenny. Mom?" I felt a stinging pain as I said these words. My cheeks were swollen. My mother cried again and couldn't speak any words, while my father, holding back his tears, managed to smile at me. After 4 days in treatment, my condition has improved. With a sad look, my father told me then what had happened. After I fainted 7 people raped me. At that time my father still couldn't see well after beling hit with a piece of wood. They raped me repeatedly. Then my father said "Vivian, Fenny is gone..." (²Ä¤G¤Ñ§Ú³Q°e¨ìPluitÂå°|,§Úªºª¨¶ý¦b§Ú¨®Ç,§Ú§ÔµÛµh °Ý¥LÌ"¶ý,Fenny©O?¶ý"»¡¸ÜÅý§Ú·P¨ì°w¨ë¯ëªºµhW,§Ú ¶ý¶ýú¤F°_¨Ó,¤@¥y¸Ü³£»¡¤£¥X¨Ó,§Úª¨ª¨§Ô¦í²\¤ô,¹ï§Ú W¯º¤@¤U,¥|¤Ñ¤§«á,§Úªº±¡ªp¦n¤FÂI,§Úªº¤÷¿Ë¤@Áy¶Ë´d ªº§i¶D§Ú,·í®É§Ú©ü°g¤F¥H«á,¦³7Ó¤H±j¼É¤F§Ú,§Ú¤÷¿Ë«h ³Q¶Ã´Ò¼Þ¥´,¨º¨Ç¼É¥Á«½Æªº±j¼ÉµÛ§Ú.¶ý¶ý¦b¤@®Ç¶Ë¤ßªº »¡"Vivian,Fenny¦º¤F...") I was confused and cried out, "Why Dad?" My father couldn't answer. He told me to rest and went out of the room. I cried over and over again, feeling that my life had no meaning any more. A week ago, after I was released from the hospital I was told everything that had happend. (§Ú¸£¤¤¤@¤ù²V¶Ã,ú¤F°_¨Ó,"ª¨ª¨.¬°¤°»ò?"§Ú¤÷¿ËµLªk ¦^µª§Ú,¥L§i¶D§Ú¦n¦n¥ð®§,¨«¤F¥X¥h,§Ú¤£°±ªºú,§Úªº¤H ¥Í¤w¸g§¹¥þ¨S¦³¥ô¦ó·N¸q¤F,¤@Ó¬P´Á¹L¥h¤F,¦b§Ú¥X °|¤§«á,¤~ª¾¹D¨Æ±¡ªº¾ãÓ¸g¹L) When Fenny was raped she kept on fighting and so she was repeatedly slapped by her rapists. The last time she fought Fenny spitted on one of them. Offended, the man grabbed a knife and stabbed Fenny's stomach over and over again. Finally she died with blood over her whole body. (Fenny¦b³Q±j¼Éªº®ÉÔ¤£°±ªº¤Ï§Ü,©ó¬O¨º¨Ç¼É¥Á¤£Â_ ªº¥´¦o,³Ì«áFennyªº¤Ï§Ü·S¤õ¤F¨ä¤¤¤@ӼɥÁ,¥L§ì°_ ¤@§â¤M¤l¨ë¶iFennyªº¨{¤l,¤@¦¸¤S¤@¦¸ªº¨ë¶i¨ë¥X,³Ì «áFenny¥þ¨¬O¦åªº¦º¤F) My father told me that uncle Dodi had the same fate watched by aunt Vera who was also raped. "God...why should all of this happen? Where are you God? Are you still alive?" My aunt Vera now stays with her parents. She is in shock. Her face is blank and refuses to eat. Almost every hour my mother and I cry over all these happenings. I can never forget. These mobs of people are uncivilized monsters." (¤÷¿Ë§i¶D§Ú,Dodi¨û¨û¤]¬ÝµÛ¦Û¤vªº¤Ó¤Ó³Q±j¼É,"¤Ñ§r! ¬°¤°»ò·|µo¥Í³oºØ¨Æ?¯«¦bþ¸Ì?Í¢ÁÙ¬¡µÛ¶Ü?"§ÚÂTÂT Vera²{¦b©M¥Lªº¤÷¥À¦í¦b¤@°_,¦o¨ü¨ìÄY«ªºÅåÀ~, ¦oªºÁy¤W¨S¦³¦å¦â¦Ó¥B©Úµ´¶i¹,§Ú©M¶ý¶ýµL®ÉµL¨è ¦]¬°³o³õ´c¹Ú¦Óúª_,§Ú¥Ã»·§Ñ¤£¤F,¨º¨Ç¼É¥Á´N¹³¬O ¨S¦³¶i¤Æªº©ÇÃ~) Additional comments from Bill Hekman: (¥H¤U¬OBill Hekmanªºªþµù) This is one of many victims. Hundreds of women and children were raped, mutilated and killed by muslim mobs. Some had their vaginas ripped apart, their bodies cut into pieces. (³o¥u¬O«Ü¦hªºÄ묹ªÌ¤¤ªº¤@Ó,¦³¼Æ¦Ê¦ì°ü¤k»P¤p«Ä ³Q¦^±Ð¼É®{±j¼É,±ÙÂ_¤â¸},±þ®`,¦o̪ºªº³±¹D³Q¼¹µõ ,¨Åé³Q¬å¦¨¦n´X¬q) Over 5000 of the Chinese Indonesian's shops were looted and burned down. A few days ago anther 63 shops were burned in Tegal, Central Java. The city of Solo is burned down. There is no protection and no justice in this country any more. (¶W¹L5000®a¥H¤Wªº¦L¥§µØ¤Hªº°Ó©±³Q±°¹Ü©MµI¿N, ´X¤Ñ«e¦b¤ö«zªºTegal,¥t¥~63®a°Ó©±³Q©ñ¤õ¿N±¼,¦L¥§ ²{¦b§¹¥þ¨S¦³«O»Ù,¥¿¸q¿ºµMµL¦s) Yesterday I was in the Kelapa Gading area and that area was spared from destruction. The police and military had guarded all the entry roads. The people there had collected large sums of money from door to door and paid for their protection. (¬Q¤Ñ§Ú¦bKelapa Gading°Ï,¨º¸Ì¥Ø«eÁÙ¨S¦³³Q¯}Ãa,ĵ¹î ©Mx¶¤¦b©Ò¦³¸ô¤fĵ§Ù,¨º¸Ìªº¤H̶°¦X¤F¤@¤jµ§¿ú¨Ó ¤ä¥I³o¨Ç«OÅ@) A similar situation took place in the Pondok Indah area.For the people who cannot pay millions to the armed forces there is no protection. Right now the hunderds of thousands of thugs,robbers, rapist, and killers live all around us. They are our neighbors. There is no punishment for the criminals and no justice for the victims. Yet, all Indonesians call themselves believers in God almighty. What a hipocracy. Shouting "God is great" when raping women andchildren is a blasphemy against a Holy God. (¦bPondok Indah°Ï¤]¬O¦P¼Ëªº±¡ªp,®³¤£¥X¿úªº¤H´N§¹¥þ ¨S¦³¥ô¦ó«O»Ù,²{¦b¦³¦¨¤d¤W¸Uªº¦^±Ð¨ë«È,±jµs,±j¼ÉªÌ©M ±þ¤HÅ]¦í¦b§Ú̪º¥|©P,¥L̬O§Ú̪º¾F©~,¹ï©ó¸o¥Ç¨S¦³ ³B»@,¹ï©ó¨ü®`ªÌ¤]¨S¦³¤½²z,¦ý¬O,©Ò¦³ªº¦L¥§¤H©I³êµÛ ¥LÌ©Ò«H¥õªº"¥þ¯àªº¯«:"¤Ó¿Ø¨ë¤F.¦b±j¼É°ü¤k©M¤p«Ä®É ©I³Û°¶¤jªº¯«,³o¬O¹ï¯«¸t¤W«Òªº«_Âp) Pray that God will annoint His preachers and missionaries throughout this nation with the power of the Holy Spirit to preach the message of repentance. God's word in 2 Chronicles 7:14 needs to be proclaimed boldly. There is no room for preachers filled with fear who think of evacuation and other selfish plans. Pray for Revival in all our churches. (¬èë¤W«Ò¦b³oÓ°ê®a¤¤¬°Í¢ªº¤l¥Á¶î¤Wªo»I,¥Î¯«¸tªº ¤O¶q¶Ç»¼®¬§ïªº°T®§,¬ù¸t¸g¾ú¥N§Ó²Ä7³¹²Ä14¸`À³¸Ó ³QÅã©úªº«Å§i, There is no room for preachers filled with fear who thinkof evacuation and other selfish plans.(³o¤@¥y¦b¤U½¤£¥X¨Ó¾A¤Áªº¦r¥y,±æ¦³ ¯àªÌ¸É¤W) Ä@§Ú̪º¸t·µ¬Ò¯à´_¿³ Some Christians are putting signs on their shops "Owned by Muslim". May God forgive them. Healing of this nation filled with crime and unjustice is bringing God's judgement and punishment. Healing and Salvation can only come with a nasional repentance at all levels in the government, armed forces and society. Then we need to share the Gospel of the Lord Jesus Christ. Christ is the One and Only Savior. No one will ever receive forgiveness and see heaven except through God's appointed Savior, the Lord Jesus Christ. Thank you for standing with us now. God bless you. (¦³ªº¤Ñ¥D±Ð®{ªº°Ó©±³Q¶K¤W"¦^±Ð®{©Ò¦³"ªº¼Ð»y,Ä@ ¤W«Ò¼e®¤¥LÌ,¥Î¤W«Òªº¼f§P»PÃg»@¨ÓÂåªv³oÓ¥Rº¡ ¸o´c»P¤£¸qªº°ê«×,°ß¦³¾ãÓ°ê®a-¦U¯Å¬F©²,x¶¤©MªÀ·| -ªº®¬§ï¤~¯à°±¤î¶Ëµh,§ÚÌn¶Ç¼½°ò·þªººÖµ,°ò·þ¬O°ß chaos in Indonesia last May 13-15. Many Chinese Indonesian citizens were abused, tortured and killed. Their houses and stores were looted and burnt. Hundreds of Chinese Indonesian girls/women (aged 10-55) were sexually harassed and gang raped brutally. Some victims were even raped in front of their family members or in front of inhuman cheering crowd. (½Ð±N³o«Ê"¶Àµ·±a"ªº«H¥ó±Hµ¹±zªºªB¤Í,¨Óªí¥Ü§Ú̹ï©ó ¦b¤¤ë13-15¤é©ó¦L¥§¼É°Ê¤¤¨ü®`ªºÄ묹ªÌ¤@Åé·P»P¦P±¡ ³\¦hµØ¸Çªº¦L¥§¤½¥Á³Q±j¼É,âh,±þ®`.¥L̪º©Ð¤l©M°Ó©± ³Q·m§T©MµI¿N,¦³ªº¨ü®`ªÌ´N¦b®a¤H»P¤@¸s¨S¦³¤H©Ê ÁÙÅw©Iªº¸s²³±«e³Q½ü¼É) Some of them were even thrown into the fire and burnt to death after being raped. Yet, not many actions seem to have been taken to investigate all this or to help the victims. And not very many people seem to know or care about what happened. Please help to spread the news and let the world know. (¨ä¤¤¦³¨Ç¤H¦b³Q±j¼É¤§«á,ÁÙ³Q¥á¤J¤õ¤¤µI¿N¦Ü¦º,¦ý¬O,¨Ã ¨S¦³³\¦h½Õ¬d©Î¨ó§Uªº¦æ°Ê¦b¶i¦æ,¦Ó¥B¤]ÁÙ¨S¦³«Ü¦h¤HÃö ¤ß©Î¬Oª¾¹D³o¨Ç¨Æ±¡,½Ð¨ó§U´²§G³o¨Ç®ø®§,Åý°ê»Ú¶¡ª¾¹D ³o¨Ç¨Æ) We need help to get more international attention to help Chinese Indonesians, who are now living in fear in Indonesia. Please pass this ribbon around as the symbol of campaign against human rights violations, injustice, and racism towards Chinese Indonesians. (§ÚÌ»Ýn¤Þ°_°ê»Ú¶¡ªº«µø,¥H«KÀ°§U¥Ø«e¥Í¬¡¦b®£Äߤ¤ ªº¦L¥§µØ¤H,½Ð±Nµ·±a§@¬°¤Ï¹ï¥[½Ñ©ó¦L¥§µØ¤Hªº¼É¤O, ¤£¥¿¸q¥H¤ÎºØ±Ú¥D¸qªº¼Ð»x) Show that we care and may God help us! (§i¶D¥@¤H§ÚÌÃö¤ß,Ä@¤W«Ò½ç»P§Ṳ́O¶q)
On Wed, 24 Nov 2004 19:26:41 +1100, Neil Conway <neilc@samurai.com> wrote : > On Mon, 2004-11-22 at 07:57 +0000, PaweX Niewiadomski wrote: > > I saw discussion about bitmap indexes few weeks ago. I wonder if > > any of you is working on it (in secret)? > > For what it's worth, I don't know of anyone working on them. > > > I will be chosing subject > > of my master thesis and thougth about implementing bitmap indexes. > > No objection here :) My promoter accepted the subject. I'm waiting for an official acceptance of the subject. Until then I would want to get familiar with PostgreSQL internals and bitmap index implementations. I will appreciate any links to online papers, books that could help me. -- **Pawel Niewiadomski**, new()foo-baz.com, http://new.foo-baz.com/ Virtual Qmail (http://v-q.foo-baz.com), qmail-patches (http://q-p.foo-baz.com)
Hi, On Fri, Nov 26, 2004 at 10:25:41AM -0000, Pawel Niewiadomski wrote: > > My promoter accepted the subject. I'm waiting for an official acceptance > of the subject. Until then I would want to get familiar with PostgreSQL > internals and bitmap index implementations. I will appreciate > any links to online papers, books that could help me. That sounds nice! I thought of implementing it as part of my master thesis next year so I already started reading about bitmap indexing and so on. A nice start is possibly "Hector Garcia-Mollina" "Database Implementation". Regards, Yann
On Sat, 29 Jan 2005 18:46:44 +0000, Mike Rylander <mrylander@gmail.com> wrote : > For on-disk bitmap indexes, yes. I don't see any reason this couldn't > be done with GiST, perhaps even as a generalization of the index stuff > in the int_array contrib module. I was thinking about playing with the core source of PostgreSQL and creating patches, GiST as I read has some limitations that I want to overcome. Speaking about my idea - I was thinking about implementing on-disk indexes, not in memory. I think having both of them would be great :-) -- **Pawel Niewiadomski**, new()foo-baz.com, http://new.foo-baz.com/ Virtual Qmail (http://v-q.foo-baz.com), qmail-patches (http://q-p.foo-baz.com)
Just ran into a fascinating edge case. One of our folks was building a stored function, and ran into an odd error when trying to COPY to stdout. Here's a characteristic example: create or replace function build_table (integer) returns integer as ' begin execute ''copy foo to stdout;''; return 1; end' language plpgsql; The result: /* cbbrowne@[local]/dba2 in=*/ select try_copy(1); ERROR: unexpected error -2 in EXECUTE of query "copy dups to stdout;" CONTEXT: PL/pgSQL function "try_copy" line 2 at execute statement If I instead create create or replace function build_table (integer) returns integer as ' copy foo to stdout; return 1; ' language sql; The latter works fine. There is evidently Something Strange about the state of stdout when it is referenced inside a stored procedure. We can work around this reasonably, but the "unexpected error -2" elicits some curiosity I'd like to satisfy. (I bounced it off one of the others that aren't off on vacation, and his reaction was exactly the same as mine, namely "Hmmm... I'm just not sure what to expect from that...") Is there a good reason/excuse to give as to why the pl/pgsql version of the COPY *shouldn't* work? -- (format nil "~S@~S" "cbbrowne" "acm.org") http://www.ntlug.org/~cbbrowne/sap.html Rules of the Evil Overlord #78. "I will not tell my Legions of Terror "And he must be taken alive!" The command will be: ``And try to take him alive if it is reasonably practical.''" <http://www.eviloverlord.com/>
On Tue, Jul 26, 2005 at 17:56:42 -0400, Chris Browne <cbbrowne@acm.org> wrote: > > There is evidently Something Strange about the state of stdout when it > is referenced inside a stored procedure. I suspect this is related to trusted PLs not being able to write files. It does seem like a problem that COPY has access to stdout in one case and not the other.
Chris Browne <cbbrowne@acm.org> writes: > create or replace function build_table (integer) returns integer as ' > begin > execute ''copy foo to stdout;''; > return 1; > end' language plpgsql; > The result: > /* cbbrowne@[local]/dba2 in=*/ select try_copy(1); > ERROR: unexpected error -2 in EXECUTE of query "copy dups to stdout;" > CONTEXT: PL/pgSQL function "try_copy" line 2 at execute statement In CVS tip I get ERROR: cannot COPY to/from client in PL/pgSQL CONTEXT: PL/pgSQL function "build_table" line 2 at execute statement There are some protocol-level reasons for not supporting this, IIRC. You're already inside a SELECT and the client library would probably get terribly confused. regards, tom lane
Per tonight's dinner discussion, it Sure Would Be Nice if psql had an additional option that allowed passing in a conninfo string. The notion: Plenty of libraries out there like Pg, DBI::Pg, and such make you specify connections in the form: "host=my.db.host.example.org port=5678 dbname=dumb_db user=chewie" Since those strings need to be kept around for the programs that are directly using conninfo, it Sure Would Be Nice if those strings could also be used as arguments to psql. Peter Eisentraut observed that "-D" is not yet used; that's certainly an option. Thus, supposing we kept conninfo strings in a series of files in /opt/scripts/conninfo, we might run a script via... #!/bin/sh CONNINFO=`cat /opt/scripts/conninfo/dumb_db.conninfo` PERIOD='90 days' QUERY="delete from my_table where trans_on < now() - '${PERIOD}'::interval;" QUERY2="vacuum analyze my_table;" psql -D "${CONNINFO}" -c "${QUERY}" psql --pqconninfo="${CONNINFO}" -c "${QUERY2}" I'm not sure those are forcibly the best names for options, but they're a thought... -- select 'cbbrowne' || '@' || 'gmail.com'; http://linuxdatabases.info/info/ Rules of the Evil Overlord #10. "I will not interrogate my enemies in the inner sanctum -- a small hotel well outside my borders will work just as well." <http://www.eviloverlord.com/>
Christopher Browne said: > Per tonight's dinner discussion, it Sure Would Be Nice if psql had an > additional option that allowed passing in a conninfo string. > > The notion: Plenty of libraries out there like Pg, DBI::Pg, and such > make you specify connections in the form: > "host=my.db.host.example.org port=5678 dbname=dumb_db user=chewie" > > Since those strings need to be kept around for the programs that are > directly using conninfo, it Sure Would Be Nice if those strings could > also be used as arguments to psql. > > Peter Eisentraut observed that "-D" is not yet used; that's certainly > an option. > > Thus, supposing we kept conninfo strings in a series of files in > /opt/scripts/conninfo, we might run a script via... > > #!/bin/sh > CONNINFO=`cat /opt/scripts/conninfo/dumb_db.conninfo` > PERIOD='90 days' > QUERY="delete from my_table where trans_on < now() - > '${PERIOD}'::interval;" QUERY2="vacuum analyze my_table;" > psql -D "${CONNINFO}" -c "${QUERY}" > psql --pqconninfo="${CONNINFO}" -c "${QUERY2}" > > I'm not sure those are forcibly the best names for options, but > they're a thought... Why is this better than using the service file? cheers andrew
On Fri, Jul 07, 2006 at 05:33:14AM -0500, Andrew Dunstan wrote: > Christopher Browne said: > > > > The notion: Plenty of libraries out there like Pg, DBI::Pg, and such > > make you specify connections in the form: > > "host=my.db.host.example.org port=5678 dbname=dumb_db user=chewie" > > Why is this better than using the service file? What Andrew asked. Cheers, D -- David Fetter <david@fetter.org> http://fetter.org/ phone: +1 415 235 3778 AIM: dfetter666 Skype: davidfetter Remember to vote!
david@fetter.org (David Fetter) writes: > On Fri, Jul 07, 2006 at 05:33:14AM -0500, Andrew Dunstan wrote: >> Christopher Browne said: >> > >> > The notion: Plenty of libraries out there like Pg, DBI::Pg, and such >> > make you specify connections in the form: >> > "host=my.db.host.example.org port=5678 dbname=dumb_db user=chewie" >> >> Why is this better than using the service file? > > What Andrew asked. It's not *better* than PGSERVICE; it's just that this does reflect the connection representation widely used by every sort of library out there that uses libpq. Let's do a quick "guess the language"... Which language is this? conn = PGconn.open("host=my.db.host.example.org port=5678 dbname=dumb_db user=chewie"); How about this? $conn = Pg::connectdb("host=my.db.host.example.org port=5678 dbname=dumb_db user=chewie"); Hmm... This one should be easy to guess... import psycopg2 dbcon = psycopg2.connect('host=my.db.host.example.org port=5678dbname=dumb_db user=chewie') This one might conceivably be easy to mistake for another... $dbconn = pg_connect("host=my.db.host.example.org port=5678dbname=dumb_db user=chewie") or die('Could not connect: ' . pg_last_error()); How about this? set conres [catch {set newdbc [pg_connect -conninfo "host=my.db.host.example.org port=5678 dbname=dumb_dbuser=chewie"]} msg] This isn't as obvious as it may seem... (setq P (pq-connectdb "host=my.db.host.example.org port=5678 dbname=dumb_db user=chewie")) But this probably is... (setf conn (postgresql::PQconnectdb "host=my.db.host.example.org port=5678 dbname=dumb_db user=chewie")) Gavin would be extremely disappointed if I had left this one out :-) let main () = let c = new connection ~conninfo:"host=my.db.host.example.orgport=5678 dbname=dumb_db user=chewie" () in let _ = c#exec ~expect:[Copy_in] ("copy" ^ Sys.argv.(1) ^ " from stdin") in c#copy_in_channel stdin; c#finish The common thing across all of these is that there's a single string passed in to request the connection. A string that libpq in effect requires, but which psql, somewhat oddly, does not allow you to directly specify. We've had cases where we needed to store pgconn "connection signatures", for reports, and then discovered we wanted to use them to describe psql connections. A "--pgconn" or "-D" option would allow doing this. I agree that pg_service.conf is probably cleaner, but it seems somewhat odd for The Poster Child libpq application, psql, *not* to have a way of passing in a conninfo value. -- let name="cbbrowne" and tld="acm.org" in name ^ "@" ^ tld;; http://cbbrowne.com/info/oses.html If a mute swears, does his mother wash his hands with soap?
On 7/7/06, Andrew Dunstan <andrew@dunslane.net> wrote:
service file is not well described in manual. quick grep showed that we have PGSERVICE variable, and pg_service.conf.sample, which doesn't even include all possible parameter names.
and - comments in this file suggest it is to be used globally only: "Copy this to your sysconf directory (typically /usr/local/pgsql/etc) and...", and not "per user".
i would really love to see pg_service.conf extended in such a way that i could:
1. store it in $HOME/.pg_service.conf
2. use name of service on psql invocation (not as env variable). preferably using psql service_name. this will conflict with current behaviour (treating argument as dbname), but with some basic logic, i belive it can be done.
just my $0.02.
depesz
Why is this better than using the service file?
service file is not well described in manual. quick grep showed that we have PGSERVICE variable, and pg_service.conf.sample, which doesn't even include all possible parameter names.
and - comments in this file suggest it is to be used globally only: "Copy this to your sysconf directory (typically /usr/local/pgsql/etc) and...", and not "per user".
i would really love to see pg_service.conf extended in such a way that i could:
1. store it in $HOME/.pg_service.conf
2. use name of service on psql invocation (not as env variable). preferably using psql service_name. this will conflict with current behaviour (treating argument as dbname), but with some basic logic, i belive it can be done.
just my $0.02.
depesz
hubert depesz lubaczewski wrote: > On 7/7/06, *Andrew Dunstan* <andrew@dunslane.net > <mailto:andrew@dunslane.net>> wrote: > > Why is this better than using the service file? > > > > service file is not well described in manual. quick grep showed that > we have PGSERVICE variable, and pg_service.conf.sample, which doesn't > even include all possible parameter names. > and - comments in this file suggest it is to be used globally only: > "Copy this to your sysconf directory (typically /usr/local/pgsql/etc) > and...", and not "per user". > i would really love to see pg_service.conf extended in such a way that > i could: > 1. store it in $HOME/.pg_service.conf > 2. use name of service on psql invocation (not as env variable). > preferably using psql service_name. this will conflict with current > behaviour (treating argument as dbname), but with some basic logic, i > belive it can be done. > The service file location is runtime configurable Just set PGSYSCONFDIR. cheers andrew
Looks like gBorg has gone down... The Slony-I project does plan to move to pgFoundry, once 1.2 is released... <http://slony-wiki.dbitech.ca/index.php/Move_to_PgFoundry_Checklist> But we need to get to that point (1.2) first. Alas, gBorg being down today doesn't help :-(. -- (format nil "~S@~S" "cbbrowne" "ntlug.org") http://cbbrowne.com/info/sap.html "Success is something I will dress for when I get there, and not until." -- Unknown
I mentioned this on -general, I believe, but gborg is on one of our last FreeBSD 4.x servers right now ... I have to scheduale moving it to the FreeBSD 6.x machines ... neptune had some problems early this morning, and the 4.x fsck took 9 @@#$@$@# hours to run ;( On Tue, 18 Jul 2006, Chris Browne wrote: > Looks like gBorg has gone down... > > The Slony-I project does plan to move to pgFoundry, once 1.2 is released... > <http://slony-wiki.dbitech.ca/index.php/Move_to_PgFoundry_Checklist> > > But we need to get to that point (1.2) first. Alas, gBorg being down > today doesn't help :-(. > -- > (format nil "~S@~S" "cbbrowne" "ntlug.org") > http://cbbrowne.com/info/sap.html > "Success is something I will dress for when I get there, and not > until." -- Unknown > > ---------------------------(end of broadcast)--------------------------- > TIP 5: don't forget to increase your free space map settings > ---- Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . scrappy@hub.org MSN . scrappy@hub.org Yahoo . yscrappy Skype: hub.org ICQ . 7615664
In support of PG 8.2, we need to have the log trigger function do the following: - Save value of standards_conforming_string - Set value of standards_conforming_string to FALSE - proceed withsaving data to sl_log_? - Recover value of standards_conforming_string The variable, standards_conforming_string, does not appear to be widely exported, but rather seems pretty localized to guc.c This is the prime thing holding us from RC2. Tom, Peter, you have touched guc.c in the context of standards_conforming_string; perahps you can suggest something? Darcy and I aren't seeing what to do... -- let name="cbbrowne" and tld="cbbrowne.com" in name ^ "@" ^ tld;; http://www3.sympatico.ca/cbbrowne/ Why do we drive on parkways and park on driveways?
Chris Browne wrote: > In support of PG 8.2, we need to have the log trigger function do the > following: > - Save value of standards_conforming_string > - Set value of standards_conforming_string to FALSE > - proceed with saving data to sl_log_? > - Recover value of standards_conforming_string Would SET LOCAL help you? -- Peter Eisentraut http://developer.postgresql.org/~petere/
Chris Browne <cbbrowne@acm.org> writes: > In support of PG 8.2, we need to have the log trigger function do the > following: > - Save value of standards_conforming_string > - Set value of standards_conforming_string to FALSE This is a good idea why exactly? What are you even expecting it to do? The backend doesn't use standard_conforming_strings to determine any output, other than in ruleutils.c which I didn't think Slony uses. regards, tom lane
peter_e@gmx.net (Peter Eisentraut) writes: > Chris Browne wrote: >> In support of PG 8.2, we need to have the log trigger function do the >> following: >> - Save value of standards_conforming_string >> - Set value of standards_conforming_string to FALSE >> - proceed with saving data to sl_log_? >> - Recover value of standards_conforming_string > > Would SET LOCAL help you? Not really. The log trigger function is an SPI function, and I don't think I want to be invoking an extra SQL request every time a tuple is updated. Consider our present handling of date localizations... Slony-I prefers to operate using ISO dates. So the log trigger function must force the datestyle correspondingly. So our code looks like... (eliding irrelevant code) int OldDateStyle = DateStyle; DateStyle = USE_ISO_DATES; /* code that generates data to stow in sl_log_n */ DateStyle = OldDateStyle; /* Retrieve user's local settings */ ------------------------------- At one point, I thought that we'd modify this to: int OldDateStyle = DateStyle; int OldSCS = standards_conforming_strings; DateStyle = USE_ISO_DATES; standards_conforming_strings = FALSE; /* code that generates data to stow in sl_log_n */ DateStyle = OldDateStyle; /* Retrieve user's local settings */ standards_conforming_strings = OldSCS; Unfortunately (perhaps) standards_conforming_strings does not appear to be exported, so I'm not sure how to do this otherwise. -- (reverse (concatenate 'string "gro.mca" "@" "enworbbc")) http://www.ntlug.org/~cbbrowne/wp.html Editing is a rewording activity. -- Alan J. Perlis [And EMACS a rewording editor. Ed.]
Chris Browne <cbbrowne@acm.org> writes: > Unfortunately (perhaps) standards_conforming_strings does not appear > to be exported, so I'm not sure how to do this otherwise. Perhaps your problem is one of spelling? It's "standard_conforming_strings", and it's certainly a global variable. You still haven't explained why you need this, though. There are no datatype output functions that examine this variable. regards, tom lane
tgl@sss.pgh.pa.us (Tom Lane) writes: > Chris Browne <cbbrowne@acm.org> writes: >> Unfortunately (perhaps) standards_conforming_strings does not appear >> to be exported, so I'm not sure how to do this otherwise. > > Perhaps your problem is one of spelling? It's > "standard_conforming_strings", and it's certainly a global variable. > > You still haven't explained why you need this, though. There are > no datatype output functions that examine this variable. Apologies. I was reading too much into the problem report. After some discussion on IRC, I think we've "simplified" things back to what the problem is. Our logtrigger() function is already applying a "non-SCS" policy to quote backslashes. No need to declare anything there, as had been the case with a recent DateStyle issue. With DateStyle, we had to control the style in two places:1. In logtrigger(), when PostgreSQL is asked to generate the string version of a datestamp. We do so by temporarily setting DateStyle.2. When loading data, we need to make surethe connection uses a consistent DateStyle. We do so by setting the GUC variable using SET. I was under the misapprehension that we needed something akin to that step 1; apparently we only need worry about step 2. To which Peter's suggestion is entirely sufficient. Sorry about the confusion. -- let name="cbbrowne" and tld="ntlug.org" in name ^ "@" ^ tld;; http://cbbrowne.com/info/internet.html "Who is General Failure and why is he reading my hard disk?" -- <leitner@inf.fu-berlin.de>, Felix von Leitner
Chris Browne wrote: > Here's a patch to add in the material on replication recently > discussed on pgsql.docs. I'm not thrilled that there were only a few > comments made; I'd be happy to see "slicing and dicing" to see this > made more useful. The agreed-to process was 1. post information on pgsql-general 1.a. solicit comments 2. put information page on web site 3. link from documentation to web site You seem to have short-circuited all that. I don't think this sort of material belongs directly into the PostgreSQL documentation. -- Peter Eisentraut http://developer.postgresql.org/~petere/
> > 1. post information on pgsql-general > 1.a. solicit comments > 2. put information page on web site > 3. link from documentation to web site > > You seem to have short-circuited all that. > > I don't think this sort of material belongs directly into the PostgreSQL > documentation. It might be interesting to have some links in the external projects area for replication, but a section of its own doesn't seem relevant. Joshua D. Drkae -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/
Joshua D. Drake wrote: > >I don't think this sort of material belongs directly into the PostgreSQL > >documentation. Why not? > It might be interesting to have some links in the external projects area > for replication, but a section of its own doesn't seem relevant. I disagree about "having some links". Maybe we should consider adding this as a section in the external projects chapter, instead of having a chapter of its own, but "some links" seems a little short on actual contents. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera wrote: > Joshua D. Drake wrote: > >>> I don't think this sort of material belongs directly into the PostgreSQL >>> documentation. > > Why not? Well Peter said that, not me :) > >> It might be interesting to have some links in the external projects area >> for replication, but a section of its own doesn't seem relevant. > > I disagree about "having some links". Maybe we should consider adding > this as a section in the external projects chapter, instead of having a > chapter of its own, but "some links" seems a little short on actual > contents. O.k. more specifically, I think that the content (even if it is a section) probably deserves discussion in the external projects section. Joshua D. Drake > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/
Joshua D. Drake wrote: > Alvaro Herrera wrote: > >Joshua D. Drake wrote: > > > >>>I don't think this sort of material belongs directly into the PostgreSQL > >>>documentation. > > > >Why not? > > Well Peter said that, not me :) I know, but I though I'd post one message instead of two. (In fact I didn't even think about it -- I just assume it's clear.) > >>It might be interesting to have some links in the external projects area > >>for replication, but a section of its own doesn't seem relevant. > > > >I disagree about "having some links". Maybe we should consider adding > >this as a section in the external projects chapter, instead of having a > >chapter of its own, but "some links" seems a little short on actual > >contents. > > O.k. more specifically, I think that the content (even if it is a > section) probably deserves discussion in the external projects section. Sure, see my suggestion above. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera wrote: > > >I don't think this sort of material belongs directly into the > > > PostgreSQL documentation. > > Why not? PostgreSQL documentation (or any product documentation) should be factual: describe what the software does and give advice on its use. This should be mostly independent of the external circumstances, because people will still read that documentation three or four years from now. The proposed text is, at least partially, journalistic: it evaluates competing ideas, gives historical and anecdotal information, reports on current events, and makes speculations about the future. That is the sort of material that is published in periodicals or other volatile media. At the summit, we resolved, for precisely these reasons, to keep the journalistic parts on the web site, for clear separation from the shipped product and for easier updates (and for easier reference as well, because the PostgreSQL documentation is not the single obvious place to look for it) and refer to it from the documentation. -- Peter Eisentraut http://developer.postgresql.org/~petere/
I was thinking of something similar to our encryption section: http://www.postgresql.org/docs/8.1/static/encryption-options.html The idea being to define issues like multi/single master, async vs, sync, and mention the projects which are in each category. --------------------------------------------------------------------------- Peter Eisentraut wrote: > Alvaro Herrera wrote: > > > >I don't think this sort of material belongs directly into the > > > > PostgreSQL documentation. > > > > Why not? > > PostgreSQL documentation (or any product documentation) should be > factual: describe what the software does and give advice on its use. > This should be mostly independent of the external circumstances, > because people will still read that documentation three or four years > from now. > > The proposed text is, at least partially, journalistic: it evaluates > competing ideas, gives historical and anecdotal information, reports on > current events, and makes speculations about the future. That is the > sort of material that is published in periodicals or other volatile > media. > > At the summit, we resolved, for precisely these reasons, to keep the > journalistic parts on the web site, for clear separation from the > shipped product and for easier updates (and for easier reference as > well, because the PostgreSQL documentation is not the single obvious > place to look for it) and refer to it from the documentation. > > -- > Peter Eisentraut > http://developer.postgresql.org/~petere/ > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Hi, Bruce Momjian wrote: > The idea being to define issues like multi/single master, async vs, > sync, and mention the projects which are in each category. You could even add shared-nothing vs. shared-disk nodes. Generally I'd say it makes sense to 'educate' people, but does it really make sense to explain all that if there is no replication solution for most of these combinations? I'd vote for an external (not in the documentation) information site about replication solutions. There we can put all the information we see fit (even 'journalistic' ones). I might change my mind once we have multiple replication solutions covering most situations. ;-) I like what and how Chris wrote [1] - an overview over existing and upcomming replication solutions. Regards Markus [1]: I can't find Chris' original message. My answer to it is in the archives, but not the original message. Why is that? (Thread view says 'message not available'). My answer contains Chris' text, though: http://archives.postgresql.org/pgsql-docs/2006-07/msg00019.php
Peter Eisentraut wrote: > Alvaro Herrera wrote: > > > >I don't think this sort of material belongs directly into the > > > > PostgreSQL documentation. > > > > Why not? > > PostgreSQL documentation (or any product documentation) should be > factual: describe what the software does and give advice on its use. > This should be mostly independent of the external circumstances, > because people will still read that documentation three or four years > from now. > > The proposed text is, at least partially, journalistic: it evaluates > competing ideas, gives historical and anecdotal information, reports on > current events, and makes speculations about the future. That is the > sort of material that is published in periodicals or other volatile > media. I can see value in documenting what replication systems are known to work (for some definition of work) with a given release in the documentation for that release. Five years down the road when I'm trying to implement replication for a client who's somehow locked into postgres 8.2 (for whatever reason), it would be very helpful to know that slony1.2 is an option. I don't know if this is sufficient justification. Including a separate page on the history of postgres replication to date also makes some sense, at least to me. It should be relatively easy to maintain. If we do talk about replicatoin, then including a probably separate and presumably quite static page on the taxonomy of replication seems necessary. As Chris notes, the term replication by it'self is can mean quite a number of things. > At the summit, we resolved, for precisely these reasons, to keep the > journalistic parts on the web site, for clear separation from the > shipped product and for easier updates (and for easier reference as > well, because the PostgreSQL documentation is not the single obvious > place to look for it) and refer to it from the documentation. > > -- > Peter Eisentraut > http://developer.postgresql.org/~petere/ > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org
Hi, Andrew Hammond wrote: > I can see value in documenting what replication systems are known to > work (for some definition of work) with a given release in the > documentation for that release. Five years down the road when I'm > trying to implement replication for a client who's somehow locked into > postgres 8.2 (for whatever reason), it would be very helpful to know > that slony1.2 is an option. I don't know if this is sufficient > justification. Please keep in mind, that most replication solutions (that I know of) are quite independent from the PostgreSQL version used. Thus, documenting which version of PostgreSQL can be used with which version of a replication system should better be covered in the documentation of the replication system. Otherwise you would have to update the PostgreSQL documentation for new releases of your favorite replication system - which seems to lead to confusion. > Including a separate page on the history of postgres replication to > date also makes some sense, at least to me. It should be relatively > easy to maintain. I agree that having such a 'replication guide for users of PostgreSQL' is a good thing to have. But I think not much of that should be part of the official PostgreSQL documentation - mainly because the replication solutions are not part of PostgreSQL. Regards Markus
Markus Schiltknecht wrote: > Hi, > > Andrew Hammond wrote: > > I can see value in documenting what replication systems are known to > > work (for some definition of work) with a given release in the > > documentation for that release. Five years down the road when I'm > > trying to implement replication for a client who's somehow locked into > > postgres 8.2 (for whatever reason), it would be very helpful to know > > that slony1.2 is an option. I don't know if this is sufficient > > justification. > > Please keep in mind, that most replication solutions (that I know of) > are quite independent from the PostgreSQL version used. Thus, > documenting which version of PostgreSQL can be used with which version > of a replication system should better be covered in the documentation of > the replication system. I would agree to this with the caveat that there needs to be something in the postgres documentation that points people to the various replication systems available. > Otherwise you would have to update the > PostgreSQL documentation for new releases of your favorite replication > system - which seems to lead to confusion. Yeah, updating the docs based on other software releases would suck. How about "what works with a given release at the time of the release"? Perhaps this could be limited to a pointer to the docs for such replication systems, and maybe a very brief description (based on Chris' taxonomy)? > > Including a separate page on the history of postgres replication to > > date also makes some sense, at least to me. It should be relatively > > easy to maintain. > > I agree that having such a 'replication guide for users of PostgreSQL' > is a good thing to have. But I think not much of that should be part of > the official PostgreSQL documentation - mainly because the replication > solutions are not part of PostgreSQL. Arguably, neither are most of the procedural languages in the Server Programming section of the documentation, and yet they're included. I agree that it's improtant to keep the documentation from getting cluttered up with stuff that's "not part of PostgreSQL". However, I think the very fact so many people assume that there's no replication for PostgreSQL simply because it's not mentioned in the documentation shows that for many people replication is precieved as "part of" the dbms. Even a single page in the documentation wich consists of something along the lines of the following would help these folks find what they're looking for. "There are a number of different approaches to solving the problem of replication, each with strengths and weaknesses. As a result, there are a number of different replication solutions available for PostgreSQL. To find out more, please refer to the website."
Andrew Hammond wrote: > How about "what works with a given release at the time of the > release"? We just threw that idea out in the context of the procedural language discussion because we do not have the resources to check what works. > Arguably, neither are most of the procedural languages in the Server > Programming section of the documentation, and yet they're included. That is false. The documentation documents exactly those pieces of code that we distribute. > "There are a number of different approaches to solving the problem of > replication, each with strengths and weaknesses. As a result, there > are a number of different replication solutions available for > PostgreSQL. To find out more, please refer to the website." Well, that's what I've been talking about all along, and it has also been the resolution at the Toronto meeting. -- Peter Eisentraut http://developer.postgresql.org/~petere/
> > "There are a number of different approaches to solving the problem of > > replication, each with strengths and weaknesses. As a result, there > > are a number of different replication solutions available for > > PostgreSQL. To find out more, please refer to the website." > > Well, that's what I've been talking about all along, and it has also > been the resolution at the Toronto meeting. Great. Is the above text sufficient for the documentation then, or does anyone have a suggestion on how to say this better? Drew
What's up there? It has been down all week. We're trying to get the Slony-I 1.2 release out, so we can then migrate over to pgFoundry. But that doesn't working terribly well when gBorg's down... -- let name="cbbrowne" and tld="acm.org" in String.concat "@" [name;tld];; http://www.ntlug.org/~cbbrowne/emacs.html "...Yet terrible as Unix addiction is, there are worse fates. If Unix is the heroin of operating systems, then VMS is barbiturate addiction, the Mac is MDMA, and MS-DOS is sniffing glue. (Windows is filling your sinuses with lucite and letting it set.) You owe the Oracle a twelve-step program." --The Usenet Oracle
Also people trying to download slony have to do some hunting to find things. The source only tar is not available on pgfoundry. one of them, elein On Thu, Aug 31, 2006 at 10:33:36AM -0400, Chris Browne wrote: > What's up there? It has been down all week. > > We're trying to get the Slony-I 1.2 release out, so we can then > migrate over to pgFoundry. But that doesn't working terribly well > when gBorg's down... > -- > let name="cbbrowne" and tld="acm.org" in String.concat "@" [name;tld];; > http://www.ntlug.org/~cbbrowne/emacs.html > "...Yet terrible as Unix addiction is, there are worse fates. If Unix > is the heroin of operating systems, then VMS is barbiturate addiction, > the Mac is MDMA, and MS-DOS is sniffing glue. (Windows is filling your > sinuses with lucite and letting it set.) You owe the Oracle a > twelve-step program." --The Usenet Oracle > > ---------------------------(end of broadcast)--------------------------- > TIP 5: don't forget to increase your free space map settings >
elein@varlena.com (elein) writes: > Also people trying to download slony have to do some > hunting to find things. The source only tar is not > available on pgfoundry. The source tarball for version 1.1.5 is now in place: <http://pgfoundry.org/frs/download.php/1063/slony1-1.1.5.tar.bz2> We may as well have at least that bit of backup. I didn't bother putting up the documentation tarball; it is better to grab a newer version of the docs. -- "cbbrowne","@","ntlug.org" http://cbbrowne.com/info/slony.html You can lead a horse to water, but if you can get him to swim on his back, you've got something.
> Also people trying to download slony have to do some hunting > to find things. The source only tar is not available on pgfoundry. All gborg *downloads* are available on: http://www.postgresql.org/ftp/projects/gborg/ Seems Slony hasn't released files using the gborg file release system, perhaps? Because for some reason Slony stuff isn't there. But I figured it'd be a good idea t oget that pointer in for people looking for anything else off gborg that didn't know we mirrored those. //Magnus
mha@sollentuna.net ("Magnus Hagander") writes: >> Also people trying to download slony have to do some hunting >> to find things. The source only tar is not available on pgfoundry. > > All gborg *downloads* are available on: > http://www.postgresql.org/ftp/projects/gborg/ > > Seems Slony hasn't released files using the gborg file release system, > perhaps? Because for some reason Slony stuff isn't there. But I figured > it'd be a good idea t oget that pointer in for people looking for > anything else off gborg that didn't know we mirrored those. In the past, binaries got hosted on Jan Wieck's "downloads area" which probably lives somewhere nearby there. Apparently it wasn't terribly convenient to add/drop files from the gBorg downloads area. For 1.2, I have been putting release candidate files over at pgFoundry, which is certainly still accessible. -- (format nil "~S@~S" "cbbrowne" "ntlug.org") http://www3.sympatico.ca/cbbrowne/finances.html "Recursion is the root of computation since it trades description for time." -- Alan J. Perlis
On Thu, Aug 31, 2006 at 10:33:36AM -0400, Chris Browne wrote: > What's up there? It has been down all week. > > We're trying to get the Slony-I 1.2 release out, so we can then > migrate over to pgFoundry. But that doesn't working terribly well > when gBorg's down... Speaking of which, what's the status of the migration tools? Is that ball still in Larry's court? -- Jim C. Nasby, Database Architect jim@nasby.net 512.569.9461 (cell) http://jim.nasby.net
No, since my time is up in the air at the moment, I've bowed out for now. Once I get settled at Surgient, I might take it up again, but not right now. -----Original Message----- From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Jim C. Nasby Sent: Saturday, September 02, 2006 5:42 PM To: Chris Browne Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] gBorg status? On Thu, Aug 31, 2006 at 10:33:36AM -0400, Chris Browne wrote: > What's up there? It has been down all week. > > We're trying to get the Slony-I 1.2 release out, so we can then > migrate over to pgFoundry. But that doesn't working terribly well > when gBorg's down... Speaking of which, what's the status of the migration tools? Is that ball still in Larry's court? -- Jim C. Nasby, Database Architect jim@nasby.net 512.569.9461 (cell) http://jim.nasby.net ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend
Oops! jim@nasby.net ("Jim C. Nasby") was seen spray-painting on a wall: > On Thu, Aug 31, 2006 at 10:33:36AM -0400, Chris Browne wrote: >> What's up there? It has been down all week. >> >> We're trying to get the Slony-I 1.2 release out, so we can then >> migrate over to pgFoundry. But that doesn't working terribly well >> when gBorg's down... > > Speaking of which, what's the status of the migration tools? Is that > ball still in Larry's court? Which migration tools? Were there migration tools specific to gBorg->pgFoundry? Or something else? -- let name="cbbrowne" and tld="linuxfinances.info" in name ^ "@" ^ tld;; http://cbbrowne.com/info/internet.html REALITY is an illusion that stays put.
Christopher Browne wrote: > Oops! jim@nasby.net ("Jim C. Nasby") was seen spray-painting on a wall: >> On Thu, Aug 31, 2006 at 10:33:36AM -0400, Chris Browne wrote: >>> What's up there? It has been down all week. >>> >>> We're trying to get the Slony-I 1.2 release out, so we can then >>> migrate over to pgFoundry. But that doesn't working terribly well >>> when gBorg's down... >> Speaking of which, what's the status of the migration tools? Is that >> ball still in Larry's court? > > Which migration tools? Were there migration tools specific to > gBorg->pgFoundry? Or something else? There was, kind of, sort of. AFAIK they are dead and gone and the current method of thought on Gborg is here: http://archives.postgresql.org/pgsql-general/2006-08/msg01167.php -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutionssince 1997 http://www.commandprompt.com/
The world rejoiced as jd@commandprompt.com ("Joshua D. Drake") wrote: > Christopher Browne wrote: >> Oops! jim@nasby.net ("Jim C. Nasby") was seen spray-painting on a wall: >>> On Thu, Aug 31, 2006 at 10:33:36AM -0400, Chris Browne wrote: >>>> What's up there? It has been down all week. >>>> >>>> We're trying to get the Slony-I 1.2 release out, so we can then >>>> migrate over to pgFoundry. But that doesn't working terribly well >>>> when gBorg's down... >>> Speaking of which, what's the status of the migration tools? Is that >>> ball still in Larry's court? >> Which migration tools? Were there migration tools specific to >> gBorg->pgFoundry? Or something else? > > There was, kind of, sort of. AFAIK they are dead and gone and the > current method of thought on Gborg is here: > > http://archives.postgresql.org/pgsql-general/2006-08/msg01167.php That seems fine to me. We already had a plan set up for Slony-I that was not too dissimilar. To wit...- There already is a project set up at pgFoundry, and fairly much all committers at gBorg are members of the projectat pgFoundry- Downloads are being handled from pgFoundry- We figured we'd migrate outstanding bugs to pgFoundry byhand- We figured we'd migrate copies of old mailing list archives- We figured we'd announce the deaths of old lists, andsuggest subscribing to the new ones- We figured there would be some ability to copy CVS over fairly losslessly The *big* steps are moving email and CVS. I really haven't yet heard a peep yet that hasn't been third-hand about the gBorg outage, after it has been out for over a week now. To say that's displeasing is something of an understatement. It certainly leaves me in no mood to want to keep any services I care to actually have running hosted on gBorg. The fact that it has been out for a week, without any public comment being made, certainly gives me pause. I *HOPE* that we can still recover CVS and email. Question about email: Do we need to migrate the old mailing list archives before setting up new lists on pgFoundry? Or could I set up a "slony1-discuss" at pgFoundry, fairly immediately, and migrate in old archives later? [Possibly helpful factor: It's now September, and the last gBorg traffic was dated in August, so we may not need to mix months...] -- (format nil "~S@~S" "cbbrowne" "gmail.com") http://linuxdatabases.info/info/x.html Who needs fault-tolerant computers when there's obviously an ample market of fault-tolerant users?
> The *big* steps are moving email and CVS. *big* but quite easy :) As long as we have an mbox file from the slony lists, we can import easily into pgfoundry. CVS is just an rsync ;) Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutionssince 1997 http://www.commandprompt.com/
> -----Original Message----- > From: pgsql-hackers-owner@postgresql.org > [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of > Christopher Browne > Sent: 04 September 2006 03:55 > To: pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] gBorg status? > > The fact that it has been out for a week, without any public comment > being made, certainly gives me pause. I *HOPE* that we can still > recover CVS and email. My understanding is that Gborg is being recovered from backup as I type. I also understand that the delay was not caused by lack of backups or anything similarly scary, but simply by other priorities. Regards, Dave.
Recently seen in ACM Operating Systems Review (this is the first time I've found as many as 1 interesting article in it in a while, and there were 3 things I found worthwhile...): NTT (of the recent "NTT Power Hour") have created a new filesystem: <http://www.nilfs.org/en/> NILFS is a log-structured file system developed for Linux. In effect, it provides the "moral equivalent" to MVCC for filesystems; overwrites are equivalent to delete/insert, and requires a "Cleaner" process in order to clean out formerly-used space. It ought to have two merits over journalling filesystems: 1. It doesn't need to write data twice, which should improve performance 2. It doesn't need to repetitively overwrite metadata, which should improve crash safety. On the latter, per the paper: "... These journaling filesystems enable fast and consistent recovery of the file system after unexpected system freezes or power failures. However, they still allow the fatal destruction of the file system due to the characteristic that recovery is realized by overwriting meta data with their copies saved in a journal file. This recovery is guaranteed to work properly only if the write order of the on-disk data blocks and meta data blocks is physically conserved on the disk platters. Unfortunately, this constraint is often violated by the write optimizations performed by the block I/O subsystem and disk controllers." It's still at a somewhat early stage, as they haven't completed coding the Cleaner. (Probably should call it the Reaper... :-)) By the way, the Google SOC 2005 also produced one: <http://logfs.sourceforge.net/> NetBSD used to have a LFS; has that gone anywhere? Or been essentially dropped? -- let name="cbbrowne" and tld="cbbrowne.com" in String.concat "@" [name;tld];; http://linuxdatabases.info/info/emacs.html "I develop for Linux for a living, I used to develop for DOS. Going from DOS to Linux is like trading a glider for an F117." -- <entropy@world.std.com> Lawrence Foard
Chris Browne <cbbrowne@acm.org> writes: > NetBSD used to have a LFS; has that gone anywhere? Or been > essentially dropped? My reading over the last few years has indicated that LFSs tend to suffer bad performance degradation as data and metadata for a given file get scattered all over the disk. This tends to cancel out the performance gain from being able to cluster writes in a single area. For a heavily write-intensive workload, it might be a win, but no one seems to have demonstrated an advantage for "normal", mostly read-heavy usage. -Doug
On Tue, 2006-09-05 at 18:24 -0400, Chris Browne wrote: > Recently seen in ACM Operating Systems Review (this is the first time > I've found as many as 1 interesting article in it in a while, and > there were 3 things I found worthwhile...): > > NTT (of the recent "NTT Power Hour") have created a new filesystem: > <http://www.nilfs.org/en/> > > NILFS is a log-structured file system developed for Linux. > As I understand LFSs, they are not ideal for a database system. An LFS is optimized so that it writes sequentially. However, PostgreSQL already writes transactions sequentially in the WAL, and tries to optimize the cleaning of dirty data pages with the background writer. So I don't see the advantage of an LFS for a database. Also, LFSs assume very effective read cache. Databases often hold much more than can fit in read cache, and so frequently require disk access for reads. An LFS scatters the data all over the disk, which destroys the sequential access that PostgreSQL depends on for efficient index and table scans. Do you see an advantage in using LFS for PostgreSQL? Did the quotation refer to people leaving write cache enabled on a journaling filesystem? Regards,Jeff Davis
On Tue, Sep 05, 2006 at 05:54:50PM -0700, Jeff Davis wrote: > On Tue, 2006-09-05 at 18:24 -0400, Chris Browne wrote: > > Recently seen in ACM Operating Systems Review (this is the first time > > I've found as many as 1 interesting article in it in a while, and > > there were 3 things I found worthwhile...): > > ... > > NILFS is a log-structured file system developed for Linux. > As I understand LFSs, they are not ideal for a database system. An LFS > is optimized so that it writes sequentially. However, PostgreSQL already > ... > Do you see an advantage in using LFS for PostgreSQL? Hey guys - I think the original poster only meant to suggest that it was *interesting*... :-) To me, applying database concepts to file systems is interesting, and practical. It's not a perfected science by any means, but the idea that a file system is a hierarchical database isn't new. :-) Applying any database on top of another database seems inefficient to me. That's one reason why I argue the opposite - PostgreSQL *should* have its own on disk layout, and not being laid out on top of another generic system designed for purposes other than database storage. The reason it isn't pursued at present, and perhaps should not be pursued at present, is that PostgreSQL has other more important priorities in the short term. Cheers, mark -- mark@mielke.cc / markm@ncf.ca / markm@nortel.com __________________________ . . _ ._ . . .__ . . ._. .__ . . . .__ | Neighbourhood Coder |\/| |_| |_| |/ |_ |\/| | |_ | |/ |_ | | | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada One ring to rule them all, one ring to find them, one ring to bring them all and in the darkness bindthem... http://mark.mielke.cc/
On Tue, 2006-09-05 at 23:28 -0400, mark@mark.mielke.cc wrote: > On Tue, Sep 05, 2006 at 05:54:50PM -0700, Jeff Davis wrote: > > On Tue, 2006-09-05 at 18:24 -0400, Chris Browne wrote: > > > Recently seen in ACM Operating Systems Review (this is the first time > > > I've found as many as 1 interesting article in it in a while, and > > > there were 3 things I found worthwhile...): > > > ... > > > NILFS is a log-structured file system developed for Linux. > > As I understand LFSs, they are not ideal for a database system. An LFS > > is optimized so that it writes sequentially. However, PostgreSQL already > > ... > > Do you see an advantage in using LFS for PostgreSQL? > > Hey guys - I think the original poster only meant to suggest that it > was *interesting*... :-) > I see, my mistake. > Applying any database on top of another database seems inefficient to me. > That's one reason why I argue the opposite - PostgreSQL *should* have its > own on disk layout, and not being laid out on top of another generic > system designed for purposes other than database storage. The reason it > isn't pursued at present, and perhaps should not be pursued at present, > is that PostgreSQL has other more important priorities in the short term. > I think that it would be a higher priority if someone showed a substantial performance improvement. Some filesystems don't really cause much overhead that isn't needed by PostgreSQL. If someone did show a substantial improvement, I would be interested to see it. And if there is an improvement, shouldn't that be a project for something like Linux, where other databases could also benefit? It could just be implemented as a database-specific filesystem. Regards,Jeff Davis
pgsql@j-davis.com (Jeff Davis) writes: > On Tue, 2006-09-05 at 23:28 -0400, mark@mark.mielke.cc wrote: >> On Tue, Sep 05, 2006 at 05:54:50PM -0700, Jeff Davis wrote: >> > On Tue, 2006-09-05 at 18:24 -0400, Chris Browne wrote: >> > > Recently seen in ACM Operating Systems Review (this is the first time >> > > I've found as many as 1 interesting article in it in a while, and >> > > there were 3 things I found worthwhile...): >> > > ... >> > > NILFS is a log-structured file system developed for Linux. >> > As I understand LFSs, they are not ideal for a database system. An LFS >> > is optimized so that it writes sequentially. However, PostgreSQL already >> > ... >> > Do you see an advantage in using LFS for PostgreSQL? >> >> Hey guys - I think the original poster only meant to suggest that it >> was *interesting*... :-) >> > > I see, my mistake. From a reliability perspective, I can see some value to it... I have seen far too many databases corrupted by journalling gone bad in the past year... :-( >> Applying any database on top of another database seems inefficient >> to me. That's one reason why I argue the opposite - PostgreSQL >> *should* have its own on disk layout, and not being laid out on top >> of another generic system designed for purposes other than database >> storage. The reason it isn't pursued at present, and perhaps should >> not be pursued at present, is that PostgreSQL has other more >> important priorities in the short term. > > I think that it would be a higher priority if someone showed a > substantial performance improvement. Some filesystems don't really > cause much overhead that isn't needed by PostgreSQL. > > If someone did show a substantial improvement, I would be interested > to see it. > > And if there is an improvement, shouldn't that be a project for > something like Linux, where other databases could also benefit? It > could just be implemented as a database-specific filesystem. The classic problem with log structured filesystems is that sequential reads tend to be less efficient than in overwriting systems; perhaps if they can get "vacuuming" to be done frequently enough, that might change the shape of things. That would be a relevant lesson that _we_ have discovered that is potentially applicable to filesystem implementors. And I don't consider this purely of academic interest; the ability to:a) Avoid the double writing of journalling, andb) Avoidthe risks of failures due to misordered writes are both genuinely valuable. -- output = reverse("ofni.sesabatadxunil" "@" "enworbbc") http://cbbrowne.com/info/lisp.html All ITS machines now have hardware for a new machine instruction -- PFLT Prove Fermat's Last Theorem. Please update your programs.
On Wed, 2006-09-06 at 18:55 -0400, Chris Browne wrote: > pgsql@j-davis.com (Jeff Davis) writes: > >> > Do you see an advantage in using LFS for PostgreSQL? > >> > >> Hey guys - I think the original poster only meant to suggest that it > >> was *interesting*... :-) > >> > > > > I see, my mistake. > > >From a reliability perspective, I can see some value to it... > > I have seen far too many databases corrupted by journalling gone bad > in the past year... :-( > Can you elaborate a little? Which filesystems have been problematic? Which filesystems are you more confident in? > > > > And if there is an improvement, shouldn't that be a project for > > something like Linux, where other databases could also benefit? It > > could just be implemented as a database-specific filesystem. > > The classic problem with log structured filesystems is that sequential > reads tend to be less efficient than in overwriting systems; perhaps > if they can get "vacuuming" to be done frequently enough, that might > change the shape of things. > > That would be a relevant lesson that _we_ have discovered that is > potentially applicable to filesystem implementors. > > And I don't consider this purely of academic interest; the ability to: > a) Avoid the double writing of journalling, and > b) Avoid the risks of failures due to misordered writes > are both genuinely valuable. Right, LFS is promising in a number of ways. I've read about it in the past, and it would be nice if this NILFS implementation sparks some new research in the area. Regards,Jeff Davis
On Mon, 4 Sep 2006, Dave Page wrote: > My understanding is that Gborg is being recovered from backup as I type. > I also understand that the delay was not caused by lack of backups or > anything similarly scary, but simply by other priorities. Yes, I have the backup uploaded right now, and doing 'recover procedures' as I type this ... it will be up before I go to bed tonight ... As for the 'other priorities' ... the delay was due to an insufficient Internet connection to upload the backup ... we were waiting for our Internet to be installed in our new location, and were using an underpowered temporary connection in the interim ... that one was going to tie up the connection for >40 hours ... once we were able to get the backup server onto its 'permanent connection', the upload took ~5hrs ... As I mentioned, am working on it right now ... will post a follow up once she's back up live and needing to be tested ... she's also moving to one of our 64bit servers, so should be a wee bit better performance wise ... ---- Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . scrappy@hub.org MSN . scrappy@hub.org Yahoo . yscrappy Skype: hub.org ICQ . 7615664
pgsql@j-davis.com (Jeff Davis) wrote: > On Wed, 2006-09-06 at 18:55 -0400, Chris Browne wrote: >> pgsql@j-davis.com (Jeff Davis) writes: >> >> > Do you see an advantage in using LFS for PostgreSQL? >> >> >> >> Hey guys - I think the original poster only meant to suggest that it >> >> was *interesting*... :-) >> >> >> > >> > I see, my mistake. >> >> >From a reliability perspective, I can see some value to it... >> >> I have seen far too many databases corrupted by journalling gone bad >> in the past year... :-( > > Can you elaborate a little? Which filesystems have been problematic? > Which filesystems are you more confident in? Well, more or less *all* of them, on AMD-64/Linux. The "pulling the fibrechannel cable" test blew them all. XFS, ext3, JFS. ReiserFS was, if I recall correctly, marginally better, but only marginally. On AIX, we have seen JFS2 falling over when there were enough levels of buffering in the way on disk arrays. >> > And if there is an improvement, shouldn't that be a project for >> > something like Linux, where other databases could also benefit? >> > It could just be implemented as a database-specific filesystem. >> >> The classic problem with log structured filesystems is that >> sequential reads tend to be less efficient than in overwriting >> systems; perhaps if they can get "vacuuming" to be done frequently >> enough, that might change the shape of things. >> >> That would be a relevant lesson that _we_ have discovered that is >> potentially applicable to filesystem implementors. >> >> And I don't consider this purely of academic interest; the ability to: >> a) Avoid the double writing of journalling, and >> b) Avoid the risks of failures due to misordered writes >> are both genuinely valuable. > > Right, LFS is promising in a number of ways. I've read about it in > the past, and it would be nice if this NILFS implementation sparks > some new research in the area. Indeed. I don't see it being a "production-ready" answer yet, but yeah, I'd certainly like to see the research continue. A vital problem is in the area of vacuuming; there may be things to be learned in both directions. -- output = reverse("moc.liamg" "@" "enworbbc") http://linuxdatabases.info/info/fs.html Health is merely the slowest possible rate at which one can die.
everything should be back up and running ... vServer is now running on our newest, 64bit HP server with FreeBSD 6.x ... On Wed, 6 Sep 2006, Marc G. Fournier wrote: > On Mon, 4 Sep 2006, Dave Page wrote: > >> My understanding is that Gborg is being recovered from backup as I type. I >> also understand that the delay was not caused by lack of backups or >> anything similarly scary, but simply by other priorities. > > Yes, I have the backup uploaded right now, and doing 'recover procedures' as > I type this ... it will be up before I go to bed tonight ... > > As for the 'other priorities' ... the delay was due to an insufficient > Internet connection to upload the backup ... we were waiting for our Internet > to be installed in our new location, and were using an underpowered temporary > connection in the interim ... that one was going to tie up the connection for >> 40 hours ... once we were able to get the backup server onto its 'permanent > connection', the upload took ~5hrs ... > > As I mentioned, am working on it right now ... will post a follow up once > she's back up live and needing to be tested ... she's also moving to one of > our 64bit servers, so should be a wee bit better performance wise ... > > ---- > Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) > Email . scrappy@hub.org MSN . scrappy@hub.org > Yahoo . yscrappy Skype: hub.org ICQ . 7615664 > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org > ---- Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . scrappy@hub.org MSN . scrappy@hub.org Yahoo . yscrappy Skype: hub.org ICQ . 7615664
On Wed, 2006-09-06 at 22:12 -0400, Christopher Browne wrote: > > Can you elaborate a little? Which filesystems have been problematic? > > Which filesystems are you more confident in? > > Well, more or less *all* of them, on AMD-64/Linux. > > The "pulling the fibrechannel cable" test blew them all. XFS, ext3, > JFS. ReiserFS was, if I recall correctly, marginally better, but only > marginally. > > On AIX, we have seen JFS2 falling over when there were enough levels > of buffering in the way on disk arrays. > Well, that's interesting. I suppose I can't count on the filesystem as much as I thought. Are you implying that the filesystems aren't ready on 64-bit? Is it more of a hardware issue (a controller lying about the security of the write)? Any comments on FreeBSD/UFS+SU? I would expect UFS+SU to have similar issues, since it depends on write ordering also. What do you do for better data security (aside from the obvious "don't pull cables")? Regards,Jeff Davis
pgsql@j-davis.com (Jeff Davis) writes: > On Wed, 2006-09-06 at 22:12 -0400, Christopher Browne wrote: > >> > Can you elaborate a little? Which filesystems have been problematic? >> > Which filesystems are you more confident in? >> >> Well, more or less *all* of them, on AMD-64/Linux. >> >> The "pulling the fibrechannel cable" test blew them all. XFS, ext3, >> JFS. ReiserFS was, if I recall correctly, marginally better, but only >> marginally. >> >> On AIX, we have seen JFS2 falling over when there were enough levels >> of buffering in the way on disk arrays. > > Well, that's interesting. I suppose I can't count on the filesystem > as much as I thought. Are you implying that the filesystems aren't > ready on 64-bit? I don't think this necessarily is a 64 bit issue; it's more that with the more esoteric, expensive disk array hardware, there are fewer with the ability to test it, because you need $200K worth of hardware around to do the testing. > Is it more of a hardware issue (a controller lying about the > security of the write)? Any comments on FreeBSD/UFS+SU? I would > expect UFS+SU to have similar issues, since it depends on write > ordering also. > > What do you do for better data security (aside from the obvious > "don't pull cables")? The last time we looked, FreeBSD wasn't an option at all, because there wasn't any suitable FibreChannel support. That may have changed; haven't researched lately. The trouble that the NILFS people pointed out seems a troublesome one, namely that the more levels of cacheing (even if battery-backed), the less certain you can be that the hardware isn't lying about write ordering. I haven't got an answer... -- let name="cbbrowne" and tld="cbbrowne.com" in String.concat "@" [name;tld];; http://linuxdatabases.info/info/multiplexor.html Jury -- Twelve people who determine which client has the better lawyer.
We're getting a bit of an anomaly relating to pg_stat_activity... oxrstld=# SELECT * from pg_stat_activity where current_query <> '<IDLE>'; datid | datname | procpid | usesysid | usename| current_query | query_start ------------+---------+---------+----------+---------+---------------+-------------------------------1347729970 | oxrstld| 893094 | 122 | tldepp | commit | 2006-12-16 19:34:08.583978+00 (1 row) oxrstld=# select version(); version ------------------------------------------------------------------------------PostgreSQL 7.4.12 on powerpc-ibm-aix5.3.0.0,compiled by GCC gcc (GCC) 4.0.1 (1 row) That PID has been dead for several days, but this connection is marked as being open, still, after lo many days. This has *traditionally* been a sign that the stats collector has been clobbered, at which point pg_stat_activity becomes useless. That's not the case here; a less restricted query on pg_stat_activity shows other more recent data that keeps changing, seemingly consistent with system activity. I can track most of the lifecycle of that PID; the connection was established at 2006-12-16 18:46:38, and I see a few errors associated with the PID (violations of unique constraints; usual business stuff). I see nothing in the logs at around 19:34 which would suggest a reason for any strange behaviour. There are no records about that PID after 19:34... I'd like to get rid of this entry, if I can; it's blowing up tests that warn us about elderly transactions, causing a false positive... -- output = reverse("moc.enworbbc" "@" "enworbbc") http://www3.sympatico.ca/cbbrowne/ Rules of the Evil Overlord #212. "I will not send out battalions composed wholly of robots or skeletons against heroes who have qualms about killing living beings. <http://www.eviloverlord.com/>
Chris Browne <cbbrowne@acm.org> writes: > We're getting a bit of an anomaly relating to pg_stat_activity... > ... > That PID has been dead for several days, but this connection is marked > as being open, still, after lo many days. This probably just means that the "backend termination" stats message got dropped due to heavy load. That's expected behavior in all pre-8.2 releases: the stats system was never intended to provide guaranteed-exactly-correct status. PG 8.2 has reimplemented the pg_stat_activity view to make it more trustworthy. (The other stuff is still probabilistic, but being just event counters, message loss isn't so obvious.) regards, tom lane
tgl@sss.pgh.pa.us (Tom Lane) writes: > Chris Browne <cbbrowne@acm.org> writes: >> We're getting a bit of an anomaly relating to pg_stat_activity... >> ... >> That PID has been dead for several days, but this connection is marked >> as being open, still, after lo many days. > > This probably just means that the "backend termination" stats message > got dropped due to heavy load. That's expected behavior in all pre-8.2 > releases: the stats system was never intended to provide > guaranteed-exactly-correct status. PG 8.2 has reimplemented the > pg_stat_activity view to make it more trustworthy. (The other stuff is > still probabilistic, but being just event counters, message loss isn't > so obvious.) That seems a *bit* surprising; the system wasn't expected to be under particularly heavy load during the period in question; I would have expected "particularly light load." No matter; there may have been some brief heavy load to cause this. There isn't any way, short of restarting the postmaster, to get rid of that PID, is there? -- "cbbrowne","@","linuxdatabases.info" http://cbbrowne.com/info/linuxdistributions.html "High-level languages are a pretty good indicator that all else is seldom equal." - Tim Bradshaw, comp.lang.lisp
Chris Browne <cbbrowne@acm.org> writes: > There isn't any way, short of restarting the postmaster, to get rid of > that PID, is there? The entry will get overwritten when that BackendId slot gets re-used, so just starting enough concurrent backends should do it. (Since incoming backends always take the lowest free slot, the fact that the dead entry has persisted awhile means that it must have a number higher than your normal number of concurrent sessions ... which is evidence in favor of the idea that it happened during a load spike ...) regards, tom lane
A long time ago, in a galaxy far, far away, tgl@sss.pgh.pa.us (Tom Lane) wrote: > Chris Browne <cbbrowne@acm.org> writes: >> There isn't any way, short of restarting the postmaster, to get rid of >> that PID, is there? > > The entry will get overwritten when that BackendId slot gets re-used, > so just starting enough concurrent backends should do it. (Since > incoming backends always take the lowest free slot, the fact that the > dead entry has persisted awhile means that it must have a number higher > than your normal number of concurrent sessions ... which is evidence > in favor of the idea that it happened during a load spike ...) Cool. I started up a nice little bunch of psql sessions in the background, and then once they were all up, shut down my shell session, thereby eliminating them. And that did, indeed, clear out that pg_stat_activity entry. ... And five minutes later, Nagios sent me message indicating that node had recovered from having an "ancient" open connection. I'll re-add a few gratuitous details here in the hopes that that makes this easily findable if anyone else should search for the issue... The Problem:- pg_stat_activity was reporting an elderly transaction in progress - that backend process wasn't running anymore - pg_stat_activity *was* reporting other legitimate activity; this was not the scenario where it had gotten deranged (normallydue to excessive load) - Per Tom's comments, there evidently *was* some load spike where the closing of this particular connection did not getlogged by the stats collector The Solution: - We needed to roll the stats collector through a bunch of its slots in order to clean the apparently-still-populated entryout. - Ran, in a shell: for i in `seq 100`; do psql & done That left 100 psql sessions in the background, all connected to the database backend. - Closed the shell. That then HUPped the 100 psql sessions. That got the offending pg_stat_activity entry cleared out. -- let name="cbbrowne" and tld="gmail.com" in name ^ "@" ^ tld;; http://linuxdatabases.info/info/finances.html "Temporary tattoos are a CRITICAL ELEMENT of our security strategy. To suggest otherwise is sheer lunacy." -- Reid Fleming, cDc
In some of our applications, we have cases where it would be very nice if we could activate TOAST at some sort of lower threshold than the usual 2K that is true now. Let me note the current code that controls the threshold: /** These symbols control toaster activation. If a tuple is larger than* TOAST_TUPLE_THRESHOLD, we will try to toast itdown to no more than* TOAST_TUPLE_TARGET bytes. Both numbers include all tuple header overhead* and between-fields alignmentpadding, but we do *not* consider any* end-of-tuple alignment padding; hence the values can be compared directly*to a tuple's t_len field. We choose TOAST_TUPLE_THRESHOLD with the* knowledge that toast-table tuples will be exactlythat size, and we'd* like to fit four of them per page with minimal space wastage.** The numbers need not be the same,though they currently are.** Note: sizeof(PageHeaderData) includes the first ItemId, but we have* to allow for 3 more,if we want to fit 4 tuples on a page.*/ #define TOAST_TUPLE_THRESHOLD^I\ ^IMAXALIGN_DOWN((BLCKSZ - \ ^I^I^I^I MAXALIGN(sizeof(PageHeaderData) + 3 * sizeof(ItemIdData))) \ ^I^I^I^I / 4) We have cases where we're storing XML message information which is near the 0.5K mark, that being the case, tuples virtually never get TOASTed. somesystem=# select min(length(xml)), max(length(xml)), avg(length(xml)), stddev(length(xml)) from table_with_xml;min | max | avg | stddev -----+------+----------------------+------------------244 | 2883 | 651.6900720788174376 | 191.602077911138 (1 row) I can see four controls as being pretty plausible: 1. Compile time... #define TOAST_DENOMINATOR 17 /* Use this as the divisor; current default behaviour falls from TOAST_DENOMINATOR = 4 */ #define TOAST_TUPLE_THRESHOLD^I\ ^IMAXALIGN_DOWN((BLCKSZ - \ ^I^I^I^I MAXALIGN(sizeof(PageHeaderData) + 3 * sizeof(ItemIdData))) \ ^I^I^I^I / TOAST_DENOMINATOR) That's obviously cheapest to the DB engine. I just made this change to a checkout of CVS HEAD, and it readily survived a regression test. 2. GUC value for TOAST_DENOMINATOR Do the above, but with the added detail that TOAST_DENOMINATOR refers to a GUC value. I think I could probably make this change; the principle remains much the same as with #1. 3. GUC value for TOAST_TUPLE_THRESHOLD This probably has to get modified to the nearest feasible value, modulo alignment; it's not all that different from #1 or #2. 4. A different mechanism would be to add a fifth storage column strategy (the present four are PLAIN, EXTENDED, EXTERNAL, MAIN), let's say, TOAST. At present, the 4 values are essentially advisory; columns get TOASTed if the column permits EXTENDED storage, but that only occurs if the size is greater than TOAST_TUPLE_THRESHOLD. If the new value was chosen, the column would *always* get stored as TOAST. Presumably #1 or #2 could readily get into 8.3 as they're pretty easy; #3 is a bit trickier, whilst #4 is probably not "8.3-fittable". Question: Which of these sounds preferable? -- let name="cbbrowne" and tld="cbbrowne.com" in name ^ "@" ^ tld;; http://cbbrowne.com/info/finances.html Where do you *not* want to go today? "Confutatis maledictis, flammis acribus addictis" (<http://www.hex.net/~cbbrowne/msprobs.html>
Chris Browne <cbbrowne@acm.org> writes: > #define TOAST_DENOMINATOR 17 > /* Use this as the divisor; current default behaviour falls from TOAST_DENOMINATOR = 4 */ > #define TOAST_TUPLE_THRESHOLD^I\ > ^IMAXALIGN_DOWN((BLCKSZ - \ > ^I^I^I^I MAXALIGN(sizeof(PageHeaderData) + 3 * sizeof(ItemIdData))) \ > ^I^I^I^I / TOAST_DENOMINATOR) Given that you are quoting code that was demonstrably broken since the original coding of TOAST up till a month or two back, "it passes regression" is not adequate proof of "it's right". In fact I think it's not right; you have not got the roundoff condition straight. > 4. A different mechanism would be to add a fifth storage column > strategy (the present four are PLAIN, EXTENDED, EXTERNAL, MAIN), let's > say, TOAST. Anything along this line would require invoking the toaster on every single tuple, since we'd always have to crawl through all the columns to see if toasting was supposed to happen. No thanks. > Which of these sounds preferable? It's a bit late in the cycle to be proposing any of these for 8.3. regards, tom lane
On 3/21/2007 2:05 PM, Tom Lane wrote: > Chris Browne <cbbrowne@acm.org> writes: >> #define TOAST_DENOMINATOR 17 >> /* Use this as the divisor; current default behaviour falls from TOAST_DENOMINATOR = 4 */ > >> #define TOAST_TUPLE_THRESHOLD^I\ >> ^IMAXALIGN_DOWN((BLCKSZ - \ >> ^I^I^I^I MAXALIGN(sizeof(PageHeaderData) + 3 * sizeof(ItemIdData))) \ >> ^I^I^I^I / TOAST_DENOMINATOR) > > Given that you are quoting code that was demonstrably broken since the > original coding of TOAST up till a month or two back, "it passes > regression" is not adequate proof of "it's right". In fact I think > it's not right; you have not got the roundoff condition straight. > >> 4. A different mechanism would be to add a fifth storage column >> strategy (the present four are PLAIN, EXTENDED, EXTERNAL, MAIN), let's >> say, TOAST. FORCE_COMPRESSION, FORCE_EXTERNAL and FORCE_EXTERNAL_UNCOMPRESSED. > > Anything along this line would require invoking the toaster on every > single tuple, since we'd always have to crawl through all the columns > to see if toasting was supposed to happen. No thanks. Not necessarily. A flag in Relation telling if the table has any column marked like that could be set while constructing the relcache entry. > >> Which of these sounds preferable? > > It's a bit late in the cycle to be proposing any of these for 8.3. Certainly. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
Is this a TODO? --------------------------------------------------------------------------- Jan Wieck wrote: > On 3/21/2007 2:05 PM, Tom Lane wrote: > > Chris Browne <cbbrowne@acm.org> writes: > >> #define TOAST_DENOMINATOR 17 > >> /* Use this as the divisor; current default behaviour falls from TOAST_DENOMINATOR = 4 */ > > > >> #define TOAST_TUPLE_THRESHOLD^I\ > >> ^IMAXALIGN_DOWN((BLCKSZ - \ > >> ^I^I^I^I MAXALIGN(sizeof(PageHeaderData) + 3 * sizeof(ItemIdData))) \ > >> ^I^I^I^I / TOAST_DENOMINATOR) > > > > Given that you are quoting code that was demonstrably broken since the > > original coding of TOAST up till a month or two back, "it passes > > regression" is not adequate proof of "it's right". In fact I think > > it's not right; you have not got the roundoff condition straight. > > > >> 4. A different mechanism would be to add a fifth storage column > >> strategy (the present four are PLAIN, EXTENDED, EXTERNAL, MAIN), let's > >> say, TOAST. > > FORCE_COMPRESSION, FORCE_EXTERNAL and FORCE_EXTERNAL_UNCOMPRESSED. > > > > > Anything along this line would require invoking the toaster on every > > single tuple, since we'd always have to crawl through all the columns > > to see if toasting was supposed to happen. No thanks. > > Not necessarily. A flag in Relation telling if the table has any column > marked like that could be set while constructing the relcache entry. > > > > >> Which of these sounds preferable? > > > > It's a bit late in the cycle to be proposing any of these for 8.3. > > Certainly. > > > Jan > > -- > #======================================================================# > # It's easier to get forgiveness for being wrong than for being right. # > # Let's break this rule - forgive me. # > #================================================== JanWieck@Yahoo.com # > > ---------------------------(end of broadcast)--------------------------- > TIP 2: Don't 'kill -9' the postmaster -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
tgl@sss.pgh.pa.us (Tom Lane) writes: > Chris Browne <cbbrowne@acm.org> writes: >> #define TOAST_DENOMINATOR 17 >> /* Use this as the divisor; current default behaviour falls from TOAST_DENOMINATOR = 4 */ > >> #define TOAST_TUPLE_THRESHOLD^I\ >> ^IMAXALIGN_DOWN((BLCKSZ - \ >> ^I^I^I^I MAXALIGN(sizeof(PageHeaderData) + 3 * sizeof(ItemIdData))) \ >> ^I^I^I^I / TOAST_DENOMINATOR) > > Given that you are quoting code that was demonstrably broken since > the original coding of TOAST up till a month or two back, "it passes > regression" is not adequate proof of "it's right". In fact I think > it's not right; you have not got the roundoff condition straight. OK, then maybe some refinement was needed. That seemed too easy. Mind you, the problem seems to me to be that TOAST_TUPLE_THRESHOLD is not solely a threshold to compare things to (as done in heapam.c/toasting.c), but gets reused to calculate TOAST_MAX_CHUNK_SIZE. If the threshold was solely used as that, alignment wouldn't matter. FYI, I took a sample table and loaded it into the resulting 8.3 backend based on the us of the totally naive TOAST_DENOMINATOR; there may be something off in the sizing of the chunks, but that does not appear to have injured fidelity of the tuples I stored. Vacuum output: --- Production system (no TOASTing) INFO: "xml_log_table": found 0 removable, 1731329 nonremovable row versions in 175870 pages DETAIL: 0 dead row versions cannot be removed yet. --- In the 8.3 instance that did toast things... INFO: "xml_log_table": found 0 removable, 1730737 nonremovable row versions in 41120 pages DETAIL: 0 dead row versions cannot be removed yet. There were 0 unused item pointers. 610 pages contain useful free space. 0 pages are entirely empty. CPU 1.08s/0.36u sec elapsed 14.94 sec. INFO: vacuuming "pg_toast.pg_toast_49194" INFO: index "pg_toast_49194_index" now contains 2303864 row versions in 6319 pages DETAIL: 0 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.09s/0.03u sec elapsed 1.71 sec. INFO: "pg_toast_49194": found 0 removable, 2303864 nonremovable row versions in 98191 pages DETAIL: 0 dead row versions cannot be removed yet. There were 0 unused item pointers. 514 pages contain useful free space. 0 pages are entirely empty. Problem with alignment of TOAST_MAX_CHUNK_SIZE or not, I seem to be getting the right results, and this nicely partitions the table into 2 chunks, one, with the non-XML data, that occupies 41K pages, and the TOAST section storing those less-frequently-accessed columns. (There is a size difference; the production instance has more empty space since it sees active inserts + deletes.) In all ways except for "strict hygenic correctness of code," this accomplished what I was hoping. If someone could make a round-off-safe calculation of TOAST_TUPLE_THRESHOLD and TOAST_MAX_CHUNK_SIZE that exposed the denominator so it could be safely modified, that would be one step ahead... I generally try not to make changes to the core, so I'll try to avoid that... >> 4. A different mechanism would be to add a fifth storage column >> strategy (the present four are PLAIN, EXTENDED, EXTERNAL, MAIN), let's >> say, TOAST. > > Anything along this line would require invoking the toaster on every > single tuple, since we'd always have to crawl through all the columns > to see if toasting was supposed to happen. No thanks. Ah, I see. I infer from that that the code starts by checking to see if the tuple size is > TOAST_TUPLE_THRESHOLD, and only starts rummaging through TOAST infrastructure if the tuple is big enough. In that case, "TOAST by default" becomes rather a nonstarter, I agree. In the application context I'm thinking of, one table out of ~80 is a "TOAST candidate;" making access to the other 79 slower would not be of benefit. (Aside: I'll link to Simon Rigg's related note, as well as to the item on the TODO list...) <http://archives.postgresql.org/pgsql-hackers/2007-02/msg00213.php> <http://www.postgresql.org/docs/faqs.TODO.html> -- let name="cbbrowne" and tld="cbbrowne.com" in name ^ "@" ^ tld;; http://cbbrowne.com/info/finances.html Where do you *not* want to go today? "Confutatis maledictis, flammis acribus addictis" (<http://www.hex.net/~cbbrowne/msprobs.html>
On Wed, Mar 21, 2007 at 12:37:36PM -0400, Chris Browne wrote: > 4. A different mechanism would be to add a fifth storage column > strategy (the present four are PLAIN, EXTENDED, EXTERNAL, MAIN), let's > say, TOAST. > > At present, the 4 values are essentially advisory; columns get TOASTed > if the column permits EXTENDED storage, but that only occurs if the > size is greater than TOAST_TUPLE_THRESHOLD. > > If the new value was chosen, the column would *always* get stored as > TOAST. Rather than a hard and fast limit of 0, why not allow defining a size threshold? And while we're at it, how about a size threshold for when to try compressing, too? > Presumably #1 or #2 could readily get into 8.3 as they're pretty easy; > #3 is a bit trickier, whilst #4 is probably not "8.3-fittable". > > Question: > > Which of these sounds preferable? 1 and 2 (variations on how to set the denominator) sound completely ugly. Trying to minimize wasted space in a toast table is great for a default, but exposing something like that to the users via any kind of setting seems way to obtuse. #3 (GUC for number of bytes) may not make sense for performance reasons, as Tom mentioned. I'm hoping that it would be easy to check either pg_class or pg_attribute to see if a table/column has non-standard toast/compression limits. -- Jim Nasby jim@nasby.net EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
I advocate the following: - Enable specification of TOAST policy on a per column basis As a first step, then: - Enable vertical partitioning of tables using per-column specification of storage policy. - Luke On 3/21/07 1:12 PM, "Bruce Momjian" <bruce@momjian.us> wrote: > > Is this a TODO? > > --------------------------------------------------------------------------- > > Jan Wieck wrote: >> On 3/21/2007 2:05 PM, Tom Lane wrote: >>> Chris Browne <cbbrowne@acm.org> writes: >>>> #define TOAST_DENOMINATOR 17 >>>> /* Use this as the divisor; current default behaviour falls from >>>> TOAST_DENOMINATOR = 4 */ >>> >>>> #define TOAST_TUPLE_THRESHOLD^I\ >>>> ^IMAXALIGN_DOWN((BLCKSZ - \ >>>> ^I^I^I^I MAXALIGN(sizeof(PageHeaderData) + 3 * sizeof(ItemIdData))) \ >>>> ^I^I^I^I / TOAST_DENOMINATOR) >>> >>> Given that you are quoting code that was demonstrably broken since the >>> original coding of TOAST up till a month or two back, "it passes >>> regression" is not adequate proof of "it's right". In fact I think >>> it's not right; you have not got the roundoff condition straight. >>> >>>> 4. A different mechanism would be to add a fifth storage column >>>> strategy (the present four are PLAIN, EXTENDED, EXTERNAL, MAIN), let's >>>> say, TOAST. >> >> FORCE_COMPRESSION, FORCE_EXTERNAL and FORCE_EXTERNAL_UNCOMPRESSED. >> >>> >>> Anything along this line would require invoking the toaster on every >>> single tuple, since we'd always have to crawl through all the columns >>> to see if toasting was supposed to happen. No thanks. >> >> Not necessarily. A flag in Relation telling if the table has any column >> marked like that could be set while constructing the relcache entry. >> >>> >>>> Which of these sounds preferable? >>> >>> It's a bit late in the cycle to be proposing any of these for 8.3. >> >> Certainly. >> >> >> Jan >> >> -- >> #======================================================================# >> # It's easier to get forgiveness for being wrong than for being right. # >> # Let's break this rule - forgive me. # >> #================================================== JanWieck@Yahoo.com # >> >> ---------------------------(end of broadcast)--------------------------- >> TIP 2: Don't 'kill -9' the postmaster
Luke Lonergan wrote: > I advocate the following: > > - Enable specification of TOAST policy on a per column basis > > As a first step, then: > > - Enable vertical partitioning of tables using per-column specification of > storage policy. > Wouldn't it be enough to enable having the toast table on a different table space? Regards, Andreas
Andreas, On 3/22/07 9:40 AM, "Andreas Pflug" <pgadmin@pse-consulting.de> wrote: > Wouldn't it be enough to enable having the toast table on a different > table space? Yes, but the ultimate goal would allow the allocation of a storage mechanism that is unique to each column. The most frequently used mechanism for our customers would likely be to have each column associated with it's own internal relation (like a separate TOAST table), which puts each column into it's own dense page storage. Beside the advantages of separating out keys columns from data columns, compression and encryption approaches that use column packed data are much more effective. As Tom points out there are complications WRT update, access, etc that need to be worked out to support this, but it's an important capability to have IMO. - Luke
llonergan@greenplum.com ("Luke Lonergan") writes: > Andreas, > > On 3/22/07 9:40 AM, "Andreas Pflug" <pgadmin@pse-consulting.de> wrote: > >> Wouldn't it be enough to enable having the toast table on a different >> table space? > > Yes, but the ultimate goal would allow the allocation of a storage mechanism > that is unique to each column. The most frequently used mechanism for our > customers would likely be to have each column associated with it's own > internal relation (like a separate TOAST table), which puts each column into > it's own dense page storage. > > Beside the advantages of separating out keys columns from data columns, > compression and encryption approaches that use column packed data are much > more effective. > > As Tom points out there are complications WRT update, access, etc that need > to be worked out to support this, but it's an important capability to have > IMO. Hmm. Are you trying to do something sort of like CStore? http://db.csail.mit.edu/projects/cstore/ That seems to have some relevant ideas... -- let name="cbbrowne" and tld="cbbrowne.com" in name ^ "@" ^ tld;; http://cbbrowne.com/info/finances.html Where do you *not* want to go today? "Confutatis maledictis, flammis acribus addictis" (<http://www.hex.net/~cbbrowne/msprobs.html>
Chris, > Hmm. Are you trying to do something sort of like CStore? > > http://db.csail.mit.edu/projects/cstore/ > > That seems to have some relevant ideas... I think "something like" is a good way to put it. As you know Cstore was a prototype for Vertica and these are in the same class as SybaseIQ and SandDB. The huge drawback of the pure column approach is update/insert while query is difficult if not impossible. I think there are hybrid approaches that yield most, if not all of the benefits of the column store approach without the disadvantages. For instance, a bitmap index with index only access in a row-store may outperform the column store on queries. Note the "index only access" part. The next advantage of a column store is deep compression, preserved through the executor access path - we can do this with selective vertical partitioning using a page-segmented WAH compression similar to what we did with bitmap index. Lastly, vectorization of the operators in the executor can be implemented with vertical partitioning and an access path that does projection before feeding the columns into the executor - this can be done in Postgres with a cache-bypass method. Some of this requires working out answers to the visibility challenges inherent to MVCC, but it's all possible IMO. So - under the guise of "TOASTing smaller things", it seems relevant to think about vertical partitioning, perhaps making use of what's already in Postgres as baby steps toward more advanced features. - Luke
Ühel kenal päeval, N, 2007-03-22 kell 10:19, kirjutas Luke Lonergan: > Andreas, > > On 3/22/07 9:40 AM, "Andreas Pflug" <pgadmin@pse-consulting.de> wrote: > > > Wouldn't it be enough to enable having the toast table on a different > > table space? > > Yes, but the ultimate goal would allow the allocation of a storage mechanism > that is unique to each column. The most frequently used mechanism for our > customers would likely be to have each column associated with it's own > internal relation (like a separate TOAST table), which puts each column into > it's own dense page storage. Maybe monetdb ( http://monetdb.cwi.nl/ ) can be of some inspiration ? btw, it may be a good idea to have a verion of bizgresMPP which has monetdb as partition db, if monetdb is as efficient as they tell it is . > > Beside the advantages of separating out keys columns from data columns, > compression and encryption approaches that use column packed data are much > more effective. > > As Tom points out there are complications WRT update, access, etc that need > to be worked out to support this, but it's an important capability to have > IMO. > > - Luke > > > > ---------------------------(end of broadcast)--------------------------- > TIP 7: You can help support the PostgreSQL project by donating at > > http://www.postgresql.org/about/donate -- ---------------- Hannu Krosing Database Architect Skype Technologies OÜ Akadeemia tee 21 F, Tallinn, 12618, Estonia Skype me: callto:hkrosing Get Skype for free: http://www.skype.com
Hi Hannu, On 3/22/07 3:21 PM, "Hannu Krosing" <hannu@skype.net> wrote: > Maybe monetdb ( http://monetdb.cwi.nl/ ) can be of some inspiration ? > > btw, it may be a good idea to have a verion of bizgresMPP which has > monetdb as partition db, if monetdb is as efficient as they tell it is . Yep - I've talked this over with the MonetDB folks in the past. The major benefits they observe are those same things we talk about here, implementation of long loops for operators and de-abstraction of operations like compare() when appropriate, say comparing two INT columns in a sort. We can get many of those benefits without by "vectorizing" the executor of PostgreSQL even without the full column partitioning. We're in the midst of working some of those changes as we speak. Early indications are that we see large performance gains from this approach. Note that the actual instruction counts per row don't change, but the more effective use of L2 I and D cache and superscaler instruction units on the CPU create the big gains. The MonetDB people present some effective literature on this, but the important gains mostly come from the "vectorization", not the operator de-abstraction IMO, which is good news for us all. - Luke
Luke Lonergan wrote: > I advocate the following: > > - Enable specification of TOAST policy on a per column basis > > As a first step, then: > > - Enable vertical partitioning of tables using per-column specification of > storage policy. > How are these different from ALTER TABLE SET STORAGE? They need to be more specific. --------------------------------------------------------------------------- > - Luke > > > On 3/21/07 1:12 PM, "Bruce Momjian" <bruce@momjian.us> wrote: > > > > > Is this a TODO? > > > > --------------------------------------------------------------------------- > > > > Jan Wieck wrote: > >> On 3/21/2007 2:05 PM, Tom Lane wrote: > >>> Chris Browne <cbbrowne@acm.org> writes: > >>>> #define TOAST_DENOMINATOR 17 > >>>> /* Use this as the divisor; current default behaviour falls from > >>>> TOAST_DENOMINATOR = 4 */ > >>> > >>>> #define TOAST_TUPLE_THRESHOLD^I\ > >>>> ^IMAXALIGN_DOWN((BLCKSZ - \ > >>>> ^I^I^I^I MAXALIGN(sizeof(PageHeaderData) + 3 * sizeof(ItemIdData))) \ > >>>> ^I^I^I^I / TOAST_DENOMINATOR) > >>> > >>> Given that you are quoting code that was demonstrably broken since the > >>> original coding of TOAST up till a month or two back, "it passes > >>> regression" is not adequate proof of "it's right". In fact I think > >>> it's not right; you have not got the roundoff condition straight. > >>> > >>>> 4. A different mechanism would be to add a fifth storage column > >>>> strategy (the present four are PLAIN, EXTENDED, EXTERNAL, MAIN), let's > >>>> say, TOAST. > >> > >> FORCE_COMPRESSION, FORCE_EXTERNAL and FORCE_EXTERNAL_UNCOMPRESSED. > >> > >>> > >>> Anything along this line would require invoking the toaster on every > >>> single tuple, since we'd always have to crawl through all the columns > >>> to see if toasting was supposed to happen. No thanks. > >> > >> Not necessarily. A flag in Relation telling if the table has any column > >> marked like that could be set while constructing the relcache entry. > >> > >>> > >>>> Which of these sounds preferable? > >>> > >>> It's a bit late in the cycle to be proposing any of these for 8.3. > >> > >> Certainly. > >> > >> > >> Jan > >> > >> -- > >> #======================================================================# > >> # It's easier to get forgiveness for being wrong than for being right. # > >> # Let's break this rule - forgive me. # > >> #================================================== JanWieck@Yahoo.com # > >> > >> ---------------------------(end of broadcast)--------------------------- > >> TIP 2: Don't 'kill -9' the postmaster -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Hi Bruce, How about these: - Allow specification of TOAST size threshold in bytes on a per column basis - Enable storage of columns in separate TOAST tables - Enable use of multi-row compression method(s) for TOAST tables - Luke On 3/26/07 5:39 PM, "Bruce Momjian" <bruce@momjian.us> wrote: > Luke Lonergan wrote: >> I advocate the following: >> >> - Enable specification of TOAST policy on a per column basis >> >> As a first step, then: >> >> - Enable vertical partitioning of tables using per-column specification of >> storage policy. >> > > How are these different from ALTER TABLE SET STORAGE? They need to be > more specific.
Luke Lonergan wrote: > Hi Bruce, > > How about these: > > - Allow specification of TOAST size threshold in bytes on a per column basis > - Enable storage of columns in separate TOAST tables > - Enable use of multi-row compression method(s) for TOAST tables At this point I would be happy just to set the TOAST threshold to a value defined as optimal, rather than as the most minimal use of TOAST possible. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Hi Bruce, On 4/12/07 9:24 AM, "Bruce Momjian" <bruce@momjian.us> wrote: > Luke Lonergan wrote: >> Hi Bruce, >> >> How about these: >> >> - Allow specification of TOAST size threshold in bytes on a per column basis >> - Enable storage of columns in separate TOAST tables >> - Enable use of multi-row compression method(s) for TOAST tables > > At this point I would be happy just to set the TOAST threshold to a > value defined as optimal, rather than as the most minimal use of TOAST > possible. I agree that's a good starting point, I guess I was thinking that was already included in the work that Tom has been doing. If not, we can add a TODO like this as a precursor to the ones above: - Allow specification of TOAST size threshold (in bytes) on a per table basis - Luke
"Luke Lonergan" <llonergan@greenplum.com> writes: > On 4/12/07 9:24 AM, "Bruce Momjian" <bruce@momjian.us> wrote: >> At this point I would be happy just to set the TOAST threshold to a >> value defined as optimal, rather than as the most minimal use of TOAST >> possible. > I agree that's a good starting point, I guess I was thinking that was > already included in the work that Tom has been doing. No. I put in the code needed to decouple toast tuple size from toasting threshold, but I don't have the time or interest to run performance tests to see whether there are better default values than the historical quarter-page values. Someone should do that before 8.3 beta ... > If not, we can add a > TODO like this as a precursor to the ones above: > - Allow specification of TOAST size threshold (in bytes) on a per table > basis I would suggest that *all* of those TODOs are premature in the absence of experimental evidence about the effect of varying the parameters. If we end up finding out that the existing settings are about right anyway across a range of test cases, who needs more knobs? We've got too many mostly-useless knobs already. regards, tom lane
Hi Tom, On 4/12/07 1:40 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote: > I would suggest that *all* of those TODOs are premature in the absence > of experimental evidence about the effect of varying the parameters. > If we end up finding out that the existing settings are about right > anyway across a range of test cases, who needs more knobs? We've got > too many mostly-useless knobs already. This set of TODOs is really about a binary change in behavior that data warehouse users will employ to shift given columns into a separate storage mechanism while preserving their schema. By contrast, the knob you describe is about tuning the existing storage mechanism by offlining values that are too large. We can talk about these TODOs as a group using the name "vertical partitioning" if that suits. To demonstrate the effectiveness of vertical partitioning, we would write queries that use the partitioned columns independently or as groups that correlate with the storage mechanism. The other demonstration applies to the use of compression techniques that align with the column type(s) and operate across tuple boundaries within pages. Examples there include the segmentation of fixed width types and variable width types into separate page storage classes, which allows for the application of different compression and/or representations on pages. - Luke
"Tom Lane" <tgl@sss.pgh.pa.us> writes: > I would suggest that *all* of those TODOs are premature in the absence > of experimental evidence about the effect of varying the parameters. > If we end up finding out that the existing settings are about right > anyway across a range of test cases, who needs more knobs? We've got > too many mostly-useless knobs already. Isn't it obvious that the "right" value is going to depend extraordinarily heavily on the precise usage pattern though? A typical table with 100-byte columns which are normally read with the rest of the columns, sometimes in sequential scans or updates, will find TOASTing them counter-productive as it t urns those all into additional random access i/o and extra inserts and deletes. Another system with a dozen 100-byte columns that are virtually never accessed and a handful of heavily-updated integer columns will benefit heavily from TOAST as it changes the table from an awfully performing 5-6 tuple/page table into a tremendously performing 100+ tuple/page table. We're not ever going to be able to predict when data is being stored what future usage pattern to expect, at least not realistically. The best we can hope for is to find the tipping point at which the cost if we guess wrong is some tolerable level of pain and set that to be the default and giving the tools to the user to obtain the benefit in the cases where he knows it'll help. Or perhaps TOAST is the wrong kind of vertical partitioning for this. Perhaps we should be using TOAST to deal with the "large datum" problem and have a different looking tool entirely for the "vertical partitioning rarely used columns" problem. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
Gregory Stark <stark@enterprisedb.com> writes: > "Tom Lane" <tgl@sss.pgh.pa.us> writes: >> I would suggest that *all* of those TODOs are premature in the absence >> of experimental evidence about the effect of varying the parameters. > Isn't it obvious that the "right" value is going to depend extraordinarily > heavily on the precise usage pattern though? It's not yet "obvious" that there's any win to be had at all. AFAIK no one has done any playing around with alternative TOAST settings. It could be that the mechanism is simply not very sensitive to those values. > Or perhaps TOAST is the wrong kind of vertical partitioning for this. Exactly my point --- quoting anecdotes about wins from someone else's vertical partitioning implementation doesn't really prove a darn thing about how TOAST will behave. It's suggestive, but I'd like to see some actual experimental evidence before we start constructing a lot of infrastructure. "Infrastructure first, learn how to use it later" is the philosophy that's given us nigh-useless stuff like commit_delay. regards, tom lane
I'm seeing some applications where it appears that there would be value in introducing asynchronous messaging, ala "message queueing." <http://en.wikipedia.org/wiki/Message_queue> The "granddaddy" of message queuing systems is IBM's MQ-Series, and I don't see particular value in replicating its functionality. On the other side, the "big names" these days are: a) The Java Messaging Service, which seems to implement *way* more options than I'm even vaguely interested in having (notably,lots that involve data stores or lack thereof that I do not care to use); b) AMPQ <http://en.wikipedia.org/wiki/Advanced_Message_Queuing_Protocol> which again is a big, complex, protocol-orientedthing that seems mighty big and complex; c) There are lesser names, like isectd <http://isectd.sf.net> and the (infamous?) Spread Toolkit which both implement memory-based messaging systems. There's a simple + free system called MQS that is implemented in Perl and uses XML-RPC; <http://www.foo.be/mqs/> that seems perhaps a little *too* primitive. FYI, here's its simple schema (filtered into PG): create table mqname ( name varchar (255) not null default '', id serial, priority integer not null default 10, primary key(id)); create table message ( id serial, timestamp timestamptz NOT NULL cid varchar(255), queue integer not null references mqname(id),body text not null, priority integer not null default 10, flag integer default 0, primary key(id) ); My bias would be to have something that can basically run as a thin set of stored procedures atop PostgreSQL :-). It would be trivial to extend that to support SOAP/XML-RPC, if desired. It would be nice to achieve 'higher availability' by having queues where you might replicate the contents (probably using the MQ system itself ;-)) to other servers. There tend to be varying semantics out there: - Some queues may represent "subscriptions" where a whole bunch of listeners want to get all the messages; - Sometimes you have the semantics where: - messages need to be delivered at least once - messages need to be delivered nomore than once - messages need to be delivered exactly once Is there any existing work out there on this? Or should I maybe be looking at prototyping something? -- (format nil "~S@~S" "cbbrowne" "linuxfinances.info") http://linuxfinances.info/info/lsf.html Q: How many Newtons does it take to change a light bulb? A: Faux! There to eat lemons, axe gravy soup!
On Jun 19, 2007, at 2:45 PM, Chris Browne wrote: > I'm seeing some applications where it appears that there would be > value in introducing asynchronous messaging, ala "message queueing." > <http://en.wikipedia.org/wiki/Message_queue> Me too. > My bias would be to have something that can basically run as a thin > set of stored procedures atop PostgreSQL :-). It would be trivial to > extend that to support SOAP/XML-RPC, if desired. > > It would be nice to achieve 'higher availability' by having queues > where you might replicate the contents (probably using the MQ system > itself ;-)) to other servers. > > There tend to be varying semantics out there: > > - Some queues may represent "subscriptions" where a whole bunch of > listeners want to get all the messages; > > - Sometimes you have the semantics where: > - messages need to be delivered at least once > - messages need to be delivered no more than once > - messages need to be delivered exactly once > > Is there any existing work out there on this? Or should I maybe be > looking at prototyping something? The skype tools have some sort of decent-looking publish/subscribe thing, PgQ, then they layer their replication on top of. It's multi consumer and producer, with "delivered at least once" semantics. Looks nice. Cheers, Steve
steve@blighty.com (Steve Atkins) writes: >> Is there any existing work out there on this? Or should I maybe be >> looking at prototyping something? > > The skype tools have some sort of decent-looking publish/subscribe > thing, PgQ, then they layer their replication on top of. It's multi > consumer and producer, with "delivered at least once" semantics. > > Looks nice. I had not really noticed that - I need to take a look at their connection pooler too, so I guess that puts more "skype" items on my ToDo list ;-). Thanks for pointing it out... -- let name="cbbrowne" and tld="linuxdatabases.info" in String.concat "@" [name;tld];; http://cbbrowne.com/info/advocacy.html Signs of a Klingon Programmer #1: "Our users will know fear and cower before our software. Ship it! Ship it and let them flee like the dogs they are!"
On Wed, June 20, 2007 04:45, Chris Browne wrote: > I'm seeing some applications where it appears that there would be > value in introducing asynchronous messaging, ala "message queueing." > <http://en.wikipedia.org/wiki/Message_queue> > > The "granddaddy" of message queuing systems is IBM's MQ-Series, and I > don't see particular value in replicating its functionality. I'm quite interested in this. Maybe I'm thinking of something too complex, but I do think there are some "oh it'll need to do that too" pitfalls that are best considered up front. The big thing about MQ is that it participates as a resource manager in two-phase commits (and optionally a transaction manager as well). That means that you get atomic processing steps: application takes message off a queue, processes it, commits its changes to the database, replies to message. The queue manager then does a second-phase commit for all of those steps, and that's when the reply really goes out. If the application fails, none of this will have happened so you get ACID over the complete cycle. That's something we should have free software for. Perhaps the time is right for something new. A lot of the complexity inside MQ comes from data representation issues like encodings and fixed-length strings, as I recall, and things have changed since MQ was designed. I agree it could be useful (and probably not hard either) to have a transactional messaging system inside the database. It saves you from having to do two-phase commits. But it does tie everything to postgres to some extent, and you lose the interesting featuresatomicity and assured, single deliveryas soon as anything in the chain does anything persistent that does not participate in the postgres transaction. Perhaps what we really need is more mature components, with a unified control layer on top. That's how a lot of successful free software grows. See below. > On the other side, the "big names" these days are: > > a) The Java Messaging Service, which seems to implement *way* more > options than I'm even vaguely interested in having (notably, lots > that involve data stores or lack thereof that I do not care to use); Far as I know, JMS is an API, not a product. You'd still slot some messaging middleware underneath, such as MQ. That is why MQSeries was renamed: it fits into the WebSphere suite as the implementing engine underneath the JMS API. From what I understand MQ is one of the "best-of-breed" products that JMS was designed around. (Sun's term, bit hypey for my taste). In one way, Java is easy: the last thing you want to get into is yet another marshaling standard. There are plenty of "standards" to choose from already, each married to one particular communications mechanism: RPC, EDI, CORBA, D-Bus, XMLRPC, what have you. Even postgres has its own.I'd say the most successful mechanism is TCP itself,because it isolates itself from content representation so effectively. It's hard not to get into marshaling: someone has to do it, and it's often a drag to do it in the application, but the way things stand now *any* choice limits the usefulness of what you're building. That's something I'd like to see change. Personally I'd love to see marshaling or low-level data representation isolated into a mature component that speaks multiple programming languages on the one hand and multiple data representation formats on the other. Something the implementers of some of these messaging standards would want to use to compose their messages, isolating their format definitions into plugins. Something that would make application writers stop composing messages in finicky ad-hoc code that fails with unexpected locales or has trouble with different line breaks. If we had a component like that, combining it with existing transactional variants of TCP and [S]HTTP might even be enough to build a usable messaging system. I haven't looked at them enough to know. Of course we'd need implementations of those protocols; see http://ttcplinux.sourceforge.net/ and http://www.csn.ul.ie/~heathclf/fyp/ for example. Another box of important tools, and I have no idea where we stand with this one, is transaction management. We have 2-phase commit in postgres now. But do we have interoperability with existing transaction managers? Is there a decent free, portable, everything-agnostic transaction manager?With those, the sphere of reliability of a database-drivenmessaging package could extend much further. A free XA-capable filesystem would be great too, but I guess I'm daydreaming. > There tend to be varying semantics out there: > > - Some queues may represent "subscriptions" where a whole bunch of > listeners want to get all the messages; The two simplest models that offer something more than TCP/UDP are 1:n reliable publish-subscribe without persistence, and 1:1 request-reply with persistent storage. D-Bus does them both; IIRC MQ does 1:1 and has add-ons on top for publish-subscribe. I could imagine variations such as persistent publish-subscribe, where you can come back once in a while and see if your subscriptions caught anything since your last visit. But such things probably get more complex and less useful as you add more ideas. On top of that goes communication model: symmetric or asymmetric, synchronous or asynchronous. Do you end up with a "remote procedure call" model like RPC, D-Bus, CORBA? Or do you stick with a pure message/event view of communication? Personally I think it's good not to intrude into the application's event loop too much, but others seem to feel the central event loop should not be part of application code. > - Sometimes you have the semantics where: > - messages need to be delivered at least once > - messages need to be delivered no more than once > - messages need to be delivered exactly once IMHO, if you're not doing "exactly once," or something very close to it, you might as well stay with ad-hoc code. You can ensure single delivery by having the sender re-send when in doubt, and keeping track of duplications in the recipient. > Is there any existing work out there on this? Or should I maybe be > looking at prototyping something? I've looked around a bit (not much) and not found anything very generally useful. I think it's an exciting area that probably needs work, so prototyping might be a good idea. If nothing else, I hope I've given you some examples of what you don't want to get yourself into. :-) Jeroen
Hi Chris, Chris Browne wrote: > I'm seeing some applications where it appears that there would be > value in introducing asynchronous messaging, ala "message queueing." > <http://en.wikipedia.org/wiki/Message_queue> ISTM that 'message queue' is a way too general term. There are hundreds of different queues at different levels on a standard server. So I'm somewhat unsure about what problem you want to solve. > c) There are lesser names, like isectd <http://isectd.sf.net> and the > (infamous?) Spread Toolkit which both implement memory-based messaging > systems. If a GCS is about what you're looking for, then you also might want to consider these: ensemble, appia or jGroups. There's a Java layer called jGCS, which supports even more, similar systems. Another commonly used term is 'reliable multicast', which guarantees that messages are delivered to a group of recipients. These algorithms often are the basis for a GCS. > My bias would be to have something that can basically run as a thin > set of stored procedures atop PostgreSQL :-). It would be trivial to > extend that to support SOAP/XML-RPC, if desired. Hm.. in Postgres-R I currently have (partial) support for ensemble and spread. Exporting that interface via stored procedures could be done, but you would probably need a manager process, as you certainly want your connections to persist across transactions (or not?). Together with that process, we already have half of what Postgres-R is: an additional process which connects to the GCS. Thus I'm questioning, if there's value for exporting the interface. Can you think of other use cases than database replication? Why do you want to do that via the database, then, and not directly with the GCS? > It would be nice to achieve 'higher availability' by having queues > where you might replicate the contents (probably using the MQ system > itself ;-)) to other servers. Uhm.. sorry, but I fail to see the big news here. Which replication solution does *not* work that way? Regards Markus
On 6/20/07, Jeroen T. Vermeulen <jtv@xs4all.nl> wrote: > On Wed, June 20, 2007 04:45, Chris Browne wrote: > > - Sometimes you have the semantics where: > > - messages need to be delivered at least once > > - messages need to be delivered no more than once > > - messages need to be delivered exactly once > > IMHO, if you're not doing "exactly once," or something very close to it, > you might as well stay with ad-hoc code. You can ensure single delivery > by having the sender re-send when in doubt, and keeping track of > duplications in the recipient. In case of PGQ, the "at least once" semantics is related to batch-based processing it does - in case of failure, full batch is delivered again, so if consumer had managed to process some of the items already, it gets them double. As it is responsponsible only for delivering events from database, it has no way of guaranteeing "exactly once" behaviour, that needs to be built on top of PGQ. Simplest case would be if the events are processed in same database that the queue resides. Then you can just fetch, process, close batch in one transaction and immidiately you get "exactly once" behaviour. To achieve "exactly once" behaviour with different databases, look at the "pgq_ext" module for sample. Basically it just requires storing batch_id/event_id on remote db and committing there first. Later it can be checked if the batch/event is already processed. It's tricky only if you want to achieve full transactionality for event processing. As I understand, JMS does not have a concept of transactions, probably also other solutions mentioned before, so to use PgQ as backend for them should be much simpler... To Chris: you should like PgQ, its just stored procs in database, plus it's basically just generalized Slony-I, with some optimizations, so should be familiar territory ;) -- marko
Marko Kreen wrote: > As I understand, JMS does not have a concept > of transactions, probably also other solutions mentioned before, > so to use PgQ as backend for them should be much simpler... JMS certainly does have the concept of transactions. Both distributed ones through XA and two-phase commit, and local involving just one JMS provider. I don't know about others, but would be surprised if they didn't. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On 6/20/07, Heikki Linnakangas <heikki@enterprisedb.com> wrote: > Marko Kreen wrote: > > As I understand, JMS does not have a concept > > of transactions, probably also other solutions mentioned before, > > so to use PgQ as backend for them should be much simpler... > > JMS certainly does have the concept of transactions. Both distributed > ones through XA and two-phase commit, and local involving just one JMS > provider. I don't know about others, but would be surprised if they didn't. Ah, sorry, my mistake then. Shouldn't trust hearsay :) -- marko
On Wed, June 20, 2007 18:18, Heikki Linnakangas wrote: > Marko Kreen wrote: >> As I understand, JMS does not have a concept >> of transactions, probably also other solutions mentioned before, >> so to use PgQ as backend for them should be much simpler... > > JMS certainly does have the concept of transactions. Both distributed > ones through XA and two-phase commit, and local involving just one JMS > provider. I don't know about others, but would be surprised if they > didn't. Wait... I thought XA did two-phase commit, and then there was XA+ for *distributed* two-phase commit, which is much harder? Jeroen
Jeroen T. Vermeulen wrote: > On Wed, June 20, 2007 18:18, Heikki Linnakangas wrote: >> Marko Kreen wrote: >>> As I understand, JMS does not have a concept >>> of transactions, probably also other solutions mentioned before, >>> so to use PgQ as backend for them should be much simpler... >> JMS certainly does have the concept of transactions. Both distributed >> ones through XA and two-phase commit, and local involving just one JMS >> provider. I don't know about others, but would be surprised if they >> didn't. > > Wait... I thought XA did two-phase commit, and then there was XA+ for > *distributed* two-phase commit, which is much harder? Well, I meant distributed as in one transaction manager, multiple resource managers, all participating in a single atomic transaction. I don't know what XA+ adds on top of that. To be precise, being a Java-thing, JMS actually supports two-phase commit through JTA (Java Transaction API), not XA. It's the same design and interface, just defined as Java interfaces instead of at native library level. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
markokr@gmail.com ("Marko Kreen") writes: > To Chris: you should like PgQ, its just stored procs in database, > plus it's basically just generalized Slony-I, with some optimizations, > so should be familiar territory ;) Looks interesting... Random ideas ============ - insert_event in C (way to get rid of plpython) Yeah, I'm with that... Ever tried building [foo] on AIX, where foo in ('perl', 'python', ...)??? :-( It seems rather excessive to add in a whole stored procedure language simply for one function... -- (format nil "~S@~S" "cbbrowne" "linuxdatabases.info") http://www3.sympatico.ca/cbbrowne/sgml.html I always try to do things in chronological order.
On 6/20/07, Chris Browne <cbbrowne@acm.org> wrote: > markokr@gmail.com ("Marko Kreen") writes: > > To Chris: you should like PgQ, its just stored procs in database, > > plus it's basically just generalized Slony-I, with some optimizations, > > so should be familiar territory ;) > > Looks interesting... Thanks :) > Random ideas > ============ > - insert_event in C (way to get rid of plpython) > > Yeah, I'm with that... Ever tried building [foo] on AIX, where foo in > ('perl', 'python', ...)??? :-( > > It seems rather excessive to add in a whole stored procedure language > simply for one function... Well, it's standard in our installations as we use it for other stuff too. It's much easier to prototype in PL/Python than in C... As it has not been performance problem I have not bothered to rewrite it. But now the interface has been stable some time, it could be done. -- marko
I recently had a chat with someone who was pretty intimate with Adabas for a number of years who's in the process of figuring things out about PostgreSQL. We poked at bits of the respective implementations, seeing some similarities and differences. He pointed out one aspect of index handling that could (in principle) be an interesting optimization. Evidently, in Adabas, index leaf nodes were not simply tuples, but lists where the index value would not be repeated. In PostgreSQL, if you have the index value 'abc', and there are 10 tuples with that value, then you'll have a page full of tuples of the following form: |abc|ptr[rec1]|abc|ptr[rec2]|abc|ptr[rec3]| ...and so forth... Now, the Adabas approach was rather different. It would only have the index value once, and then have the list of tuple pointers: |abc|ptr[rec1],ptr[rec2],ptr[rec3],...[ptr[rec10]| This could allow a fair bit of compression, for cases where the index value is not unique. There is a concommitant downside, that concurrent updates may fight over a page, and, since there would be a higher density, there would be more need to fight over pages. Does this seem pretty much like madness? Or is it a plausible "some day ToDo"? -- "cbbrowne","@","acm.org" http://linuxfinances.info/info/postgresql.html "I don't do drugs anymore 'cause I find I get the same effect just by standing up really fast." -- Jonathan Katz
Isn't this what Grouped Index Tuples is? On Tue, Aug 14, 2007 at 05:21:16PM -0400, Chris Browne wrote: > I recently had a chat with someone who was pretty intimate with Adabas > for a number of years who's in the process of figuring things out > about PostgreSQL. We poked at bits of the respective implementations, > seeing some similarities and differences. He pointed out one aspect > of index handling that could (in principle) be an interesting > optimization. > > Evidently, in Adabas, index leaf nodes were not simply tuples, but > lists where the index value would not be repeated. > > In PostgreSQL, if you have the index value 'abc', and there are 10 > tuples with that value, then you'll have a page full of tuples of the > following form: > > |abc|ptr[rec1]|abc|ptr[rec2]|abc|ptr[rec3]| ...and so forth... > > Now, the Adabas approach was rather different. It would only have the > index value once, and then have the list of tuple pointers: > > |abc|ptr[rec1],ptr[rec2],ptr[rec3],...[ptr[rec10]| > > This could allow a fair bit of compression, for cases where the index > value is not unique. > > There is a concommitant downside, that concurrent updates may fight > over a page, and, since there would be a higher density, there would > be more need to fight over pages. > > Does this seem pretty much like madness? Or is it a plausible "some > day ToDo"? > -- > "cbbrowne","@","acm.org" > http://linuxfinances.info/info/postgresql.html > "I don't do drugs anymore 'cause I find I get the same effect just by > standing up really fast." -- Jonathan Katz > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org > -- Decibel!, aka Jim Nasby decibel@decibel.org EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
On Tue, 2007-08-14 at 16:27 -0500, Decibel! wrote: > Isn't this what Grouped Index Tuples is? > http://community.enterprisedb.com/git/git-readme.txt It looks like GIT is a little different. GIT actually stores a lower-bound key of a contiguous* range of keys that all point to the same page, and for each of those ranges stores a bitmap of page offsets. A search searches first for an exact match in the index, and failing that, looks to see if the previous index tuple happens to be one of these ranges. The algorithm Chris is talking about stores a set of tuple ids (which include page and offset) for each distinct key. Both could be helpful, although I don't think they can work together very well. GIT has the disadvantage that it's "lossy". It doesn't even store every key in the index, so it can't be sure that the match actually is a match. Thus, it returns candidate matches. That also has implications for enforcing UNIQUE (although it's not impossible, according to the readme). However, GIT can be used effectively on an index that happens to be unique. GIT also assumes a tree structure, and makes no sense for a hash index, and makes no sense for a types without ordering. GIT's space savings is dependent on the clustering of the table. Chris's suggestion would work on a UNIQUE index, but would be no help at all, because there would be no duplicate keys to collapse. However, it could be used for non-tree indexes. The space savings for this strategy is dependent on how repetitive the keys are. I guess the ultimate deciding factor is which can save you more space. If you have lots of duplicates, Chris's suggestion might work better, because you don't have to try to maintain cluster order. If you have a wider distribution of data, GIT is probably better, although you have to keep some degree of clustering (HOT may help with that). Heikki is the authority on GIT, so I'm including him in the CC so he can correct me :) Regards,Jeff Davis *: I'm not 100% sure I'm using "contiguous" correctly, but the range of keys can contain gaps or duplicates, so long as every key in the range points to that same page. That is, if keys 1,1,2,3,5 all point to page P, they can be grouped into just "1" so long as there doesn't exist a key 4 that points to a page other than P.
Jeff Davis wrote: > On Tue, 2007-08-14 at 16:27 -0500, Decibel! wrote: >> Isn't this what Grouped Index Tuples is? > > http://community.enterprisedb.com/git/git-readme.txt > > It looks like GIT is a little different. > > GIT actually stores a lower-bound key of a contiguous* range of keys > that all point to the same page, and for each of those ranges stores a > bitmap of page offsets. A search searches first for an exact match in > the index, and failing that, looks to see if the previous index tuple > happens to be one of these ranges. > > The algorithm Chris is talking about stores a set of tuple ids (which > include page and offset) for each distinct key. Right. > Both could be helpful, although I don't think they can work together > very well. What Chris is suggesting is basically a special case of GIT, where all the heap tuples represented by an index tuple have the same key. I was actually thinking of adding a flag to index tuples to indicate that special case in GIT. We could effectively do both. > GIT has the disadvantage that it's "lossy". It doesn't even store every > key in the index, so it can't be sure that the match actually is a > match. Thus, it returns candidate matches. That also has implications > for enforcing UNIQUE (although it's not impossible, according to the > readme). However, GIT can be used effectively on an index that happens > to be unique. GIT also assumes a tree structure, and makes no sense for > a hash index, and makes no sense for a types without ordering. GIT's > space savings is dependent on the clustering of the table. > > Chris's suggestion would work on a UNIQUE index, but would be no help at > all, because there would be no duplicate keys to collapse. However, it > could be used for non-tree indexes. The space savings for this strategy > is dependent on how repetitive the keys are. Right. I wasn't concerned about the case of a lot of duplicate keys, because the bitmap index is more efficient for that. And also because HOT should reduce the number of index tuples with duplicate keys pointing to different versions of the same row. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On Wed, 2007-08-15 at 06:51 +0100, Heikki Linnakangas wrote: > Jeff Davis wrote: > > On Tue, 2007-08-14 at 16:27 -0500, Decibel! wrote: > >> Isn't this what Grouped Index Tuples is? > > > > http://community.enterprisedb.com/git/git-readme.txt > > > > It looks like GIT is a little different. > > > > GIT actually stores a lower-bound key of a contiguous* range of keys > > that all point to the same page, and for each of those ranges stores a > > bitmap of page offsets. A search searches first for an exact match in > > the index, and failing that, looks to see if the previous index tuple > > happens to be one of these ranges. > > > > The algorithm Chris is talking about stores a set of tuple ids (which > > include page and offset) for each distinct key. > > Right. > > > Both could be helpful, although I don't think they can work together > > very well. > > What Chris is suggesting is basically a special case of GIT, where all > the heap tuples represented by an index tuple have the same key. I was > actually thinking of adding a flag to index tuples to indicate that > special case in GIT. We could effectively do both. A few additional thoughts... The approach suggested by Chris is also used by Teradata Non-Unique Secondary Indexes, known as NUSIs (but not named by me!). The following link is a public domain description that is detailed enough: http://teradata.uark.edu/research/wang/indexes.html Replicating this approach directly isn't that useful because it would interact badly with the way we handle updates. Thinking about the basic design pattern however, we can envisage a type of index that changes the 1:1 mapping between index and heap tuple into the concept of a "tuple set index" where we have a 1:Many mapping between index and heap. In general, the tuple set index approach can significantly reduce index size. This won't be of interest to anyone unless all of your data overflows RAM and you need to do I/O to constantly page back in pieces of your data. If your database does fit in RAM, reducing the number of index blocks might just increase contention. This means that the tuple set approach is only useful when you have very large databases, but when you do it is very, very useful. GIT is a tuple set index with two important extensions: 1. GIT allows a range of values to be indexed, not just a single value, so this allows it to be useful for both Unique and Non-Unique cases. 2. GIT restricts the set of tuples to a small range of blocks within the table. Making the range of blocks = 1 means that GIT is designed to work well with HOT, which also stays on the same block. Keeping the range of blocks small means GIT degrades as slowly as possible in the face of "cold" UPDATEs. If the values are inserted in roughly ordered/clustered sequence then this doesn't increase index size at all, so the most common/highest volume use cases are covered. So from my perspective, GIT is very close to the optimal design for a tuple set index that addresses the need for high concurrency and unique/near-uniqueness with PostgreSQL. There are certainly many other options for a tuple set index, and bitmap indexes are simply another version of a tuple set index but with different behaviours tuned to a different use case. There maybe other use cases that require more than two kinds of tuple set index...and we have discussed implementing the tuple set behaviour as part of the other main index types. As an aside, it turns out, after further research that GIT is similar to a clustered index in SQLServer, as described by Louis Davidson, "Pro SQL Server 2005 Database Design and Optimization", p.405. SQLServer creates a clustered index by default on each PK, so it says. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
On 8/14/07, Chris Browne <cbbrowne@acm.org> wrote: > I recently had a chat with someone who was pretty intimate with Adabas > for a number of years who's in the process of figuring things out > about PostgreSQL. We poked at bits of the respective implementations, > seeing some similarities and differences. He pointed out one aspect > of index handling that could (in principle) be an interesting > optimization. > > Evidently, in Adabas, index leaf nodes were not simply tuples, but > lists where the index value would not be repeated. > > In PostgreSQL, if you have the index value 'abc', and there are 10 > tuples with that value, then you'll have a page full of tuples of the > following form: > > |abc|ptr[rec1]|abc|ptr[rec2]|abc|ptr[rec3]| ...and so forth... > > Now, the Adabas approach was rather different. It would only have the > index value once, and then have the list of tuple pointers: > > |abc|ptr[rec1],ptr[rec2],ptr[rec3],...[ptr[rec10]| > > This could allow a fair bit of compression, for cases where the index > value is not unique. Interesting. Some time ago I've played a little with quite a big table which constained path (file path) as a primary key. It did have sense to have a strucure (SELECTs were mostly ORDER BY path WHERE path > '/foo' LIMIT n). The actual index was only a little bit smaller than the table (there were maybe 4 or 5 columns there). Some time ago I've had an idea that it might be possible to compress th index size, even if it is a unique index. Take the path example. My idea would be to to split indexed value to 8-byte chunks. For example: /var/lib/postgresql/8.2/main would be split into: "/var/lib" "/postgre" "sql/8.2" -- these would be inserteredinto a tree as a "scaffold", and only vacuum should remove them.. "main" -- this would be a leaf node. It could be repeated in non-unique indexes. [/etc/pas] -- scaffold-node|-"swd" -- leaf node [/etc/sha]|-"dow" [/var/lib] -- a problematic mixed scaffold/leaf node.[/postgre] |-"sql" |-"sql/8.2" [sql/8.2/] |-"main" |-"foobar" The scaffold nodes would be there to guarantee that there is some place to attach leafs to. They should not be removed by DELETE (unless we are sure no other node depends on them). Advantages? The repeated values (as "/var/lib/postgresql/8.2") are not repeated -- they are embedded into tree, as a "scaffold", actual nodes that are significant (files, not directories, in my example) are put as actual leafs. I guess it would reduce large indexes size and at the same time it could remove limitation that B-tree index cannot index values larger than 1/3 of the database page. 8-byte chunks was given as an example here, perhaps larger value would be better. (Of course redesigning schema to put directories separate from files woul be useful, but it would not help with ORDER BY .. LIMIT queries -- they would have to be JOIN-ed and re-sorted in memory I'm afraid). Regards, Dawid
On Wed, 2007-08-15 at 06:51 +0100, Heikki Linnakangas wrote: > What Chris is suggesting is basically a special case of GIT, where all > the heap tuples represented by an index tuple have the same key. I was > actually thinking of adding a flag to index tuples to indicate that > special case in GIT. We could effectively do both. > The bigger difference that I see is that GIT doesn't just group together ranges of keys, it also groups by heap page number (or a small range of page numbers, as Simon pointed out). Regards,Jeff Davis
Dawid Kuroczko wrote: > Some time ago I've had an idea that it might be possible to compress > th index size, even if it is a unique index. Take the path example. > My idea would be to to split indexed value to 8-byte chunks. > For example: /var/lib/postgresql/8.2/main would be split into: > "/var/lib" > "/postgre" > "sql/8.2" -- these would be insertered into a tree as a "scaffold", > and only vacuum should remove them.. > "main" -- this would be a leaf node. It could be repeated in non-unique > indexes. That general approach of storing a common part leading part just once is called prefix compression. Yeah, it helps a lot on long text fields. Tree structures like file paths in particular. It's been discussed before. One big problem is extracting the common leading part. You could only do it for text, but it should be done in a datatype neutral way. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
qnex42@gmail.com ("Dawid Kuroczko") writes: > On 8/14/07, Chris Browne <cbbrowne@acm.org> wrote: >> I recently had a chat with someone who was pretty intimate with Adabas >> for a number of years who's in the process of figuring things out >> about PostgreSQL. We poked at bits of the respective implementations, >> seeing some similarities and differences. He pointed out one aspect >> of index handling that could (in principle) be an interesting >> optimization. >> >> Evidently, in Adabas, index leaf nodes were not simply tuples, but >> lists where the index value would not be repeated. >> >> In PostgreSQL, if you have the index value 'abc', and there are 10 >> tuples with that value, then you'll have a page full of tuples of the >> following form: >> >> |abc|ptr[rec1]|abc|ptr[rec2]|abc|ptr[rec3]| ...and so forth... >> >> Now, the Adabas approach was rather different. It would only have the >> index value once, and then have the list of tuple pointers: >> >> |abc|ptr[rec1],ptr[rec2],ptr[rec3],...[ptr[rec10]| >> >> This could allow a fair bit of compression, for cases where the index >> value is not unique. > > Interesting. Some time ago I've played a little with quite a big table > which constained path (file path) as a primary key. It did have sense > to have a strucure (SELECTs were mostly ORDER BY path WHERE path > > '/foo' LIMIT n). > The actual index was only a little bit smaller than the table (there were > maybe 4 or 5 columns there). > > Some time ago I've had an idea that it might be possible to compress > th index size, even if it is a unique index. Take the path example. > My idea would be to to split indexed value to 8-byte chunks. > For example: /var/lib/postgresql/8.2/main would be split into: > "/var/lib" > "/postgre" > "sql/8.2" -- these would be insertered into a tree as a "scaffold", > and only vacuum should remove them.. > "main" -- this would be a leaf node. It could be repeated in non-unique > indexes. > > [/etc/pas] -- scaffold-node > |-"swd" -- leaf node > [/etc/sha] > |-"dow" > [/var/lib] -- a problematic mixed scaffold/leaf node. > [/postgre] > |-"sql" > |-"sql/8.2" > [sql/8.2/] > |-"main" > |-"foobar" > > The scaffold nodes would be there to guarantee that there is some > place to attach leafs to. They should not be removed by DELETE > (unless we are sure no other node depends on them). Note that there is a well-understood name for this; this is assortedly known as a "Radix tree" or a "Patricia trie". <http://en.wikipedia.org/wiki/Radix_tree> As you observe, the tree/trie edges consist not of individual characters, but rather of sequences of characters. > Advantages? The repeated values (as "/var/lib/postgresql/8.2") > are not repeated -- they are embedded into tree, as a "scaffold", > actual nodes that are significant (files, not directories, in my > example) are put as actual leafs. Certainly as compared to a traditional trie, this representation leads to there being a whole lot less nodes and a whole lot less branching. The Radix/Patricia tree compresses things two ways: - As you observe, there can be fewer, larger componets - By combining common prefixes together, this makes cases of strings that are highly patterned much, much, much cheaper,as the tree branches at (and only at) the places where they tend to differ. It could conceivably make it workable to have indexes on big, highly repeating things such as blobs of XML. (Although it *doesn't* get you the ability to search on substrings, which is probably what you'd also want...) > I guess it would reduce large indexes size and at the same time it > could remove limitation that B-tree index cannot index values larger > than 1/3 of the database page. 8-byte chunks was given as an > example here, perhaps larger value would be better. > > (Of course redesigning schema to put directories separate from files > woul be useful, but it would not help with ORDER BY .. LIMIT queries > -- they would have to be JOIN-ed and re-sorted in memory I'm > afraid). I'll gleefully ignore the nature of the example, as it's kind of beside the point. The point is to try to compress what's in the column. If it's being abused somewhat, and has more crud in it, that gives a potential for a *bigger* win. -- output = reverse("gro.mca" "@" "enworbbc") http://linuxdatabases.info/info/spiritual.html "Documentation wants to be obsolete." -- Bruce R. Lewis
"Heikki Linnakangas" <heikki@enterprisedb.com> writes: > That general approach of storing a common part leading part just once is > called prefix compression. Yeah, it helps a lot on long text fields. > Tree structures like file paths in particular. You kind of want to do avoid both the prefix and the suffix, no? > It's been discussed before. One big problem is extracting the common > leading part. You could only do it for text, Or for multi-column indexes I could see this being especially useful if you have some columns in the index key which are small and some that are quite large. So if you have an event table with an index on <userid,timestamp> you wouldn't have to store lots of timestamps on the upper level tree nodes. You would only store them for the leaf nodes. > but it should be done in a datatype neutral way. I wonder if there's an analogous operation for other data types though. Numeric could store the a value relative to the parent value. Arrays could store only the elements needed. bytea of course works just as well as text (or better in the face of i18n). -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
Gregory Stark wrote: > "Heikki Linnakangas" <heikki@enterprisedb.com> writes: > >> That general approach of storing a common part leading part just once is >> called prefix compression. Yeah, it helps a lot on long text fields. >> Tree structures like file paths in particular. > > You kind of want to do avoid both the prefix and the suffix, no? You're much more likely to find common prefixes than suffixes in an index page, because of the ordering. I suppose compressing the suffix would be useful in some cases as well. You might be better off with some generic compression algorithm at that point, though. >> It's been discussed before. One big problem is extracting the common >> leading part. You could only do it for text, > > Or for multi-column indexes Oh yeah, that you could do more easily. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
"Heikki Linnakangas" <heikki@enterprisedb.com> writes: > Gregory Stark wrote: >> "Heikki Linnakangas" <heikki@enterprisedb.com> writes: >> >>> That general approach of storing a common part leading part just once is >>> called prefix compression. Yeah, it helps a lot on long text fields. >>> Tree structures like file paths in particular. >> >> You kind of want to do avoid both the prefix and the suffix, no? > > You're much more likely to find common prefixes than suffixes in an > index page, because of the ordering. I suppose compressing the suffix > would be useful in some cases as well. You might be better off with some > generic compression algorithm at that point, though. Sorry, by "suffix" I don't mean common sufixes, I mean the bits of the key following the point which discriminates between the left and right side of the tree. So for example if you're indexing a text field and have a tree structure like: Redhat Fedora Core 7 / \ Debian Etch (Unstable) Ubuntu hoary We don't really need the whole of "Redhat Fedora Core 7" in the index node. We could actually get by with just "R". Everything before "R" is on the left and everything after "R" is on the right. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
stark@enterprisedb.com (Gregory Stark) writes: > "Heikki Linnakangas" <heikki@enterprisedb.com> writes: > >> Gregory Stark wrote: >>> "Heikki Linnakangas" <heikki@enterprisedb.com> writes: >>> >>>> That general approach of storing a common part leading part just once is >>>> called prefix compression. Yeah, it helps a lot on long text fields. >>>> Tree structures like file paths in particular. >>> >>> You kind of want to do avoid both the prefix and the suffix, no? >> >> You're much more likely to find common prefixes than suffixes in an >> index page, because of the ordering. I suppose compressing the suffix >> would be useful in some cases as well. You might be better off with some >> generic compression algorithm at that point, though. > > Sorry, by "suffix" I don't mean common sufixes, I mean the bits of the key > following the point which discriminates between the left and right side of the > tree. > > So for example if you're indexing a text field and have a > tree structure like: > > Redhat Fedora Core 7 > / \ > Debian Etch (Unstable) Ubuntu hoary > > We don't really need the whole of "Redhat Fedora Core 7" in the index node. We > could actually get by with just "R". Everything before "R" is on the left and > everything after "R" is on the right. Right. The case where you get more characters than just "R" is when you introduce extra entries that have the same prefix. Say, "Redhat Fedora Core 4", "Redhat Fedora Core 5", "Redhat Fedora Core 6". And, for good measure, let's throw in "Redhat RHAS 3", "Redhat RHAS 4", and "Redhat RHAS 5". In that case, you'd have substrings: "R" "edhat " "Fedora Core " "RHAS " as discriminators. -- "cbbrowne","@","cbbrowne.com" http://www3.sympatico.ca/cbbrowne/spreadsheets.html If a mute swears, does his mother wash his hands with soap?
There's a new #include file that it turns out we need for Slony-I to reference, namely include/server/utils/snapmgr.h I tried adding an autoconf rule to Slony-I to check for its existence (goal then is to do a suitable #define so that we can #ifdef the #include, so that we #include this only with versions of PostgreSQL that have the file). If I use: AC_CHECK_HEADER(utils/snapmgr.h, HAVE_SNAPMGR=1) this turns out to fail. Apparently autoconf wants to compile the #include file to validate that it's an OK #include file. GCC barfs on it, thus: cbbrowne@dba2:~/Slony-I/CMD/slony1-HEAD> gcc -I/opt/OXRS/dbs/pgsql84-beta/include/server /opt/OXRS/dbs/pgsql84-beta/include/server/utils/snapmgr.h In file included from /opt/OXRS/dbs/pgsql84-beta/include/server/storage/itemptr.h:17, from /opt/OXRS/dbs/pgsql84-beta/include/server/access/htup.h:17, from /opt/OXRS/dbs/pgsql84-beta/include/server/utils/snapshot.h:16, from /opt/OXRS/dbs/pgsql84-beta/include/server/utils/tqual.h:18, from /opt/OXRS/dbs/pgsql84-beta/include/server/utils/snapmgr.h:16: /opt/OXRS/dbs/pgsql84-beta/include/server/storage/block.h:31: error: expected '=', ',', ';', 'asm' or '__attribute__' before'BlockNumber' /opt/OXRS/dbs/pgsql84-beta/include/server/storage/block.h:55: error: expected specifier-qualifier-list before 'uint16' In file included from /opt/OXRS/dbs/pgsql84-beta/include/server/storage/off.h:17, from /opt/OXRS/dbs/pgsql84-beta/include/server/storage/itemptr.h:18, from /opt/OXRS/dbs/pgsql84-beta/include/server/access/htup.h:17, from /opt/OXRS/dbs/pgsql84-beta/include/server/utils/snapshot.h:16, from /opt/OXRS/dbs/pgsql84-beta/include/server/utils/tqual.h:18, from /opt/OXRS/dbs/pgsql84-beta/include/server/utils/snapmgr.h:16: /opt/OXRS/dbs/pgsql84-beta/include/server/storage/itemid.h:46: error: expected '=', ',', ';', 'asm' or '__attribute__' before'ItemOffset' /opt/OXRS/dbs/pgsql84-beta/include/server/storage/itemid.h:47: error: expected '=', ',', ';', 'asm' or '__attribute__' before'ItemLength' In file included from /opt/OXRS/dbs/pgsql84-beta/include/server/storage/itemptr.h:18, from /opt/OXRS/dbs/pgsql84-beta/include/server/access/htup.h:17, from /opt/OXRS/dbs/pgsql84-beta/include/server/utils/snapshot.h:16, from /opt/OXRS/dbs/pgsql84-beta/include/server/utils/tqual.h:18, from /opt/OXRS/dbs/pgsql84-beta/include/server/utils/snapmgr.h:16: /opt/OXRS/dbs/pgsql84-beta/include/server/storage/off.h:24: error: expected '=', ',', ';', 'asm' or '__attribute__' before'OffsetNumber' In file included from /opt/OXRS/dbs/pgsql84-beta/include/server/access/htup.h:17, from /opt/OXRS/dbs/pgsql84-beta/include/server/utils/snapshot.h:16, from /opt/OXRS/dbs/pgsql84-beta/include/server/utils/tqual.h:18, from /opt/OXRS/dbs/pgsql84-beta/include/server/utils/snapmgr.h:16: /opt/OXRS/dbs/pgsql84-beta/include/server/storage/itemptr.h:40: error: expected specifier-qualifier-list before 'OffsetNumber' /opt/OXRS/dbs/pgsql84-beta/include/server/storage/itemptr.h:143: error: expected '=', ',', ';', 'asm' or '__attribute__'before 'ItemPointerEquals' /opt/OXRS/dbs/pgsql84-beta/include/server/storage/itemptr.h:144: error: expected '=', ',', ';', 'asm' or '__attribute__'before 'ItemPointerCompare' In file included from /opt/OXRS/dbs/pgsql84-beta/include/server/access/htup.h:18, from /opt/OXRS/dbs/pgsql84-beta/include/server/utils/snapshot.h:16, from /opt/OXRS/dbs/pgsql84-beta/include/server/utils/tqual.h:18, from /opt/OXRS/dbs/pgsql84-beta/include/server/utils/snapmgr.h:16: /opt/OXRS/dbs/pgsql84-beta/include/server/storage/relfilenode.h:44: error: expected specifier-qualifier-list before 'Oid' In file included from /opt/OXRS/dbs/pgsql84-beta/include/server/utils/snapshot.h:16, from /opt/OXRS/dbs/pgsql84-beta/include/server/utils/tqual.h:18, from /opt/OXRS/dbs/pgsql84-beta/include/server/utils/snapmgr.h:16: /opt/OXRS/dbs/pgsql84-beta/include/server/access/htup.h:106: error: expected specifier-qualifier-list before 'TransactionId' /opt/OXRS/dbs/pgsql84-beta/include/server/access/htup.h:118: error: expected specifier-qualifier-list before 'int32' /opt/OXRS/dbs/pgsql84-beta/include/server/access/htup.h:142: error: expected specifier-qualifier-list before 'uint16' /opt/OXRS/dbs/pgsql84-beta/include/server/access/htup.h:442: error: expected specifier-qualifier-list before 'uint32' /opt/OXRS/dbs/pgsql84-beta/include/server/access/htup.h:498: error: expected specifier-qualifier-list before 'uint32' /opt/OXRS/dbs/pgsql84-beta/include/server/access/htup.h:623: error: expected specifier-qualifier-list before 'uint16' /opt/OXRS/dbs/pgsql84-beta/include/server/access/htup.h:671: error: expected specifier-qualifier-list before 'BlockNumber' /opt/OXRS/dbs/pgsql84-beta/include/server/access/htup.h:684: error: expected specifier-qualifier-list before 'BlockNumber' /opt/OXRS/dbs/pgsql84-beta/include/server/access/htup.h:694: error: expected specifier-qualifier-list before 'TransactionId' /opt/OXRS/dbs/pgsql84-beta/include/server/access/htup.h:714: error: expected specifier-qualifier-list before 'BlockNumber' /opt/OXRS/dbs/pgsql84-beta/include/server/access/htup.h:722: error: expected '=', ',', ';', 'asm' or '__attribute__' before'HeapTupleHeaderGetCmin' /opt/OXRS/dbs/pgsql84-beta/include/server/access/htup.h:723: error: expected '=', ',', ';', 'asm' or '__attribute__' before'HeapTupleHeaderGetCmax' /opt/OXRS/dbs/pgsql84-beta/include/server/access/htup.h:725: error: expected declaration specifiers or '...' before 'CommandId' /opt/OXRS/dbs/pgsql84-beta/include/server/access/htup.h:726: error: expected declaration specifiers or '...' before 'bool' In file included from /opt/OXRS/dbs/pgsql84-beta/include/server/utils/tqual.h:18, from /opt/OXRS/dbs/pgsql84-beta/include/server/utils/snapmgr.h:16: /opt/OXRS/dbs/pgsql84-beta/include/server/utils/snapshot.h:30: error: expected declaration specifiers or '...' before '*'token /opt/OXRS/dbs/pgsql84-beta/include/server/utils/snapshot.h:31: error: 'bool' declared as function returning a function /opt/OXRS/dbs/pgsql84-beta/include/server/utils/snapshot.h:35: error: expected specifier-qualifier-list before 'SnapshotSatisfiesFunc' In file included from /opt/OXRS/dbs/pgsql84-beta/include/server/utils/snapmgr.h:16: /opt/OXRS/dbs/pgsql84-beta/include/server/utils/tqual.h:22: error: expected '=', ',', ';', 'asm' or '__attribute__' before'SnapshotData' /opt/OXRS/dbs/pgsql84-beta/include/server/utils/tqual.h:23: error: expected '=', ',', ';', 'asm' or '__attribute__' before'SnapshotData' /opt/OXRS/dbs/pgsql84-beta/include/server/utils/tqual.h:24: error: expected '=', ',', ';', 'asm' or '__attribute__' before'SnapshotData' /opt/OXRS/dbs/pgsql84-beta/include/server/utils/tqual.h:25: error: expected '=', ',', ';', 'asm' or '__attribute__' before'SnapshotData' /opt/OXRS/dbs/pgsql84-beta/include/server/utils/tqual.h:69: error: 'HeapTupleSatisfiesMVCC' declared as function returninga function /opt/OXRS/dbs/pgsql84-beta/include/server/utils/tqual.h:71: error: 'HeapTupleSatisfiesNow' declared as function returninga function /opt/OXRS/dbs/pgsql84-beta/include/server/utils/tqual.h:73: error: 'HeapTupleSatisfiesSelf' declared as function returninga function /opt/OXRS/dbs/pgsql84-beta/include/server/utils/tqual.h:75: error: 'HeapTupleSatisfiesAny' declared as function returninga function /opt/OXRS/dbs/pgsql84-beta/include/server/utils/tqual.h:77: error: 'HeapTupleSatisfiesToast' declared as function returninga function /opt/OXRS/dbs/pgsql84-beta/include/server/utils/tqual.h:79: error: 'HeapTupleSatisfiesDirty' declared as function returninga function /opt/OXRS/dbs/pgsql84-beta/include/server/utils/tqual.h:83: error: expected declaration specifiers or '...' before 'CommandId' /opt/OXRS/dbs/pgsql84-beta/include/server/utils/tqual.h:85: error: expected declaration specifiers or '...' before 'TransactionId' /opt/OXRS/dbs/pgsql84-beta/include/server/utils/tqual.h:88: error: expected declaration specifiers or '...' before 'uint16' /opt/OXRS/dbs/pgsql84-beta/include/server/utils/tqual.h:88: error: expected declaration specifiers or '...' before 'TransactionId' /opt/OXRS/dbs/pgsql84-beta/include/server/utils/snapmgr.h:20: error: expected '=', ',', ';', 'asm' or '__attribute__' before'Snapshot' /opt/OXRS/dbs/pgsql84-beta/include/server/utils/snapmgr.h:21: error: expected '=', ',', ';', 'asm' or '__attribute__' before'Snapshot' /opt/OXRS/dbs/pgsql84-beta/include/server/utils/snapmgr.h:22: error: expected '=', ',', ';', 'asm' or '__attribute__' before'Snapshot' /opt/OXRS/dbs/pgsql84-beta/include/server/utils/snapmgr.h:24: error: expected '=', ',', ';', 'asm' or '__attribute__' before'TransactionXmin' /opt/OXRS/dbs/pgsql84-beta/include/server/utils/snapmgr.h:25: error: expected '=', ',', ';', 'asm' or '__attribute__' before'RecentXmin' /opt/OXRS/dbs/pgsql84-beta/include/server/utils/snapmgr.h:26: error: expected '=', ',', ';', 'asm' or '__attribute__' before'RecentGlobalXmin' Is there something in here that needs cleaning? Note: I pulled CVS HEAD earlier today, so this is the state of the code just a little while ago. -- "cbbrowne","@","linuxfinances.info" http://linuxfinances.info/info/advocacy.html Humpty Dumpty sat on the wall, Humpty Dumpty had a great fall! All the king's horses, And all the king's men, Had scrambled eggs for breakfast again!
Chris Browne wrote: > If I use: > AC_CHECK_HEADER(utils/snapmgr.h, HAVE_SNAPMGR=1) > > this turns out to fail. Apparently autoconf wants to compile the > #include file to validate that it's an OK #include file. > > GCC barfs on it, thus: > > cbbrowne@dba2:~/Slony-I/CMD/slony1-HEAD> gcc -I/opt/OXRS/dbs/pgsql84-beta/include/server /opt/OXRS/dbs/pgsql84-beta/include/server/utils/snapmgr.h > In file included from /opt/OXRS/dbs/pgsql84-beta/include/server/storage/itemptr.h:17, > from /opt/OXRS/dbs/pgsql84-beta/include/server/access/htup.h:17, > from /opt/OXRS/dbs/pgsql84-beta/include/server/utils/snapshot.h:16, > from /opt/OXRS/dbs/pgsql84-beta/include/server/utils/tqual.h:18, > from /opt/OXRS/dbs/pgsql84-beta/include/server/utils/snapmgr.h:16: Hmm. It works for me if I forcefully include postgres.h: gcc -I/pgsql/install/00head/include/server -include postgres.h /pgsql/install/00head/include/server/utils/snapmgr.h Our header file rule says that a header must include any header it needs to compile, but never include postgres.h, which must be the first include in all the .c files. So I'm not sure the fix for this. What does Slony-I need snapmgr.h for, anyway? This code is in a state of a flux right now -- there are pending patches which are likely to change the horizon a bit. Perhaps it does not make sense for Slony to adjust to a state that's expected to be short-lived. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Chris Browne <cbbrowne@acm.org> writes: > I tried adding an autoconf rule to Slony-I to check for its existence > (goal then is to do a suitable #define so that we can #ifdef the > #include, so that we #include this only with versions of PostgreSQL > that have the file). The customary way of handling Postgres version differences at compilation time has been something like #include "catalog/catversion.h" #if CATALOG_VERSION_NO >= 200804201 ... new code ... #else ... old code ... #endif Seems to me that would do just fine and you don't need autoconf help. I concur with Alvaro's comment that you're probably wasting your time to do anything touching snapmgr.h right now, but if you must ... regards, tom lane
alvherre@commandprompt.com (Alvaro Herrera) writes: > Chris Browne wrote: > >> If I use: >> AC_CHECK_HEADER(utils/snapmgr.h, HAVE_SNAPMGR=1) >> >> this turns out to fail. Apparently autoconf wants to compile the >> #include file to validate that it's an OK #include file. >> >> GCC barfs on it, thus: >> >> cbbrowne@dba2:~/Slony-I/CMD/slony1-HEAD> gcc -I/opt/OXRS/dbs/pgsql84-beta/include/server /opt/OXRS/dbs/pgsql84-beta/include/server/utils/snapmgr.h >> In file included from /opt/OXRS/dbs/pgsql84-beta/include/server/storage/itemptr.h:17, >> from /opt/OXRS/dbs/pgsql84-beta/include/server/access/htup.h:17, >> from /opt/OXRS/dbs/pgsql84-beta/include/server/utils/snapshot.h:16, >> from /opt/OXRS/dbs/pgsql84-beta/include/server/utils/tqual.h:18, >> from /opt/OXRS/dbs/pgsql84-beta/include/server/utils/snapmgr.h:16: > > > Hmm. It works for me if I forcefully include postgres.h: > > gcc -I/pgsql/install/00head/include/server -include postgres.h /pgsql/install/00head/include/server/utils/snapmgr.h > > Our header file rule says that a header must include any header it needs > to compile, but never include postgres.h, which must be the first > include in all the .c files. > > So I'm not sure the fix for this. > > What does Slony-I need snapmgr.h for, anyway? This code is in a state > of a flux right now -- there are pending patches which are likely to > change the horizon a bit. Perhaps it does not make sense for Slony to > adjust to a state that's expected to be short-lived. Well, one of the Sun guys observed this... http://www.slony.info/bugzilla/show_bug.cgi?id=46 And I was trying to build against CVS HEAD (for both projects ;-)) and observed it: cbbrowne@dba2:CMD/slony1-HEAD/src/backend> make gcc -g -O2 -Wall -Wmissing-prototypes -Wmissing-declarations -I../.. -fpic -I/opt/OXRS/dbs/pgsql84-beta/include/ -I/opt/OXRS/dbs/pgsql84-beta/include/server/ -c -o slony1_funcs.o slony1_funcs.c slony1_funcs.c: In function '_Slony_I_createEvent': slony1_funcs.c:142: error: 'SerializableSnapshot' undeclared (first use in this function) slony1_funcs.c:142: error: (Each undeclared identifier is reported only once slony1_funcs.c:142: error: for each function it appears in.) The definition for SerializableSnapshot has moved from server/utils/tqual.h to server/utils/snapmgr.h I agree that the code seems in flux; it seems quite likely that there will be further changes between now and release of 8.4. That being said, it's useful to observe these problems *early*, and have workarounds, so that we can validate that Slony-I is generally compatible with 8.4 throughout its development cycle. As a workaround, for now, I'll see if Tom's counsel on this works out well; I expect so. The point of the exercise wasn't so much to ask "How do I work around this?" as it was to point out that there's *something* up with the header file, in that the autoconf AC_CHECK_HEADER function barfs on it. *THAT* (the fact that AC_CHECK_HEADER breaks) appears to be a bad thing, irrespective of any Slony-I issues. -- let name="cbbrowne" and tld="linuxdatabases.info" in name ^ "@" ^ tld;; http://linuxdatabases.info/info/x.html Signs of a Klingon Programmer - 1. "Defensive programming? Never! Klingon programs are always on the offense. Yes, offensive programming is what we do best."
Chris Browne wrote: > And I was trying to build against CVS HEAD (for both projects ;-)) and > observed it: > > cbbrowne@dba2:CMD/slony1-HEAD/src/backend> make > gcc -g -O2 -Wall -Wmissing-prototypes -Wmissing-declarations -I../.. -fpic -I/opt/OXRS/dbs/pgsql84-beta/include/ -I/opt/OXRS/dbs/pgsql84-beta/include/server/ -c -o slony1_funcs.o slony1_funcs.c > slony1_funcs.c: In function '_Slony_I_createEvent': > slony1_funcs.c:142: error: 'SerializableSnapshot' undeclared (first use in this function) > slony1_funcs.c:142: error: (Each undeclared identifier is reported only once > slony1_funcs.c:142: error: for each function it appears in.) > > The definition for SerializableSnapshot has moved from > server/utils/tqual.h to server/utils/snapmgr.h Well, considering that my next patch proposes removing SerializableSnapshot altogether, I think Slony will need some rejiggering. What do you need SerializableSnapshot for? Perhaps this is reason enough not to remove it (but I doubt it.) Please point me to an online copy of slony1_funcs.c. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On 4/22/08, Alvaro Herrera <alvherre@commandprompt.com> wrote: > Chris Browne wrote: > > > And I was trying to build against CVS HEAD (for both projects ;-)) and > > observed it: > > > > cbbrowne@dba2:CMD/slony1-HEAD/src/backend> make > > gcc -g -O2 -Wall -Wmissing-prototypes -Wmissing-declarations -I../.. -fpic -I/opt/OXRS/dbs/pgsql84-beta/include/ -I/opt/OXRS/dbs/pgsql84-beta/include/server/ -c -o slony1_funcs.o slony1_funcs.c > > slony1_funcs.c: In function '_Slony_I_createEvent': > > slony1_funcs.c:142: error: 'SerializableSnapshot' undeclared (first use in this function) > > slony1_funcs.c:142: error: (Each undeclared identifier is reported only once > > slony1_funcs.c:142: error: for each function it appears in.) > > > > The definition for SerializableSnapshot has moved from > > server/utils/tqual.h to server/utils/snapmgr.h > > > Well, considering that my next patch proposes removing > SerializableSnapshot altogether, I think Slony will need some > rejiggering. > > What do you need SerializableSnapshot for? Perhaps this is reason > enough not to remove it (but I doubt it.) Please point me to an online > copy of slony1_funcs.c. You probably simply need to use ActiveSnapshot there. Same thing came up when txid landed in core - it had inherited SerializableSnapshot from xxid code. If you really need exact same sematics, you need to simply SET TRANSACTION ISOLATION LEVEL SERIALIZABLE. But if the whole transaction is not SERIALIZABLE already, its unlikely you need it. -- marko
Marko Kreen escribió: > You probably simply need to use ActiveSnapshot there. Heh, ActiveSnapshot as a symbol is also gone in the new code ;-) I changed txid.c to use GetActiveSnapshot() instead. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
I'm trying to start preparing buildfarm nodes for the upcoming Git migration, and have run into a few issues. I speculate that -hackers is one of the better places for this to get discussed; if it should be elsewhere, I'm sure Andrew Dunstan won't be shy to redirect this :-). What I was hoping to do was to run nodes for a little while against the repo at git://github.com/oicu/pg-cvs-mirror.git to validate that it all works against git. The best instructions available thus far... <http://wiki.postgresql.org/wiki/PostgreSQL_Buildfarm_Howto> Unfortunately, I'm finding some anomalies in terms of differences between my environments and what that suggests. 1. git clone --mirror is apparently new in git version 1.6 <http://kerneltrap.org/mailarchive/git/2008/8/2/2793244> My somewhat mouldy Ubuntu node is on git 1.5.4, which doesn't include this. There's a workaround, as --mirror is just "syntactic sugar" $ git clone --mirror $URL may be replaced by $ git clone --bare $URL $ (cd $(basename $URL) && git remote add --mirrororigin $URL) I see options: a) Require git 1.6 or so (latest stable is 1.7.1.1) b) Offer the alternative mirror approach 2. SCM.pl seems to expect the repository directory to be called "pgsql" (around line #373) I wound up hacking up the code to head to a specific directory, which isn't a terribly good hack, but apparently worksfor now. - chdir 'pgsql'; + chdir '/opt/build-farm/pgsql.git'; It feels as though something's not quite right about how the nearby Git repository is referenced. 3. Some problems checking status. i) Status Line: 491 bad ts parameter - [timestamp omitted] is in the future I know my clock's reasonable - ntp is reporting I'm within 0.25s of some stratum 2 nodes. Is it possible that the buildfarm server is ill-synced? ii) Status Line: 460 script version too low I just pulled a buildfarm build last week, so I'm not sure what this would relate to. That's where I've gotten thus far. Node caracara isn't reporting in just now because of this testing. I hope other people ought to be doing similar tryouts so that the revision effort to shift to Git doesn't wait to *start* until the development repo shifts. -- let name="cbbrowne" and tld="gmail.com" in String.concat "@" [name;tld];; http://linuxfinances.info/info/linuxdistributions.html Rules of the Evil Overlord #77. "If I have a fit of temporary insanity and decide to give the hero the chance to reject a job as my trusted lieutentant, I will retain enough sanity to wait until my current trusted lieutenant is out of earshot before making the offer." <http://www.eviloverlord.com/>
Excerpts from Chris Browne's message of lun jul 05 12:33:49 -0400 2010: > 3. Some problems checking status. > > i) Status Line: 491 bad ts parameter - [timestamp omitted] is in the future > > I know my clock's reasonable - ntp is reporting I'm within 0.25s of > some stratum 2 nodes. Is it possible that the buildfarm server is > ill-synced? I asked around and was told that the server is in sync currently. It has been known to be out of sync before; if this problem still comes up, maybe there's a software bug or something.
Alvaro Herrera wrote: > Excerpts from Chris Browne's message of lun jul 05 12:33:49 -0400 2010: > > >> 3. Some problems checking status. >> >> i) Status Line: 491 bad ts parameter - [timestamp omitted] is in the future >> >> I know my clock's reasonable - ntp is reporting I'm within 0.25s of >> some stratum 2 nodes. Is it possible that the buildfarm server is >> ill-synced? >> > > I asked around and was told that the server is in sync currently. It > has been known to be out of sync before; if this problem still comes up, > maybe there's a software bug or something. > > This discussion really belongs on pgbuildfarm-members, I think. I have added a small fudge factor to the time test - let's see if that changes things. cheers andrew