Thread: How to migrate column type from uuid to serial
I was integrating a payment gateway for my app when I noticed its maximum length of customer id string is 32. SIze of UUID is 36 (32 characters and 4 dashes). So I want to change the type of customer id to serial. The problem is by now, the column is being used at many places. How to migrate the column to serial without dropping the data?
On 10/7/20 5:48 AM, Hemil Ruparel wrote: > I was integrating a payment gateway for my app when I noticed its > maximum length of customer id string is 32. SIze of UUID is 36 (32 > characters and 4 dashes). So I want to change the type of customer id to > serial. The problem is by now, the column is being used at many places. > How to migrate the column to serial without dropping the data? Changing the size of the column is not an option? Your description of the customer id column above is somewhat confusing to me. Is the id actually stored as a UUID? Why is the max length an issue? If you where to migrate I would say create an independent serial/identity column. Then point the dependent objects at that. After all have been converted drop the old column. -- Adrian Klaver adrian.klaver@aklaver.com
On 10/7/20 6:58 AM, Hemil Ruparel wrote: Please reply to list also. Ccing list > Yes. The id is stored as uuid. Thanks for the suggestion. Should work > > > On Wed, Oct 7, 2020 at 7:27 PM Adrian Klaver <adrian.klaver@aklaver.com > <mailto:adrian.klaver@aklaver.com>> wrote: > > On 10/7/20 5:48 AM, Hemil Ruparel wrote: > > I was integrating a payment gateway for my app when I noticed its > > maximum length of customer id string is 32. SIze of UUID is 36 (32 > > characters and 4 dashes). So I want to change the type of > customer id to > > serial. The problem is by now, the column is being used at many > places. > > How to migrate the column to serial without dropping the data? > > Changing the size of the column is not an option? > > Your description of the customer id column above is somewhat confusing > to me. Is the id actually stored as a UUID? > > Why is the max length an issue? > > If you where to migrate I would say create an independent > serial/identity column. Then point the dependent objects at that. After > all have been converted drop the old column. > > > > > > -- > Adrian Klaver > adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com> > -- Adrian Klaver adrian.klaver@aklaver.com
Yes the id is stored as a uuid. Thanks for the suggestion. Should work
On Wed, Oct 7, 2020 at 7:29 PM Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 10/7/20 6:58 AM, Hemil Ruparel wrote:
Please reply to list also.
Ccing list
> Yes. The id is stored as uuid. Thanks for the suggestion. Should work
>
>
> On Wed, Oct 7, 2020 at 7:27 PM Adrian Klaver <adrian.klaver@aklaver.com
> <mailto:adrian.klaver@aklaver.com>> wrote:
>
> On 10/7/20 5:48 AM, Hemil Ruparel wrote:
> > I was integrating a payment gateway for my app when I noticed its
> > maximum length of customer id string is 32. SIze of UUID is 36 (32
> > characters and 4 dashes). So I want to change the type of
> customer id to
> > serial. The problem is by now, the column is being used at many
> places.
> > How to migrate the column to serial without dropping the data?
>
> Changing the size of the column is not an option?
>
> Your description of the customer id column above is somewhat confusing
> to me. Is the id actually stored as a UUID?
>
> Why is the max length an issue?
>
> If you where to migrate I would say create an independent
> serial/identity column. Then point the dependent objects at that. After
> all have been converted drop the old column.
>
>
>
>
>
> --
> Adrian Klaver
> adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
>
--
Adrian Klaver
adrian.klaver@aklaver.com
Hemil Ruparel schrieb am 07.10.2020 um 16:02: > Yes the id is stored as a uuid. Then it should be declared with the data type uuid, which only needs 16 bytes.
umm it is declared as uuid. But how does it occupy only 16 bytes? Even if we remove those 4 dashes thats 32 bytes of text right? I am not concerned about the size at all. How do i send it as a string below 32 bytes?
On Wed, Oct 7, 2020 at 7:37 PM Thomas Kellerer <shammat@gmx.net> wrote:
Hemil Ruparel schrieb am 07.10.2020 um 16:02:
> Yes the id is stored as a uuid.
Then it should be declared with the data type uuid, which only needs 16 bytes.
Hemil Ruparel schrieb am 07.10.2020 um 16:21: > it is declared as uuid. But how does it occupy only 16 bytes? Because a UUID is internally simply a 128bit number - the dashes you see are just formatting. But if you can only send the text represnation, then yes 32 characters aren't enough.
Sorry if this is silly but if it is a 128 bit number, why do we need 32 characters to represent it? Isn't 8 bits one byte?
On Wed, Oct 7, 2020 at 8:08 PM Thomas Kellerer <shammat@gmx.net> wrote:
Hemil Ruparel schrieb am 07.10.2020 um 16:21:
> it is declared as uuid. But how does it occupy only 16 bytes?
Because a UUID is internally simply a 128bit number - the dashes you see are just formatting.
But if you can only send the text represnation, then yes 32 characters aren't enough.
>>> it is declared as uuid. But how does it occupy only 16 bytes? >> Because a UUID is internally simply a 128bit number - the dashes you see are just formatting. > Sorry if this is silly but if it is a 128 bit number, why do we need 32 characters to represent it? The 36 (or 32 without the dashes) characters are just the default hex representation. If you wanted to, you could convert it to a bigint (or a numeric). See e.g. here for an example: https://stackoverflow.com/a/27286610
Is it because they are hex characters and hence only need 4 bit to store per character but we display each of those 4 bits as a character as a hex value (0 to 9 and a-f) all of which in ASCII and UTF-8 require a byte to represent? Hence the length of 32 (or 36 with dashes)?
On Wed, Oct 7, 2020 at 8:10 PM Hemil Ruparel <hemilruparel2002@gmail.com> wrote:
Sorry if this is silly but if it is a 128 bit number, why do we need 32 characters to represent it? Isn't 8 bits one byte?On Wed, Oct 7, 2020 at 8:08 PM Thomas Kellerer <shammat@gmx.net> wrote:Hemil Ruparel schrieb am 07.10.2020 um 16:21:
> it is declared as uuid. But how does it occupy only 16 bytes?
Because a UUID is internally simply a 128bit number - the dashes you see are just formatting.
But if you can only send the text represnation, then yes 32 characters aren't enough.
Hemil: On Wed, Oct 7, 2020 at 2:49 PM Hemil Ruparel <hemilruparel2002@gmail.com> wrote: > I was integrating a payment gateway for my app when I noticed its maximum length of customer id string is 32. SIze of UUIDis 36 (32 characters and 4 dashes). So I want to change the type of customer id to serial. The problem is by now, thecolumn is being used at many places. How to migrate the column to serial without dropping the data? An uuid is just a 128 bit number, as noted by many. Your problem is the default encoding is 32 hex digits plus 4 hyphens. If your payment gateway uses a 32 CHARs string ( "maximum length of customer id string is 32" ) you can just use other encodings. Passing to integer and using decimal wont cut it ( you'll need 39 digits ), but using just hex (without hyphens) will drop the string representation to 32 characters ( you can encode with a subst and postgres accepts it without hyphens ). If you want to buy a little more space for your own purposes you can even fit 128 bits in 22 base64 chars with a couple pairs of bits to spare, and IIRC you can do it with a creative decode/encode step after killing the dashes. And if your payment gateway uses unicode codepoints instead of ASCII chars as units you could probably use more creative encodings ;-) , but probably using a "drop the dashes" subst in the interface will be your simpler option. Francisco Olarte.
I was thinking UUID was not a very good choice for id. Serial would be a better one because I don't have a billion customers. It is more like a thousand. So when I saw the customer ID of the payment gateway cannot accept more than 32 characters, I thought UUID is overkill. So I want to migrate to using a serial int instead as the primary key.,
On Wed, Oct 7, 2020 at 10:48 PM Francisco Olarte <folarte@peoplecall.com> wrote:
Hemil:
On Wed, Oct 7, 2020 at 2:49 PM Hemil Ruparel <hemilruparel2002@gmail.com> wrote:
> I was integrating a payment gateway for my app when I noticed its maximum length of customer id string is 32. SIze of UUID is 36 (32 characters and 4 dashes). So I want to change the type of customer id to serial. The problem is by now, the column is being used at many places. How to migrate the column to serial without dropping the data?
An uuid is just a 128 bit number, as noted by many. Your problem is
the default encoding is 32 hex digits plus 4 hyphens. If your payment
gateway uses a 32 CHARs string ( "maximum length of customer id
string is 32" ) you can just use other encodings. Passing to integer
and using decimal wont cut it ( you'll need 39 digits ), but using
just hex (without hyphens) will drop the string representation to 32
characters ( you can encode with a subst and postgres accepts it
without hyphens ).
If you want to buy a little more space for your own purposes you can
even fit 128 bits in 22 base64 chars with a couple pairs of bits to
spare, and IIRC you can do it with a creative decode/encode step after
killing the dashes.
And if your payment gateway uses unicode codepoints instead of ASCII
chars as units you could probably use more creative encodings ;-) ,
but probably using a "drop the dashes" subst in the interface will be
your simpler option.
Francisco Olarte.
On Thu, Oct 8, 2020 at 6:14 AM Hemil Ruparel <hemilruparel2002@gmail.com> wrote: > I was thinking UUID was not a very good choice for id. Serial would be a better one because I don't have a billion customers.It is more like a thousand. So when I saw the customer ID of the payment gateway cannot accept more than 32 characters,I thought UUID is overkill. So I want to migrate to using a serial int instead as the primary key., I do not like them. They are typically random 128 bit integers. If you think of it, random 16 bits ints are not a good idea for keys, even if the table has only a couple dozen keys. UUIDs are the same at a bigger scale, they look nicer and safer, but have got problems. The version 1-2, the one with mac address with theoretically guaranteed uniqueness were nice for their intended use, avoiding collisions without a central authority. But the namespace hash and random version are practically collision free, but not theoretically, which displeases me. When I need to present an obscured name to an outside service ( i.e., in your case, presenting the user id to the gateway directly leaks information, as they can estimate your consumer growth ) I just encrypt ( not hash, encrypt ) them. For these you can do something like treating a 64 bit number like an 8 byte array, encrypt it with a "secret" key and you send something which is random enough to the gateway, but is easy for you to manage in the database, and preserves the counter niceties, like knowing there are approximately 1000 customers between IDs 3000 and 4000 ( some may have been deleted /skiped ). Also, depending on your patterns, IDs typically correlate nicely with table order, which makes index access a bit faster, you do not jump around. And 4 byte ints make for a much denser index than 16 byte uuid. Francisco Olarte.
On 2020-10-07 20:10:34 +0530, Hemil Ruparel wrote: > Sorry if this is silly but if it is a 128 bit number, why do we need 32 > characters to represent it? Isn't 8 bits one byte? Yes, 8 bits are 1 byte. But that's 256 different values, so to display them in 1 character you would need 256 different characters. That's not possible in ASCII (ASCII has only 94 graphic characters), and even if you included accented characters and other alphabets (like Greek or Cyrillic) it would be hard to read. So the decision was to display each byte as a pair of two hexadecimal digits (because 16 * 16 = 256). They could also have used 3 decimal digits (000 - 255) for each byte, but that would have wasted even more space, or they could have used base 32 or 64 for the whole number, but that would make conversion harder. hp -- _ | Peter J. Holzer | Story must make more sense than reality. |_|_) | | | | | hjp@hjp.at | -- Charles Stross, "Creative writing __/ | http://www.hjp.at/ | challenge!"
Attachment
On 2020-10-10 11:31:23 +0200, Peter J. Holzer wrote: > On 2020-10-07 20:10:34 +0530, Hemil Ruparel wrote: > > Sorry if this is silly but if it is a 128 bit number, why do we need 32 > > characters to represent it? Isn't 8 bits one byte? > > Yes, 8 bits are 1 byte. But that's 256 different values, so to display > them in 1 character you would need 256 different characters. That's not > possible in ASCII (ASCII has only 94 graphic characters), and even if > you included accented characters and other alphabets (like Greek or > Cyrillic) it would be hard to read. I'm showing my European bias here. I should have thought of Korean. The Hangul script is syllabic with a very straightforward and easy to learn structure. Wikipedia tells me that they have 19 consonants and 21 vowels, so you could just pick 16 consonants and 16 vowels to construct 256 syllables. That would even make UUIDs pronounceable. hp -- _ | Peter J. Holzer | Story must make more sense than reality. |_|_) | | | | | hjp@hjp.at | -- Charles Stross, "Creative writing __/ | http://www.hjp.at/ | challenge!"
Attachment
oh. I get it now. Thanks
On Sat, Oct 10, 2020 at 3:41 PM Peter J. Holzer <hjp-pgsql@hjp.at> wrote:
On 2020-10-10 11:31:23 +0200, Peter J. Holzer wrote:
> On 2020-10-07 20:10:34 +0530, Hemil Ruparel wrote:
> > Sorry if this is silly but if it is a 128 bit number, why do we need 32
> > characters to represent it? Isn't 8 bits one byte?
>
> Yes, 8 bits are 1 byte. But that's 256 different values, so to display
> them in 1 character you would need 256 different characters. That's not
> possible in ASCII (ASCII has only 94 graphic characters), and even if
> you included accented characters and other alphabets (like Greek or
> Cyrillic) it would be hard to read.
I'm showing my European bias here.
I should have thought of Korean. The Hangul script is syllabic with a
very straightforward and easy to learn structure. Wikipedia tells me
that they have 19 consonants and 21 vowels, so you could just pick 16
consonants and 16 vowels to construct 256 syllables. That would even
make UUIDs pronounceable.
hp
--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"