Thread: Comparing two URL strings

Comparing two URL strings

From
mahendrakar s
Date:
Hi,
I am facing an issue in comparing two urls (below).

Comparison of the two strings is failing due to mismatch of '%40'  and (@) character even though the both urls are the same.
 (gdb) p url
$1 = 0x55d35cfbd1f8 "https://domain.org/v1.0/users/test@user.org"
(gdb) p graph_url
$2 = 0x7ffd82777240 "https://domain.org/v1.0/users/test%40user.org"

Can you please let me know how to compare these two strings.
One way I could think of is convert both the strings to utf-8 but don't find an utility function to compare the UTF-8 strings.

Thanks,
Mahendrakar.

Re: Comparing two URL strings

From
Alan Hodgson
Date:
On Thu, 2022-05-19 at 22:23 +0530, mahendrakar s wrote:
Hi,
I am facing an issue in comparing two urls (below).

Comparison of the two strings is failing due to mismatch of '%40'  and (@) character even though the both urls are the same.
 (gdb) p url
$1 = 0x55d35cfbd1f8 "https://domain.org/v1.0/users/test@user.org"
(gdb) p graph_url
$2 = 0x7ffd82777240 "https://domain.org/v1.0/users/test%40user.org"

Can you please let me know how to compare these two strings.
One way I could think of is convert both the strings to utf-8 but don't find an utility function to compare the UTF-8 strings.


They aren't multi-byte; converting them to UTF-8 wouldn't change anything. As is, they are in fact different strings, as "URL" is something only browsers know about.

You probably need to urldecode (or encode) them in your programming language before storing them.

Re: Comparing two URL strings

From
mahendrakar s
Date:
Thanks Alan.

On Thu, 19 May 2022 at 22:35, Alan Hodgson <ahodgson@lists.simkin.ca> wrote:
On Thu, 2022-05-19 at 22:23 +0530, mahendrakar s wrote:
Hi,
I am facing an issue in comparing two urls (below).

Comparison of the two strings is failing due to mismatch of '%40'  and (@) character even though the both urls are the same.
 (gdb) p url
$1 = 0x55d35cfbd1f8 "https://domain.org/v1.0/users/test@user.org"
(gdb) p graph_url
$2 = 0x7ffd82777240 "https://domain.org/v1.0/users/test%40user.org"

Can you please let me know how to compare these two strings.
One way I could think of is convert both the strings to utf-8 but don't find an utility function to compare the UTF-8 strings.


They aren't multi-byte; converting them to UTF-8 wouldn't change anything. As is, they are in fact different strings, as "URL" is something only browsers know about.

You probably need to urldecode (or encode) them in your programming language before storing them.