Thread: Inconsistent results postgresql
Hello all,
I've got two queries which should produce the same results but they don't for some reason, please consider these :
SELECT date_trunc('day', payments.created_at) "day", SUM("payments"."amount") AS sum_id
FROM "payments" INNER JOIN "users" ON "users"."id" = "payments"."user_id"
WHERE "payments"."currency" = 'gbp' AND (payments.refunded_date is null) AND
(users.deleted_at is null) AND
(users.subscribed_at between '2014-07-07 00:00:00.000000' and '2014-07-07 23:59:59.999999') AND
("payments"."created_at" BETWEEN '2014-07-07 00:00:00.000000' AND '2014-07-07 23:59:59.999999')
GROUP BY 1
ORDER by 1 asc
It produces this :
day | sum_id
------------------------------
"2014-07-07 00:00:00" | 1863.85
But when I try to group by day at looking at 30 days period with this query :
SELECT date_trunc('day', payments.created_at) "day", SUM("payments"."amount") AS sum_id
FROM "payments" INNER JOIN "users" ON "users"."id" = "payments"."user_id"
WHERE "payments"."currency" = 'gbp' AND (payments.refunded_date is null) AND
(users.deleted_at is null) AND
(users.subscribed_at between '2014-06-30 00:00:00.000000' and '2014-07-30 23:59:59.999999') AND
("payments"."created_at" BETWEEN '2014-06-30 00:00:00.000000' AND '2014-07-30 23:59:59.999999')
GROUP BY 1
ORDER by 1 asc
It produces this (truncated to include the important data only) :
day | sum_id
------------------------------
"2014-07-07 00:00:00" | 1898.84
So looking for same date using different date range I get different results, how is this even possible? Can I look at something else? I'm really stuck here
Thanks
Emir Ibrahimbegovic wrote > Hello all, > > I've got two queries which should produce the same results but they don't > for some reason, please consider these : > > SELECT date_trunc('day', payments.created_at) "day", > SUM("payments"."amount") AS sum_id FROM "payments" > INNER JOIN "users" ON "users"."id" = "payments"."user_id" WHERE > "payments"."currency" = 'gbp' AND (payments.refunded_date is null) > AND (users.deleted_at is null) > AND (users.subscribed_at between '2014-07-07 00:00:00.000000' and > '2014-07-07 23:59:59.999999') > AND ("payments"."created_at" BETWEEN '2014-07-07 00:00:00.000000' > AND '2014-07-07 23:59:59.999999')GROUP BY 1ORDER by 1 asc > > It produces this : > > day | > sum_id------------------------------"2014-07-07 00:00:00" | 1863.85 > > But when I try to group by day at looking at 30 days period with this > query > : > > SELECT date_trunc('day', payments.created_at) "day", > SUM("payments"."amount") AS sum_id FROM "payments" > INNER JOIN "users" ON "users"."id" = "payments"."user_id" WHERE > "payments"."currency" = 'gbp' AND (payments.refunded_date is null) > AND (users.deleted_at is null) > AND (users.subscribed_at between '2014-06-30 00:00:00.000000' and > '2014-07-30 23:59:59.999999') > AND ("payments"."created_at" BETWEEN '2014-06-30 00:00:00.000000' > AND '2014-07-30 23:59:59.999999')GROUP BY 1ORDER by 1 asc > > It produces this (truncated to include the important data only) : > > day | > sum_id------------------------------"2014-07-07 00:00:00" | 1898.84 > > So looking for same date using different date range I get different > results, how is this even possible? Can I look at something else? I'm > really stuck here > > Thanks There may be a timezone issue here - hard to tell without seeing the data types for subscribed_at and created_at. The first thing I would do is move the WHERE clause portion of the query into a CTE: WITH detail AS ( SELECT ) Then I would run the GROUP BY portion over the CTE SELECT * FROM ( SELECT date_trunc(...), sum(...) FROM detail WHERE date_trunc(...) GROUP BY 1 ) src WHERE date_trunc = '2014-07-07'::date Assuming you are still getting different sums you can now readily list the corresponding detail (i.e., remove the group by) and see which records show in the one set but not the other. You can add additional CTE queries to facilitate this if desired - i.e. put each where clause into its own detail query and, if you have some kind of ID to compare against, perform a FULL OUTER JOIN between the two CTE tables and exclude any items that appear in both. Ignoring indexes for the moment it would be much safer and clearer to write the WHERE clause using "BETWEEN date AND date" instead of using timestamps with subsecond precision. If you do insist on using timestamps I would then suggest at least explicitly casting them. '2014-07-07 00:00:00.000000'::timestamp David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Inconsistent-results-postgresql-tp5813375p5813377.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
Hello all,
I've got two queries which should produce the same results but they don't for some reason, please consider these :
SELECT date_trunc('day', payments.created_at) "day", SUM("payments"."amount") AS sum_id FROM "payments" INNER JOIN "users" ON "users"."id" = "payments"."user_id" WHERE "payments"."currency" = 'gbp' AND (payments.refunded_date is null) AND (users.deleted_at is null) AND (users.subscribed_at between '2014-07-07 00:00:00.000000' and '2014-07-07 23:59:59.999999') AND ("payments"."created_at" BETWEEN '2014-07-07 00:00:00.000000' AND '2014-07-07 23:59:59.999999') GROUP BY 1 ORDER by 1 asc
It produces this :
day | sum_id ------------------------------ "2014-07-07 00:00:00" | 1863.85
But when I try to group by day at looking at 30 days period with this query :
SELECT date_trunc('day', payments.created_at) "day", SUM("payments"."amount") AS sum_id FROM "payments" INNER JOIN "users" ON "users"."id" = "payments"."user_id" WHERE "payments"."currency" = 'gbp' AND (payments.refunded_date is null) AND (users.deleted_at is null) AND (users.subscribed_at between '2014-06-30 00:00:00.000000' and '2014-07-30 23:59:59.999999') AND ("payments"."created_at" BETWEEN '2014-06-30 00:00:00.000000' AND '2014-07-30 23:59:59.999999') GROUP BY 1 ORDER by 1 asc
It produces this (truncated to include the important data only) :
day | sum_id ------------------------------ "2014-07-07 00:00:00" | 1898.84
So looking for same date using different date range I get different results, how is this even possible? Can I look at something else? I'm really stuck here
Thanks
Remove the sum (just select "payments.amount") and the GROUP BY and run your queries. You'll see that you're getting different rows included than you think you are.On Wed, Jul 30, 2014 at 8:41 PM, Emir Ibrahimbegovic <emir.ibrahimbegovic@gmail.com> wrote:Hello all,
I've got two queries which should produce the same results but they don't for some reason, please consider these :
SELECT date_trunc('day', payments.created_at) "day", SUM("payments"."amount") AS sum_id FROM "payments" INNER JOIN "users" ON "users"."id" = "payments"."user_id" WHERE "payments"."currency" = 'gbp' AND (payments.refunded_date is null) AND (users.deleted_at is null) AND (users.subscribed_at between '2014-07-07 00:00:00.000000' and '2014-07-07 23:59:59.999999') AND ("payments"."created_at" BETWEEN '2014-07-07 00:00:00.000000' AND '2014-07-07 23:59:59.999999') GROUP BY 1 ORDER by 1 asc
It produces this :
day | sum_id ------------------------------ "2014-07-07 00:00:00" | 1863.85
But when I try to group by day at looking at 30 days period with this query :
SELECT date_trunc('day', payments.created_at) "day", SUM("payments"."amount") AS sum_id FROM "payments" INNER JOIN "users" ON "users"."id" = "payments"."user_id" WHERE "payments"."currency" = 'gbp' AND (payments.refunded_date is null) AND (users.deleted_at is null) AND (users.subscribed_at between '2014-06-30 00:00:00.000000' and '2014-07-30 23:59:59.999999') AND ("payments"."created_at" BETWEEN '2014-06-30 00:00:00.000000' AND '2014-07-30 23:59:59.999999') GROUP BY 1 ORDER by 1 asc
It produces this (truncated to include the important data only) :
day | sum_id ------------------------------ "2014-07-07 00:00:00" | 1898.84
So looking for same date using different date range I get different results, how is this even possible? Can I look at something else? I'm really stuck here
Thanks
--I asked the Internet how to train my cat, and the Internet told me to get a dog.
On 07/31/2014 05:44 AM, Emir Ibrahimbegovic wrote: > Thank you for your response guys. So, did you find the cause? -- Adrian Klaver adrian.klaver@aklaver.com
On 07/31/2014 05:44 AM, Emir Ibrahimbegovic wrote:So, did you find the cause?Thank you for your response guys.
--
Adrian Klaver
adrian.klaver@aklaver.com
Chris Curvey <chris@chriscurvey.com> wrote: > Emir Ibrahimbegovic <emir.ibrahimbegovic@gmail.com> wrote: >> So looking for same date using different date range I get >> different results, how is this even possible? Can I look at >> something else? I'm really stuck here > Remove the sum (just select "payments.amount") and the GROUP BY > and run your queries. You'll see that you're getting different > rows included than you think you are. Well, either that or he's getting them in a different order and the data type being summed is a floating point approximate type. With the approximate types the order that addition happens in can affect the sum. For the most accurate result you want to add from the numbers closest to zero to the ones farthest away from zero. If exact results are desired, the best solution is to not use an approximate data type. Try converting the column to numeric, which is not as vulnerable to rounding errors. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company