SQL query uses "wrong" join -
i have query gives me wrong result.
tables:
a +----+ | id | +----+ | 1 | | 2 | +----+ b +----+----+ | id | x | b.id = a.id +----+----+ | 1 | 1 | | 1 | 1 | | 1 | 0 | +----+----+ c +----+----+ | id | y | c.id = a.id +----+----+ | 1 | 1 | | 1 | 2 | +----+----+
what want do: select rows a. each row in count in b x value 1 , x value 0 b.id = a.id. each row in minimum y c c.id = a.id.
the result expecting is:
+----+------+--------+---------+ | id | min | count1 | count 2 | +----+------+--------+---------+ | 1 | 1 | 2 | 1 | | 2 | null | 0 | 0 | +----+------+--------+---------+
first try: doesn't work.
select a.id, min(c.y), sum(if(b.x = 1, 1, 0)), sum(if(b.x = 0, 1, 0)) left join b on ( a.id = b.id ) left join c on ( a.id = c.id ) group a.id +----+------+--------+---------+ | id | min | count1 | count 2 | +----+------+--------+---------+ | 1 | 1 | 4 | 2 | | 2 | null | 0 | 0 | +----+------+--------+---------+
second try: works sure has bad performance.
select a.id, min(c.y), b.x, b.y left join (select b.id, sum(if(b.x = 1, 1, 0)) x, sum(if(b.x = 0, 1, 0)) y b) b on ( a.id = b.id ) left join c on ( a.id = c.id ) group a.id +----+------+--------+---------+ | id | min | count1 | count 2 | +----+------+--------+---------+ | 1 | 1 | 2 | 1 | | 2 | null | 0 | 0 | +----+------+--------+---------+
last try: works too.
select x.*, sum(if(b.x = 1, 1, 0)), sum(if(b.x = 0, 1, 0)) (select a.id, min(c.y) left join c on ( a.id = c.id ) group a.id) x left join b on ( b.id = x.id ) group x.id
now question is: last 1 best choise or there way write query 1 select statement (like in first try)?
your joins doing cartesian products given value, because there multiple rows in each table.
you can fix using count(distinct)
rather sum()
:
select a.id, min(c.y), count(distinct (case when b.x = 1 b.id end)), count(distinct (case when b.x = 0 b.id end)) left join b on ( a.id = b.id ) left join c on ( a.id = c.id ) group a.id;
you can fix pre-aggregating b
(and/or c
). , need take approach if aggregation function sum of column in b
.
edit:
you correct. above query counts distinct values of b
, b
contains rows exact duplicates. (personally, think having column name id
has duplicates sign of poor design, issue.)
you solve having real id
in b
table, because count(distinct)
count correct values. can solve aggregating 2 tables before joining them in:
select a.id, c.y, x1, x0 left join (select b.id, sum(b.x = 1) x1, sum(b.x = 0) x0 b group b.id ) b on ( a.id = b.id ) left join (select c.id, min(c.y) y c group c.id ) c on ( a.id = c.id );
here sql fiddle problem.
edit ii:
you can in 1 statement, i'm not sure work on similar data. idea can count cases x = 1
, divide number of rows in c
table real distinct count:
select a.id, min(c.y), coalesce(sum(b.x = 1), 0) / count(distinct coalesce(c.y, -1)), coalesce(sum(b.x = 0), 0) / count(distinct coalesce(c.y, -1)) left join b on ( a.id = b.id ) left join c on ( a.id = c.id ) group a.id;
it little tricky, because have handle null
s right values. note counting y
value distinct count c
table. question re-enforces why idea have unique integer primary key in every table.
Comments
Post a Comment