SQL query uses "wrong" join -
i have query gives me wrong result.
tables:
a +----+ | id | +----+ | 1 | | 2 | +----+ b +----+----+ | id | x | b.id = a.id +----+----+ | 1 | 1 | | 1 | 1 | | 1 | 0 | +----+----+ c +----+----+ | id | y | c.id = a.id +----+----+ | 1 | 1 | | 1 | 2 | +----+----+ what want do: select rows a. each row in count in b x value 1 , x value 0 b.id = a.id. each row in minimum y c c.id = a.id.
the result expecting is:
+----+------+--------+---------+ | id | min | count1 | count 2 | +----+------+--------+---------+ | 1 | 1 | 2 | 1 | | 2 | null | 0 | 0 | +----+------+--------+---------+ first try: doesn't work.
select a.id, min(c.y), sum(if(b.x = 1, 1, 0)), sum(if(b.x = 0, 1, 0)) left join b on ( a.id = b.id ) left join c on ( a.id = c.id ) group a.id +----+------+--------+---------+ | id | min | count1 | count 2 | +----+------+--------+---------+ | 1 | 1 | 4 | 2 | | 2 | null | 0 | 0 | +----+------+--------+---------+ second try: works sure has bad performance.
select a.id, min(c.y), b.x, b.y left join (select b.id, sum(if(b.x = 1, 1, 0)) x, sum(if(b.x = 0, 1, 0)) y b) b on ( a.id = b.id ) left join c on ( a.id = c.id ) group a.id +----+------+--------+---------+ | id | min | count1 | count 2 | +----+------+--------+---------+ | 1 | 1 | 2 | 1 | | 2 | null | 0 | 0 | +----+------+--------+---------+ last try: works too.
select x.*, sum(if(b.x = 1, 1, 0)), sum(if(b.x = 0, 1, 0)) (select a.id, min(c.y) left join c on ( a.id = c.id ) group a.id) x left join b on ( b.id = x.id ) group x.id now question is: last 1 best choise or there way write query 1 select statement (like in first try)?
your joins doing cartesian products given value, because there multiple rows in each table.
you can fix using count(distinct) rather sum():
select a.id, min(c.y), count(distinct (case when b.x = 1 b.id end)), count(distinct (case when b.x = 0 b.id end)) left join b on ( a.id = b.id ) left join c on ( a.id = c.id ) group a.id; you can fix pre-aggregating b (and/or c). , need take approach if aggregation function sum of column in b.
edit:
you correct. above query counts distinct values of b, b contains rows exact duplicates. (personally, think having column name id has duplicates sign of poor design, issue.)
you solve having real id in b table, because count(distinct) count correct values. can solve aggregating 2 tables before joining them in:
select a.id, c.y, x1, x0 left join (select b.id, sum(b.x = 1) x1, sum(b.x = 0) x0 b group b.id ) b on ( a.id = b.id ) left join (select c.id, min(c.y) y c group c.id ) c on ( a.id = c.id ); here sql fiddle problem.
edit ii:
you can in 1 statement, i'm not sure work on similar data. idea can count cases x = 1 , divide number of rows in c table real distinct count:
select a.id, min(c.y), coalesce(sum(b.x = 1), 0) / count(distinct coalesce(c.y, -1)), coalesce(sum(b.x = 0), 0) / count(distinct coalesce(c.y, -1)) left join b on ( a.id = b.id ) left join c on ( a.id = c.id ) group a.id; it little tricky, because have handle nulls right values. note counting y value distinct count c table. question re-enforces why idea have unique integer primary key in every table.
Comments
Post a Comment