Monday, March 26, 2012

Help with an Outer Join

Hello I'm tying to create query that selects data from two tables depending on the Employee and the range of dates. I have not used outer joins before and I either receive no data or receive too much data.

SELECT qad.NEW_USERS, qad.TRANSACTIONS, CONVERT( VARCHAR, qad.DATA_DATE, 101) DATA_DATE,
qas.QA_SCORE
FROM QA_SCORES qas FULL OUTER JOIN QA_DATA qad
ON (qas.EMPLOYEE_ID = qad.EMPLOYEE_ID AND qas.QA_DATE >= @.startDate AND qas.QA_DATE < @.endDate)
WHERE qad.DATA_DATE >= @.startDate AND qad.DATA_DATE < @.endDate
ORDER BY qad.DATA_DATE

Here are the tables:

QA_DATA:

QA_DATA_ID

EMPLOYEE_ID

NEW_USERS

TRANSACTIONS

DATA_DATE

4

11

0

0

12/1/2006

5

11

9

14

12/2/2006

6

1

2

3

1/1/2006

7

1

2

3

12/1/2006

8

11

6

18

12/3/2006

9

11

14

17

12/4/2006

10

11

7

16

12/5/2006

11

12

8

15

12/1/2006

12

12

0

0

12/3/2006

13

12

8

21

12/4/2006

QA_SCORES:

QA_ID

QA_DATE

QA_SCORE

EMPLOYEE_ID

1

1/18/2007

85

11

2

1/9/2007

83.01

11

Sometimes I receive this:

NEW_USERS

TRANSACTIONS

DATA_DATE

QASCORE

0

0

12/1/2006

85

0

0

12/1/2006

83.01

9

14

12/2/2006

83.01

9

14

12/2/2006

85

6

18

12/3/2006

85

6

18

12/3/2006

83.01

14

17

12/4/2006

83.01

14

17

12/4/2006

85

7

16

12/5/2006

85

7

16

12/5/2006

83.01

If the startDate = 12/01/2006 and the endDate = 12/31/2006 it should look like this:

NEW_USERS

TRANSACTIONS

DATA_DATE

QASCORE

0

0

12/1/2006

9

14

12/2/2006

6

18

12/3/2006

14

17

12/4/2006

7

16

12/5/2006

Because there are no QA_SCOREs in 2006

And if the start date is 1/01/2007 and the endDate is 1/31/2007 it should look like this:

NEW_USERS

TRANSACTIONS

DATA_DATE

QASCORE

1/9/2007

85

1/18/2007

83.01

I have tried quite a few things and just can't figure this out.

Any help is appreciated.

.

Milla:

(1) It appears that your data has the 85 score and the 83 score switched with each other. (2) Also, it appears to me that you have an additional filter criteria that you are probably leaving out -- probably a filter based on "EMPLOYEE_ID". Based on this guess I came up with this query that seems to return the results that you are looking for. I have no idea whether this is close or not, but maybe it will help:


declare @.filter_id integer
set @.filter_id = 11

SELECT qad.NEW_USERS,
qad.TRANSACTIONS,
CONVERT( VARCHAR, coalesce (qad.DATA_DATE, qas.qa_date), 101)
DATA_DATE,
qas.QA_SCORE
FROM QA_SCORES qas
FULL OUTER JOIN QA_DATA qad
ON qas.EMPLOYEE_ID = qad.EMPLOYEE_ID
AND qad.DATA_DATE >= @.startDate AND qad.DATA_DATE < @.endDate
AND qas.QA_DATE >= @.startDate AND qas.QA_DATE < @.endDate
WHERE coalesce (qad.DATA_DATE, qas.qa_date) >= @.startDate
AND coalesce (qad.DATA_DATE, qas.qa_date) < @.endDate
AND coalesce (qas.employee_id, qad.employee_id) = @.filter_id
ORDER BY coalesce (qad.DATA_DATE, qas.qa_date)


-- - Sample Output for 12/1/6 through 12/31/6: -

-- NEW_USERS TRANSACTIONS DATA_DATE QA_SCORE
-- -- --
-- 0 0 12/01/2006 NULL
-- 9 14 12/02/2006 NULL
-- 6 18 12/03/2006 NULL
-- 14 17 12/04/2006 NULL
-- 7 16 12/05/2006 NULL


-- - Sample Output for 1/1/7 through 1/31/7: -

-- NEW_USERS TRANSACTIONS DATA_DATE QA_SCORE
-- -- --
-- NULL NULL 01/09/2007 83.01
-- NULL NULL 01/18/2007 85.00


Dave

|||

Milla:

(If this post is a duplicate, I am sorry; it seems that my post has been lost.)

It appears to me that your query is missing a filter -- probably based on EMPLOYEE_ID. Based on this guess, the following query seems to return the correct data:


declare @.filter_id integer
set @.filter_id = 11

SELECT qad.NEW_USERS,
qad.TRANSACTIONS,
CONVERT( VARCHAR, coalesce (qad.DATA_DATE, qas.qa_date), 101)
DATA_DATE,
qas.QA_SCORE
FROM QA_SCORES qas
FULL OUTER JOIN QA_DATA qad
ON qas.EMPLOYEE_ID = qad.EMPLOYEE_ID
AND qad.DATA_DATE >= @.startDate AND qad.DATA_DATE < @.endDate
AND qas.QA_DATE >= @.startDate AND qas.QA_DATE < @.endDate
WHERE coalesce (qad.DATA_DATE, qas.qa_date) >= @.startDate
AND coalesce (qad.DATA_DATE, qas.qa_date) < @.endDate
AND coalesce (qas.employee_id, qad.employee_id) = @.filter_id
ORDER BY coalesce (qad.DATA_DATE, qas.qa_date)


-- - Sample Output for 12/1/6 through 12/31/6: -

-- NEW_USERS TRANSACTIONS DATA_DATE QA_SCORE
-- -- --
-- 0 0 12/01/2006 NULL
-- 9 14 12/02/2006 NULL
-- 6 18 12/03/2006 NULL
-- 14 17 12/04/2006 NULL
-- 7 16 12/05/2006 NULL


-- - Sample Output for 1/1/7 through 1/31/7: -

-- NEW_USERS TRANSACTIONS DATA_DATE QA_SCORE
-- -- --
-- NULL NULL 01/09/2007 83.01
-- NULL NULL 01/18/2007 85.00

|||

Hi,

Can you please tell what you want to do in text?

By setting the WHERE clause like this, you are not making a FULL OUTER JOIN but a RIGHT OUTER JOIN.

Why? For all the records that are in QA_SCORES and that have no associated record in QA_DATA, the field QA_DATA.DATA_DATE will always be NULL. Therefor, these records can not pass the WHERE clause.

You can prevent this by changing the WHERE keyword to the AND keyword so that the condition is included in the JOIN.

Greetz,

Geert

Geert Verhoeven
Consultant @. Ausy Belgium

My Personal Blog

|||

Take a look at your WHERE clause:

SELECT qad.NEW_USERS, qad.TRANSACTIONS,
CONVERT( VARCHAR, qad.DATA_DATE, 101) DATA_DATE,
qas.QA_SCORE
FROM QA_SCORES qas
FULL OUTER JOIN QA_DATA qad
ON (qas.EMPLOYEE_ID = qad.EMPLOYEE_ID
AND qas.QA_DATE >= @.startDate AND qas.QA_DATE < @.endDate)
WHERE qad.DATA_DATE >= @.startDate AND qad.DATA_DATE < @.endDate
ORDER BY qad.DATA_DATE

This where clause will apply to all rows that are returned from the FROM clause. So you will not receive qas rows where the qad.DATA_DATE is not greater than @.startDate. So any null rows would be eliminated from the set. You can put the WHERE clause stuff up in the ON clause and this won't affect how rows match in the qas set.

|||Since you didn't tell us what you are trying to get as a result, use "LEFT OUTER JOIN" instead of "FULL OUTER JOIN", that will probably get you closer.|||

Thanks for the replies. What you are saying makes sence.

But I still can't seem to get it to work.

Here is what I have now:

declare @.startDate datetime
declare @.endDate datetime
set @.startDate = 1/1/2007
set @.endDate = 1/31/2007

SELECT emp.NAME NAME, qa.NEW_USERS NewUsers, qa.TRANSACTIONS Trans,
CONVERT( VARCHAR, coalesce (qa.DATA_DATE, qas.qa_date), 101) DATA_DATE, qas.QA_SCORE
FROM EMPLOYEE emp INNER JOIN
QA_DATA qa ON (emp.EMPLOYEE_ID = qa.EMPLOYEE_ID AND emp.REPORTABLE = 1) FULL OUTER JOIN QA_SCORES qas
ON (qas.EMPLOYEE_ID = qa.EMPLOYEE_ID
AND qa.DATA_DATE >= @.startDate AND qa.DATA_DATE < @.endDate
AND qas.QA_DATE >= @.startDate AND qas.QA_DATE < @.endDate)

No matter what dates I enter it returns all rows. I'm guessing it's because there is no where statement, but if I enter a where statement it returns no rows.

Here is what it is returning:

Allard

Rich

0

0

12/1/2006

NULL

Allard

Rich

9

14

12/2/2006

NULL

Allard

Rich

6

18

12/3/2006

NULL

Allard

Rich

14

17

12/4/2006

NULL

Allard

Rich

7

16

12/5/2006

NULL

Bass

Gary

8

15

12/1/2006

NULL

Bass

Gary

0

0

12/3/2006

NULL

Bass

Gary

8

21

12/4/2006

NULL

NULL

NULL

NULL

1/18/2007

85

NULL

NULL

NULL

1/9/2007

83.01

This is what it should look like:

NULL

NULL

NULL

1/18/2007

85

NULL

NULL

NULL

1/9/2007

83.01

And if I change the dates to 12/1/2007 and 12/31/2007 the result should be this:

Allard

Rich

0

0

12/1/2006

NULL

Allard

Rich

9

14

12/2/2006

NULL

Allard

Rich

6

18

12/3/2006

NULL

Allard

Rich

14

17

12/4/2006

NULL

Allard

Rich

7

16

12/5/2006

NULL

Bass

Gary

8

15

12/1/2006

NULL

Bass

Gary

0

0

12/3/2006

NULL

Bass

Gary

8

21

12/4/2006

NULL

I know this is probably an easy fix but I can't seem to figure out what I'm doing wrong.

Thanks in advance!

|||

Another thing I just noticed is it is returning :

NULLNULLNULL1/18/200785
NULLNULLNULL1/9/200783.01

and it should look like this:

Allard, Rich NULL NULL 1/18/2007 85
Allard, Rich NULL NULL 1/9/2007 83.01

|||

Milla:

I used the following data:

insert into QA_DATA values (4, 11, 0, 0, '12/1/2006' )
insert into QA_DATA values (5, 11, 9, 14, '12/2/2006' )
insert into QA_DATA values (6, 1, 2, 3, '1/1/2006' )
insert into QA_DATA values (7, 1, 2, 3, '12/1/2006' )
insert into QA_DATA values (8, 11, 6, 18, '12/3/2006' )
insert into QA_DATA values (9, 11, 14, 17, '12/4/2006' )
insert into QA_DATA values (10, 11, 7, 16, '12/5/2006' )
insert into QA_DATA values (11, 12, 8, 15, '12/1/2006' )
insert into QA_DATA values (12, 12, 0, 0, '12/3/2006' )
insert into QA_DATA values (13, 12, 8, 21, '12/4/2006' )

insert into dbo.QA_SCORES values (1, '1/18/2007', 85, 11 )
insert into dbo.QA_SCORES values (2, '1/9/2007', 83.01, 11 )

truncate table dbo.employee
insert into employee values (1, 'Employee #1')
insert into employee values (11, 'Allard, Rich')
insert into employee values (12, 'Bass, Gary')

with this query:

declare @.startDate datetime
declare @.endDate datetime
--set @.startDate = '12/1/6'
--set @.endDate = '12/31/6'
set @.startDate = '1/1/7'
set @.endDate = '1/31/7'

SELECT emp.[name],
qad.NEW_USERS,
qad.TRANSACTIONS,
CONVERT( VARCHAR, coalesce (qad.DATA_DATE, qas.qa_date), 101)
DATA_DATE,
qas.QA_SCORE
FROM QA_SCORES qas
FULL OUTER JOIN QA_DATA qad
ON qas.EMPLOYEE_ID = qad.EMPLOYEE_ID
AND qad.DATA_DATE >= @.startDate AND qad.DATA_DATE < @.endDate
AND qas.QA_DATE >= @.startDate AND qas.QA_DATE < @.endDate
inner join employee emp
on coalesce (qas.employee_id, qad.employee_id) = emp.employee_id
WHERE coalesce (qad.DATA_DATE, qas.qa_date) >= @.startDate
AND coalesce (qad.DATA_DATE, qas.qa_date) < @.endDate
ORDER BY EMP.[name], coalesce (qad.DATA_DATE, qas.qa_date)

to get these results:

name NEW_USERS TRANSACTIONS DATA_DATE QA_SCORE
-- --
Allard, Rich NULL NULL 01/09/2007 83.01
Allard, Rich NULL NULL 01/18/2007 85.00

name NEW_USERS TRANSACTIONS DATA_DATE QA_SCORE
-- --
Allard, Rich 0 0 12/01/2006 NULL
Allard, Rich 9 14 12/02/2006 NULL
Allard, Rich 6 18 12/03/2006 NULL
Allard, Rich 14 17 12/04/2006 NULL
Allard, Rich 7 16 12/05/2006 NULL
Bass, Gary 8 15 12/01/2006 NULL
Bass, Gary 0 0 12/03/2006 NULL
Bass, Gary 8 21 12/04/2006 NULL
Employee #1 2 3 12/01/2006 NULL

|||

Not sure what I did the first time that made it not work but the last one you entered is working for me.

Thanks Waldrop!

|||

You didn't do anything wrong the first time; my guess was wrong! Remember: I made an assertion that the employee_id was 11 -- and this was not correct. When you gave some more information it became apparent that my guess was not correct and I dropped this part from my version of the query. Glad it worked out!


Dave

No comments:

Post a Comment