Sunday, April 3, 2011

How do I best get the top 2 unique rows when a JOIN is involved?

I have this query:

select top(2)
    property_id_ref
    ,image_file
    ,property_name 
from property_master a 
inner join image_master b 
    on a.property_id=b.property_id_ref 
inner join customer_master c 
    on a.customer_id=c.customer_id

When I execute it, I get the following result:

512 ~/propertyimg/3954493 id_1.jpg Commercial Land 
512 ~/propertyimg/3954493.jpg Commercial Land

But I need the output distinct property_id_ref with random image_file from the property_id_ref like this:

512 ~/propertyimg/3954493 id_1.jpg Commercial Land 
513 ~/propertyimg/3119918 Id.jpg Residential Plot

For that I made a query like:

select top(2) 
    max(pm.property_name) as property_name
    ,max(im.property_id_ref) as property_id_ref
    ,CONVERT(varchar(5000),  max( CONVERT(binary, im.image_file))) as image_file 
from property_master pm
inner join image_master im
    on pm.property_id=im.property_id_ref 
inner join customer_master cm
    on pm.customer_id=cm.customer_id 
group by im.property_id_ref

So I got the same output as the one I expected. I want to know whether this is the right way to do it, or is there any other better way of doing the same thing?

I am using SQL Server 2005.

From stackoverflow
  • The way you do is the right one. An group by of the ID_Ref and a random member by max. It's completly OK and I see no reason why to change it.

    Blixt : I disagree... There are unnecessary aggregate functions and conversions going on in there. This can be done with `ROW_NUMBER()` in SQL Server 2005, or joining to a sub-query in earlier versions of SQL Server. I was looking for a third, better solution I can't remember, but I can't find it. Maybe someone else remembers it though.
  • If you really only have that query you posted in example, this will work fine:

    SELECT TOP (2)
        pm.property_id,
        pm.property_name,
        (SELECT TOP 1 image_file
         FROM image_master
         WHERE property_id_ref = pm.property_id) AS image_file
    FROM
        property_master pm
    -- This is only needed if it's possible that [image_file] can be NULL and you
    -- don't want to get those rows.
    WHERE
        EXISTS (SELECT * FROM image_master
                WHERE property_id_ref = pm.property_id)
    

    I assume your query is more complex than that though, but I can't give you a more specific query unless you post your real query.

0 comments:

Post a Comment