Friday, February 4, 2011

How to return a page of results from SQL?

Many applications have grids that display data from a database table one page at a time. Many of them also let the user pick the number of records per page, sort by any column, and navigate back and forth through the results.

What's a good algorithm to implement this pattern without bringing the entire table to the client and then filtering the data on the client. How do you bring just the records you want to display to the user?

Does LINQ simplify the solution?

  • On MS SQL Server 2005 and above, ROW_NUMBER() seems to work:

    http://blogs.x2line.com/al/archive/2005/11/18/1323.aspx

    DECLARE @PageNum AS INT;
    DECLARE @PageSize AS INT;
    SET @PageNum = 2;
    SET @PageSize = 10;
    
    WITH OrdersRN AS
    (
        SELECT ROW_NUMBER() OVER(ORDER BY OrderDate, OrderID) AS RowNum
              ,OrderID
              ,OrderDate
              ,CustomerID
              ,EmployeeID
          FROM dbo.Orders
    )
    
    SELECT * 
      FROM OrdersRN
     WHERE RowNum BETWEEN (@PageNum - 1) * @PageSize + 1 
                      AND @PageNum * @PageSize
     ORDER BY OrderDate
             ,OrderID;
    
  • Actually, LINQ has Skip and Take methods which can be combined to choose which records are fetched.

    Check those out.

    For DB: http://www.singingeels.com/Articles/Pagination_In_SQL_Server_2005.aspx

    From Vaibhav
  • Oracle Solution:

    select * from (
        select a.*, rownum rnum from (
            YOUR_QUERY_GOES_HERE -- including the order by
        ) a
        where rownum <= MAX_ROW
     ) where rnum >= MIN_ROW
    
  • I'd recommend either using LINQ, or try to copy what it does. I've got an app where I use the LINQ Take and Skip methods to retrieve paged data. The code looks something like this:

    MyDataContext db = new MyDataContext();
    var results = db.Products
        .Skip((pageNumber - 1) * pageSize)
        .Take(pageSize);
    

    Running SQL Server Profiler reveals that LINQ is converting this query into SQL similar to:

    SELECT [ProductId], [Name], [Cost], and so on...
    FROM (
        SELECT [ProductId], [Name], [Cost], [ROW_NUMBER]
        FROM (
           SELECT ROW_NUMBER() OVER (ORDER BY [Name]) AS [ROW_NUMBER], 
               [ProductId], [Name], [Cost]
           FROM [Products]
        )
        WHERE [ROW_NUMBER] BETWEEN 10 AND 20
    )
    ORDER BY [ROW_NUMBER]
    

    In plain English:
    1. Filter your rows and use the ROW_NUMBER function to add row numbers in the order you want.
    2. Filter (1) to return only the row numbers you want on your page.
    3. Sort (2) by the row number, which is the same as the order you wanted (in this case, by Name).

  • There is another identical post: http://stackoverflow.com/questions/2840/paging-sql-server-2005-results#9870

    Is there a way to merge posts in stackoverflow?

  • There are a few solutions which I use with MS SQL 2005.

    One of them is ROWNUMBER(). But, personally, I don't like ROWNUMBER() because it doesn't work for big results (DB which I work on is really big -- over 1TB data running thousands of queries in second -- you know -- big social networking site).

    Here are my favourite solution.

    I will use kind of pseudo code of T-SQL.

    Let's find 2nd page of users sorted by forename, surname, where each page has 10 records.

    @page = 2 -- input parameter
    @size = 10 -- can be optional input parameter
    
    if @page < 1 then begin
        @page = 1 -- check page number
    end
    @start = (@page-1) * @size + 1 -- @page starts at record no @start
    
    -- find the beginning of page @page
    SELECT TOP (@start)
        @forename = forename,
        @surname = surname
        @id = id
    FROM
        users
    ORDER BY
        forename,
        surname,
        id -- to keep correct order in case of have two John Smith.
    
    -- select @size records starting from @start
    SELECT TOP (@size)
        id,
        forename,
        surname
    FROM
        users
    WHERE
        (forename = @forename and surname = @surname and id >= @id) -- the same name and surname, but bigger id
        OR (forename = @forename and surname > @surname) -- the same name, but bigger surname, id doesn't matter
        OR (forename > @forename) -- bigger forename, the rest doesn't matter
    ORDER BY
        forename,
        surname,
        id
    
  • LINQ combined with lambda expressions and anonymous classes in .Net 3.5 hugely simplifies this sort of thing.

    Querying the database:

    var customers = from c in db.customers
                    join p in db.purchases on c.CustomerID equals p.CustomerID
                    where p.purchases > 5
                    select c;
    

    Number of records per page:

    customers = customers.Skip(pageNum * pageSize).Take(pageSize);
    

    Sorting by any column:

    customers = customers.OrderBy(c => c.LastName);
    

    Getting only selected fields from server:

    var customers = from c in db.customers
                    join p in db.purchases on c.CustomerID equals p.CustomerID
                    where p.purchases > 5
                    select new
                    {
                        CustomerID = c.CustomerID,
                        FirstName = c.FirstName,
                        LastName = c.LastName
                    };
    

    This creates a statically-typed anonymous class in which you can access its properties:

    var firstCustomer = customer.First();
    int id = firstCustomer.CustomerID;
    

    Results from queries are lazy-loaded by default, so you aren't talking to the database until you actually need the data. LINQ in .Net also greatly simplifies updates by keeping a datacontext of any changes you have made, and only updating the fields which you change.

    ScSub : The first statement seems to pull back everything from the DB and then the second statement gets a subset. What if you just want a subset in the first place? I have 90,000 rows and just want page 4 of 10 rows.
    Adam Lassek : @ScSub LINQ expressions are lazy-loaded, so the first call doesn't actually do anything at first. You could call `customers = customers.Skip(30).Take(10)` and it would only pull back what you want.
  • There is a discussion about this Here

    The technique gets page number 100,000 from a 150,000 line database in 78ms

    Using optimizer knowledge and SET ROWCOUNT, the first EmployeeID in the page that is requested is stored in a local variable for a starting point. Next, SET ROWCOUNT to the maximum number of records that is requested in @maximumRows. This allows paging the result set in a much more efficient manner. Using this method also takes advantage of pre-existing indexes on the table as it goes directly to the base table and not to a locally created table.

    I am afraid I am not able to judge if it is better than the current accepted answer.

    From Aidan

0 comments:

Post a Comment