Sunday, March 20, 2011

Date ranges in views - is this normal?

I recently started working at a company with an enormous "enterprisey" application. At my last job, I designed the database, but here we have a whole Database Architecture department that I'm not part of.

One of the stranger things in their database is that they have a bunch of views which, instead of having the user provide the date ranges they want to see, join with a (global temporary) table "TMP_PARM_RANG" with a start and end date. Every time the main app starts processing a request, the first thing it does it "DELETE FROM TMP_PARM_RANG;" then an insert into it.

This seems like a bizarre way of doing things, and not very safe, but everybody else here seems ok with it. Is this normal, or is my uneasiness valid?

Update I should mention that they use transactions and per-client locks, so it is guarded against most concurrency problems. Also, there are literally dozens if not hundreds of views that all depend on TMP_PARM_RANG.

From stackoverflow
  • Personally, I'm guessing that it would be a pretty strange occurance. And from what you are saying two methods calling the process at the same time could be very interesting.

    Typically date ranges are done as filters on a view, and not driven by outside values stored in other tables.

    The only justification I could see for this is if there was a multi-step process, that was only executed once at a time and the dates are needed for multiple operations, across multiple stored procedures.

  • Do they also add one -in the application- to generate the next unique value for the primary key?

    It seems that the concept of shared state eludes these folks, or the reason for the shared state eludes us.

    Paul Tomblin : I haven't discovered how they assign primary keys yet, but there isn't a sequence in the database like I expected.
  • That sounds like a pretty weird algorithm to me. I wonder how it handles concurrency - is it wrapped in a transaction?

    Sounds to me like someone just wasn't sure how to write their WHERE clause.

  • There must be some business reason for this table. I've seen views with dates hardcoded that were actually a partioned view and they were using dates as the partioning field. I've also seen joining on a table like when dealing with daylights saving times imagine a view that returned all activity which occured during DST. And none of these things would ever delete and insert into the table...that's just odd

    So either there is a deeper reason for this that needs to be dug out, or it's just something that at the time seemed like a good idea but why it was done that way has been lost as tribal knowledge.

  • If the database is oracle, it's possibly a global temporary table; every session sees its own version of the table and inserts/deletes won't affect other users.

  • Do I understand this correctly?

    There is a view like this:

    SELECT * FROM some_table, tmp_parm_rang
      WHERE some_table.date_column BETWEEN tmp_parm_rang.start_date AND tmp_parm_rang.end_date;
    

    Then in some frontend a user inputs a date range, and the application does the following:

    1. Deletes all existing rows from TMP_PARM_RANG
    2. Inserts a new row into TMP_PARM_RANG with the user's values
    3. Selects all rows from the view

    I wonder if the changes to TMP_PARM_RANG are committed or rolled back, and if so when? Is it a temporary table or a normal table? Basically, depending on the answers to these questions, the process may not be safe for multiple users to execute in parallel. One hopes that if this were the case they would have already discovered that and addressed it, but who knows?

    Even if it is done in a thread-safe way, making changes to the database for simple query operations doesn't make a lot of sense. These DELETEs and INSERTs are generating redo/undo (or whatever the equivalent is in a non-Oracle database) which is completely unnecessary.

    A simple and more normal way of accomplishing the same goal would be to execute this query, binding the user's inputs to the query parameters:

    SELECT * FROM some_table WHERE some_table.date_column BETWEEN ? AND ?;
    
    Paul Tomblin : Yes to your first question, although there are literally dozens if not hundreds of these views that all depend on tmp_parm_rang.
  • The views are probably used as temp tables. In SQL Server we can use a table variable or a temp table (# / ##) for this purpose. Although creating views are not recommended by experts, I have created lots of them for my SSRS projects because the tables I am working on do not reference one another (NO FK's, seriously!). I have to workaround deficiencies in the database design; that's why I am using views a lot.

  • I suppose it would let them support multiple ranges. For example, they can return all dates between 1/1/2008 and 1/1/2009 AND 1/1/2006 and 1/1/2007 to compare 2006 data to 2008 data. You couldn't do that with a single pair of bound parameters. Also, I don't know how Oracle does it's query plan caching for views, but perhaps it has something to do with that? With the date columns being checked as part of the view the server could cache a plan that always assumes the dates will be checked.

    Just throwing out some guesses here :)

    Also, you wrote:

    I should mention that they use transactions and per-client locks, so it is guarded against most concurrency problems.

    While that may guard against data consistency problems due to concurrency, it hurts when it comes to performance problems due to concurrency.

  • With the global temporary table GTT approach that you comment is being used here, the method is certainly safe with regard to a multiuser system, so no problem there. If this is Oracle then I'd want to check that the system either is using an appropriate level of dynamic sampling so that the GTT is joined appropriately, or that a call to DBMS_STATS is made to supply statistics on the GTT.

0 comments:

Post a Comment