I have this MySQL query:
SELECT DAYOFYEAR(`date`) AS d, COUNT(*)
FROM `orders`
WHERE `hasPaid` > 0
GROUP BY d
ORDER BY d
Which returns something like this:
d | COUNT(*) |
20 | 5 |
21 | 7 |
22 | 12 |
23 | 4 |
What I'd really like is another column on the end to show the running total:
d | COUNT(*) | ??? |
20 | 5 | 5 |
21 | 7 | 12 |
22 | 12 | 24 |
23 | 4 | 28 |
Is this possible?
-
I would say that this is impossible every resulting row should be independent. Use programming language for getting these values
Jarret Hardie : Given the nature of relational math, and the fact that you're using group by, even if mysql has some hack to make this possible, it would be less convoluted to just do it in a programming language as Sergej suggests.cdonner : I would disagree. Splitting the processing tasks between the database and the application layer is problematic from a reuse and maintenance perspective. If you want to use this data in different places, maybe on a report and on a screen, you'd have to duplicate the running totals logic.nickf : +1 you're right: this would be easier and better overall in the programming logic - I was trying to see if there was some magic awesome function to do it.Sam : When you have a considerable amount of data you have to compromise some purity, and also, in this case it really doesn't look like true "logic" to me, it could be seen just as a "visual aid", there is no real business logic in accumulate values.Jarret Hardie : Agree with Sam. A report and a screen are both view code. The re-usable "logic" should be encapsulated in the view layer... application design notwithstanding. -
SELECT DAYOFYEAR(O.`date`) AS d, COUNT(*), (select count(*) from `orders` where DAYOFYEAR(`date`) <= d and `hasPaid` > 0) FROM `orders` as O WHERE O.`hasPaid` > 0 GROUP BY d ORDER BY d
This will require some syntactical tuning (I don't have MySQL to test it), but it shows you the idea. THe subquery just has to go back and add up everything fresh that you already included in the outer query, and it has to do that for every row.
Take a look at this question for how to use joins to accomplish the same.
To address concerns about performance degradation with growing data: Since there are max. 366 days in a year, and I assume that you are not running this query against multiple years, the subquery will get evaluated up to 366 times. With proper indices on the date and the hasPaid flag, you'll be ok.
nickf : thanks - this works perfectly as is.Sergej Andrejev : Be aware that this will be extremely slow on big, average and some of the small databases, because it needs to do as many additional queries as there will be rows in resultJarret Hardie : Agree. I +1'd this answer because it is clever, and we've all used solutions like this when needed, but we are also all aware there is a cost. Depends on where you need the running count. For the business logic? Then maybe do this in the DB. For the view? Do it in code. -
Unless you have no other option but doing this in sql, I'd sum the results in the programming language that is making the query. A nesting like this will became very slow as the table grows.
-
You can hack this using the Cross Join statement or some slef joins but it will get slow with any large data sets so probably best done in a post query processor; either cursor of in client code
-
This is one of the only places where cursors are faster than a set based queries, if performance is critical I would either
- Do this outside of MySql or
- Use MySql 5 Cursors
-
Perhaps a simpler solution for you and prevents the database having to do a ton of queries. This executes just one query then does a little math on the results in a single pass.
SET @runtot:=0;
SELECT
q1.d,
q1.c,
(@runtot := @runtot + q1.c) AS rt
FROM
(SELECT
DAYOFYEAR(date
) AS d,
COUNT(*) AS c
FROMorders
WHEREhasPaid
> 0
GROUP BY d
ORDER BY d) AS q1This will give you an additional RT (running total) column. Don't miss the SET statement at the top to initialize the running total variable first or you will just get a column of NULL values.
nickf : that works brilliantly! Looking at the `EXPLAIN` on this shows it to be much more efficient than the previously accepted answer
0 comments:
Post a Comment