Tuesday, January 18, 2011

Avg. Disk Queue Length counter when using external RAID enclosure?

I realize that many are recommending that we look at other counters like sec/Read and sec/Write instead of Avg. Disk Queue Length. However, I have a question about this particular counter:

It's typically recommended that Avg. Disk Queue Length not be greater than 2. Just as often I'll see that it should not be greater than 2 + the number of spindles in the "physical disk." This is what I'm curious about. If I'm using an external RAID system, doesn't the OS see it as one physical spindle? Would I still factor in the number of physical spindles in the array when using this counter? Some insight into how this works would be helpful.

  • The OS doesn't do any fancy calculation. I speculate here, but given how performance counters work in generals I guess it just going to increment the counter and the base counter when it posts the IO, decrement the counter when the IO returns. The performance counter type being defined as 'Average', the performance tools and libraries will do computation from the raw values (counter, base counter) and the time between samples and the result will the be counter value you see.

    Nowhere in this process does the physical structure of the RAID array come into picture. So when you evaluate the value, you must factor in the number of spindles and consider it accordingly when deciding if the value is high or low. If the external RAID has 100 spindles, then an avg of 200 requests pending is a good one, it measn all 100 spindles have something to chew on. If it has 10 spindles though, the average queue of 200 means that each spindle is looking at 19 more requests pending after is done with the current one, on average, so the I/O is bottleneck.

    John Gardeniers : +1 Other than software RAID the OS knows nothing about the number of spindles. That's a user supplied piece of the puzzle. In this context an external RAID array is no different to an internal hardware RAID array.
    Boden : Thanks Remus, the explanation in your second paragraph was exactly what I needed. I think I understand now!
  • divide the queue length by number of used spindles. take into account hot spares and parity depending on your RAID config.

0 comments:

Post a Comment