Sunday, March 6, 2011

Envelope Algorithm Optimization -- Best Place to Put a Circle

I have to solve the following problem in an optimal way.

Input data is:

  • N points in a plane given as a (x, y) pair of integer coordinates
  • M points in the same plane given as a (x, y) pair of integer coordinates representing the center of a circle. All this circles have (0, 0) on their edge.

I need to find a way of isolating a number of circles that have the property than for any "good" circle no point from the first N points lays in the chosen circle or at the edge of the chosen circle.

The number of points and circles is in the order of 100,000. The obvious solution of checking every circle with every point has the complexity O(N * M) and with 100,000 circles and 100,000 points it takes about 15 seconds on a Core 2 Duo with 64 bit SSE3 single precision code. The reference implementation I compete against takes only about 0.1 seconds with the same data. I know the reference implementation is O(Nlog N + Mlog M).

I have thought of optimizing my algorithm in the following way. Make 2 copies of the point data and sort the copies in respect to x coordinate, respectively the y coordinate. Then use only points that lay in the square defined by [(xc - r, yc - r); (xc + r, yc + r)], where (xc, yc) is the center of the "current" circle, with radius r. I can find points in that interval using binary search because now I work with sorted data. The complexity of this approach should be O(Nlog N + Mlog^2 N), and indeed it is faster but still significantly slower than the reference.

I know more or less how the reference implementation works, but there are some steps that I don't understand. I will try to explain what I know so far:

The radius of a circle with coordinates (Xc, Yc) is:

  • Rc = sqrt(Xc * Xc + Yc * Yc) (1)

That's because (0, 0) is on the edge of a circle.

For a point P(x, y) to be outside of a circle, the following inegality must be true:

  • sqrt((Xc - x)^2 + (Yc - y)^2) > Rc (2)

Now if we substitute Rc from (1) into (2), then square the inegality after we make some simple calculations we get:

  • Yc < 1/2y * (x^2 + y^2) - Xc * x/y (3.1) for y > 0
  • Yc > 1/2y * (x^2 + y^2) - Xc * x/y (3.2) for y < 0

(3.1) and (3.2) must be true for a given circle C(Xc, Yc) for any (x, y) chosen from the input data.

For simplicity, let's make a few notations:

  • A(x, y) = 1/2y * (x^2 + y^2) (4.1)
  • B(x, y) = -x/y (4.2)
  • E(Xc) = 1/2y * (x^2 + y^2) - Xc * x/y = A(x, y) + Xc * B(x, y) (4.3)

We can see that for a given circle C(Xc, Yc), we can write (3) as:

  • Yc < MIN(E(Xc)) (5.1) for all points with y > 0
  • Yc > MAX(E(Xc)) (5.2) for all points with y < 0

We can see that E(Xc) is a linear function in respect to Xc with 2 paramaters -- A(x, y) and B(x, y). That means that basically E(Xc) represents in the Euclidean space a family of lines with 2 parameters.

Now here comes the part I don't understand. They say that because of the property stated in the above paragraph we can calculate MIN() and MAX() in O(1) amortized time instead of O(N) time using an Envelope algorithm. I don't know how the Envelope algorithm might work.

Any hints on how to implement the Envelope algorithm?

Thanks in advance!


Edit:

The question is not about what an envelope in the mathematical sense is -- I already know that. The question is how to determine the envelope in better time then O(n), apparently it could be done in amortized O(1).

I have the family of functions I need to calculate the envelope, and I have an array of all posible parameters. How do I solve the maximization problem in an optimal way?

Thanks again!

From stackoverflow
  • Here is the Wikipedia entry on envelopes. Here is a tutorial about the envelope theorem in optimization.

    Terminus : Thanks. I know about envelopes in the mathematical sense, and I know what it should mean in this case, but although I have read the 2nd link you posted I still can't think of a way of determining the envelope in O(1) or better then O(n).
  • I don't have the mathematical background, but I would approach this problem in three steps:

    • Throw away most points in N. This is the tricky part. Every pair of points "shadows" an area "behind" it when seen from the origin. This area is delimited by two beams starting from the points outward, pointing to the origin, and a circle intersection between the points. The calculation of this might be much simpler in polar coordinates. Start off with a random pair, then look at one new point at a time: if it is shadowed, throw it away; if not, find out if it shadows any point already in your set, then rebuild the enveloping set of curves. The tests for rebuilding the enveloping curve part should take almost constant time because the shadow set is unlikely to grow beyond a certain small number. The worst case for this seems to be O(NlogN). I cannot imagine any solution can be better than O(N) because you have to look at each point in any case.

    • Throw away most points in M. This is rather easy: if the point from M is farther from the origin than half the distance to the farthest point from the enveloping set, then it can be thrown out. This takes O(M)

    • Filter the remaining points in M through the actual check. It depends on the distribution of N and M how long this takes, but I think it is almost O(1) if both numbers are big and the distributions similar.

    Overall, it seems to be possible in O(N log(N) + M). No guarantees, though ;)

  • Consider some other aspects of your computations.

    For instance, you apparently compare a lot of distances. Each takes a call to SQRT. Why not compare the "squares of the distances" instead. SQRT is a costly computation.

    • Construct an R-Tree of all the points in the first set.
    • For each point in the second set, compute the bounding box of its circle and look up all the points in the R-Tree that fall inside that bounding box (O(n log n) with respect to number of points returned).
    • Check the distance between each returned point and the point currently under consideration; discard any that lie within the bounding box but outside the circle.
  • I think you can do it with Voronoi diagram:

    • Make a Voronoi Diagram of the {N points} union {[0,0]}
    • The centres of circles not touching the N points are exactly these lying inside the Voronoi cell of the point [0,0], which is a convex polygon
    • Filter the M points, one test should take O(log C)=O(log N) [where C is the number of vertices in the cell [0,0]

    Overall complexity should be O(N log N+M log N)

0 comments:

Post a Comment