Code Question: 03/06/11

Sunday, March 6, 2011

JavaScript/ CSS/ Image reference paths

Hi,

I was wondering if anyone has any preference for referencing images/ css or javascript files in their sites?

The reason I ask is if a client wants to host the site we've writen under a virtual directory, the site usually has to have it's file references changed - even down to url (...image path) in CSS files because we usually develop the site as if it's a root application - sometimes the client doesn't know where they're going to host it until the day before drop-date!

Using ASP.NET we can get around the problem in the forms by referencing items using the '~' when using runat server components or the Url.Content method when using ASP.NET MVC...

So, are there any good solutions for keeping the file references generic or are we faced with having to change the file references everytime?

From stackoverflow

Images (like background-url) in CSS are always referenced relative to the css file.

Example:
```
/img/a.gif
/css/c.css
```
To reference a.gif from the css file, you must always reference it like such ../img/a.gif, irrelevant to where a page is located or how it is rewritten

Kieron : Excellent, I didn't realise they where always relative to the CSS file...thanks!
Not quite sure I understand the problem: are your references not relative then? I have no such problem with virtual directories and relative paths...
You can still use relative addresses with "up" navigation:

E.g. In /styles/main.css for /images/bg.png:
```
background-image: url('../images/bg.png');
```
Or in /path/to/this/page.html for /index.html
```
<a href="../../../index.html">Home</a>
```
It may not be the most appealing, but it should work.
I like to keep all of my style images in sub-directories of the style sheet location, for example:

Site->Style->Common.css
Site->Style->Images->Background.png

This way you alleviate the issues with "up" directories in the CSS being confusing, or needing to move the CSS easily, as it is all self contained in a single directory.

Frankly, mixing style images and content images together can get a bit messy in a large site with lots of media.

Checking if the file is already in use before opening (network drive, C#)

Does there exist any way in .Net to check before opening if a file (on a local network drive) is already in use?

From stackoverflow

You should try to access it and if it failed, you either don't have required permissions (which you can check with GetAccessControl) or it's locked by another process.

But I don't think there is any reliable measure to distinguish a lock and a permission failure (since you might be able to read the file, but not able to check the permissions). Even Windows' error message says you either don't have permission or it's being used by another process.

You can use WMI CIM_DataFile class to query InUseCount for a specified data file.

If you're looking for a programmatical equivalent to lsof utility in Linux, to find out all open files by a given local process, you could try using Win32_Process WMI class through System.Management namespace. You could issue a WMI query to look up the file name in all open files being used by all local processes too see if it's there or not. Alternatively, you could P/Invoke and use NtQuerySystemInformation API directly to accomplish the same task.

liggett78 : What if the file is already open with FileShare.ReadWrite? Then you can still open it without any error on your side, but it is already in use

Nelson Reis : You could try to rename the file... if it was already in use, it would throw an exception. But that wouldn't be a good solution.

Mehrdad Afshari : @Nelson, If you have just read permission for the file, can you do it?! Even if it's not locked, an exception will be thrown.

Mehrdad Afshari : @liggett78, if you try to acquire an exclusive lock (open it exclusively), it'll fail even if it's already open with shared lock.
Use System.IO.File.OpenWrite(path). Surround it in a try/catch block, and if it is already open for writing somewhere else, you will get a System.UnauthorizedAccessException.

liggett78 : You can open a file for write successfully if another process specified that the file be shared-for-write (FileShare.Write).
The following syntax will help:

FileStream s2 = new FileStream(name, FileMode.Open, FileAccess.Read, FileShare.Read);
This will do. FileShare.None as mentioned in MSDN :

None : Declines sharing of the current file. Any request to open the file (by this process or another process) will fail until the file is closed.
```
File.Open(name, FileMode.Open, FileAccess.Read, FileShare.None);
```
EDIT : Remember to wrap in a Try/Catch block, and using FileShare.None actually means you want to open the file exclusively.

bool CanReadAndWrite(string path)
{
    var perm = new System.Security.Permissions.FileIOPermission(
         System.Security.Permissions.FileIOPermissionAccess.Write |
         System.Security.Permissions.FileIOPermissionAccess.Read,
         path);
    try
    {
         perm.Demand();
         return true;
    }
    catch 
    {
         return false;
    }
}

Checks to see if you can read and write to a file.

Ash : Just remember that the file can still be locked after calling this function, but before you try to open it. So there is currently no 100% solution except for trying to Open the file and then handling the exception, if any.

In general, you should always avoid relying on file accessibility checks. The reason is that it might be, and then it may change just a couple of cycles before you access it.

Instead, just try to open it and see if you have any errors.

Using a strongly typed ActionLink when the action method doesn't take a primitive type

Hi,

Does anyone know how I could go about doing something like :

Html.ActionLink(c => c.SomeAction(new MessageObject { Id = 1 } ))

This should output a link with the url of "/Controller/SomeAction/1", pointing at an ActionMethod along the lines of:

public Controller : Controller
{
  public ActionResult SomeMethod(MessageObject message)
  {
      // do something with the message
      return View();
  }
}

I've written something similar for generating forms, but that doens't need to include the Id value on the end of the Url. Basically I want to do some sort of reverse lookup in my routes but I can't find any doco on how I might go about doing that. I have a ModelBinder setup that is able to build a MessageObject from GET and POST parameters, but I'm not sure how I can reverse the process.

Thanks, Matt

From stackoverflow

I'm not sure exactly what you are trying to do since your example URL doesn't match that required for the signature of your method. Typically if you use a method that requires a complex object, you pass the values to construct that object in the query string or as form parameters and the ModelBinder constructs the object from the data supplied in the parameters. If you want to pass just the id, then the method typically doesn't take any parameters, you extract the id from the RouteData, and look the object up in persistent storage (or in a cache). If you want to do the latter, your method should look like:
```
public ActionResult SomeMethod()
{
    int messageObjectID;
    if (RouteData.Values.TryGetValue("id",out messageObjectID))
    {
       ... get the object with the correct id and process it...
    }
    else
    {
       ... error processing because the id was not available...
    }
    return View();
}
```
I'm not sure exactly what you are trying to do since your example URL doesn't match that required for the signature of your method. Typically if you use a method that requires a complex object, you pass the values to construct that object in the query string or as form parameters and the ModelBinder constructs the object from the data supplied in the parameters.

LOL that's exactly what I'm trying to do :) That url works fine and maps to that method, the model binder is able to turn that URL into a route that maps to that action and works fine. (That route maps the "1" to a RouteValue named Id, which the model binder then assigns to the Id field of the message object).

What I'm trying to do is go the other way, take a method call and turn it into a route.

In the end I ended up wrapping the following code in an HtmlHelper extension method. This would allow me to use something like Html.ActionLink(c => c.SomeAction(new MessageObject { Id = 1 } ))

and have all the properties of the MessageObject created as RouteValues.

 public static RouteValueDictionary GetRouteValuesFromExpression<TController>(Expression<Action<TController>> action)
            where TController : Controller
        {
            Guard.Against<ArgumentNullException>(action == null, @"Action passed to GetRouteValuesFromExpression cannot be null.");
            MethodCallExpression methodCall = action.Body as MethodCallExpression;
            Guard.Against<InvalidOperationException>(methodCall == null, @"Action passed to GetRouteValuesFromExpression must be method call");
            string controllerName = typeof(TController).Name;
            Guard.Against<InvalidOperationException>(!controllerName.EndsWith("Controller"), @"Controller passed to GetRouteValuesFromExpression is incorrect");

            RouteValueDictionary rvd = new RouteValueDictionary();
            rvd.Add("Controller", controllerName.Substring(0, controllerName.Length - "Controller".Length));
            rvd.Add("Action", methodCall.Method.Name);

            AddParameterValuesFromExpressionToDictionary(rvd, methodCall);
            return rvd;
        }

        /// <summary>
        /// Adds a route value for each parameter in the passed in expression.  If the parameter is primitive it just uses its name and value
        /// if not, it creates a route value for each property on the object with the property's name and value.
        /// </summary>
        /// <param name="routeValues"></param>
        /// <param name="methodCall"></param>
        private static void AddParameterValuesFromExpressionToDictionary(RouteValueDictionary routeValues, MethodCallExpression methodCall)
        {
            ParameterInfo[] parameters = methodCall.Method.GetParameters();
            methodCall.Arguments.Each(argument =>
            {
                int index = methodCall.Arguments.IndexOf(argument);

                ConstantExpression constExpression = argument as ConstantExpression;
                if (constExpression != null)
                {
                    object value = constExpression.Value;
                    routeValues.Add(parameters[index].Name, value);
                }
                else
                {
                    object actualArgument = argument;
                    MemberInitExpression expression = argument as MemberInitExpression;
                    if (expression != null)
                    {
                        actualArgument = Expression.Lambda(argument).Compile().DynamicInvoke();
                    }

                    // create a route value for each property on the object
                    foreach (PropertyDescriptor descriptor in TypeDescriptor.GetProperties(actualArgument))
                    {
                        object obj2 = descriptor.GetValue(actualArgument);
                        routeValues.Add(descriptor.Name, obj2);
                    }
                }
            });
        }

IF you don't mind adding a method alongside each action in your controller for which you want to generate URLs you can proceed as follows. This has some downsides compared to your lambda expression approach but some upsides too.

Implementation:-

Add this to your Controller for EACH action method for which you want strongly-typed url generation ...
```
// This const is only needed if the route isn't already mapped 
// by some more general purpose route (e.g. {controller}/{action}/{message}
public const string SomeMethodUrl = "/Home/SomeMethod/{message}";

// This method generates route values that match the SomeMethod method signature
// You can add default values here too
public static object SomeMethodRouteValues(MessageObject messageObject)
{
   return new { controller = "Home", action = "SomeMethod", 
                message = messageObject };
} 
```
You can use these in your route mapping code ...
```
Routes.MapRoute ("SomeMethod", 
                  HomeController.SomeMethodUrl,
                  HomeController.SomeMethodRouteValues(null));
```
And you can use them EVERYWHERE you need to generate a link to that action:- e.g.
```
<%=Url.RouteUrl(HomeController.SomeMethodValues(new MessageObject())) %>
```
If you do it this way ...

1) You have just one place in your code where the parameters to any action are defined

2) There is just one way that those parameters get converted to routes because Html.RouteLink and Url.RouteUrl can both take HomeController.SomeMethodRouteValues(...) as a parameter.

3) It's easy to set defaults for any optional route values.

4) It's easy to refactor your code without breaking any urls. Suppose you need to add a parameter to SomeMethod. All you do is change both SomeMethodUrl and SomeMethodRouteValues() to match the new parameter list and then you go fix all the broken references whether in code or in Views. Try doing that with new {action="SomeMethod", ...} scattered all over your code.

5) You get Intellisense support so you can SEE what parameters are needed to construct a link or url to any action. As far as being 'strongly-typed', this approach seems better than using lambda expressions where there is no compile-time or design-time error checking that your link generation parameters are valid.

The downside is that you still have to keep these methods in sync with the actual action method (but they can be next to each other in the code making it easy to see). Purists will no doubt object to this approach but practically speaking it's finding and fixing bugs that would otherwise require testing to find and it's helping replace the strongly typed Page methods we used to have in our WebForms projects.

iPhone Apps for fun and profit?

I have been considering developing a couple iPhone/iTouch apps for fun and profit. I am curious if there are any iPhone developers out there that would like to give some feedback.

I consider myself a good developer. I have 20+ years experience, but I find it difficult to get exposure for apps I have written. iTunes seems like a great place to release apps and get a ton of exposure.

What advice can anyone give me on how to get started with this?

Update: Awesome feedback. Thanks guys.

From stackoverflow

From a technical perspective, start at http://developer.apple.com/iphone/, where Apple provides a huge amount of free info, including documentation, sample code, etc. If you haven't done Objective-C programming, consider Aaron Hillegass's book "Cocoa Programming for Mac OS X". Although it's written for Macs, most of it applies.

You will need to get a Mac, if you don't have one. That might seem obvious, but many people don't seem to realize it.

On non-technical issues, don't make the mistake of thinking that getting into the iPhone app store is an easy road to "a ton of exposure". The app store is already crowded, and just being listed there doesn't do much to promote an application. It's a convenient way to process orders but it's also extremely easy to be lost in the crowd if you don't take the time to market the software in other ways. Apple doesn't market your app for you, they provide a place for people to buy it. If I look through the iPhone apps I actually use on a regular basis, all of them are apps that I've heard about through other sources. I've tried just going to the store and searching but I haven't found it to be an effective approach.

It's also gotten quite competitive, to the point where apps not in the top listings often end up in a race to the bottom on pricing, hoping that dropping the price again and again will bring in new customers (see "How to Price Your iPhone App out of Existence"). There have been numerous stories about the fortunes made by a few iPhone developers, but they're the exception rather than the rule.

I focus on this aspect because it seems there are a lot of people who do think that getting into the app store is the end of their marketing efforts. In the first couple of months after the store opened this may have been enough, as apps could ride a wave of excitement that apps were finally available. That's mostly petered out now.
Simply putting your apps in the App Store is not enough. There are thousands of apps there vying for attention. Users are reluctant to spend more than 99 cents on any app, so you'll probably need a lot of volume if you want to make a living at it.

You really need to be prepared to do some marketing outside the App Store if you want to make money without relying on pure luck.

Newsweek has an article about some of the lucky iPhone developers: http://www.newsweek.com/id/174266

FWIW, my 99-cent iPhone app has sold about 200 copies in the past three months. I'm not going to be quitting my day job any time soon.
On the technical side, I have found the pragmatic programmer's book http://pragprog.com/titles/amiphd/iphone-sdk-development, together with the sample code at the apple site, very useful.

I managed to get a working app together with just this, despite no prior Mac, Cocoa or objective-C experience. I did come from a pretty good background in Java and C/C++ though.

Without a decent book and some good examples to start from and adapt, it is a struggle though. Even with this, be prepared to spend a few days tearing your hair out over some bug. Of course, SO is another good resource to ask questions if you get really stuck.

As other answers have pointed out, you'll need to sign up with the apple dev program, and get a mac. A mac mini or an entry level macbook is fine, but make sure you get at least 2G RAM. And, it may be pointing out the obvious, but having an iPhone and/or iPod touch to develop on helps. :-)

You'll also need to figure out the code signing that the iPhone requires, both for app store submission and development itself...but I was able to get this going just by following the apple dev site instructions. You really need to RTFM on this one, because it is far from intuitive and a bit flaky.
I can make some development philosophy suggestions, based on my experience. For the technical issues of getting started, you can search around here or in Apple's excellent documentation.

Make your first application free. I decided to do that, and it's worked out very well. A free project is a great way to become acclimated to the frameworks, tools, submission process, etc. Good free applications can also get you a lot of exposure and build goodwill. For example, my application has been downloaded over 340,000 times worldwide. Hopefully, I've made a favorable impression on many of those users, who may be more likely to purchase something I write in the future. Even better, open source your program and allow others to learn from your work. You'll get a tremendous number of hits to your blog or site just from other developers looking for code examples.

You asked for how to have fun, well I can tell you that I've greatly enjoyed hearing from educators and scientists around the globe. By releasing the code for the application, I've had a chance to talk to a number of great Mac / iPhone developers and learn from them. I've had more fun doing Cocoa development on the Mac and iPhone than on any other platform I've worked on.

The next thing I'd encourage is for you to challenge yourself in the development of your application. If you can write it in under two weeks, don't bother. Almost all of the low-hanging fruit has been plucked in the App Store, with decent free versions of most ideas. A talented developer can differentiate himself from the crowd by writing an application that's difficult to do a good free implementation of. In particular, I'd love to see more applications aimed at scientists, engineers, and even students.

If you're worried about the limited exposure that a higher-priced, quality application gets now, I don't think that will be as much of an issue in the future. Significant improvement have been made to the App Store since launch, and I believe even bigger ones are to come that will make it much easier for users to pick out the quality applications from the garbage. Hopefully, the market will sort out in the way that it has on the Mac, where the polished, easy-to-use applications become the most popular.

Can I identify matched terms when searching with sphinx?

I am using sphinx to do full text search on a mysql database through thinking sphinx.

I would like to highlight the matched terms in the results I show to the user.

Shpinx is smart enough that searching for 'botulism' will match "i like to inject botulinum into my eyes"

How can I get it to tell me that 'botulinum' matches 'botulism'?

From stackoverflow

Hello,

First, I'm heavily using sphinx for one of my project but I'm not using ThinkingSphinx since the config file we use is quite complex, I'm using a customized act_as_sphinx plugin.

To answer your question from pure sphinx point of view :
- there is an a BuildExcerpts api in sphinx to get excerpt of a content with matching underlined see http://www.sphinxsearch.com/docs/current.html#api-func-buildexcerpts.
  Thinking Sphinx should provide this functionnality
- to match botulism as botulinum you should compile sphinx with stemmer, maybe the porter algorithm may answer your question : see http://www.sphinxsearch.com/docs/current.html#conf-morphology
Hope this helps and I highly encourage you to look at the sphinx documentation to fully use this very efficient indexer

Manfred

Best Practices - Design before coding.

I'm curious How do you people think ? (I mean a way of thinking) about design architecture of your Libraries, Systems, Frameworks, etc. before start coding it.

I recently find my self feeling pain in what I've done, and practically every time I want to start everything from scratch..

I do design before, painting some schemes on the paper and imagine how it will work, but maybe I do it in a wrong way ?

For example how do you decide what Interfaces you will need, or how everything will be connected in a best way ?

(I had a problem some day ago, my friend asked me a library what I've done some time ago, and instead of giving him just one file, I had to give him about 3-4 files, and that's because they're connected in some way.. but not in the right one I think :) so it was my mistake in design..)

From stackoverflow

For an object oriented approach, I find it is generally a good idea to step away from the user interface for a bit, and focus on what entities (objects) will exist in the system, and what properties and actions are appropriate.

Drawing on a whiteboard or large piece of paper, using different colors to identify various characteristics is also a good idea. Post-it notes are also a nice way to visualise your model.

Even if I spend a lot of time to think through a design very carefully, I ALWAYS end up changing it as I go. So it's good keep in mind that your design WILL change as you make a decision about how to document your design.
Although not an entire answer to your question, I sometimes find the easiest way to get into gear for a project as it were is to find a small piece of isolated functionality to get started on and work on that whilst also thinking about the bigger picture.

That way you don't get too hung up on every minute detail and are being productive and giving yourself breathing space to see what you need to do with clarity.

Like I say, not an answer.
Open question. There will be nearly as much answers as posters. 1) have a look at many software engeneering books. Some argue with good design the rest is a snap. That's a straigh lie 2) See how intrusive diverse Frameworks are, You better have to use them the intended way otherwise you better implement the stuff again for your needs 3) Design would need a constantly change as any writing. You see the sentences but you feel they do not fit fully. So you rewrite it. So every design must take into account that things are different as they seem as you have written the design.

Regards Friedrich

Draemon : Could you rephrase this to make it clearer?
I mostly start with a box of cards, and write down the concepts of the domain I want to model. Sometimes I use a mindmap for that.

Take a look at the stakeholders, and what they want to accomplish. It is important to start there, because it allows you to prioritize correctly (i.e. not on the most interesting technical part, but on the part with the most business value)

The thinking about design is mostly written down in tests. Written before the code to implement it. Start with the stakeholders end-goals, work from there backwards to the beginning (Time-inversion). That ensures that you concentrate on what, and less on how. Interaction between objects is more important to get right than object attributes.

The other part is mostly on the white board. A compact digital camera is standard equipment these days.

Doing a framework before you have three implementations that need it is a bad practice. If you have to, be prepared for significant interface (and implementation) change.

See:
- http://www.robvens.nl/en/articles/computers/29-time-inversion-pattern
- http://www.robvens.nl/en/articles/computers/28-active-passive-pattern
The problem is when you begin a new project, it tends to be new (isn't that obvious now). And in general one doesn't understand new stuff off the bat, unless your a very specialized consultant doing the exact same thing over and over again, which kinda sounds freaky...

Because you're inevitably new to the system your making, you're design and code aren't going to be perfect the first time around, so you reiterate the design & refactor the code, until its ready.
I usually dedicate about 2 - 4 hours to do the design of my app and then write it down in a notebook.

Then I start coding, and each time I thing get complicated ( because something is not in the right place ) I refactor. Move one class to another package, or extract method, etc. etc. When the "design" feels right then I can move on.

This way I avoid "analysis paralysis" that use to happen to me quite a lot. Many times when I over designed upfront I ended up not using some artifacts.

Thus I took this more "agile" approach.

Sketch up the design very quickly and "refine" ( by refactoring ) in the run.

Of course this is possible for small apps ( 2 - 4 weeks long )

I would suggest you to take a read to this article: Is design dead from Martin Fowler. It is a little lengthy but worth reading.

EDIT

Additional link _{(slightly offtopic)} How to design a good API and why it matters by Joshua Bloch. Talk about the relevance of design when you have an audience for your API. Summary, start as private as possible and the grow from there.
It has to be a balance.

If you try to design everything up front with lots of pictures on a white board, then you'll probably miss some details when it actually comes round to coding it up.

If you start hacking away at some small part of the system, you'll probably lose sight of the "big picture" and end up with a poor design.

As you're looking at libraries, you want them to be re-usable as much as possible. So for the initial design, think of the most general cases you can among the intended uses of your library you already know about -- don't worry too much at this stage about hypothetical future uses that might well never happen. Code up that design with unit tests and refactor as you learn more and find problems with the design.

Try to put as little specific knowledge about the library's users into the library itself. Then, with luck you will end up with something re-usable and won't automatically want to start from scratch next time.
I follow a very loose version of Rational Unified Process.

First start of with a vision document which states clearly what your trying to do, who your are doing it for and maybe some key details about the proposed methods/algothithms. It doesnt need to be fancy and could even be a one liner for a simple system "A system for predicting winning lottery numbers, for personal use only, based on latest research from Hogwarts school.", for a more complex system it could be about five pages, but no more. This is not the design but more like a wish list.

Then do some use cases/stories. These should be plain natural language text descriptions of all interactions with the external world. Minimalism is the key here; the purpose of use cases is to identify all the required functionality and only the required functionality. Also I find this is where most of the creativity occurs.

A use case might start of as:
```
  User presses the "magic" button.
  System displays next weeks winning number.
  User writes down number.
  User buys lottery ticket.
  Lottery draws winning number.
  User claims money
  lottery pays up.
```
After much working around it ends up as:-
```
  User presses "magic" button
  System selects next weeks numbers.
  System logs on to lottery system.
  System enters winning numbers.
  Lottery selects winning numbers.
  Lottery transfers winnings to users account.
  User spends money.
```
Once you have done your use cases you can fire up your development environment and the various classes and interactions will just fall into place.

Stephan Eggermont : These are not use cases: what stakeholder goal is (partly) accomplished? -1

James Anderson : Space is a bit limited for doing the full use case "stories" might have been a better term. The main point was to illustrate how drasticlally a design can change when specifing the use cases and how little these changes cost.
When doing any major projects like game development I tend to try and carefully structure my code and make non-functional dummy functions and classes right from the start. From the Main code file I split everything into classes within other modules and sub-modules, if necessary. For example; My main code page would consist of nothing more than includes and calls to class initializers for my main modules. Here's an example of the init.py for a game I'm making in Python right now;

from commonheaders import *
```
if arguements.debug:
 debug.init()

try:
 graphics.init()
 sound.init()
 physics.init()
 input.init()
 loop.start("scene1")
except Exception, e:
 print "Error!"
 print e
```
It's very minimal and easy to understand, but what I like most about it is that it gives very defined separation between the the various main code aspects. If I get frustrated with something in the graphics class I can just go work in the sound class or something else to get other stuff done and keep graphics out of my mind for awhile until the clouds pass. On all class init() calls there is optional arguements, such as widht/height for graphics, but I usually leave them out of there and either set them in the commonheaders.py file or through command line switches.

I try not to let any of my modules get larger than about 100 lines before thinking about what I can cut out and place into it's own module. It makes everything much cleaner and it's much easier to find what you are looking for when it's split up like that. I've worked with various open source projects before and seen single modules in the tens of thousands of lines before...there was so much redundant code that could easily have been offloaded to helper functions. I use comments where necessary, but I try to keep my code structured in such a way that you don't really even need to read the comments.

I really like having the non-functional framework built from the start as it makes it very easy to see what still needs to be done and what can be improved in terms of structure. It's better than rushing off to write thousands of lines of codes only to figure out that there was a more elegant and functional way of doing it that would require a total restructuring of the code.
I usually do enough analysis of the problem domain on paper/white board to get a good enough understanding of the problem domain to start writing code. I rarely draw implementation or class diagrams on paper. A key technique I've found to achieve better design is to not get too attached to the code you write. If I don't like it, I delete, rename, move and shuffle it around until it expresses a good enough solution to what I'm trying to solve. Sounds easy? Not at all! But with good "coding" tools, actually writing the code is not the major effort. Write some, refactor, delete, write again...

Good design almost never start out good. It evolves to good. Accepting this makes it easier to work in small steps without getting frustrated why the design isn't "perfect". In order for this process to work you have to posses good design skills though. The point being, even excellent designers don't get it right the first time.

Many times, I thought I understood the problem domain when I started, but I didn't. I then go back to the white board, talk to someone or read up on the problem domain if I realize I don't understand it well enough. I then go back to the code.

It is a very iterative processes.

An interesting question to ask when dealing with how programmers think, is how they developed their way of thinking. Personally, my way of thinking has evolved over the years, but a few events have had profound influence on the way I develop software. The most important among them have been to design software with people who are expert designers. Nothing has influenced me more than spending iterations with great designers. Another event that has, and still do, affect the way I think is going back and look at software I wrote some time back.
I prefer a bottom-up design. If there is not an application framework in place, I tend to assembly or build it first. A good application framework is portable and application-agnostic, largely addressing cross cutting concerns. It will normally contain things such as logging, exception handling, validation helpers, extension methods, etc.

From there, I like to go with a largely DDD (Domain Driven Design) approach. Interviewing business users, if needed, to construct the domain model. Once the domain model, and subsequent DAL (Data Access Layer) is taken care of, I move to the business logic layer (BLL).

I personally try to stay as removed from front-end implementation as possible, since it is defintely not a strong point of mine. Definitely one reason that I love unit testing is that I can focus on core functionality, and be able to test that functionality without jumping into premature interface decisions.
Maybe you can also recommend some Books ? Articles ?

Nelson Reis : That would be great!!

Envelope Algorithm Optimization -- Best Place to Put a Circle

I have to solve the following problem in an optimal way.

Input data is:

N points in a plane given as a (x, y) pair of integer coordinates
M points in the same plane given as a (x, y) pair of integer coordinates representing the center of a circle. All this circles have (0, 0) on their edge.

I need to find a way of isolating a number of circles that have the property than for any "good" circle no point from the first N points lays in the chosen circle or at the edge of the chosen circle.

The number of points and circles is in the order of 100,000. The obvious solution of checking every circle with every point has the complexity O(N * M) and with 100,000 circles and 100,000 points it takes about 15 seconds on a Core 2 Duo with 64 bit SSE3 single precision code. The reference implementation I compete against takes only about 0.1 seconds with the same data. I know the reference implementation is O(Nlog N + Mlog M).

I have thought of optimizing my algorithm in the following way. Make 2 copies of the point data and sort the copies in respect to x coordinate, respectively the y coordinate. Then use only points that lay in the square defined by [(xc - r, yc - r); (xc + r, yc + r)], where (xc, yc) is the center of the "current" circle, with radius r. I can find points in that interval using binary search because now I work with sorted data. The complexity of this approach should be O(Nlog N + Mlog^2 N), and indeed it is faster but still significantly slower than the reference.

I know more or less how the reference implementation works, but there are some steps that I don't understand. I will try to explain what I know so far:

The radius of a circle with coordinates (Xc, Yc) is:

Rc = sqrt(Xc * Xc + Yc * Yc) (1)

That's because (0, 0) is on the edge of a circle.

For a point P(x, y) to be outside of a circle, the following inegality must be true:

sqrt((Xc - x)^2 + (Yc - y)^2) > Rc (2)

Now if we substitute Rc from (1) into (2), then square the inegality after we make some simple calculations we get:

Yc < 1/2y * (x^2 + y^2) - Xc * x/y (3.1) for y > 0
Yc > 1/2y * (x^2 + y^2) - Xc * x/y (3.2) for y < 0

(3.1) and (3.2) must be true for a given circle C(Xc, Yc) for any (x, y) chosen from the input data.

For simplicity, let's make a few notations:

A(x, y) = 1/2y * (x^2 + y^2) (4.1)
B(x, y) = -x/y (4.2)
E(Xc) = 1/2y * (x^2 + y^2) - Xc * x/y = A(x, y) + Xc * B(x, y) (4.3)

We can see that for a given circle C(Xc, Yc), we can write (3) as:

Yc < MIN(E(Xc)) (5.1) for all points with y > 0
Yc > MAX(E(Xc)) (5.2) for all points with y < 0

We can see that E(Xc) is a linear function in respect to Xc with 2 paramaters -- A(x, y) and B(x, y). That means that basically E(Xc) represents in the Euclidean space a family of lines with 2 parameters.

Now here comes the part I don't understand. They say that because of the property stated in the above paragraph we can calculate MIN() and MAX() in O(1) amortized time instead of O(N) time using an Envelope algorithm. I don't know how the Envelope algorithm might work.

Any hints on how to implement the Envelope algorithm?

Thanks in advance!

Edit:

The question is not about what an envelope in the mathematical sense is -- I already know that. The question is how to determine the envelope in better time then O(n), apparently it could be done in amortized O(1).

I have the family of functions I need to calculate the envelope, and I have an array of all posible parameters. How do I solve the maximization problem in an optimal way?

Thanks again!

From stackoverflow

Here is the Wikipedia entry on envelopes. Here is a tutorial about the envelope theorem in optimization.

Terminus : Thanks. I know about envelopes in the mathematical sense, and I know what it should mean in this case, but although I have read the 2nd link you posted I still can't think of a way of determining the envelope in O(1) or better then O(n).
I don't have the mathematical background, but I would approach this problem in three steps:
- Throw away most points in N. This is the tricky part. Every pair of points "shadows" an area "behind" it when seen from the origin. This area is delimited by two beams starting from the points outward, pointing to the origin, and a circle intersection between the points. The calculation of this might be much simpler in polar coordinates. Start off with a random pair, then look at one new point at a time: if it is shadowed, throw it away; if not, find out if it shadows any point already in your set, then rebuild the enveloping set of curves. The tests for rebuilding the enveloping curve part should take almost constant time because the shadow set is unlikely to grow beyond a certain small number. The worst case for this seems to be O(NlogN). I cannot imagine any solution can be better than O(N) because you have to look at each point in any case.
- Throw away most points in M. This is rather easy: if the point from M is farther from the origin than half the distance to the farthest point from the enveloping set, then it can be thrown out. This takes O(M)
- Filter the remaining points in M through the actual check. It depends on the distribution of N and M how long this takes, but I think it is almost O(1) if both numbers are big and the distributions similar.
Overall, it seems to be possible in O(N log(N) + M). No guarantees, though ;)
Consider some other aspects of your computations.

For instance, you apparently compare a lot of distances. Each takes a call to SQRT. Why not compare the "squares of the distances" instead. SQRT is a costly computation.
- Construct an R-Tree of all the points in the first set.
- For each point in the second set, compute the bounding box of its circle and look up all the points in the R-Tree that fall inside that bounding box (O(n log n) with respect to number of points returned).
- Check the distance between each returned point and the point currently under consideration; discard any that lie within the bounding box but outside the circle.
I think you can do it with Voronoi diagram:
- Make a Voronoi Diagram of the {N points} union {[0,0]}
- The centres of circles not touching the N points are exactly these lying inside the Voronoi cell of the point [0,0], which is a convex polygon
- Filter the M points, one test should take O(log C)=O(log N) [where C is the number of vertices in the cell [0,0]
Overall complexity should be O(N log N+M log N)

Using Windows DLL from Linux

We need to interface to 3rd party app, but company behind the app doesn't disclose message protocol and provides only Windows DLL to interface to.

Our application is Linux-based so I cannot directly communicate with DLL. I couldn't find any existing solution so I'm considering writing socket-based bridge between Linux and Windows, however I'm sure it is not such a unique problem and somebody should have done it before.

Are you aware of any solution that allows to call Windows DDL functions from C app on Linux? It can use Wine or separate Windows PC - doesn't matter.

Many thanks in advance.

From stackoverflow

Calling the DLL's functions themselves is of course only the tip of the iceberg. What if the DLL calls Win32, then you'd have a rather massive linking problem. I guess Wine could help you out there, not sure if they provide a solution.
I haven't done it, but you might be able to create a wrapper for it using MONO.
IMO, the best bet is to use Sockets. I have done this previously and it works like a charm.

qrdl : That was my initial idea. It seems that's the way I have to go.
Any solution is going to need a TCP/IP-based "remoting" layer between the DLL which is running in a "windows-like" environment, and your linux app.

You'll need to write a simple PC app to expose the DLL functions, either using a homebrew protocol, or maybe XML-RPC, SOAP or JSON protocols. The RemObjects SDK might help you - but could be overkill.

I'd stick with a 'real' or virtualized PC. If you use Wine, the DLL developers are unlikely to offer any support.

MONO is also unlikely to be any help, because your DLL is probably NOT a .NET assembly.
Sometimes it is better to pick a small vendor over a large vendor because the size of your business will give you more weight for them. We have certainly found this with AV engine vendors.

If you are sufficiently important to them, they should provide either a documented, supported protocol, a Linux build of the library, or the source code to the library.

Otherwise you'll have to run a Windows box in the loop using RPC as others have noted, which is likely to be very inconvenient, especially if the whole of the rest of your infrastructure runs Linux.

Will the vendor support the use of their library within a Windows VM? If performance is not critical, you might be able to do that.
Get a sniffer and learn the protocol. That's what I'm doing.

Is there a small alternative editing tool for Access tables?

We have several legacy applications which use Access databases for storing data and/or configuration.

Sometimes we have to do small changes or corrections at our customers databases. (Adding an index, modifying a data row, ...) In many cases Access is available on the customers' workstations, but sometimes it's not.

Is there any small tool for doing small maintenance operations on Access databases which needs not to be installed? (i.e. can be started from a USB stick)

I know of Squirrel SQL, but I'm hoping for something more lightweight.

From stackoverflow

MS Access uses ODBC so any DB tool on windows can be used.

The main problem with these tools is that many commercial ones use some kind of copy protection, for example a license key which is installed in the registry (for example, AQT). So these won't do.

So OSS tools like Squirrel SQL are your best bet since they don't come with artifical restrictions and it's simple to install it (along with Java) on an USB stick:
1. Just install Java somewhere
2. Copy the directory on your USB stick
3. Unpack Squirrel SQL on the USB stick
4. Create a small .BAT file in the home of Squirrel SQL:
  
  set DIR=%~dp0
  %DIR%..\java\bin\javaw.exe -jar squirrel.jar
That's it.

I use VBScript for edits and updates of databases when Access is not available. Scripts can be written quite quickly and there are a number of ready-made scripts available on-line, such as for compacting a database.

This example links a table.

Dim adoCn
Dim adoCat
Dim adoTbl

strLinkFile = "C:\Docs\DB1.mdb"
strAccessFile = "C:\Docs\LTD.mdb"

'Create Link...'
Set cn = CreateObject("ADODB.Connection")
cn.Open "Provider=Microsoft.Jet.OLEDB.4.0;" & _
       "Data Source=" & strAccessFile & ";" & _
       "Persist Security Info=False"

Set adoCat = CreateObject("ADOX.Catalog")
Set adoCat.ActiveConnection = cn

Set adoTbl = CreateObject("ADOX.Table")

Set adoTbl.ParentCatalog = adoCat
adoTbl.Name = "LinkTable"

adoTbl.properties("Jet OLEDB:Link Datasource") = strLinkFile
adoTbl.properties("Jet OLEDB:Link Provider String") = "MS Access"
adoTbl.properties("Jet OLEDB:Remote Table Name") = "Table1"
adoTbl.properties("Jet OLEDB:Create Link") = True

'Append the table to the tables collection'
adoCat.Tables.Append adoTbl

David-W-Fenton : This works, of course, because the Jet db engine is installed by default on all copies of Windows, starting with Windows 2000.

Personally I'd try and avoid doing this altogether. You're masking the problem rather than solving it.

If an index is worth adding to a single customer's database for example, it's probably worth adding to all customer databases. Otherwise the same issues will occur repeatedly going forward. I understand the need for rapid support, but having databases which are fundamentally different on different workstations is only going to cause more problems moving forward, for example in recreating bugs.

It also adds a potential user "fiddle factor" once they learn how to do this and if the application is left on their machine (i.e. "I wonder what happens if I change this value?").

Either modify the current application (legacy or not) to add the appropriate indexes on startup, or create a seperate small "hotfix" program that adds the indexes and require your customers to run it. The suggestion above that these are written using VBScript is perfectly reasonable. The key is that the databas echanges are repeatable and that you can track what changes have been made where.

If data itself needs modification, then why was this data written badly in the first place? Maybe the application can be appropriately modified so that this can be prevented in the first place? This would avoid the same issue happening with other databases.

DR : Yes, you are right, but I'm not in any position to change those things.

David-W-Fenton : I voted this back up because even though this answer doesn't help the original questioner, it's really good advice, nonetheless.

Invoking Java main method with parameters from Eclipse

During development (and for debugging) it is very useful to run a Java class' public static void main(String[] argv) method directly from inside Eclipse (using the Run As context menu).

Is there a similarily quick way to specify command line parameters for the run? What I do now is go to the "Run Dialog", click through the various settings to the tab where I can specify VM and program arguments and enter them there. Too many steps, plus I do not want to mix the more permanent runtime configuration settings with the one-off invokation parameters. What I want instead is to check a box somewhere (or have a separate menu item "Run as Java application with command line") and then be prompted for the commandline every time (with a nice history).

From stackoverflow

AFAIK there isn't a built-in mechanism in Eclipse for this.

The closest you can get is to create a wrapper that prompts you for these values and invokes the (hardcoded) main. You then get you execution history as long as you don't clear terminated processes. Two variations on this are either to use JUNit, or to use injection or parameter so that your wrapper always connects to the correct class for its main.
I'm not sure what your uses are, but I find it convenient that usually I use no more than several command line parameters, so each of those scenarios gets one run configuration, and I just pick the one I want from the Run History.

The feature you are suggesting seems a bit of an overkill, IMO.
Uri is wrong, there is a way to add parameters to main method in Eclipse directly, however the parameters won't be very flexible (some dynamic parameters are allowed). Here's what you need to do:
1. Run your class once as is.
2. Go to Run -> Run configurations...
3. From the lefthand list, select your class from the list under Java Application or by typing its name to filter box.
4. Select Arguments tab and write your arguments to Program arguments box. Just in case it isn't clear, they're whitespace-separated so "a b c" (without quotes) would mean you'd pass arguments a, b and c to your program.
5. Run your class again just like in step 1.
I do however recommend using JUnit/wrapper class just like Uri did say since that way you get a lot better control over the actual parameters than by doing this.

matt b : I think that Thilo/Uri are talking about a simple way to do this _without_ involving delving into the Run dialog box - as his question states.
Another idea:

Place all your parameters in a properties file (one parameter = one property in this file), then in your main method, load this file (using Properties.load(*fileInputStream*)). So if you want to modify one argument, you will just need to edit your args.properties file, and launch your application without more steps to do...

Of course, this is only for development purposes, but can be really helpfull if you have to change your arguments often...
This answer is based on Eclipse 3.4, but should work in older versions of Eclipse.

When selecting Run As..., go into the run configurations.

On the Arguments tab of your Java run configuration, configure the variable ${string_prompt} to appear (you can click variables to get it, or copy that to set it directly).

Every time you use that run configuration (name it well so you have it for later), you will be prompted for the command line arguments.

Smooth spectrum for Mandelbrot Set rendering

I'm currently writing a program to generate really enormous (65536x65536 pixels and above) Mandelbrot images, and I'd like to devise a spectrum and coloring scheme that does them justice. The wikipedia featured mandelbrot image seems like an excellent example, especially how the palette remains varied at all zoom levels of the sequence. I'm not sure if it's rotating the palette or doing some other trick to achieve this, though.

I'm familiar with the smooth coloring algorithm for the mandelbrot set, so I can avoid banding, but I still need a way to assign colors to output values from this algorithm.

The images I'm generating are pyramidal (eg, a series of images, each of which has half the dimensions of the previous one), so I can use a rotating palette of some sort, as long as the change in the palette between subsequent zoom levels isn't too obvious.

From stackoverflow

Seems simple to do by trial and error. Assume you can define HSV1 and HSV2 (hue, saturation, value) of the endpoint colors you wish to use (black and white; blue and yellow; dark red and light green; etc.), and assume you have an algorithm to assign a value P between 0.0 and 1.0 to each of your pixels. Then that pixel's color becomes
```
(H2 - H1) * P + H1 = HP
(S2 - S1) * P + S1 = SP
(V2 - V1) * P + V1 = VP
```
With that done, just observe the results and see how you like them. If the algorithm to assign P is continuous, then the gradient should be smooth as well.

Nick Johnson : I can't guarantee a range of 0 to 1.0 for the values simply, though, since that results in the higher-resolution images lacking detail (they end up with all high-iteration values). Some sort of gradual palette rotation is needed.

Paul Brinkley : That's okay. If the range is infinite in size, it can still have a period, even if that period is imposed upon it. Simply choose an appropriate value N, and map values to themselves mod N, and then scale that to the 0-1 range.
Use the smooth coloring algorithm to calculate all of the values within the viewport, then map your palette from the lowest to highest value. Thus, as you zoom in and the higher values are no longer visible, the palette will scale down as well. With the same constants for n and B you will end up with a range of 0.0 to 1.0 for a fully zoomed out set, but at deeper zooms the dynamic range will shrink, to say 0.0 to 0.1 at 200% zoom, 0.0 to 0.0001 at 20000% zoom, etc.
My eventual solution was to create a nice looking (and fairly large) palette and store it as a constant array in the source, then interpolate between indexes in it using the smooth coloring algorithm. The palette wraps (and is designed to be continuous), but this doesn't appear to matter much.
This is the smooth color algorithm:

Lets say you start with the complex number z0 and iterate n times until it escapes. Let the end point be zn.

A smooth value wolud be
```
nsmooth := n + 1 - Math.log(Math.log(zn.abs()))/Math.log(2)
```
This only works for mandelbrot, if you want to compute a smoot function for julia sets, then use
```
Complex z = new Complex(x,y);
double smoothcolor = Math.exp(-z.abs());

for(i=0;i<max_iter && z.abs() < 30;i++) {
    z = f(z);
    smoothcolor += Math.exp(-z.abs());
}
```
Then smoothcolor is in the interval (0,max_iter).

Divide smoothcolor with max_iter to get a value between 0 and 1.

To get a smooth color from the value:

This can be passed to for example (in Java):
```
Color.HSBtoRGB(0.95f + 10 * smoothcolor ,0.6f,1.0f);
```
since the first value in HSB color parameters is used to define the color from the color circle.

Group repeated rows in TSQL

I have the following table and data in SQL Server 2005:

create table LogEntries (
  ID int identity,
  LogEntry varchar(100)
)

insert into LogEntries values ('beans')
insert into LogEntries values ('beans')
insert into LogEntries values ('beans')
insert into LogEntries values ('cabbage')
insert into LogEntries values ('cabbage')
insert into LogEntries values ('beans')
insert into LogEntries values ('beans')

I would like to group repeated LogEntries so that I have the following results:

LogEntry  EntryCount
beans     3
cabbage   2
beans     2

Can you think of any way to do this in TSQL outside of using a cursor?

From stackoverflow

SQL not exactly my strong point but won't
```
SELECT LogEntry, COUNT(1) AS Counter FROM LogEntries GROUP BY LogEntry
```
do it?

Dheer : Nope this would result in beans - 5 Cabbage - 2

Paul : Ah, fell it to the bear pit too. Oh well. At least I'm in good company :)
I don't think that you can do this with one query. In order to provide counts in a query, you need to group using the LogEntry column. However, this will just give you total counts per LogEntry, and not the counts of the numbers of entries in sequence that you are looking for. I think that a cursor is called for (or bring the whole dataset to your application, and use logic there to get the results that you want).
unless my brain has yet to boot up this morning

SELECT LogEntry, COUNT(LogEntry) as EntryCount FROM LogEntries GROUP BY LogEntry

kpollock : ah b*gger, fell into the same trap as Russ Cam...

Dheer : Nope this would result in beans - 5 Cabbage - 2

kpollock : worse, someone picked me up on it before I finished typing my retraction!!! :-)
Now I have looked at the eactual question closely enough :-)

Hmm, on reconsidering, why not just use a cursor? The performance isn't always worse than straight SQL - and it'd certianly be easy for other people to follow the code when they come to look at it. Wrap it in a stored proc or function and you'd be able to use it pretty much anywhere you might need.

This is a set-based solution for the problem. The performance will probably suck, but it works :)

CREATE TABLE #LogEntries (
  ID INT IDENTITY,
  LogEntry VARCHAR(100)
)

INSERT INTO #LogEntries VALUES ('beans')
INSERT INTO #LogEntries VALUES ('beans')
INSERT INTO #LogEntries VALUES ('beans')
INSERT INTO #LogEntries VALUES ('cabbage')
INSERT INTO #LogEntries VALUES ('cabbage')
INSERT INTO #LogEntries VALUES ('carrots')
INSERT INTO #LogEntries VALUES ('beans')
INSERT INTO #LogEntries VALUES ('beans')
INSERT INTO #LogEntries VALUES ('carrots')

SELECT logentry, COUNT(*) FROM (
    SELECT logentry, 
    ISNULL((SELECT MAX(id) FROM #logentries l2 WHERE l1.logentry<>l2.logentry AND l2.id < l1.id), 0) AS id
    FROM #LogEntries l1
) AS a
GROUP BY logentry, id


DROP TABLE #logentries

Results:

beans   3
cabbage 2
carrots 1
beans   2
carrots 1

The ISNULL() is required for the first set of beans.

Lance Fisher : Brilliant! Thanks, this is a lot simpler than the cursor I wrote.

kpollock : I am curious, how is the performance vs a cursor?

kpollock : I am going senile too, as now I look at it, I have used this sort of set based solution myself in the past.

I think this will do it... didn't check too thoroughly though
```
select 
    COUNT(*),subq.LogEntry 
from 
(
    select 
     ROW_NUMBER() OVER(ORDER BY id)-ROW_NUMBER() OVER(PARTITION BY logentry ORDER BY id) as t,*
    from 
     LogEntries
) subq 
group by 
    subq.t,subq.LogEntry 
order by 
    MIN(subq.ID)
```
Lance Fisher : Thanks, it does work. I haven't used the partition keyword before. That's interesting.

Jonas Lincoln : This is a very elegant and cunning use ov the OVER() function. Nice!

kpollock : NEAT! I don't know partitionng well enough... should do!

Taking a snapshot of optimized JVM runtime

I know that the JVM can do some pretty serious optimizations at runtime, especially in -server mode. Of course, it takes a little while for the JVM to settle down and reach peak performance. Is there any way to take a snapshot of those optimizations so they can be applied immediately the next time you run your app?

"Hey JVM! Great job optimizing my code. Could you write that down for me for later?"

From stackoverflow

Basically not yet with Sun's VM, but they have it in mind.

See various postings/comments under here:

http://blogs.sun.com/fatcatair/category/Java

(Sorry: I can't find quite the right one about retaining stats over restart for immediate C1 compilation of known-hot-at-startup methods.)

But I don't know where all this stuff is right now.

Rgds

Damon

PS. Note that optimisations appropriate in steady-state may well not be appropriate at start-up and might indeed reduce start-up performance, and indeed two runs may not have the same hotspots...
Perhaps this might help: http://wikis.sun.com/display/HotSpotInternals/PrintAssembly.

Finding available LPT (parallel) ports and addresses in Delphi

Hi, I am doing direct I/O on a parallel port which is fine and necessary for speed. I would like to enumerate the available ports to offer the user a choice of ports at setup time rather than a tedious trawl through device manager to read the address manually. Does anyone know a means of doing this please? Many thanks, Brian

From stackoverflow

According to this Microsoft article, for Win2K and newer, you can find details of parallel-connected devices in the registry at HKLM\SYSTEM\CurrentControlSet\Enum\LPTENUM.
Many thanks Scott, I'll investigate. Brian.

Sunday, March 6, 2011

Blog Archive