Code Question: 02/17/11

Thursday, February 17, 2011

Help getting QTP to identify a control

We're trying to use QTP (QuickTest Professional) to auto-test a legacy C++ application.

However, the main window for the app is composed of several IDENTICAL panels. Each panel has a unique title.

If I view the window in Spy++ (comes with DevStudio), I see:

+ Window <hwnd> "Window Title" taskwindowclass
  + Window <hwnd> "Panel A" childwindowclass
    + Window <hwnd> "OK" Button
  + Window <hwnd> "Panel B" childwindowclass
    + Window <hwnd> "OK" Button

In QTP's Object Spy however, the hierarchy shows as:

+ Window: Window Title
  + WinButton: OK

It doesn't even show that there is an intermediate panel.

Does anybody know a way to get the window "path" in order to identify each control? i.e. so the controls identify as:

Button A: "Window Title/Panel A/OK"
Button B: "Window Title/Panel B/OK"

From stackoverflow

You could use descriptive programming to bypass the object map. QTP will ignore panel objects by default. You could try to get a reference to the panel object dynamically, then search the ChildObjects collection to find the ok button. Another option might be to simply add an ordinal identifier to the ok buttons.
- Button A: "Window Title/OK index #1"
- Button B: "Window Title/OK index #2"

How can I run Test projects with Visual Studio 2008 Standard Edition?

I'm currently running Visual Studio 2008 Standard Edition and one of the items they cut out for that edition is the unit testing capability. As a result, when I open up example projects like the MVC commerce test, one of the projects just won't load since it doesn't know to open that type of project.

I know I can just exclude or remove the project and I am aware of TestDriven.NET, but is there a plugin for VS2008 Standard which will do the unit tests that come with VS2008 Pro? Will TestDriven.NET do this or are the tests different enough from NUnit testing that this won't work?

UPDATE: To clarify, I'm curious if there are any programs or plugins out there that can run Test projects within Visual Studio 2008. TestDriven.NET cannot load or run or allow VS2008 to open Test projects and Gallio does not run within the VS2008 IDE (although I appreciate the suggestions greatly)

From stackoverflow

TestDriven.NET can run NUnit, xUnit.NET and MSTest. It's not free, though, but well worth it. From http://www.testdriven.net/overview.aspx:

Supports multiple unit testing frameworks including NUnit, MbUnit and MS Team System.

Schnapple : Was TestDriven.NET ever free? Or did I imagine that?
Try Gallio as explained best in this blog post by Richard Dingwall, and for Visual Studio support, use TestDriven.Net to runs it.

There's some information available in this Google Groups post as well.

From Gallio's website:

At present Gallio can run tests from MbUnit versions 2 and 3, MSTest, NBehave, NUnit, xUnit.Net, and csUnit. Gallio provides tool support and integration with CCNet, MSBuild, NAnt, NCover, Pex, Powershell, Resharper, TestDriven.Net, TypeMock, and Visual Studio Team System.

Finding Unique Table/Column Combinations Across SQL Databases

I have 4 databases with similar schema's, and I'm trying to create a query to return just the table, column pairs that exist ONLY in database 1 and do not exist in database 2, 3, or 4.

Currently I can return the symmetric difference between database 1 and 2 via the following query...

select table_name, column_name from (
     select table_name, column_name from [Database1].information_schema.columns
     union all
     select table_name, column_name from [Database2].information_schema.columns) as tmp
     group by table_name, column_name having count(*) = 1

However, in trying to isolate just those columns in database 1, and doing the same across all 4 databases, things are getting complicated. What is the cleanest solution for this query?

From stackoverflow

SELECT
     D1.table_name,
     D1.column_name
FROM
     Database1.information_schema.columns D1
LEFT OUTER JOIN Database2.information_schema.columns D2 ON
     D2.table_name = D1.table_name AND
     D2.column_name = D1.column_name
LEFT OUTER JOIN Database3.information_schema.columns D3 ON
     D3.table_name = D1.table_name AND
     D3.column_name = D1.column_name
LEFT OUTER JOIN Database4.information_schema.columns D4 ON
     D4.table_name = D1.table_name AND
     D4.column_name = D1.column_name
WHERE
     D2.table_name IS NULL AND
     D3.table_name IS NULL AND
     D4.table_name IS NULL

Blorgbeard : you have a trailing AND

Tom H. : Thanks. I need to work on my copy-paste skills :)

I can come up with this at first shot, which can be a head start. This hasn't been tested though, just another alternative maybe:

select table_name, column_name from [Database1].information_schema.columns d1 where table_name || '+' || column_name not in (select table_name || '+' || column_name from [Database2].information_schema.columns)

and table_name || '+' || column_name not in (select table_name || '+' || column_name from [Database3].information_schema.columns)

and table_name || '+' || column_name not in (select table_name || '+' || column_name from [Database4].information_schema.columns)

Is it considered bad design to do lengthy operations in a constructor?

I am implementing a class to compare directory trees (in C#). At first I implemented the actual comparison in the class's constructor. Like this:

DirectoryComparer c = new DirectoryComparer("C:\\Dir1", "C:\\Dir2");

But it doesn't feel "right" to do a possible lengthy operation in the constructor. An alternative way is to make the constructor private and add a static method like this:

DirectoryComparer c = DirectoryComparer.Compare("C:\\Dir1", "C:\\Dir2");

What do you think? Do you expect a constructor to be "quick"? Is the second example better or is it just complicating the usage of the class?

BTW:

I wont mark any answer as accepted because I don't think there is a correct answer, just preference and taste.

Edit:

Just to clarify my example a little. I'm not only insterested if the directories differs, I'm also interested in how they differ (which files). So a simple int return value wont be enough. The answer by cdragon76.myopenid.com actually is pretty close to what I want (+1 to you).

From stackoverflow

I prefer the second one.

I expect the constructor to instanciate the class. The method compare does what it is designed to do.
You should never do anything that might fail in a constructor. You don't want to ever create invalid objects. While you could implement a "zombie" state where the object doesn't do much, it's much better to perform any complex logic in seperate methods.

Jon Skeet : It's perfectly okay for a constructor to fail, and throw an exception. There are plenty of examples of this in the framework. It should make sure that it doesn't leak a reference to itself during the constructor, but apart from that it's fine. Not that I like complex constructors, mind you.

James : I don't think that just because it is in the framework it is OK. The framework is great, but it isn't flawless. It's great advice to say that you should never do things that may fail in a constructor. But yes, constructors should be allowed to throw exceptions in the case of invalid data, IMHO.
Yes, typically a constructor is something quick, it is designed to prepare the object for use, not to actually do operations. I like your second option as it keeps it a one line operation.

You could also make it a bit easier by allowing the constructor to pass the two paths, then have a Compare() method that actually does the processing.
I think an interface might be what you're after. I would create a class to represent a directory, and have that implement the DirectoryComparer interface. That interface would include the compare method. If C# already has a Comparable interface, you could also just implement that.

In code, your call would be:
```
D1 = new Directory("C:\");
..
D1.compare(D2);
```
I like the second example because it explains what is exactly happening when you instantiate the object. Plus, I always use the constructor to initialize all of the global settings fro the class.
I would think a combination of the two is the "right" choice, as I would expect the Compare method to return the comparison result, not the comparer itself.
```
DirectoryComparer c = new DirectoryComparer();

int equality = c.Compare("C:\\Dir1", "C:\\Dir2");
```
...and as Dana mentions, there is an IComparer interface in .Net that reflects this pattern.

The IComparer.Compare method returns an int since the use of IComparer classes is primarily with sorting. The general pattern though fits the problem of the question in that:
1. Constructor initializes an instance with (optionally) "configuring" parameters
2. Compare method takes two "data" parameters, compares them and returns a "result"
Now, the result can be an int, a bool, a collection of diffs. Whatever fits the need.
I think for a general purpose comparer you may on construction only want to specify the files you are comparing and then compare later- this way you can also implement extended logic:
- Compare again- what if the directories changed?
- Change the files you are comparing by updating the members.
Also, you may want to consider in your implementation receiving messages from your OS when files have been changed in the target directories- and optionally recomparing again.

The point is- you are imposing limits by assuming that this class will only be used to compare once for a single instance of those files.

Therefore, I prefer:

DirectoryComparer = new DirectoryComparer(&Dir1,&Dir2);
DirectoryComparer->Compare();

Or

DirectoryComparer = new DirectoryComparer();
DirectoryComparer->Compare(&Dir1,&Dir2);

Don Kirkby : I agree, and you can still have a single line by saying new DirectoryComparer(a, b).Compare()

tloach : @Don: A former coworker called that "trying to do too many sexy things on one line". If the new fails for any reason you've just caused an exception that can be hard to track down.
If you are working with C#, you could use extension methods to create a method for comparing 2 directories that you would attach to the build in DirectoryClass, so it would look some thing like:
```
Directory dir1 = new Directory("C:\.....");
Directory dir2 = new Directory("D:\.....");

DirectoryCompare c = dir1.CompareTo(dir2);
```
This would be much clearer implementation. More on extension methods here.
If an operation may take an unknown amount of time, it is an operation you might want to export into a different thread (so your main thread won't block and can do other things, like showing a spinning progress indicator for example). Other apps may not want to do this, they may want everything within a single thread (e.g. those that have no UI). Moving object creation to a separate thread is a bit awkward IMHO. I'd prefer to create the object (quickly) in my current thread and then just let a method of it run within another thread and once the method finished running, the other thread can die and I can grab the result of this method in my current thread by using another method of the object before dumping the object, since I'm happy as soon as I know the result (or keeping a copy if the result involves more details I may have to consume one at a time).
I agree with the general sentiment of not doing lengthy operations inside constructors.

Additionally, while on the subject of design, I'd consider changing your 2nd example so that the DirectoryComparer.Compare method returns something other than a DirectoryComparer object. (Perhaps a new class called DirectoryDifferences or DirectoryComparisonResult.) An object of type DirectoryComparer sounds like an object you would use to compare directories as opposed to an object that represents the differences between a pair of directories.

Then if you want to define different ways of comparing directories (such as ignoring timestamps, readonly attributes, empty directories, etc.) you could make those parameters you pass to the DirectoryComparer class constructor. Or, if you always want DirectoryComparer to have the exact same rules for comparing directories, you could simply make DirectoryComparer a static class.

For example:
```
DirectoryComparer comparer = new DirectoryComparer(
    DirectoryComparerOptions.IgnoreDirectoryAttributes
);
DirectoryComparerResult result = comparer.Compare("C:\\Dir1", "C:\\Dir2");
```
If the arguments are just going to be processed once then I don't think they belong as either constructor arguments or instance state.

If however the comparison service is going to support some kind of suspendable algorithm or you want to notify listeners when the equality state of two directories changes based on filesystem events or something like that. Then ther directories is part of the instance state.

In neither case is the constructor doing any work other than initializing an instance. In case two above the algorithm is either driven by a client, just like an Iterator for example, or it's driven by the event listening thread.

I generally try to do things like this: Don't hold state in the instance if it can be passed as arguments to service methods. Try to design the object with immutable state. Defining attributes, like those used in equals and hashcode should allways be immutable.

Conceptualy a constructor is a function mapping an object representation to the object it represents.

By the definition above Integer.valueOf(1) is actually more of a constructor than new Integer(1) because Integer.valueOf(1) == Integer.valueOf(1). , In either case this concept also means that all the cosntructor arguments, and only the constructor argument, should define the equals behavior of an object.
I would definitely do the second.

Long actions in a constructor are fine if they are actually building the object so it is usable.

Now one thing that I see people do in constructors is call virtual methods. This is BAD since once someone uses you as a base class and overrides one of those functions you will call the base class's version not the derived class once you get into your constructor.
I don't think that talking about abstract terms like "lengthy" have anything to do with the decision if you put something in an constructor or not.

A constructor is something that should be used to initialize an object, a method should be used to "do something", i.e. have a function.
I think it's not only okay for a constructor to take as much time as needed to construct a valid object, but the constructor is required to do so. Deferring object creation is very bad as you end up with potentially invalid objects. So, you will have to check an object everytime before you touch it (this is how it is done in the MFC, you have bool IsValid() methods everywhere).

I only see a slight difference in the two ways of creating the object. One can see the new operator as a static function of the class anyway. So, this all boils down to syntactic sugar.

What does the DirectoryComparer class do? What is it's responsibility? From my point of view (which is a C++ programmer's view) it looks like you'd be better off with just using a free function, but I don't think that you can have free functions in C#, can you? I guess you will collect the files which are different in the DirectoryComparer object. If so, you could better create something like an array of files or an equivalent class that's named accordingly.

Sql Server Management - Option to avoid scripting the collation?

I have databases with different collations. I want to be able to script out tables from one database and create them in another. However, when I script out the tables, it includes the collation in the column definitions.

Is there a way to exclude the collations from the generated table creation scripts?

From stackoverflow

Tools -> Options

Under Scripting (in the Table/View section)

Set Include Collation to False!

Jim : Hmm. I don't see those options in Tools -> Options.

Mitchel Sellers : You have to have the most current version of SSMS from what I can tell, older builds of the tool don't seem to have the options.
Pretty lame - but do a search and replace on the script after it has been generated??
Which version of SSMS are you using? The full edition, or Express edition?

You'll need to apply at least SP2 for 2005 to get the scripting settongs under Tools->Options.

This doesn't need to be applied to the database servers, it can be applied to just the client tools on your admin workstation if required.

How can I switch on Intellisense and code hints in C++ Builder 2009?

I've just installed CodeGear Studio 2009 and I didn`t find where I can put the Intellisense and code hints on.

And, by the way, in Visual Studio we can indent selected blocks of code in the same way. How can I use similar functions in CodeGear?

Any ideas about that?

From stackoverflow

It's called Code Insight. I don't have 2009 yet but it's probably in the same place as 2007. Have a look in Tools > Options > Editor Options > Code Insight.

By the way, it's no where near as good as Intellisense in VS.
The generic name is called "code completing", intellisense and insight are just catching marketing spun names companies use.
even after I`ve relaunched it with all points of CodeInsight turned on I found out that there was no result((

Deleting from a database map

I have these 3 tables + data:

items: itemId, itemName
data: 1, my item one

categories: catId, catName
data: 1, my cat one. 2, my cat two

map: mapId, itemId, catId

When you include item "my item one" in category "my cat one", you insert [1, 1, 1] into the map. When you add "my item one" to "my cat two", you insert [2, 1, 2] into the map. Now let's say we change our mind and only want the item in "my cat two". This means we need to know what categories the item is no longer in and delete the associations from the map. What's the most efficient sequence of steps to take to do so? (I'm looking for a solution that will scale beyond this trivial example.)

From stackoverflow

Assuming you've already got the category ID for "my cat two" and the item ID for "my item 1":
```
DELETE FROM MAP WHERE ItemID = @MyItem1Id
                AND CatID <> @MyCat2Id
```
If you have a set of categories you want to keep the item in, you could either:
1. Delete everything from the map for that item and then re-add the set
2. Use a query like the above but with "AND CatID NOT IN [ ... ]"

If you decide that an item should only be in the new category, the following should work:

DELETE
     M
FROM
     Map M
WHERE
     M.itemid = @item_id AND
     M.catid <> @new_cat_id

It's not very clear what you mean by "scale beyond this trivial example".

Sorry for the confusion. I mean with the same tables but with far more data.

Tom H. : As long as you don't have hundreds of millions of rows and have normal indexes the amount of data shouldn't be an issue.

Is there a tool that will generate Visual Studio 2008 projects?

Does anyone know of an application that meets the following criteria?

I'd like to use a GUI tool to generate a batch of related Visual Studio 2008 projects. For example, I know that I'll need a solution file, a web project for my UI, a class library for my business objects, and a class library for my data access layer. I'd like to use a GUI application that will prompt me for the types of projects, their names, their location in the filesystem, and their default namespaces. When I'm ready, it will generate those as specified.

Icing on the cake would be the ability to specify what references to include in each project and perhaps some basic project items like empty classes, controls, or forms/pages.

I'd prefer free/open source software, but if there's a commercial tool that you feel is worth the money, I'd love to hear about it. For what it's worth, I think something like LLBLGen Pro or CodeSmith is probably overkill for what I'm looking for. If there's nothing out there that does what I want I'll look into writing my own, but I might as well not re-invent the wheel if I can find something acceptable.

From stackoverflow

My understanding of T4 is that it is supposed to help with that.

http://msdn.microsoft.com/en-us/library/bb126445.aspx
At my last job, we generated Visual Studio 2003 and 2005 project files from our Makefiles. So you could do a "make" to build, or a "make debug" to launch the IDE with the newly generated project file. Doing a build from the IDE would invoke "make" instead of using the IDE's native facilities for building, so you didn't have to synchronize all the crazy compiler flags, defines, etc. in the IDE, 'cause it didn't use them. All you needed to do was make it point to the makefile properly, and have the list of files in the project. By the time I left, they had also generated solution file for executables, such that the solution would contain the project to build the exe, plus the projects to build all the libraries it depended on.

I've since moved to a new company, where I haven't needed to generate Visual Studio projects, but I do generate project files for Cross Works IDE (for my ARM development) and the Silicon Labs IDE (for my 8051 development).

If you've got a properly set-up build system, it shouldn't take a competent *nix geek more than a day to add project file generation. Easiest way is to make a project file template with tags in it like , , and then replace them (I use sed) with the actual value. In the case of , you can't just put a file name, but instead the IDE-specific string for the list of file names and locations.

This has worked great at making a sane cross-platform build system (make, SCons, Jam, whatever) that can be user-friendly for people who are attached to their IDEs, especially windows users.
Are you aware of a freeware tool on Codeplex called TreeSurgeon?

It does provide a way to "click'n'create" Visual Studio projects - I am uncertain as to how flexible it is in being adapted / tweaked to your particular needs, though.

Certainly worth a look, I'd say!

Marc

How do I make php wait for curl to finish before continuing?

It seems if I do something like

$file = fopen($filepath, "w");
$CR = curl_init();
curl_setopt($CR, CURLOPT_URL, $source_path);
curl_setopt($CR, CURLOPT_POST, 1);
curl_setopt($CR, CURLOPT_FAILONERROR, true);
curl_setopt($CR, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($CR, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($CR, CURLOPT_FILE, $file);
$result = curl_exec( $CR );
$error = curl_error( $CR );
print filesize($filepath);

I get a different result than if I just run

print filesize($filepath);

a second time. My guess is that curl is still downloading when do a filesize().

From stackoverflow

I don't have an answer, but perhaps we can get closer to deducing what's going on here. cURL is supposed to be synchronous, so it's strange that you get a different size the second time. I would expect the file to be complete before curl_exec returns. Try calling this function immediately following your call to curl_exec:

print_r(curl_getinfo($CR));

Pay attention particularly to CURLINFO_SIZE_DOWNLOAD. Perhaps the OS is doing something funny with the file? What OS are you using, and is the file being saved locally or via a network-based connection?

Note that functions like filesize() cache their result, try adding a call to clearstatcache() above 'print filesize(...);'. Here is an example:

$file = '/tmp/test12345';
file_put_contents($file, 'hello');
echo filesize($file), "\n";
file_put_contents($file, 'hello world, this is a test');
echo filesize($file), "\n";
clearstatcache();
echo filesize($file), "\n";

See www.php.net/clearstatcache

PortalSiteMapProvider causing excessive SPRequest objects

We have a custom navigation webpart that uses the PortalSiteMapProvider of MOSS to build a menu navigation. It seems that the Provider is not managing it's objects. Any idea on how to manage the objects that are being created in the Provider?

It is causing log errors like so:

Potentially excessive number of SPRequest objects (9) currently unreleased on thread 1. Ensure that this object or its parent (such as an SPWeb or SPSite) is being properly disposed. Allocation Id for this object: {56D66DBA-AE72-42DF-A70F-B45E05A60A08} Stack trace of current allocation:
at Microsoft.SharePoint.SPRequestManager.Add(SPRequest request, Boolean shareable)
at Microsoft.SharePoint.SPGlobal.CreateSPRequestAndSetIdentity(Boolean bNotGlobalAdminCode, String strUrl, Boolean bNotAddToContext, Byte[] UserToken, String userName, Boolean bIgnoreTokenTimeout, Boolean bAsAnonymous)
at Microsoft.SharePoint.SPWeb.InitializeSPRequest()
at Microsoft.SharePoint.SPWeb.EnsureSPRequest()
at Microsoft.SharePoint.SPWeb.get_Request()
at Microsoft.SharePoint.SPWeb.InitWebPublic()
at Microsoft.SharePoint.SPWeb.get_Exists()
at Microsoft.SharePoint.Publishing.CachedObjectFactory.CreateWebFromUrl(String url)
at Microsoft.SharePoint.Publishing.Navigation.SPNavigationSiteMapNode..ctor(PortalWebSiteMapNode webNode, SPNavigationNode node, PortalSiteMapNode parentNode, NodeTypes type, String url, String title, String description)
at Microsoft.SharePoint.Publishing.Navigation.SPNavigationSiteMapNode.CreateSPNavigationSiteMapNode(PortalWebSiteMapNode webNode, SPNavigationNode node, PortalSiteMapNode parentNode)
at Microsoft.SharePoint.Publishing.Navigation.PortalWebSiteMapNode.ProcessSPNavigationNode(SPNavigationNode node, SPNavigationNode previousSibling, PortalSiteMapNode parentNode)
at Microsoft.SharePoint.Publishing.Navigation.PortalWebSiteMap.Node.PopulateNavigationChildren()
at Microsoft.SharePoint.Publishing.Navigation.PortalSiteMapNode.GetNavigationChildren(NodeTypes includedTypes, NodeTypes includedHiddenTypes, OrderingMethod ordering, AutomaticSortingMethod method, Boolean ascending, Int32 lcid)
at Microsoft.SharePoint.Publishing.Navigation.PortalSiteMapNode.GetNavigationChildren(NodeTypes includedHiddenTypes)
at Microsoft.SharePoint.Publishing.Navigation.PortalSiteMapProvider.GetChildNodes(PortalSiteMapNode node, NodeTypes includedHiddenTypes)
at Microsoft.SharePoint.Publishing.Navigation.PortalSiteMapProvider.GetChildNodes(SiteMapNode node)
at System.Web.SiteMapNode.get_ChildNodes()
at Microsoft.SharePoint.Publishing.Navigation.PortalHierarchicalEnumerable.System.Collections.IEnumerable.GetEnumerator()
at System.Web.UI.WebControls.Menu.DataBindRecursive(MenuItem node, IHierarchicalEnumerable enumerable)
at System.Web.UI.WebControls.Menu.DataBindRecursive(MenuItem node, IHierarchicalEnumerable enumerable)
at System.Web.UI.WebControls.Menu.DataBindItem(MenuItem item)
at System.Web.UI.WebControls.Menu.PerformDataBinding()
at System.Web.UI.WebControls.HierarchicalDataBoundControl.PerformSelect()
at System.Web.UI.WebControls.BaseDataBoundControl.DataBind()
at System.Web.UI.WebControls.Menu.DataBind()
at System.Web.UI.WebControls.BaseDataBoundControl.EnsureDataBound()
at System.Web.UI.WebControls.Menu.EnsureDataBound()
at System.Web.UI.WebControls.BaseDataBoundControl.OnPreRender(EventArgs e)
at System.Web.UI.WebControls.Menu.OnPreRender(EventArgs e, Boolean registerScript)
at System.Web.UI.WebControls.Menu.OnPreRender(EventArgs e)
at Microsoft.SharePoint.WebControls.AspMenu.OnPreRender(EventArgs e)
at System.Web.UI.Control.PreRenderRecursiveInternal()
at System.Web.UI.Control.PreRenderRecursiveInternal()
at System.Web.UI.Control.PreRenderRecursiveInternal()
at System.Web.UI.Control.PreRenderRecursiveInternal()
at System.Web.UI.Control.PreRenderRecursiveInternal()
at System.Web.UI.Control.PreRenderRecursiveInternal()
at System.Web.UI.Control.PreRenderRecursiveInternal()
at System.Web.UI.Page.ProcessRequestMain(Boolean includeStagesBeforeAsyncPoint, Boolean includeStagesAfterAsyncPoint)
at System.Web.UI.Page.ProcessRequest(Boolean includeStagesBeforeAsyncPoint, Boolean includeStagesAfterAsyncPoint)
at System.Web.UI.Page.ProcessRequest() at System.Web.UI.Page.ProcessRequestWithNoAssert(HttpContext context)
at System.Web.UI.Page.ProcessRequest(HttpContext context) at ASP.VIEWPAGE_ASPX_2077083467.ProcessRequest(HttpContext context)
at System.Web.HttpApplication.CallHandlerExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute()
at System.Web.HttpApplication.ExecuteStep(IExecutionStep step, Boolean& completedSynchronously)
at System.Web.HttpApplication.ApplicationStepManager.ResumeSteps(Exception error)
at System.Web.HttpApplication.System.Web.IHttpAsyncHandler.BeginProcessRequest(HttpCont

From stackoverflow

Stefan Goßner's blog post seems to answer the question. The issue is not that the SPWeb objects are not being closed, but that once a certain threshold (defaults to 8) of allocations are hit, the warning is created in the log. Depending on your site structure the number that will be created will vary. Given the nested nature of navigation, the objects are kept open while the sub tree is traversed. So the objects are correctly disposed of, it is just that more than 8 (by default) are being held open at a given time.

See this question

webwires : Yes, thanks. I did eventually find this and you are right.

Java resource files

I'm writing a small GUI app that contains some "editor" functionality, and something that I'd like to let users open a few sample text files to test things out quickly. The easiest way of doing this would be to package a separate zip with the appropriate sample files, and have them open them manually; I'd like to make things a little more user-friendly and allow them to pick the files from inside the application and then run them.

So... what do I use? I initially considered .properties but that doesn't seem terribly well suited for the job...

From stackoverflow

Your FileDialog can be given a FilenameFilter that filters files by any criteria you like. You can default-point it to a directory of sample files, have it ignore everything not named ".sample" or "MySampleXXXX.java", e.g.
```
myDialog.setFilenameFilter( new FilenameFilter() {

   public void accept (File dir, String name) {
           return name.startsWith("FooBar");
   }
}
```
You can include a resource file right in the jar and then open it as a resource stream in your app. If you're using Spring, you can inject the resource right into a bean. If not, check out Class.getResourceAsStream(). You just have to be careful about the path you use to get to the resource file.

Could C++ have not obviated the pimpl idiom?

As I understand, the pimpl idiom is exists only because C++ forces you to place all the private class members in the header. If the header were to contain only the public interface, theoretically, any change in class implementation would not have necessitated a recompile for the rest of the program.

What I want to know is why C++ is not designed to allow such a convenience. Why does it demand at all for the private parts of a class to be openly displayed in the header (no pun intended)?

From stackoverflow

May be because the size of the class is required when passing its instance by values, aggregating it in other classes, etc ?

If C++ did not support value semantics, it would have been fine, but it does.
Someone will have a much more verbose answer than I, but the quick response is two-fold: the compiler needs to know all the members of a struct to determine the storage space requirements, and the compiler needs to know the ordering of those members to generate offsets in a deterministic way.

The language is already fairly complicated; I think a mechanism to split the definitions of structured data across the code would be a bit of a calamity.

Typically, I've always seen policy classes used to define implementation behavior in a Pimpl-manner. I think there are some added benefits of using a policy pattern -- easier to interchange implementations, can easily combine multiple partial implementations into a single unit which allow you to break up the implementation code into functional, reusable units, etc.
This would be a nice feature, however: This has to do with the size of the object. When the h file is read, the size of the object is known (based on all its contained elements).

If the private elements are not known, then you would not know how large of an object to new.

You can simulate your desired behavior by the following:
```
class MyClass
{
public:
   // public stuff

private:
#include "MyClassPrivate.h"
};
```
This does not enforce the behavior, but it gets the private stuff out of the .h file. On the down side, this adds another file to maintain. Also, in visual studio, the intellisense does not work for the private members - this could be a plus or a minus.

Frederick : But of course! What a boob I am to ask that question. Thanks everyone anyway.

peterchen : Another downside is that a change to the private interface still does require a recompile of the clients.

J.F. Sebastian : Why this answer is accepted? "MyClassPrivate.h" is as easily can be read as the original header. It still requires recompile. Size of the object is a minor issue. The true show-stoppers are efficiency and backward-compatibility with some C idioms.

Edward Kmett : Unfortunately MyClassPrivate.h doesn't add any value from a link-time perspective. You still have to traverse them all. For private template members functions, it would be much nicer to not have to include them in the header at all.

KJAWolf : My soul aches at the sight of that include statement.
Yes, but...

You need to read Stroustrup's "Design and Evolution of C++" book. It would have inhibited the uptake of C++.
You're all ignoring the point of the question -

Why must the developer type out the PIMPL code?

For me, the best answer I can come up with is that we don't have a good way to express C++ code that allows you to operate on it. For instance, compile-time (or pre-processor, or whatever) reflection or a code DOM.

C++ badly needs one or both of these to be available to a developer to do meta-programming.

Then you could write something like this in your public MyClass.h:
```
#pragma pimpl(MyClass_private.hpp)
```
And then write your own, really quite trivial wrapper generator.
I think there is a confusion here. The problem is not about headers. Headers don't do anything (they are just ways to include common bits of source text among several source-code files).

The problem, as much as there is one, is that class declarations in C++ have to define everything, public and private, that an instance needs to have in order to work. (The same is true of Java, but the way reference to externally-compiled classes works makes the use of anything like shared headers unnecessary.)

It is in the nature of common Object-Oriented Technologies (not just the C++ one) that someone needs to know the concrete class that is used and how to use its constructor to deliver an implementation, even if you are using only the public parts. The device in (3, below) hides it. The practice in (1, below) separates the concerns, whether you do (3) or not.
1. Use abstract classes that define only the public parts, mainly methods, and let the implementation class inherit from that abstract class. So, using the usual convention for headers, there is an abstract.hpp that is shared around. There is also an implementation.hpp that declares the inherited class and that is only passed around to the modules that implement methods of the implementation. The implementation.hpp file will #include "abstract.hpp" for use in the class declaration it makes, so that there is a single maintenance point for the declaration of the abstracted interface.
2. Now, if you want to enforce hiding of the implementation class declaration, you need to have some way of requesting construction of a concrete instance without possessing the specific, complete class declaration: you can't use new and you can't use local instances. (You can delete though.) Introduction of helper functions (including methods on other classes that deliver references to class instances) is the substitute.
3. Along with or as part of the header file that is used as the shared definition for the abstract class/interface, include function signatures for external helper functions. These function should be implemented in modules that are part of the specific class implementations (so they see the full class declaration and can exercise the constructor). The signature of the helper function is probably much like that of the constructor, but it returns an instance reference as a result (This constructor proxy can return a NULL pointer and it can even throw exceptions if you like that sort of thing). The helper function constructs a particular implementation instance and returns it cast as a reference to an instance of the abstract class.
Mission accomplished.

Oh, and recompilation and relinking should work the way you want, avoiding recompilation of calling modules when only the implementation changes (since the calling module no longer does any storage allocations for the implementations).

Strategies for moving to Team System

Does anyone have any strategies/tips/traps for moving to Team System? Should it be done in baby steps, or all at once? Should we migrate our SourceSafe library over, or draw a line in the sand and move forward? Is it worth brining SharePoint into the mix? Any thoughts are appreciated.

From stackoverflow

I've never had to migrate to TFS, but have used it pretty extensively for the past couple of years.

Regarding your question on Sharepoint, we've found it pretty useful in conjunction with TFS. We use it primarily for documentation management and for storing other "non-technical" artifacts related to the project. Some dev teams advocate keeping documentation in source control alongside source code, which is OK, but in my experience our project stakeholders have an easier time accessing relevent project documentation via the Sharepoint portal than they would having to interface with source control.

I basically was able to distribute the URL to the sharepoint site associated with our TFS team project to the concerned non-technical team members and have been able to avoid constantly e-mailing documents around, so it's been great for us.
It may just be too much work to do it all at once. I feel that it is easier to divvy out projects to different people one at a time. That way they can move them across and ensure that each works okay before closing out the SourceSafe.

You will always want a backup of the SourceSafe "database" around just in case.

I do not know how to migrate from SourceSafe to TFS and keep the comments and versions. By far the easiest it to just add the projects in, but having migrated that way in the past, we always missed the ability to find out what others had done to particular files. If you find a way to migrate, I would go that way unless it is hideously expensive.

When to call the gang of four? [When to use design patterns?]

In The Guerilla Guide to Interviewing Joel says that guys who want to get things done, but are not smart will do stupid things like using a visitor design pattern where a simple array would be sufficient.

I find it hard to detect, if the design pattern suggested by the Gang of Four should be applied.

Therefore, I would like some examples from Your work experience

When is a simple approach (fixed size array) sufficient?
What is the minimum size of a piece of software that justifies the use of the GoF patterns?
When to refactor from simple-minded to GoF? Can this be done in a sensible way?

From stackoverflow

This is similar to any other design decision. Ultimately, it depends. You should learn those patterns that are useful in your language (many GoF patterns aren't needed in Lisp or Smalltalk, for example), learn their advantages and disadvantages, understand the constraints of your system, and make the choice that best fits your needs.

The best advice that I can give is to learn, learn, learn.
I often find that using test driven development helps guide me when faced with these questions.
- When is a simple approach sufficient? It is always sufficient to use the simplest approach to get the next test to pass. But knowing when/how to refactor is the real art form.
- What is the minimum size of a piece of software that justifies the use of the GoF patterns? A rule of thumb I once read is that when you code something once, fine, when you duplicate that code somewhere a second time, make a note and move on. When you find a need for the same code a third time, it's time to refactor to remove duplication and simplify, and often that involves moving to a design pattern.
- When to refactor from simple-minded to GoF? I like what @anopres said - it's time when you feel the pain of not having the design pattern in place. The pain (or code "smell") may manifest itself in several ways. Code duplication is the most obvious. Refactoring books like Fowler's Refactoring or Kerievsky's Refactoring to Patterns list many such pain points/code stenches.
- Can this [refactoring] be done in a sensible way? The trick to refactoring is to have a suite of unit tests in place which you have confidence in, and then to refactor without causing any of those tests to fail. Refactoring, by definition, does not change the functionality of your code. Therefore, if your tests continue to pass, you can have a pretty good feeling that you didn't break anything. Although it can be difficult, I actually enjoy this part of TDD, it's almost like a game to make changes without breaking any tests.
In summary, I would say that TDD helps guide me to write the code that is sufficient at the time, and perhaps more importantly helps me to make the changes later when inevitably requirements change, more functionality is required, etc.
Switching from a simple approach to a formal design pattern is usually something that happens fairly naturally for me as a problem increases in complexity. The key is to be familiar enough with the patterns that you can recognize the tipping point and switch from the simple approach to a design pattern when it will bring the most benefit for current and future development.

For a larger, more complex project, the tipping point should be fairly early on; in many cases, before you even start coding. For smaller projects, you can afford to wait before deciding to implement a pattern.

One of the best things you can do to increase your ability to recognize when a pattern should be used is to take some time after completing a project to examine how complex your "simple" approach has become. If it would have taken you less time and effort to implement a pattern, or if the pattern would clarify what you were trying to do, you can file that knowledge away for the next time you encounter a similar problem.
Design Patterns are a consequence, not an objective. You don't think today I shall use Strategy Patterns, you just do it. Halfway through doing the third nearly identical class you stop and use a paper notebook to figure out the general case and knock up a base class that describes the shared context. You refactor the first two classes to be descendants, and this gives you a reality check and quite a few changes to your base class. Then the next thirty are a walk in the park.

It's only the next day at the team meeting that you save everyone thirty minutes of boredom by saying "I used strategy pattern. They all work the same so there's only one test program, it takes parameters to change the test case."

Intimate familiarity with patterns makes you use them reflexively, whenever the situation demands it. When people treat the use of patterns as an objective in its own right, you get stilted, ugly code that speaks of mechanism rather than purpose; the how rather than the why.

alchemical : --bravo-- *
When you have a problem that one of the patterns solves. The GoF book has a section in each chapter that explains what types of scenarios each pattern is appropriate for. You should not analyze each problem you have, then go look up what pattern to use. You should become familiar with the patterns so that you learn to recognize what situations call for them.
Patterns are only tools and vocabulary. You write the code to be as simple, understandable and maintainable as you know how. By knowing patterns you have more allternatives at your disposal, and you have a language to discuss pros and cons of the approach before implementing it.

In either case you dont just "switch" to "using a pattern". You just keep doing what you allways do, write the code the best way you know how.
One of the nice things about the the original GoF book is that there is discussion of the scenarios where the pattern would best solve problem. Reviewing these discussions can help you determine if "it's time".

Another good place to start is with Head First Design Patterns. The exercises that illustrate the use of different design patterns are elaborate enough to offer a good learning experience. In addition the exercises are also grounded in real world scenarios, so it's never a stretch to see when the appropriate times to apply design patterns.

Calculating Months

I have an application where the user selects the dates of a first statement and a last statement. Example, first statement = 1/1/08, last statement = 12/1/08, should equal 12 statements.

However, when using the following code, the result is 11:

numPayments = DateDiff(DateInterval.Month, CDate(.FeeStartDate), CDate(.FeeEndDate))

Is there another way to calculate this, or do I have to be stuck with adding 1 to the result?

From stackoverflow

Add 1, as you write. ;)

The difference between 1/1/2008 and 12/1/2008 is 11 months. No changing that. ;)
Well, the number of months between Jan 1st and Dec 1st is 11... what you're looking for is the difference of months +1. So just add one :)
Also, the DateDiff function you're using is a VB6 hold-over. Better to express it like this:
```
numPayments = (Date.Parse(.FeeEndDate) - Date.Parse(.FeeStartDate)).TotalMonths + 1
```
Yes, you'd always have to add one though you may be able to add one to the end date or subtract one from the start date to also get this effect. Consider the case where the start and end dates are the same. Their difference is 0 but you'd still want 1 statement to show just to note one odd case.

Where is the best place to add methods to the Integer class in Rails?

Where is the best place to add a method to the integer class in Rails? I'd like to add a to_meters and to_miles methods.

From stackoverflow

Normally (and logically), integers can't be converted to miles or to meters. It sounds like you may want to create a new class like "Feet" or "inches" that is initialized with an integer, then contains methods like size_in_miles or size_in_meters. For convenience those methods could return decimal or float types, but you might also want to write a miles class or a meters class.

As an alternate method, you might want to create a static method in your new class that would have a signature like this:

Float feetToMiles(integer I)

that you would call

miles = Feet.feetToMiles(5280);

and get miles = 1.0
Create your own module/library which you include into scope when you need it to perform this task.

Such as "requre 'unitCoversions' "

And Chances are, somebody has already done this if you look hard enough :)

However DONT try modifying the native core class, that will only end in Misery.

( Also, the class you want to extend is 'numeric' , that will apply to both Integers and Floats :) )

Not entirely clear why I shouldn't do this... Rails does this to the string class to great success.

Because it can be done doesn't mean it should be done. 'Monkey Patching' as it is known can have all sorts of odd side effects, and they can be an epic failure when done wrong.

Do it when there is no good alternative.

Because if you really wanted to do something daft, you could build an entire framework that ALL it did was monkey patch the core classes.

Just for example, flip databasing on its head.
```
5.getArtist(); 
10.getEvent(); 
100.getTrack();
```
etc etc. there is no limit to how many bad ways there are to do that.
```
"Bob".createUser();
```
misery in a cup.

If you want to do something practical, have a Convert class or function,
```
convert( 3 , { :from=>:miles, :to=>:meters });
```
at least you're not polluting the global namespace and core functions that way and it makes more coherent sense.

Ben : Not entirely clear why I shouldn't do this... Rails does this to the string class to great success.
If you were going to do this, which you shouldn't, then you would put your code into:
```
config/initializers/add_methods_that_are_naughty_to_numeric.rb
```
Rails would automatically run these for you.

Why not just:

class Feet
  def self.in_miles(feet)
    feet/5280
  end
end

usage:

Feet.in_miles 2313

Or maybe look at it the other way:

class Miles
  def self.from_feet(feet)
    feet/5280
  end
end

Miles.from_feet 2313

I agree monkey patching should be used with care, but occasionally it just make sense. I really like the helpers that allow you to type 5.days.ago which are part of the active_support library

So some of the other answers might be better in this case, but if you are extending ruby classes we keep all our extensions in lib/extensions/class_name.rb

this way when working on a project it is quick and easy to find and see anything that might be out of the ordinary with standard classes.
If you have your heart set on mucking with the Numeric (or integer, etc) class to get unit conversion, then at least do it logically and with some real value.

First, create a Unit class that stores the unit type (meters,feet, cubits, etc.) and the value on creation. Then add a bunch of methods to Numeric that correspond to the valid values unit can have: these methods will return a Unit object with it's type recorded as the method name. The Unit class would support a bunch of to_* methods that would convert to another unit type with the corresponding unit value. That way, you can do the following command:
```
>> x = 47.feet.to_meters
=> 14.3256
>> x.inspect
=> #<Unit 0xb795efb8 @value=14.3256, @type=:meter>
```
The best way to handle it would probably be a matrix of conversion types and expressions in the Unit class, then use method_missing to check if a given type can be converted to another type. In the numeric class, use method_missing to ask Unit if it supports the given method as a unit type, and if so, return a unit object of the requested type using the numeric as its value. You could then support adding units and conversions at runtime by adding a register_type and register_conversion class method to Unit that extended the conversion matrix and Numeric would "automagically" pick up the ability.

As for where to put it, create a lib/units.rb file, which would also contain the monkey_patch to Numeric, then initialize it in config/environment.rb bu requiring the lib/units.rb file.

jonnii : The only thing I'd suggest doing differently is doing the initialize using the initializers directory instead of in environment.rb. Great post.

Difference between a LinkedList and a Binary Search Tree

What are the main differences between a Linked List and a BinarySearchTree? Is BST just a way of maintaining a LinkedList? My instructor talked about LinkedList and then BST but did't compare them or didn't say when to prefer one over another. This is probably a dumb question but I'm really confused. I would appreciate if someone can clarify this in a simple manner.

From stackoverflow

I would say the MAIN difference is that a binary search tree is sorted. When you insert into a binary search tree, where those elements end up being stored in memory is a function of their value. With a linked list, elements are blindly added to the list regardless of their value.

Right away you can some trade offs: Linked lists preserve insertion order and inserting is less expensive Binary search trees are generally quicker to search
In computer science, a binary search tree (BST) is a binary tree data structure which has the following properties:
- each node (item in the tree) has a distinct value;
- both the left and right subtrees must also be binary search trees;
- the left subtree of a node contains only values less than the node's value;
- the right subtree of a node contains only values greater than or equal to the node's value.
In computer science, a linked list is one of the fundamental data structures, and can be used to implement other data structures.

So a Binary Search tree is an abstract concept that may be implemented with a linked list or an array. While the linked list is a fundamental data structure.

Harper Shelby : Binary Search Trees aren't just abstract. I had to implement one in my Algorithms and Data Structures class. I didn't use a linked list or an array in the implementation.

VarunGupta : Harper Shelby, please divulge more details regarding your implementation?

Harper Shelby : @VarunGupta - it's been a few years, and I doubt I could dig up the source at this point, but I created a simple node structure with a data pointer, a left (subtree) pointer, and a right (subtree) pointer. The overall BST was simply a head node pointer. I wrote functions for insert/delete/etc.
Linked lists and BSTs don't really have much in common, except that they're both data structures that act as containers. Linked lists basically allow you to insert and remove elements efficiently at any location in the list, while maintaining the ordering of the list. This list is implemented using pointers from one element to the next (and often the previous).

A binary search tree on the other hand is a data structure of a higher abstraction (i.e. it's not specified how this is implemented internally) that allows for efficient searches (i.e. in order to find a specific element you don't have to look at all the elements.

Notice that a linked list can be thought of as a degenerated binary tree, i.e. a tree where all nodes only have one child.
Linked List:
```
Item(1) -> Item(2) -> Item(3) -> Item(4) -> Item(5) -> Item(6) -> Item(7)
```
Binary tree:
```
                 Node(1)
                /
            Node(2)
           /    \
          /      Node(3)
  RootNode(4)
          \      Node(5)
           \    /
            Node(6)
                \
                 Node(7)
```
In a linked list, the items are linked together through a single next pointer. In a binary tree, each node can have 0, 1 or 2 subnodes, where (in case of a binary search tree) the key of the left node is lesser than the key of the node and the key of the right node is more than the node. As long as the tree is balanced, the searchpath to each item is a lot shorter than that in a linked list.

Searchpaths:
```
------ ------ ------
key    List   Tree
------ ------ ------
1      1      3
2      2      2
3      3      3
4      4      1
5      5      3
6      6      2
7      7      3
------ ------ ------
avg    4      2.43
------ ------ ------
```
By larger structures the average search path becomes significant smaller:
```
------ ------ ------
items  List   Tree
------ ------ ------
     1      1   1
     3      2   1.67
     7      4   2.43
    15      8   3.29
    31     16   4.16
    63     32   5.09
------ ------ ------
```
Don Kirkby : What does "almost a lot shorter" mean? Typo? Other than that, nice answer.

Simucal : Nice tree representation! You have a future as an ascii artist!

Robert : +1 for ASCII art. Or is it UTF8 art?

Gamecat : Lol, I learned "ascii art" on the good old typewriters about 30 years ago. Although we used an other name back then.

whatnick : +1 for little arrows and art

Lazer : @Gamecat: linked `list:tree == tree:hashing` in some sense, correct? In hashing, the search path is just 1 (assuming everything balanced).
A binary search tree can be implemented in any fashion, it doesn't need to use a linked list.

A linked list is simply a structure which contains nodes and pointers/references to other nodes inside a node. Given the head node of a list, you may browse to any other node in a linked list. Doubly-linked lists have two pointers/references: the normal reference to the next node, but also a reference to the previous node. If the last node in a doubly-linked list references the first node in the list as the next node, and the first node references the last node as its previous node, it is said to be a circular list.

A binary search tree is a tree that splits up its input into two roughly-equal halves based on a binary search comparison algorithm. Thus, it only needs a very few searches to find an element. For instance, if you had a tree with 1-10 and you needed to search for three, first the element at the top would be checked, probably a 5 or 6. Three would be less than that, so only the first half of the tree would then be checked. If the next value is 3, you have it, otherwise, a comparison is done, etc, until either it is not found or its data is returned. Thus the tree is fast for lookup, but not nessecarily fast for insertion or deletion. These are very rough descriptions.

Linked List from wikipedia, and Binary Search Tree, also from wikipedia.
A linked list is just that... a list. It's linear; each node has a reference to the next node (and the previous, if you're talking of a doubly-linked list). A tree branches---each node has a reference to various child nodes. A binary tree is a special case in which each node has only two children. Thus, in a linked list, each node has a previous node and a next node, and in a binary tree, a node has a left child, right child, and parent.

These relationships may be bi-directional or uni-directional, depending on how you need to be able to traverse the structure.
It's actually pretty simple. A linked list is just a bunch of items chained together, in no particular order. You can think of it as a really skinny tree that never branches:

1 -> 2 -> 5 -> 3 -> 9 -> 12 -> |i. (that last is an ascii-art attempt at a terminating null)

A Binary Search Tree is different in 2 ways: the binary part means that each node has 2 children, not one, and the search part means that those children are arranged to speed up searches - only smaller items to the left, and only larger ones to the right:
```
    5
   / \
  3   9
 / \   \
1   2   12
```
9 has no left child, and 1, 2, and 12 are "leaves" - they have no branches.

Make sense?

For most "lookup" kinds of uses, a BST is better. But for just "keeping a list of things to deal with later First-In-First-Out or Last-In-First-Out" kinds of things, a linked list might work well.
Linked List is straight Linear data with adjacent nodes connected with each other e.g. A->B->C. You can consider it as a straight fence.

BST is a hierarchical structure just like a tree with the main trunk connected to branches and those branches in-turn connected to other branches and so on. The "Binary" word here means each branch is connected to a maximum of two branches.

You use linked list to represent straight data only with each item connected to a maximum of one item; whereas you can use BST to connect an item to two items. You can use BST to represent a data such as family tree, but that'll become n-ary search tree as there can be more than two children to each person.
They are totally different data structures.

A linked list is a sequence of element where each element is linked to the next one, and in the case of a doubly linked list, the previous one.

A binary search tree is something totally different. It has a root node, the root node has up to two child nodes, and each child node can have up to two child notes etc etc. It is a pretty clever data structure, but it would be somewhat tedious to explain it here. Check out the Wikipedia artcle on it.
The issue with a linked list is searching within it (whether for retrieval or insert).

For a single-linked list, you have to start at the head and search sequentially to find the desired element. To avoid the need to scan the whole list, you need additional references to nodes within the list, in which case, it's no longer a simple linked list.

A binary tree allows for more rapid searching and insertion by being inherently sorted and navigable.

An alternative that I've used successfully in the past is a SkipList. This provides something akin to a linked list but with extra references to allow search performance comparable to a binary tree.
A linked list is a sequential number of "nodes" linked to each other, ie:
```
public class LinkedListNode
{
     Object Data;
     LinkedListNode NextNode;
}
```
A Binary Search Tree uses a similar node structure, but instead of linking to the next node, it links to two child nodes:
```
public class BSTNode
{
    Object Data
    BSTNode LeftNode;
    BSTNode RightNode;
}
```
By following specific rules when adding new nodes to a BST, you can create a data structure that is very fast to traverse. Other answers here have detailed these rules, I just wanted to show at the code level the difference between node classes.

It is important to note that if you insert sorted data into a BST, you'll end up with a linked list, and you lose the advantage of using a tree.

Because of this, a linkedList is an O(N) traversal data structure, while a BST is a O(N) traversal data structure in the worst case, and a O(log N) in the best case.
A Binary Search Tree is a binary tree in which each internal node x stores an element such that the element stored in the left subtree of x are less than or equal to x and elements stored in the right subtree of x are greater than or equal to x.

Now a Linked List consists of a sequence of nodes, each containing arbitrary values and one or two references pointing to the next and/or previous nodes.
They do have similarities, but the main difference is that a Binary Search Tree is designed to support efficient searching for an element, or "key".

A binary search tree, like a doubly-linked list, points to two other elements in the structure. However, when adding elements to the structure, rather than just appending them to the end of the list, the binary tree is reorganized so that elements linked to the "left" node are less than the current node and elements linked to the "right" node are greater than the current node.

In a simple implementation, the new element is compared to the first element of the structure (the root of the tree). If it's less, the "left" branch is taken, otherwise the "right" branch is examined. This continues with each node, until a branch is found to be empty; the new element fills that position.

With this simple approach, if elements are added in order, you end up with a linked list (with the same performance). Different algorithms exist for maintaining some measure of balance in the tree, by rearranging nodes. For example, AVL trees do the most work to keep the tree as balanced as possible, giving the best search times. Red-black trees don't keep the tree as balanced, resulting in slightly slower searches, but do less work on average as keys are inserted or removed.

Thursday, February 17, 2011

Blog Archive