Code Question: 03/28/11

Monday, March 28, 2011

Need way to determine whether function has void return type in VC6 and VC7

The following C++ code compiles and runs correctly for GNU g++, LLVM and every other C++ compiler I threw at it except for Microsoft VC6 and VC7:

template<typename A, typename B> int HasVoidReturnType(A(*)(B)) { return 0; }
template<typename B> int HasVoidReturnType(void(*)(B)) { return 1; }
void f(double) {}
int foo() { return HasVoidReturnType(f); }

For VC6 and VC7, it fails to compile and gives the error:

f.cpp(4) : error C2667: 'HasVoidReturnType' : none of 2 overloads have a best conversion
    f.cpp(2): could be 'int HasVoidReturnType(void (__cdecl *)(B))'
    f.cpp(1): or       'int HasVoidReturnType(A (__cdecl *)(B))'
    while trying to match the argument list '(overloaded-function)'
f.cpp(4) : error C2668: 'HasVoidReturnType' : ambiguous call to overloaded function
    f.cpp(2): could be 'int HasVoidReturnType(void (__cdecl *)(B))'
    f.cpp(1): or       'int HasVoidReturnType(A (__cdecl *)(B))'
    while trying to match the argument list '(overloaded-function)'

Rather than arguing the merits of what compiler is right, how can I determine from a template function whether a function has a void return type using VC6 and VC7?

From stackoverflow

Instead of creating two templates, have you tried just using the first one and using template specialization to define the second?
FYI this is compilable on C++ 2008 Express edition from Microsoft. (I would have liked to help but can't reproduce the problem on my compiler)
Try this on for size
```
template<typename FuncPtrType>
struct DecomposeFuncPtr;

template<typename ReturnType, typename ArgType>
struct DecomposeFuncPtr<ReturnType(*)(ArgType)> {
  typedef ReturnType return_type;
};

template<typename T>
struct is_void {
  enum { value = 0 };
};

template<>
struct is_void<void> {
  enum { value = 1 };
};

template<typename T>
int HasVoidReturnType(T dontcare) {
  return is_void< typename DecomposeFuncPtr<T>::return_type >::value;
}
```
it should avoid the overloading that is confusing VC6/7.

Hrmm. Sorry I couldn't test it with VC6/7. I see to recall running into issues using function pointers with templates before in VC though. Since we know the A, B works for the function in your original, I wonder if something like:
```
template<typename T>
struct is_void {
  enum { value = 0 };
};

template<>
struct is_void<void> {
  enum { value = 1 };
};

template<typename A, typename B>
int HasVoidReturnType(A(*)(B)) {
  return is_void<A>::value;
}
```
would work.

Johannes Schaub - litb : Try template char (& HasVoidReturnType(A(*)(B)) )[is_void::value + 1]; then sizeof(HasVoidReturnType(fun)) can tell you either 1 or 2 depending on whether it returns void. maybe that works with vc6 too? it would work at compile time :)

Greg Rogers : The last way is how I would have written it in the first place - but probably because I don't grok overload resolution compared to template specialization.

As far as VC++ 6 is concerned, you are screwed, as it doesn't support partial template specialisation, which is what you need to solve this problem.

Andrew Medico : VC++ 6 can barely be called a "C++ compiler".

ChrisInEdmonton : VC++ 7.0 also sucks. VC++ 7.1, on the other hand, is a pretty decent C++ compiler. The version numbering system here is unfortunate.

In which situations is it advisable to opt for BSD systems instead of Linux?

For an everyday-user with new hardware Linux seems for me the natural choice if somebody is looking for an alternative to Windows. But when does it make sense to give the BSD variants a try?

From stackoverflow

I'm told that the BSDs are more... coherent than the Linuxes. I've had long conversations with my sysadmin friend on why/why not BSD/Linux. Here's a link:

http://www.over-yonder.net/~fullermd/rants/bsd4linux/bsd4linux1.php?dupe=with_honor

Having said that, I started using Debian in 2007, and I've never looked back! :)

Jason Baker : Debian is awesome. For the most part, I agree with the link you posted though.
One of the big areas that BSD has over Linux is licensing. Linux's GPL can make it difficult to use some differently-licensed features of other Operating Systems. The first one that springs to mind is ZFS.

Plus, BSD is a bit more mature operating system (being directly descendent from AT&T System V UNIX).

The commonly cited wisdom is that BSD is more useful for a server OS and Linux is more useful for a desktop OS. But don't take that as the gospel truth as lots of people have successfully used Linux as a server OS and lots of people have used BSD as a desktop OS.
I've always found the BSD's to be more intuitive. There are some different philosophies in BSD than in Linux. For example, Linux prefers GNU commands, while BSD opts for either classic BSD commands (which are similar, but often times have different options) or newly written ones, falling back to GNU when nothing else is available. Also, I find the BSD Man pages to be more comprehensive and contain more examples than GNU man pages, since GNU tends to prefer info pages (which I despise) for examples.

Many ISP sysadmins swear by BSD. They claim it holds up better under load, hasn't made as many compromsies for the desktop, and that it's networking stack is more efficient and less buggy. I don't know if those are, or are still true, but this is what i've been told.

Also, OpenBSD has a reputation of focusing heavily on security, and they have historically had a very good track record when it comes to security. They take proactive measures (developing new C Runtime library routines, for instance) to prevent security flaws before they can be written.

NetBSD has a reputation of running on just about anything. They have a long list of platforms they actively support. Linux, to some extent tries to do this as well, but typically only a small subset of these are mainline supported.

Finally, it often just comes down to personal preference. Do the guys you have or are going to hire know BSD? Do you personally like it?

There are also some reasons NOT to run BSD. If you're primarily a desktop user, BSD may not be the best choice. Sure, you can install most of the same stuff on BSD as Linux, but you won't find a "distro" similar to, say, Ubuntu which focuses strictly on the desktop. Also, some device drivers aren't available on BSD because they were written with GPL only licenses.

Jason Baker : Desktop oriented versions of BSD do exist, but I'd argue that they're not as good quality as Ubuntu. Google for PC-BSD or Desktop BSD if you want to check them out.

jandersson : +1 on the man pages. OpenBSD for example has man pages that are comprehensive and meticulously correct. Also man pages are available not only for commands, but for config files as well as general concepts.

clear buffer cache on Mac OS X

Is there a way to programatically clear the buffer cache on the Mac, preferrably in C?

Basically, I'm looking for the source of 10.5's purge command. EDIT: I now see this is part of the CHUD tools, for which it seems the source isn't directly available. However, I'm still looking for some code to do the same.

From stackoverflow

You could use sync(2) several times (as in the well-known idiom sync; sync; sync). I can't seem to find the purge source code but it may just be part of the man packages available in 10.5.6 code

Jason Coco : purge is part of CHUD, so that's why you can't find the source ;-)

Ben Alpert : So, does that mean the source is unavailable, or just hiding?

Ben Alpert : Keltia, that will only force writes to be written to disk; it doesn't actually clear the buffer cache.

Keltia : Apart from trying to disassemble it or run it under DTrace, I do not see a way short of gaining access to the source code then.
When you don't have the source code for the tool you wish to emulate (as is the case here), there are a number of ways to go about it.

1/ From your C code, simply call the tool with a system() function call. This works well as long as there's no visible effect (such as opening a graphical window). You could use system("/path/to/purge -purgargs >/dev/null 2>&1");, for example.

2/ Reverse-engineer the code to see how it's actually doing it. This is somewhat trickier since it will require knowledge of the assembler language, system calls and many other things.

3/ Contact the developers to obtain tips on how it was done. This doesn't have to be a "send me the code so I can rip it off and make money" question. You could phrase it as "I have an interest in using purge for development but I'm unsure exactly what is does" or "I have security issues with running the code, the powers that be won't let me run it unless we know exactly what it does". Then you code yours to do the same.

Me, I would just use option 1 if possible (I'm inherently lazy :-). If you're going to write a tool to compete with purge (and this'll be hard given it's free), option 2 is probably the best bet.

Ben Alpert : I'm guessing that Apple wouldn't be that helpful if I went and asked for some of their proprietary source code.

paxdiablo : No, I don't think they would be either, which is why I said to phrase it more subtly :-) Anyway, you don't need their code (and using it would possibly make your code a derivative work/copyright violation). You only need to know how to do it conceptually.
Wouldn't you be interested in turning off the cache for a file instead? Depending on what you are trying to achieve, it could be an alternative. Good summary here.
UBC can be cleared by running 'purge' which allocates a lot of memory to force the cache to clear.
```
fcntl(fd, F_GLOBAL_NOCACHE, 1)
```
can be used turn caching off for a particular file. This can be done in any process and the file can be closed after.
lpfavreau : I'm just giving that as an alternative route, not knowing what you're doing exactly. Just ignore that if it doesn't fit the bill. It's just that _sometimes_, when you need to purge a cache often, it might be because it shouldn't be cached in the first place.
It seems that:

You can use usr/bin/purge (type purge in the terminal) to flush the disk cache (inactive memory), or you can do many random reads from the hard disk to do the same thing.

Taken from a comment from user guns.
I've disassembled the function in question (_utilPurgeDiskBuffers) from the CHUD framework. The function doesn't seem to be very complex, but since I'm no MacOS programmer, the imports and called sys APIs don't make much sense to me.

The first thing the API does is to call another function, namely _miscUtilsUserClientConnect_internal. This function seems to establish a connection to the CHUD kernel extension.
To do this, it calls _getCHUDUtilsKextService which tries to locate the CHUD kernel extension by enumerating all kexts using the IORegistryCreateIterator imported from the I/O kit. After the kext has been found, it is opened via _IOServiceOpen.

At this point we have a connection to the CHUD kext (at least that's my understanding from the disassembly listing).

Finally a call to IOConnectMethodStructureIStructureO is made, which I guess carries out the real magic.
Without knowing some internal details or the signature of this function the parameters don't make sense to me.

Here's the disassembly, though:
```
__text:4B0157A7 lea     eax, [ebp+var_1C]
__text:4B0157AA mov     dword ptr [esp+14h], 0
__text:4B0157B2 mov     [esp+10h], eax
__text:4B0157B6 mov     [esp+0Ch], eax
__text:4B0157BA mov     dword ptr [esp+8], 0
__text:4B0157C2 mov     dword ptr [esp+4], 0Eh
__text:4B0157CA mov     [esp], edx
__text:4B0157CD call    _IOConnectMethodStructureIStr
```
Note that var_1C has been zeroed out before.

Hopefully some of you can make more sense out of those syscalls. If you want more information, let me know.

Update:
To get you started, just take the AppleSamplePCIClient.c example from the IO kit SDK. This does basically what the purge application from the CHUD tools does.
The only thing you would have to change are the parameters to the final _IOConnectMethodStructureIStr call. Take them from the disassembly listing above. I cannot test all this stuff since I don't have a Mac.

How to externally populate a Django model?

What is the best idea to fill up data into a Django model from an external source?

E.g. I have a model Run, and runs data in an XML file, which changes weekly.

Should I create a view and call that view URL from a curl cronjob (with the advantage that that data can be read anytime, not only when the cronjob runs), or create a python script and install that script as a cron (with DJANGO _SETTINGS _MODULE variable setup before executing the script)?

From stackoverflow

You don't need to create a view, you should just trigger a python script with the appropriate Django environment settings configured. Then call your models directly the way you would if you were using a view, process your data, add it to your model, then .save() the model to the database.

Marius Ursache : I can do this from both sides, save it from the view or save it from the python script.

Carl Meyer : A custom management command is a better solution than munging the Django environment settings yourself. See Daevaorn's answer.
There is excellent way to do some maintenance-like jobs in project environment- write a custom manage.py command. It takes all environment configuration and other stuff allows you to concentrate on concrete task.

And of course call it directly by cron.
"create a python script and install that script as a cron (with DJANGO _SETTINGS _MODULE variable setup before executing the script)?"

First, be sure to declare your Forms in a separate module (e.g. forms.py)

Then, you can write batch loaders that look like this. (We have a LOT of these.)
```
from myapp.forms import MyObjectLoadForm
from myapp.models import MyObject
import xml.etree.ElementTree as ET

def xmlToDict( element ):
    return dict(
        field1= element.findtext('tag1'),
        field2= element.findtext('tag2'),
    )

def loadRow( aDict ):
     f= MyObjectLoadForm( aDict )
     if f.is_valid():
         f.save()

def parseAndLoad( someFile ):
    doc= ET.parse( someFile ).getroot()
    for tag in doc.getiterator( "someTag" )
        loadRow( xmlToDict(tag) )
```
Note that there is very little unique processing here -- it just uses the same Form and Model as your view functions.

We put these batch scripts in with our Django application, since it depends on the application's models.py and forms.py.

The only "interesting" part is transforming your XML row into a dictionary so that it works seamlessly with Django's forms. Other than that, this command-line program uses all the same Django components as your view.

You'll probably want to add options parsing and logging to make a complete command-line app out of this. You'll also notice that much of the logic is generic -- only the xmlToDict function is truly unique. We call these "Builders" and have a class hierarchy so that our Builders are all polymorphic mappings from our source documents to Python dictionaries.

Carl Meyer : No reason not to implement this kind of script as a Django management command. It integrates with other commands in manage.py, and it takes care of things like argument and option parsing for you. More "Djangoic".
I've used cron to update my DB using both a script and a view. From cron's point of view it doesn't really matter which one you choose. As you've noted, though, it's hard to beat the simplicity of firing up a browser and hitting a URL if you ever want to update at a non-scheduled interval.

If you go the view route, it might be worth considering a view that accepts the XML file itself via an HTTP POST. If that makes sense for your data (you don't give much information about that XML file), it would still work from cron, but could also accept an upload from a browser -- potentially letting the person who produces the XML file update the DB by themselves. That's a big win if you're not the one making the XML file, which is usually the case in my experience.

Can I select multiple objects in a Linq query

I'm very new to Linq so bare with me. Can I return more than one item in a select? For instance I have a List of Fixtures (think football (or soccer for the yanks) fixtures). Each fixture contains a home and away team and a home and away score. I want to get all the teams that drew. I want to use something like

IEnumerable<Team> drew = from fixture in fixtures
                         where fixture.Played && (fixture.HomeScore == fixture.AwayScore)
                         select fixture.HomeTeam && fixture.AwayTeam;

I know this syntax is incorrect, what I don't know is if it's possible to do this. Would I need two queries and then concatenate them?

Edit: this is really a learning thing so it's not critical to achieve this in any particular way. Basically, at this stage all i want is a list of teams that have drawn. An example usage might be that for a given list of fixtures i can find all of the drawn teams so that i could update their standings in a table by 1 point (3 for a win, 0 for a loss).

Cheers James

From stackoverflow

101 LINQ Samples, namely Select - Anonymous Types 1
```
... select new { HomeTeam = fixture.HomeTeam, AwayTeam = fixture.AwayTeam };
```
Mike Powell : Not the answer he's looking for. He wants a list of Teams, not a list of anonymous types with hometeam and awayteam properties.

James Hay : This is true... i could get round it using anonymous types... just wondered if there was a way to get just a list of teams. If it's the only way it's the only way though

bendewey : I agree that this doesn't return a list of teams, but i thinks its better for him to adapt his code to suport handling this anon type. If James Hay could update his question to describe his usuage that might help.

Mike Powell : I think his question already describes his requirement perfectly: "I want to get a list of teams that drew." There are lots of reasons he might not want to use anonymous types here (needing to pass the list outside this method would be a common one).

Edit: Sorry, misunderstood your original question, so rewrote answer.

You could use the "SelectMany" operator to do what you want:

IEnumerable<Team> drew =
           (from fixture in fixtures
            where fixture.Played && (fixture.HomeScore == fixture.AwayScore)
                  select new List<Team>()
                             { HomeTeam = fixture.HomeTeam,
                               AwayTeam = fixture.AwayTeam
                             }).SelectMany(team => team);

This will return a flattened list of teams that drew.

Or you can define a type to hold all that data:

IEnumerable<TeamCluster> drew = from fixture in fixtures
                         where fixture.Played && (fixture.HomeScore == fixture.AwayScore)
                         select new TeamCluster {
                             Team1 = fixture.HomeTeam,
                             Team2 = fixture.AwayTeam,
                             Score1 = fixture.HomeScore,
                             Score2 = fixture.AwayScore
                         };

class TeamCluster {
    public Team Team1 { get; set; }
    public Team Team2 { get; set; }
    public int Score1 { get; set; }
    public int Score2 { get; set; }
}

I think you're looking for the Union method as follows:

IEnumerable<Team> drew = (from fixture in fixtures
                     where fixture.Played 
                        && (fixture.HomeScore == fixture.AwayScore)
                     select fixture.HomeTeam)
                     .Union(from fixture in fixtures
                     where fixture.Played 
                        && (fixture.HomeScore == fixture.AwayScore)
                     select fixture.AwayTeam);

An (independant) variation on John Price's solution...

IEnumerable<Team> drew =
    from fixture in fixtures
    where fixture.Played && (fixture.HomeScore == fixture.AwayScore)
    from team in new[]{fixture.AwayTeam, fixture.HomeTeam}
    select team;

You could consider adding "ParticipatingTeams" to the Fixture class to get:

IEnumerable<Team> drew =
    from fixture in fixtures
    where fixture.Played && (fixture.HomeScore == fixture.AwayScore)
    from team in fixture.ParticipatingTeams
    select team;

mattdekrey : +1 for your first query - doesn't require a contract change and is more efficient than the leading answer.

Taking a stab at this myself I came up with the same version as 'it depends'.

Using query comprehension syntax:

IEnumerable<Team> drew =
    from fixture in fixtures
    where fixture.Played && (fixture.HomeScore == fixture.AwayScore)
    from team in new[]{fixture.AwayTeam, fixture.HomeTeam}
    select team;

Using lambda with extension methods:

IEnumerable<Team> drew =
    fixtures.Where(f => f.Played && f.HomeScore == f.AwayScore)
    .SelectMany(f => new[]{f.HomeTeam, f.AwayTeam});

Edit: I don't know if a team could have possibly played and drawn more than once in your database, but if that's possible, then you might want to make use of the Distinct query operator:

IEnumerable<Team> drew =
    (from fixture in fixtures
     where fixture.Played && (fixture.HomeScore == fixture.AwayScore)
     from team in new[]{fixture.AwayTeam, fixture.HomeTeam}
     select team).Distinct();

or:

IEnumerable<Team> drew =
    fixtures.Where(f => f.Played && f.HomeScore == f.AwayScore)
    .SelectMany(f => new[]{f.HomeTeam, f.AwayTeam})
    .Distinct();

Associated programming languages to jobs

Hello,

I was thinking about all of the different programming languages and wondering about which programming language is best suited for certain jobs. I am in web development and we were using coldfusion but now have switched over to .net. So for web I would think a small list of the main languages would be like php, asp.net, coldfusion, perl. I would also think that gaming is usually C++. More computational science might require C. My question is what language would you associate with what job and more specifically, does anyone still use C and what for?

From stackoverflow

I would associate PHP/Javascript with the web developer (my) role. Throw in a healthy dose of CSS too!

I've never written any C code in any professional capacity, so I shan't comment further.
In terms of web development, you could also add Ruby and Python as two increasingly popular languages.
Wasabi - Bug Tracking/Software Project Management

(someone had to say it)
To the question about C:

I use C with anything remotely low level, like directly interacting with binary formats. I use a scripting language to clue those smaller programs together.

To the broader question:

I think the most important point is to factor your skill with the language into the suitability of the language. I happen to use PHP as my general purpose glue/scripting language (and I don't mean web stuff). Not because it's the best choice out there, but because my familiarity with it at the moment makes it the most appropriate language for me to use.

I emphasize at the moment because the second most important part is to constantly be refining your tool set by learning new languages. Just like polyglots are more expressive in one spoken language because they know others, a programmer who knows many languages knows many approaches to a problem.
The traditional application domain for C has been system software and other programs where performance is so critical that it's worth the tradeoff in higher development and maintenance costs. If you were writing a production-quality virtual machine, you might consider C.

Some still argue that it has a place in environments constrained in other ways, such as limited-resource embedded systems.

If by "computational science" you mean computer science / computer engineering education, you may still find C in environments which want to expose students to "the bare metal". If you mean computing in support of scientific research, you might be surprised at how much FORTRAN code is around.

You might also be amazed at the amount of cycles still spent on COBOL.

I tend to think of the above as "taxicab" languages. They aren't glamorous by any stretch of the imagination, but are still found all over the place due to their workaday utility.

As for other languages, one often finds Java in the big-server corporate/enterprise environment, as well as in many open-source projects. The JVM-as-platform is host for a rapidly-growing variety of alternate languages, many of which leverage the wide range of available Java libraries to avoid reinventing the wheel. These alternative languages may be open-source projects, academic efforts, or individual efforts; examples include Scala, Fan, JRuby, Jython, Clojure, Groovy, and hundreds more.

All of Python, Ruby, Perl, Lisp, Squeak, etc. have active, enthusiastic user communities with applications all over the landscape, especially including web development.

Functional languages, such as Haskell, OCaml, Scala, Erlang, F#, etc. were primarily academic, research, or niche efforts in the past, but are seeing much more interest in a wide range of application areas due to the increasing interest in high reliability, high code re-use, and multi-core/parallel programming.

The list goes ever on...
Haskell, OCaml, and Python have all made big inroads in scientific computing.

Games, robots, and jet engines are written in C++.
"...Please don't assume Lisp is only useful for Animation and Graphics, AI, Bioinformatics, B2B and E-Commerce, Data Mining, EDA/Semiconductor applications, Expert Systems, Finance, Intelligent Agents, Knowledge Management, Mechanical CAD, Modeling and Simulation, Natural Language, Optimization, Research, Risk Analysis, Scheduling, Telecom, and Web Authoring just because these are the only things they happened to list." -- Kent Pitman

How to terminate script's process tree in Cygwin bash from bash script

I have a Cygwin bash script that I need to watch and terminate under certain conditions - specifically, after a certain file has been created. I'm having difficulty figuring out how exactly to terminate the script with the same level of completeness that Ctrl+C does, however.

Here's a simple script (called test1) that does little more than wait around to be terminated.

#!/bin/bash

test -f kill_me && rm kill_me

touch kill_me
tail -f kill_me

If this script is run in the foreground, Ctrl+C will terminate both the tail and the script itself. If the script is run in the background, a kill %1 (assuming it is job 1) will also terminate both tail and the script.

However, when I try to do the same thing from a script, I'm finding that only the bash process running the script is terminated, while tail hangs around disconnected from its parent. Here's one way I tried (test2):

#!/bin/bash

test -f kill_me && rm kill_me

(
    touch kill_me
    tail -f kill_me
) &

while true; do
    sleep 1
    test -f kill_me && {
        kill %1
        exit
    }
done

If this is run, the bash subshell running in the background is terminated OK, but tail still hangs around.

If I use an explicitly separate script, like this, it still doesn't work (test3):

#!/bin/bash

test -f kill_me && rm kill_me

# assuming test1 above is included in the same directory
./test1 &

while true; do
    sleep 1
    test -f kill_me && {
        kill %1
        exit
    }
done

tail is still hanging around after this script is run.

In my actual case, the process creating files is not particularly instrumentable, so I can't get it to terminate of its own accord; by finding out when it has created a particular file, however, I can at that point know that it's OK to terminate it. Unfortunately, I can't use a simple killall or equivalent, as there may be multiple instances running, and I only want to kill the specific instance.

From stackoverflow

This script looks like it'll do the job:

#!/bin/bash
# Author: Sunil Alankar

##
# recursive kill. kills the process tree down from the specified pid
#

# foreach child of pid, recursive call dokill
dokill() {
    local pid=$1
    local itsparent=""
    local aprocess=""
    local x=""
    # next line is a single line
    for x in `/bin/ps -f | sed -e '/UID/d;s/[a-zA-Z0-9_-]\{1,\}
\{1,\}\([0-9]\{1,\}\) \{1,\}\([0-9]\{1,\}\) .*/\1 \2/g'`
    do
        if [ "$aprocess" = "" ]; then
            aprocess=$x
            itsparent=""
            continue
        else
            itsparent=$x
            if [ "$itsparent" = "$pid" ]; then
                dokill $aprocess
            fi
            aprocess=""
        fi
    done
    echo "killing $1"
    kill -9 $1 > /dev/null 2>&1
}

case $# in
1) PID=$1
        ;;
*) echo "usage: rekill <top pid to kill>";
        exit 1;
        ;;
esac

dokill $PID

Barry Kelly : The script doesn't work unmodied in Cygwin, but it was a starting point. Upvoted, but with a working script in my own answer.

Adam's link put me in a direction that will solve the problem, albeit not without some minor caveats.

The script doesn't work unmodified under Cygwin, so I rewrote it, and with a couple more options. Here's my version:

#!/bin/bash

function usage
{
    echo "usage: $(basename $0) [-c] [-<sigspec>] <pid>..."
    echo "Recursively kill the process tree(s) rooted by <pid>."
    echo "Options:"
    echo "  -c        Only kill children; don't kill root"
    echo "  <sigspec> Arbitrary argument to pass to kill, expected to be signal specification"
    exit 1
}

kill_parent=1
sig_spec=-9

function do_kill # <pid>...
{
    kill "$sig_spec" "$@"
}

function kill_children # pid
{
    local target=$1
    local pid=
    local ppid=
    local i
    # Returns alternating ids: first is pid, second is parent
    for i in $(ps -f | tail +2 | cut -b 10-24); do
        if [ ! -n "$pid" ]; then
            # first in pair
            pid=$i
        else
            # second in pair
            ppid=$i
            (( ppid == target && pid != $$ )) && {
                kill_children $pid
                do_kill $pid
            }
            # reset pid for next pair
            pid=
        fi
    done

}

test -n "$1" || usage

while [ -n "$1" ]; do
    case "$1" in
        -c)
            kill_parent=0
            ;;

        -*)
            sig_spec="$1"
            ;;

        *)
            kill_children $1
            (( kill_parent )) && do_kill $1
            ;;
    esac
    shift
done

The only real downside is the somewhat ugly message that bash prints out when it receives a fatal signal, namely "Terminated", "Killed" or "Interrupted" (depending on what you send). However, I can live with that in batch scripts.

/bin/kill (the program, not the bash builtin) interprets a negative PID as “kill the process group” which will get all the children too.

Changing
```
kill %1
```
to
```
/bin/kill -- -$$
```
works for me.

Barry Kelly : Thanks! It's odd that that is not documented in the Cygwin kill manpage. It does, however, work with the Cygwin version of kill.

Setting Culture / Language in RichTextBox WPF

Hi there is possible to set a default language or set a new one in RichTextBox, i want to set it in "es-PE" for spellchecker propouses

Thanks!

From stackoverflow

Have you tried setting the current thread's culture to the one you want? Most stuff in .NET takes the culture from the thread.
```
Thread.CurrentThread.CurrentCulture = new Culture( "es-PE" );
Thread.CurrentThread.CurrentUICulture = Thread.CurrentThread.CurrentCulture;
```
John : This would effect all of the RichTextBox controls in the application

Angel Escobedo : cant find Culture Namespace
...RichTextBox Language="es-PE" Name="mainRTB" AcceptsTab="True" SpellCheck.IsEnabled="True"...

sorry i was thinking extending the control and using System.Windows.Forms.InputLanguage ...(fx 2.0)
It would seem to be possible setting the xml:lang as in the code below:
```
   <StackPanel>
    <RichTextBox SpellCheck.IsEnabled="True"/>
    <RichTextBox SpellCheck.IsEnabled="True" xml:lang="es-PE"/>
</StackPanel>
```
The first box checks in the default culture and the second in the specified one ("es-PE").

The documentation also suggests you could add this attribute to a parent control such as a panel and it will be inherited by the child controls.

Asp.net E-commerce performance

I am developing e-commerce project on Asp.Net 3.5 with C#. I am using 3 tiers (Data + Business + UI) structure to reach the data from database (Msql 2005).

There are stored procedures and everything going on from them.(CRUD methods)

There is a performance issue here, project is running so slowly. I couldn't find any problem in transaction model.

Also the project is running on shared hosting at overseas country.Database server and web server are running on different machines.Database server has nearly 1000 databases.

How can I test and learn where is the problem ?

From stackoverflow

Since you're running on a shared hosting service, I would guess that's where your problem is. You're competing for server resources with every other website and database on those servers.

To make sure, I would set up a local environment that mimics your production environment. Then perform some standard stress tests to see how it performs. If it performs how you would expect, then it is probably your hosting solution.

With shared hosting solutions, you really do get what you pay for. If it's a system that requires a lot more speed then you're getting, you should look at a dedicated hosting solution.
Since there is upwards of 1000 Databases sharing resources I would take a stab that might be your issue.... If you connect to your database and it takes 5 seconds to run a simple query then you can guess the problem.

I would add some stopwatch functionality onto a "testpage" that runs on your web server. This should give you the basic info to see if there is a "bottle neck" in waiting for the database to return your query. If you have made it that far then I would suspect it would be your web server.

Your last option would be be to set up a simple low spec machine with DB and web server on it and just test. Depending on how much traffic your site is getting you should be able to get a pretty good idea of its response time.

Tools such as YSlow might also be of some help however these are usually used more for fine tuning.

Jack : thank you very much. I will try this suggestion.
I suggest you take a look at Tracing:

http://davidhayden.com/blog/dave/archive/2005/07/17/2396.aspx

This enables you to see a stack trace (The last picture in the article), and localize your performance bottlenecks.

Jack : thank you Martin !
A quick solution I developed to keep logs of performance on my web app may help you here. I have a web server and DB server running a similar-sounding app. I wrote a web service that runs a "benchmarking" stored procedure and returns the run time. I wrote a win app that runs on my development server that calls the web service, passes it the name of the stored procedure to run, and times how long the whole request takes. The win app writes the data to a log file and runs every 10 minutes as a scheduled task. Extra bells and whistles include automatic emails to team members when performance exceeds the specified threshold 3 consecutive times, fails to connect, and when it recovers to normal performance after a slow period.

This provides a general indication of how a user's experience on the website will be at any given time and serves as a warning bell for the team. Not exactly the best solution, but I wrote it in a couple of hours several months ago and have used the data it creates for troubleshooting purposes many times.

Why does Java Web Start not work with 64-bit Java environments?

Java Web Start does not come with 64-bit builds of the JDK. Why is this? What is lacking that keeps it from building and working?

From stackoverflow

Apparently, there is no reason, since its in JRE6u12. http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4626735
Thought you'd might want to know the new update is out: http://java.sun.com/javase/6/webnotes/6u12.html

64-Bit Browser Support for Java Plugin and Java Webstart This release supports the New Java Plugin and Java Webstart on AMD64 architecture, on Windows platforms. A Java offline installer (JRE and JDK) is provided. Note, if you use 32-bit and 64-bit browsers interchangeably, you will need to install both 32-bit and 64-bit JREs in order to have Java Plug-In for both browsers.
Mostly its a lack of demand. You only really need 64-bit version if you intend to run more than 1200 MB of memory for a web start client. Otherwise it doesn't make much difference.

Do you know of any examples of a web start application which uses this much memory?

skiphoppy : I'm pretty sure my company's product would enjoy having that much memory to run in. It's pretty memory-intensive. :)

Developer essentials - resources and projects to attempt

Hi, I've been working as a developer for about 2 years, I did a bit of a shitty IT degree, I wish I had done a "proper" computer science degree, as I have come to realize I have massive gaps in my knowledge.

I work completely in c# with some front end web development. There are things that really want to learn and I was wondering if people could point me to some good resources. I've basically been trying to work out a list of what I don't know and sort it by priority. Can anyone can suggest example projects I could attempt for each item listed and resources to use (both web and book suggestions are welcome).

networking, understanding the full ip stack etc.
http protocol, is the OReilly book worth a look?
multi threaded applications
low level programming, assembly currently using this programming from the ground up
Data structures and algorithms.
Operating systems
anything else you think should be on this list!

I suppose I'm asking, "what should every developer know? And what projects should they attempt in their own time to ensure they understand the subject well". I know my list is rather scatter gun in approach but I guess thats why I'm asking for some help with my direction.

Thanks in advance to anyone who takes the time to respond to this.

.. Bri

From stackoverflow

One great book to take a look at is "Operating System Concepts" by Silberschatz, Galvin & Gagne.

It walks you thought every aspect from threading, memory management to the design structures of an OS. The examples in the book are in C, so it is easy enough to get the hang of. It really fills in the any gaps you may have about the OS as a whole.
Have you read Code Complete? That's a great starting point, as it highlights a ton of additional reading in a variety of subjects.
I voted your question down because it seems like you didn't do a search.

What should every programmer know?

What algorithms should every developer know?

asp.net frameworks and libraries every developer should know…and use

CSS tips which every beginning developer should know about?

Questions every good .NET developer should be able to answer?

What single URL should every web developer have bookmarked?

What skills should a programmer have nowadays?

I'm not trying to be rude. At least I'm pointing out WHY I downvoted.

Peter Morris : Well done Greg. I hate why people down vote an answer without a reason (not too bothered about down voting questions though), but at least you have good manners :-)
My must-have knowledge is:
- Dependency injection containers
- Unit testing
- Patterns (somewhat)
- Domain patterns (archetypes)
Peter Morris : Thanks for commenting on why you voted this down, I appreciate it.
One thing that you might try is to find some public syllabi for computer science classes. See what kind of projects they are doing and attempt them yourself. Consult books, forums, and SO as necessary when you need help.

I thought the most foundational CS courses I took in undergrad were theory of programming languages, computer organization, operating systems, and theory/discrete math classes.

Try taking a classic algorithm and implementing it in the 4 different types of languages. (try C#, LISP, Prolog, and C for example) Add some toy functionality to a local copy of the Linux kernel to see how it ticks. Write a distributed file system. These are some of the projects in school that taught me the most.
Definitely take a look at computational theory. A couple great books are: http://www.amazon.com/Introduction-Theory-Computation-Michael-Sipser/dp/053494728X http://www.amazon.com/Computability-Complexity-Languages-Second-Fundamentals/dp/0122063821

I would also take a look at projecteuler.net and work out some of those problems. Avoid solving the problems using brute force... instead, look at the mathematics and try to learn common algorithms and concepts that would be discussed in a CS theory class.
Ok, to attack the list you've given (BTW, they don't sound scattershot at all - you sound like you're interested in computer systems, which is a mature and lively area of research).

I'll give you a resource and an example project for each area:

Data structures and algorithms.
- Resource: the canonical book, and one that I still think is the best, is Introduction to Algorithms by Cormen et al. It is not very easy going, but it is very rich and well explained. Back up what you can't quite grasp from there with wikipedia (whose DS+A pages are not in general bad) and perhaps the NIST dictionary.
- Project: implement as many as you like of your favourite algorithms in the language of your choice. If you are using ITA as above, I'd recommend Python only because it's the closest language to the pseudo-code they use, but do not worry too much about your choice of langauge. I'll repeat that don't get bogged down in choosing a language; there's plenty of time to learn them all :)
Networking.
- Resource: To learn how it all fits together, there are several good books. I like Tanenbaum's 'Computer Networks', but another good choice (although caveat emptor, I haven't read it thoroughly myself) which might suit you better is Kurose and Ross' Computer Networks: A Top Down Approach Using The Internet. This book might work well as you are more likely to hit concepts you find familiar early on.
  
  To put it all into practice, read the Linux or FreeBSD kernel source to see how the networking stack there is put together. There are good resources, either in print or on the net, to help you with this process. BTW, I really wouldn't worry too much about HTTP. It's an important protocol, but not really a very interesting one. TCP, IP, UDP, BGP and friends are much more interesting in my view!
- Project: Difficult to know what is a workable project. Answer the questions at the end of book chapters. Teach yourself user-space socket programming by writing a simple client-server program ("Hello world!" over the net isn't too hard to do!). Once you have this, you can probably come up with an extension yourself - perhaps you want to write a really simple web server.
Multi-threaded programming:
- Resource: This is a huge topic. If you just want to understand how multi-threaded primitives operate in your language of choice, find a tutorial on the net - there are loads for Python, Java and C at least. If you get interested in the theory, search for Herb Sutter's series in DDJ on concurrent programming (but this may a bit advanced for you right now). Herlihy and Shavit's Art of Multiprocessor Programming is a fantastic book on concurrent programming from a very (very!) academic perspective on the practical, although the first edition needs a bunch of mistakes fixed.
- Project: Take your server from the networking project and serve each client in a new thread, so that you can accept many connections at once. Code up a solution to the dining philosopher's problem :)
Operating systems:
- Resource: Several good introductory books exist. Again, I'm a fan of Tanenbaum's Modern Operating Systems, but the Silberschatz et. al. book is good as well. You really want a book here, IMHO, as you want a generalist overview of what design choices are available before studying how a particular operating system works.
  
  When you do get to that point, however, I'd suggest reading the Linux source code again. There are good articles on the net, and some good books in print (I like Linux Kernel Development, but it is becoming necessarily a bit dated).
- Project: Install a Linux distribution in a virtual machine. Add a system call to the kernel, and test it from user space. Then the kernel is your oyster! Try hacking on the scheduler - start off by making it really dumb, then slowly add features back in.
  
  If you like, you can try writing an OS from scratch, but that is a large and potentially hugely frustrating experience. I'd suggest starting to work with an extant kernel - at least then when it breaks you know it was because of something you just did :)

SQL "ON" Clause Optimization

Which query would run faster?

SELECT * FROM topic_info
    LEFT JOIN topic_data ON topic_info.id = topic_data.id
WHERE id = ?

SELECT * FROM topic_info
    LEFT JOIN topic_data ON topic_data.id = topic_info.id
WHERE id = ?

The difference is the order of expressions on the "ON" clause: the first query is checking topic_info.id against topic_data.id, the second topic_data.id against topic_info. Which query would generally run faster?

(I know either query won't parse because of the ambiguous "id" column, but let's ignore that)

From stackoverflow

I don't think it should make a difference. Pick a convention and stick with it.
Modern dbs are smart enough so that it makes no difference in execution performance.

Tomalak : I don't think that "smartness" and "modern DBs" have anything to do with it. It's more that "a = b" and "b = a" are mathematically/logically equivalent.

Learning : oh yes ... it does. If a = 1M records and b = 3M records , it does matter how you evaluate your joins. Earlier the sequence of joins did matter in queries ( you could see it in the execution plan changing). Now it does not.
It probably doesn't matter, but do test it out.

If you use MySQL, try 'explain select ...' -- it'll tell you what you need to know.
Won't they return slightly different results?

Because its a left join, if there are rows in either table without linked records in the other, they'll not be retrieved when that table is on the right of the join clause.

Or have i misunderstood?

EDIT: I have misunderstood completely - just done some tests. Ta to the commenters for putting me straight.

Sören Kuklau : The join order doesn't change. `FROM topic_info LEFT JOIN topic_data` and `FROM topic_data LEFT JOIN topic_info` *would* return different results.

Bill Karwin : Right, the OP is changing only the order of terms in the ON clause. Equality is commutative, so the result would be the same. The only difference might be in the way the SQL engine chooses indexes.

Wayne Khan : As I mentioned...
Hi,

The other posters are correct, it does not matter which way round you express the join.

This is becuase they query optimizer will determine the most efficient method and of course query plan to use for your query. This will happen irrespective of your join statement ordering.

Hope this clears things up for you however please feel free to request additional details.

Cheers, John

Core Data vs SQLite 3

I am already quite familiar with relational databases and have used SQLite (and other databases) in the past. However, Core Data has a certain allure, so I am considering spending some time to learn it for use in my next application.

Is there much benefit to using Core Data over SQLite, or vice versa? What are the pros/cons of each?

I find it hard to justify the cost of learning Core Data when Apple doesn't use it for many of its flagship applications like Mail.app or iPhoto.app - instead opting for SQLite databases. SQLite is also used extensively on the iPhone.

Can those familiar with using both comment on their experience? Perhaps, as with most things, the question is deeper than just using one over the other?

From stackoverflow

SQLite is one of the database formats for Core Data. Using Core Data you get better integration with the rest of the Cocoa API.
Core Data isn't so much a database engine as it is an API that abstracts over the actual data store. You can tell Core Data to save as an sqlite database, a plist, a binary file, or even a custom data store type.

I would recommend learning Core Data, as is it an excellent resource that greatly accelerates many parts of cocoa application development.
The SQLite DBMS has serious problems from a relational point of view, primarily being the mishandling of all columns as strings, and not being able to handle referential integrity correctly.

That being said, it's a brilliant product for learning SQL, I just wouldn't use it for anything important.

Core Data, on the other hand, is the "next layer up". It's an abstraction which can use various DBMS' as the next layer down and it is a boon to Cocoa developers. You can start using it with various underlying data stores (including SQLite, I believe) and, hopefully, put a decent data store under it if you ever want to provide enterprise-quality behavior.
Although Core Data is a descendant of Apple's Enterprise Object Framework, an object-relational mapper (ORM) that was/is tightly tied to a relational backend, Core Data is not an ORM. It is, in fact, an object graph management framework. It manages a potentially very large graph of object instances, allowing an app to work with a graph that would not entirely fit into memory by faulting objects in and out of memory as necessary. Core Data also manages constraints on properties and relationships and maintins reference integrity (e.g. keeping forward and backwards links consistent when objects are added/removed to/from a relationship). Core Data is thus an ideal framework for building the "model" component of an MVC architecture.

To implement its graph managemet, Core Data happens to use sqlite as a disk store. It could have been implemented using a different relational database or even a non-relational database such as CouchDB. As others have pointed out, Core Data can also use XML or a binary format or a user-written atomic format as a backend (though these options require that the entire object graph fit into memory). If you're interested in how Core Data is implemented on an sqlite backend, you might want to check out OmniGroup's OmniDataObjects framework, an open source implementation of a subset of the Core Data API. The BaseTen framework is also an implementation of the Core Data API using PostgreSQL as a backend.

Because Core Data is not intended to be an ORM for sqlite, it cannot read arbitrary sqlite schema. Conversely, you should not rely on being able to read Core Data's sqlite data stores with other sqlite tools; the schema is an implementation detail that may change.

Thus, there is not really any conflict between using Core Data or sqlite directly. If you want a relational database, use sqlite (directly or via one of the Objective-C wrappers such as FMDB), or a relational database server. However, you may still want to learn Core Data for use as an object graph management framework. In combination with Apple's controller classes and key-value binding compatible view widgets, you can implement an complete MVC architecture with very little code.

robottobor : Note fmdb is not an ORM, just a objc wrapper around sqlite3 C api

Barry Wark : Thanks for the catch; I'll update the post.

Perspx : Great answer - thanks very much!
This might be of interest to you. By Brent Simmons, author of NetNewsWire.

On switching away from Core Data (to SQLite)

Search Web Page Content

How do you search the Web Page Source in ruby Hard to explain, but heres a the code for doing it in Python

import urllib2, re
word = "How to ask"
source = urllib2.urlopen("http://stackoverflow.com").read()
if re.search(word,source):
     print "Found it "+word

From stackoverflow

A quick look at Google gave me this: http://snippets.dzone.com/posts/show/2430

Directly porting your code:

require 'net/http'
word = 'How to ask'
source = Net::HTTP.get(URI.parse('http://stackoverflow.com/'))
if source.match(word)
    puts "Found #{word}"
end

If you want to do things like follow redirects, you'll want to read the documentation.

Best Resources to learn OO Design and Analysis

Hello,

I am looking for the best resources, videos, books, magazines(I like videos) to learn and master Object Oriented design and analysis. I would really like to know more about trusted and reputable methodologies for structuring your programs, designing classes, and dealing with databases in your programs. So, my question is what are the best resources?

thanks

From stackoverflow

Gotta read Uncle Bob Martin's columns at Object Mentor. He's been writing good things about object-oriented programming since C++ Report in the 90s. His SOLID ideas are language-agnostic.
Design Patterns by the Gang of Four. One reference book you will always need. It gives great detail on how to structure your code using OO design.

http://en.wikipedia.org/wiki/Design_Patterns
The 'Head First' books are very good:
- Object oriented analysis and design
- Design patterns
This might help: how-to-learn-good-software-design-architecture
I would definitely recommend the "Head First Design Patterns" book. My suggestion is to read through that book atleast once. And once you get a feel of design patterns, use the "Gang of Four Design Patterns" book for quick reference/refresh.

And here are a few links from my bookmarks:
- http://sourcemaking.com/design-patterns-and-tips
- http://www.dofactory.com/Patterns/Patterns.aspx
Hope it helps.
You will learn this best on a University course, or atleast a good one. You don't have to spend 2 years out to do this - if you can afford £400 - $500 I'd recommend this one.

It teaches you about state, and the other 4 concepts you can read about in a badly expressed way on wikipedia. I'm not convinced you will learn it properly from free resources online, I'd guess you'll just get patchy information.

You can be extremely brainy but the information out there isn't going to be that high calibre for a reason - the brightest minds in software pay for their university courses, lectures, assignments and exams, not just read it on the internet.

For analysis try the M256 course, which is about Object Oriented software development, UML and system design. It sounds dull but contains a lot of background information that you probably will never use but will want to know anyway.

Monday, March 28, 2011

Blog Archive