Monday, February 7, 2011

How Does The Debugging Option -g Change the Binary Executable?

When writing C/C++ code, in order to debug the binary executable the debug option must be enabled on the compiler/linker. In the case of GCC, the option is -g. When the debug option is enabled, how does the affect the binary executable? What additional data is stored in the file that allows the debugger function as it does?

  • -g tells the compiler to store symbol table information in the executable. Among other things, this includes:

    • symbol names
    • type info for symbols
    • files and line numbers where the symbols came from

    Debuggers use this information to output meaningful names for symbols and to associate instructions with particular lines in the source.

    For some compilers, supplying -g will disable certain optimizations. For example, icc sets the default optimization level to -O0 with -g unless you explicitly indicate -O[123]. Also, even if you do supply -O[123], optimizations that prevent stack tracing will still be disabled (e.g. stripping frame pointers from stack frames. This has only a minor effect on performance).

    With some compilers, -g will disable optimizations that can confuse where symbols came from (instruction reordering, loop unrolling, inlining etc). If you want to debug with optimization, you can use -g3 with gcc to get around some of this. Extra debug info will be included about macros, expansions, and functions that may have been inlined. This can allow debuggers and performance tools to map optimized code to the original source, but it's best effort. Some optimizations really mangle the code.

    For more info, take a look at DWARF, the debugging format originally designed to go along with ELF (the binary format for Linux and other OS's).

    Mike : Just to add to this, it can also slow down the executable. I was testing some OpenMP code with the Sun Studio compiler, and with debugging information the code ran much slower. Just something to keep in mind.
    tgamblin : Unless the -g flag in the Sun compiler disables some optimizations, debug info should NOT slow down your code.
    Mike : This is OpenMP code, and it did slow it down. I was playing with fractals, and working on using the OpenMP compiler extensions. The code on a single thread, ran slower than the non OpenMP code on a single thread. I disabled debugging and the speed equalised.
    tgamblin : Noted. That's actually kind of interesting. Maybe it's putting extra stuff in there to tell the debugger about parallel regions... They say here (http://docs.sun.com/source/819-3683/OpenMP.html) that you can get map the master thread back to source but not slaves, which seems odd, too.
    Mike : I think that's the case, doesn't affect GCC of course, certainly gave me a surprise when the single thread code went from 11secs to 22. :/ With debugging disabled and 4 threads (I have a Q6600) it dropped to about 3 secs.
    tgamblin : gcc4 actually supports OpenMP, so it's possible you'd see similar issues there. I hear the performance isn't that good to begin with, though.
    tgamblin : Just out of curiosity, did you supply additional optimization options when you compiled with -g (e.g. -g -O3) or did you just add -g without explicitly specifying -O[123]? The former could drop you to -O0, at least on icc.
    Mike : I did have the project setup for maximum optimisation (with debugging), SSE, MMX, etc, when I get home I'll post the exact options, maybe there's something I missed.
    Mike : Ok, using -# to list what -fast expands to: cc -xopenmp -fast -# Expands to: -D__MATHERR_ERRNO_DONTCARE -fns -nofstore -fsimple=2 -fsingle -xalias_level=basic -xarch=ssse3 -xbuiltin=%all -xcache=32/64/8:4096/64/16 -xchip=core2 -xdepend -xlibmil -xlibmopt -xO5 -xopenmp -xregs=frameptr
    Mike : The -g flag is included as well, comments just don't give me the space to post the full line. :) Debug version: real time: 6.060s user time: 22.815s Release version: 3.774s user time: 13.902s (Both using 4 threads)
    From tgamblin
  • A symbol table is added to the executable which maps function/variable names to data locations, so that debuggers can report back meaningful information, rather than just pointers. This doesn't effect the speed of your program, and you can remove the symbol table with the 'strip' command.

  • There is some overlap with this question which covers the issue from the other side.

    From Rob Walker
  • Just as a matter of interest, you can crack open a hexeditor and take a look at an executable produced with -g and one without. You can see the symbols and things that are added. It may change the assembly (-S) too, but I'm not sure.

    From Bernard
  • -g adds debugging information in the executable, such as the names of variables, the names of functions, and line numbers. This allows a debugger, such as gdb to step through code line by line, set breakpoints, and inspect the values of variables. Because of this additional information using -g increases the size of the executable.

    Also, gcc allows to use -g together with -O flags, which turn on optimization. Debugging an optimized executable can be very tricky, because variables may be optimized away, or instructions may be executed in a different order. Generally, it is a good idea to turn off optimization when using -g, even though it results in much slower code.

    From Dima
  • In addition to the debugging and symbol information
    Google DWARF (A Developer joke on ELF)

    By default most compiler optimizations are turned off when debugging is enabled.
    So the code is the pure translation of the source into Machine Code rather than the result of many highly specialized transformations that are applied to release binaries.

    But the most important difference (in my opinion)
    Memory in Debug builds is usually initialized to some compiler specific values to facilitate debugging. In release builds memory is not initialized unless explicitly done so by the application code.

    Check your compiler documentation for more information:
    But an example for DevStudio is:

    • 0xCDCDCDCD Allocated in heap, but not initialized
    • 0xDDDDDDDD Released heap memory.
    • 0xFDFDFDFD "NoMansLand" fences automatically placed at boundary of heap memory. Should never be overwritten. If you do overwrite one, you're probably walking off the end of an array.
    • 0xCCCCCCCC Allocated on stack, but not initialized
  • Some operating systems (like z/OS) produce a "side file" that contains the debug symbols. This helps avoid bloating the executable with extra information.

    From Nighthawk

What is a Privileged instruction?

I have added some code which compiles cleanly and have just received this Windows error:

---------------------------
(MonTel Administrator) 2.12.7: MtAdmin.exe - Application Error
---------------------------
The exception Privileged instruction.

 (0xc0000096) occurred in the application at location 0x00486752.

I am about to go on a bug hunt, and I am expecting it to be something silly that I have done which just happens to produce this message. The code compiles cleanly with no errors or warnings. The size of the exe has grown to 1,454,132 bytes, and includes links to ODCS.lib, but is otherwise pure 'c' to the Win32 API, with DEBUG on (running on a P4 on Win2K).

  • I saw this with visual c++ 6.0 in the year 2000.

    The debug C++ library had calls to physical IO instructions in it, in an exception handler. IIRC it was dumping status to an IO port that used to be for DMA base registers, which I assume someone an M$ was using for a debugger card.

    Look for some error condition that might be latent causing diagnostics code to run.

    I was debugging, backtracked and read the Dissassembly. It was an exception while processing std::string, maybe indexing off the end.

    David L Morris : It is actually VC6. But not C++, though plenty of zero terminated strings. (I could use the new compilers but have heard rumors that VC6 is actually faster for C, rather than C++). I doubt it is a compiler bug though... (I always discover it is one of those - "what was I thinking" moments).
  • This sort of thing usually happens when using function pointers that point to invalid data. It can also happen if you have code that trashes your return stack. It can sometimes be quite tricky to track these sort of bugs down because they usually are hard to reproduce.

    From Daniel
  • First probability that I can think of is, you may be using a local array and it is near the top of the function declaration. Your bounds checking gone insane and overwrite the return address and it points to some instruction that only kernel is allowed to execute.

  • A privileged instruction is an IA-32 instruction that is only allowed to be executed in Ring-0 (i.e. kernel mode). If you're hitting this in userspace, you've either got a really old EXE, or a corrupted binary.

    David L Morris : The exe I compiled using VC6 about 10 seconds ago.... but is does contain links to ODBC.LIB and other libs that might be quite old. This app is not a driver or a service.
    From Paul Betts
  • The error location 0x00486752 seems really small to me, before where executable code usually lives. I agree with Daniel, it looks like a wild pointer to me.

    From Jeremy
  • To answer the question, a privileged instruction is a processor op-code (assembler instruction) which can only be executed in "supervisor" (or Ring-0) mode. These types of instructions tend to be used to access I/O devices and protected data structures from the windows kernel.

    Regular programs execute in "user mode" (Ring-3?) which disallows direct access to I/O devices, etc...

    As others mentioned, the cause is probably a corrupted stack or a messed up function pointer call.

    From Benoit
  • As I suspected it was something silly that I did. I think I solved this twice as fast because of some of the clues in comments in the messages above. Thanks to those especially those who pointed to something early in the app overwriting the stack. I actually found several answer here more useful that the post I have marked as answering the question as they clued and queued me as to where to look, though I think it best sums up the answer.

    As it turned out I had just added a button that went over the maximum size of an array holding some tool bar button information (which was on the stack). I had forgotten that

    #define MAX_NUM_TOOBAR_BUTTONS  (24)
    

    even existed!

  • The CPU of most processors manufactured in the last 15 years have some special instructions which are very powerful. These Privileged Instructions are kept for Operating System Kernel applications and are not able to be used by user written programs. This restricts the damage that a user written program can inflict upon the system and cuts down the number of times that the system actually crashes.

How do you specify a different port number in SQL Management Studio?

I am trying to connect to a Microsoft SQL 2005 server which is not on port 1433. How do I indicate a different port number when connecting to the server using SQL management Studio?

Thank you,

Brett

  • 127.0.0.1,6283 add a comma between the ip and port

    jasonco : +1 dude, you saved my life! LOL
    Anders Fjeldstad : +1 spent an hour trying to track down what was wrong with my SQL Server-via-SSH tunnel setup. Turns out I had been using ip:port in Management Studio all along, not ip,port.
    From Nescio
  • You'll need the SQL Server Configuration Manager. Go to Sql Native Client Configuration, Select Client Protocols, Right Click on TCP/IP and set your default port there.

    From Mike
  • Another way is to setup an alias in Config Manager. Then simply type that alias name when you want to connect. This makes it much easier and is more prefereable when you have to manage several servers/instances and/or servers on multiple ports and/or multiple protocols. Give them friendly names and it becomes much easier to remember them.

    From mattlant
  • If you're connecting to a named instance and UDP is not available when connecting to it then you may need to specify the protocol as well.

    Example:

    tcp:192.168.1.21\SQL2K5,1443

    Brettski : This can be done from the SQL Management studio logon dialog?
    From James
  • Using the client manager affects all connections or sets a client machine specific alias.

    Use the comma as above: this can be used in an app.config too

    It's probably needed if you have firewalls between you and the server too...

    From gbn

Free or open source IBM 3151 or aixterm emulators?

Does anyone know of any free or open source terminal emulators that will emulate an IBM 3151 terminal or an HFT terminal (aixterm)?

We have some offshore contractors that need access to some of our systems that need a 3151 or hft emulation, but are having issues transferring licenses of Hummingbird HostExplorer to India. For that matter, if we could save on US Hummingbird licenses it would be beneficial as well.

Thanks!

  • I doubt you'll find an open source or free emulator for this terminal type. While IBM has contributed to open source communities, they are also very interested in protecting their intellectual property. Hummingbird licenses are certainly expensive. We ran into issues with that when I worked for IBM!

    That said, I never needed a specific terminal type in order to access AIX systems, as we used OpenSSH (comes with AIX 5L). Is there some reason why you can't provide SSH access to these systems to your contractors?

    From jtimberman
  • Yes, it boils down to the function key emulation. Legacy code that is very Fkey laden. Thanks!

    TFD : please use comment for comments :-)
    From coleca

css help with box models

I need help with the css box models. I have a problem with my website ucare-ph.org . The logo (temporary) is covering the main copy. Now I know someone can tell me the code to fix this. But what I really need is a resource or resources in books or websites that can give me a better understanding of the box model. The syntax is very easy. But the relationship between the boxes is very difficult. And yes I have firebug and web dev for firefox. But its still trial and error.

  • The specification is pretty clear.

    From RobbieGee
  • As to your site's logo, try setting clear: both on the content.

  • Something that helped me a lot is using Firebug

    Right-click an element on the page and choose 'Inspect Element'. Then choose the 'Layout' tab in the panel on the right. You can see a graphical representation of the element along with the measurement of each component of the box model. My life would be miserable without Firebug.

    From neonski
  • Note that Internet Explorer uses a broken box model unless you specify a doctype such as:

    <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
    "http://www.w3.org/TR/html4/strict.dtd">
    

    In the broken box model, the height and width values refer to the size of the box excluding the margins.

    For example, if you have this:

    <div style="width:100px; height:100px; padding:10px; border:10px solid #000;"></div>
    

    In the correct box model it will display a box that is 140x140 because the padding adds 10px to each side of the content area (which is 100x100) and then the border adds another 10px to each side.

    However, in the broken IE box model it will display a box that is 100x100 and the border and padding will sit inside that space leaving you with a content area of only 60x60.

    From molasses

Best direction for a grade 12 student planning on going into Computer Sciences?

So, I have just started grade 12, and after I graduate I plan on going to University and major in Computer Sciences, and eventually make a career out of it. I am looking for advice on what direction I should be heading to make the most out of things. :]

Currently, I know some PHP, Java, and C++. Nothing too advanced for any of these, but I am not a beginner either. I haven't done anything outside of the console in C++, so I figure that should be my next step.

Anyways, I'm just looking for any advise on what I should be doing. Any suggestions on what language I should be working on improving (or learning), any projects that should help me get a bit of experience, etc.

Anything, really. :]
Thanks.

  • First of all you should do what you like.

    Take what they give you during classes and try to push it always further. Discovering new things is the key.

    From Robit
  • Some people think that programmers need to have some experience with C. Although it may not be necessary, it certainly does provide a foundation for understanding many other languages.

  • Languages are easy to pick up, and in 20 years the hot langauges aren't likely to be any of the ones you're learning now, so don't sweat it.

    The most important thing is to work on good projects. Student projects. Internships. Open source. Doesn't matter. They will give you the most experience, and the best employers know to look for them.

    As for what KIND of projects to work on-- this doesn't matter either! Pick the one that gets you excited. That's the one that will be best for you.

    kooshmoose : I was going to respond, but Aaron's got my number. The only other thing I'd mention is that you'll end up spending more time with your fellow students than with your professor. Make sure that you surround yourself with good programmers and you'll only get better.
    From Aaron
  • Learn to use a debugger. You might start with Visual Studio's.

    From antik
  • I would say you should mostly practice and advance your practical skills in those languages you know (Java, C++, ...). It will come in handy when you are introduced to more advanced theory in college.

    From mrlinx
  • Get involved. Join an open source project. Experience is far more important than learning any given language, development model, or such.

  • Hey Coal,

    Consider getting involved with an open source project. You'll have an opportunity to learn a TON and communicate with other developers on a project (kind of like on-the-job experience).

    I write a lot of C# and think it is great but there are others too. I'd encourage you to learn about databases, at least at some level, because it can't hurt to know about them.

    There's lots of cool stuff out there in the open source community and that experience will be invaluable when you do go looking for a job.

    Best of luck!

    From itsmatt
  • Well you already will know more than a lot of your classmates I bet. I went into University knowing Turing. I didn't even know what a compiler was!

    You have the basics of programming down so I say your next step should involve looking into the theory behind computer science. This includes algorithm design and analysis, automata theory, Big-Oh notation, software engineering theory, etc.

    I am willing to bet your previous programming experience will give you a good headstart in your programming classes. So if you want to get ahead of the game now, I would start looking into theory and advanced mathematics.

  • Definitely get involved in a multi-person project of some sort. Learn about the software development process, version control tools, bug trackers, testing, etc. Those are the kinds of things that they don't spend a lot of time on in most Computer Science schools.

    From hallidave
  • Some stuff I've had success with.

    • Open Source projects
    • University Projects
    • Internships

    Depending on your location, you can find some professor's at the local universities who will be willing to let you contribute to there projects. Sometimes you can get into REU (Research for Undergraduates) programs when your still in high school, or when you graduate high school/in college. Open Source projects are great too, especially at learning how to work with different kinds of people all over the world.

    Just work on something, just for learning experience and it looks great to future employers that you've been at this since long before College.

  • Programming language is just 'media', what you need to have, is a good foundation on what is programming, e.g. conditional, loop, function, etc. If you have good understanding on this, you can use any programming language. Other additional things like know how to write a good code, follow new technologies, since it grows rapidly nowadays, etc.

    From ChRoss
  • First and foremost, that is pretty badass for 12th grade. You have a serious leg up. This means you will totally fly through your college courses. You have the opportunity to use four whole years of free time to become an absolute beast of a programmer. Want to? Then:

    1. Learn a language that's fun and easy, and that you can use to put together really simple programs. I suggest Ruby, with Shoes for user interfaces.
    2. Start looking for practical problems that you can solve by programming. If you keep your eyes open, opportunities are everywhere. This will give you a constant supply of new things to build.

    You'll have a lot of fun, so you won't get sick of it. You'll have a lot of ideas, so you'll keep coding. So so so, you'll increase your skills crazy-fast.

    Thomas : Knowing languages does not automatically make one "totally fly through [one's] college courses". Being able to think in an abstract, structured manner is far more helpful.
  • So, you know enough to get by, and can probably do a few interesting, practical things with your current skillset. That's great. So think of college in these terms: what can you get there that you can't as easily pick up on your own?

    1. Concepts, algorithms, and (dare I say it) math. Especially discrete math.
    2. A chance to work with and learn from interesting, intelligent people. Your profs usually have really cool research work going, ask about it.
    3. A well-rounded education. I knew plenty of folks in CS who complained about every non-major class they had to take and talked about other subjects like they didn't matter. That's not the path to an interesting, rewarding career. Most programming jobs require domain expertise or the ability to learn about different subjects or industries quickly. You're in college to learn how to learn - and it really helps if you learn how to write and speak in front of groups, too.

    Hope this help.

  • Computer science is a very large field, and it is much more than writing code. You should learn about its many sub-fields, such as theory of computation, algorithms, software engineering, operating systems, compilers, databases, computer vision, computer graphics, machine learning, etc. This way you can see what you may want to specialize in.

    As many people here have pointed out, knowing a particular language is far less important than the fundamentals, such as understanding different programming paradigms (functional, logical, object-oriented), being able to come up with good designs, and understanding algorithmic complexity. Having said that, it definitely does help to know a few languages really well, and to be able to pick up a new language quickly.

    Mathematics is also very important. You may get away without knowing calculus if you write business software, but to do really cool things, like graphics, computer vision, AI, robotics, or games math is absolutely essential. Some of the relevant topics are calculus, linear algebra, and probability.

    From Dima
  • If you know some programming and you like doing it, you are fine. You've got 4 years of concentrated CS instruction ahead of you, don't sweat it, they'll take care of you. Do what's interesting right now, that's the important bit.

    Here's some advice for being a CS undergrad student:

    1. Now and then, when you are at school, you will hate the sight of your laptop. That's normal and healthy,go out and kick a ball around, or whatever you do that's not in any way cerebral. Don't burn yourself out, you have a full career of doing this ahead of you.

    2. When your prof says, "start this project early", start that project early. All nighters are fun for a while, but they wear you out. Save that for the finals.

    3. Remember that programming is fun. Now and then, do stuff that's not for school, and not for any real reason.

    4. Get summer internships at companies to get practical experience. School will teach you computer science. There is a completely different skillset which is software engineering, and you will need both. You'll likely have a class on the latter, but a single class is not enough. Plus internships are a good way to feel out different companies.

    5. Have faith -- what they are teaching you is useful, even if it doesn't seem so. They will be teaching you stuff that seems lame, irrelevant, old, and/or boring. "But I am never going to write a compiler! But cloud computing is a paradigm shift! Who cares about 17 kinds of sort, I just say collection.sort!" Lisp, wtf? Matrices?? How does that help me build Ruby on Rails websites with Ajax and AdSense?" ... etc, etc, etc. Well, trust me, if you stick with it for the long haul, and you are any good, a couple of years after you graduate, you will curse every time you uttered one of those sentences :-).

    Good luck, it's a lot of fun.

    From SquareCog
  • Make sure you choose an excellent college or university. The field is competitive, and if you are not well-educated, you will be less able to find a job doing what you want. You definitely need to look for a school that has achieved a regional accreditation; that is true for all academic disciplines. Some other criteria that may help you include:

    • Look for a computer science-specific accreditation agency in your region of the world. Consider its standards, and take that into account as you look at universities.
    • Read what smart, successful people like Joel Spolsky say about education and hiring. http://www.joelonsoftware.com/items/2007/12/03.html
    • Ask people working in the field what the best computer science schools are in your country or region, and check them out. For the United States, I again refer you to Joel.
    • Have realistic options in mind in case your top choices do not work out.
    • Do not forget to take into account your physical well-being, living conditions, and broadness of education as you look for a university. There is more to life than programming!

    It sounds like you have a sincere interest and that you have put in a lot of time studying already--so much so that you may want to spend less time on independent study of computer science topics and focus fully on finishing your high school career well and finding a great university that is right for you. Best wishes!

  • You will learn a lot of theory in school. Focus on experience outsise of school. Make websites, make small games, write a program to help you be more productive. Seek out internships, especially as you get into your junior and senior year. If you are interested in a particular area, get to know the faculty that are doing research in that area. Read up on current trends and technology. Subscribe to RSS feeds from websites about programming, web development, and technology.

  • Programming is the easy part of Computer Science, and of professional development practice. What you need is

    • experience working with other talented developers,
    • experience reading and understanding other people's code, and
    • experience with a wide variety of "ways of thinking" about building useful abstractions.

    If you already know a language like Java, then learn Lisp or Prolog or Haskell (Python and Ruby both take many ideas from Lisp, but it's actually easier to see what they are good for if you know some Lisp). It will stretch your mind and make it much easier to learn the next new thing quickly when it comes along.

  • man, you guys all missed the most important aspect!!!

    your in high school, take ALL the math courses you can, and try to take the AP stuff so u don't have to do it in college. Calculus in a high school AP class will be WAAAAAYYYY easier than taking it in college from a Doctor who got his phD in finding limits at infinity!

    and that will give you more time to take the cool robotics electives!!!

AssignProcessToJobObject fails with "Access Denied" error when running under the debugger

You do AssignProcessToJobObject and it fails with "access denied" but only when you are running in the debugger. Why is this?

  • This one puzzled me for for about 30 minutes.

    First off, you probably need a UAC manifest embedded in your app (as suggested here). Something like this:

    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
      <assembly xmlns="urn:schemas-microsoft-com:asm.v1" manifestVersion="1.0">
        <!-- Identify the application security requirements. -->
        <trustInfo xmlns="urn:schemas-microsoft-com:asm.v3">
          <security>
            <requestedPrivileges>
              <requestedExecutionLevel
                level="asInvoker"
                uiAccess="false"/>
            </requestedPrivileges>
          </security>
        </trustInfo>
      </assembly>
    

    Secondly (and this is the bit I got stuck on), when you are running your app under the debugger, it creates your process in a job object. Which your child process needs to be able to breakaway from before you can assign it to your job. So (duh), you need to specify CREATE_BREAKAWAY_FROM_JOB in the flags for CreateProcess).

    If you weren't running under the debugger, or your parent process were in the job, this wouldn't have happened.

  • Shouldn't elevated apps be invoked with manifest set to requireadministrator?

STL Alternative

I really hate using STL containers because they make the debug version of my code run really slowly. What do other people use instead of STL that has reasonable performance for debug builds?

I'm a game programmer and this has been a problem on many of the projects I've worked on. It's pretty hard to get 60 fps when you use STL container for everything.

I use MSVC for most of my work.

  • I'll bet your STL uses a checked implementation for debug. This is probably a good thing, as it will catch iterator overruns and such. If it's that much of a problem for you, there may be a compiler switch to turn it off. Check your docs.

  • STL containers should not run "really slowly" in debug or anywhere else. Perhaps you're misusing them. You're not running against something like ElectricFence or Valgrind in debug are you? They slow anything down that does lots of allocations.

    All the containers can use custom allocators, which some people use to improve performance - but I've never needed to use them myself.

    Tod : You've obviously never written a game with STL. The problem isn't the games algorithms, it's the stl overhead. When stl isn't optimized it calls like 100 functions when it would be completely inlined and that overhead kills the game's framerate.
    MarkR : I've written several games with STL, but perhaps not on the same scale as you have; I've not found it to be a problem (but imagine it's possible)
    From MarkR
  • If you're using Visual C++, then you should have a look at this:

    http://channel9.msdn.com/shows/Going+Deep/STL-Iterator-Debugging-and-Secure-SCL/

    and the links from that page, which cover the various costs and options of all the debug-mode checking which the MS/Dinkware STL does.

    If you're going to ask such a platform dependent question, it would be a good idea to mention your platform, too...

    From Will Dean
  • Check out EASTL.

    Torlack : I don't think EASTL is available to the public. But the document covers a lot of problems with current STL implementations.
  • Ultimate++ has its own set of containers - not sure if you can use them separatelly from the rest of the library: http://www.ultimatepp.org/

  • If your running visual studios you may want to consider the following:

    #define _SECURE_SCL 0
    #define _HAS_ITERATOR_DEBUGGING 0
    

    That's just for iterators, what type of STL operations are you preforming? You may want to look at optimizing your memory operations; ie, using resize() to insert several elements at once instead of using pop/push to insert elements one at a time.

    From rhinovirus
  • What about the ACE library? It's an open-source object-oriented framework for concurrent communication software, but it also has some container classes.

    Bklyn : To paraphrase jwz: if you think "I know, I'll use ACE", now you have two problems (see http://regex.info/blog/2006-09-15/247)
    From koschi
  • My experience is that well designed STL code runs slowly in debug builds because the optimizer is turned off. STL containers emit a lot of calls to constructors and operator= which (if they are light weight) gets inlined/removed in release builds.

    Also, Visual C++ 2005 and up has checking enabled for STL in both release and debug builds. It is a huge performance hog for STL-heavy software. It can be disabled by defining _SECURE_SCL=0 for all your compilation units. Please note that having different _SECURE_SCL status in different compilation units will almost certainly lead to disaster.

    You could create a third build configuration with checking turned off and use that to debug with performance. I recommend you to keep a debug configuration with checking on though, since it's very helpful to catch erroneous array indices and stuff like that.

    From rasmusb
  • EASTL is a possibility, but still not perfect. Paul Pedriana of Electronic Arts did an investigation of various STL implementations with respect to performance in game applications the summary of which is found here: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2271.html

    Some of these adjustments to are being reviewed for inclusion in the C++ standard.

    And note, even EASTL doesn't optimize for the non-optimized case. I had an excel file w/ some timing a while back but I think I've lost it, but for access it was something like:

           debug   release
    STL      100        10
    EASTL     10         3
    array[i]   3         1
    

    The most success I've had was rolling my own containers. You can get those down to near array[x] performance.

    paercebal : +1 for the Link.
    paercebal : Well, if I could add another +1, I would. The link http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2271.html is very VERY enlightning.
    Viktor Sehr : Assuming the array is allocated on the heap, range checking is disabled\compiler is set to optimize, I dont belive that an C array has fast access times than a std::vector.
    From Jeff
  • For big, performance critical applications, building your own containers specifically tailored to your needs may be worth the time investment.

    I´m talking about real game development here.

    Tod : Yeah, this is actually what I've done for all the games where I didn't inherit the code. And since you control the allocators it's probably the best answer. I'm just hoping to fine something better!
  • Checkout Data Structures and Algorithms with Object-Oriented Design Patterns in C++ By Bruno Preiss http://www.brpreiss.com/

  • MSVC uses a very heavyweight implementation of checked iterators in debug builds, which others have already discussed, so I won't repeat it (but start there)

    One other thing that might be of interest to you is that your "debug build" and "release build" probably involves changing (at least) 4 settings which are only loosely related.

    1. Generating a .pdb file (cl /Zi and link /DEBUG), which allows symbolic debugging. You may want to add /OPT:ref to the linker options; the linker drops unreferenced functions when not making a .pdb file, but with /DEBUG mode it keeps them all (since the debug symbols reference them) unless you add this expicitly.
    2. Using a debug version of the C runtime library (probably MSVCR*D.dll, but it depends on what runtime you're using). This boils down to /MT or /MTd (or something else if not using the dll runtime)
    3. Turning off the compiler optimizations (/Od)
    4. setting the preprocessor #defines DEBUG or NDEBUG

    These can be switched independently. The first costs nothing in runtime performance, though it adds size. The second makes a number of functions more expensive, but has a huge impact on malloc and free; the debug runtime versions are careful to "poison" the memory they touch with values to make uninitialized data bugs clear. I believe with the MSVCP* STL implementations it also eliminates all the allocation pooling that is usually done, so that leaks show exactly the block you'd think and not some larger chunk of memory that it's been sub-allocating; that means it makes more calls to malloc on top of them being much slower. The third; well, that one does lots of things (this question has some good discussion of the subject). Unfortunately, it's needed if you want single-stepping to work smoothly. The fourth affects lots of libraries in various ways, but most notable it compiles in or eliminates assert() and friends.

    So you might consider making a build with some lesser combination of these selections. I make a lot of use of builds that use have symbols (/Zi and link /DEBUG) and asserts (/DDEBUG), but are still optimized (/O1 or /O2 or whatever flags you use) but with stack frame pointers kept for clear backtraces (/Oy-) and using the normal runtime library (/MT). This performs close to my release build and is semi-debuggable (backtraces are fine, single-stepping is a bit wacky at the source level; assembly level works fine of course). You can have however many configurations you want; just clone your release one and turn on whatever parts of the debugging seem useful.

    From puetzk
  • Qt has reimplemented most c++ standard library stuff with different interfaces. It looks pretty good, but it can be expensive for the commercially licensed version.

Best Permalinking for Rails

What do you think is the best way to create SEO friendly URLs (dynamically) in Rails?

  • Override the to_param method in your model classes so that the default numeric ID is replaced with a meaningful string. For example, this very question uses best-permalinking-for-rails in the URL.

    Ryan Bates has a Railscast on this topic.

  • Check out the permalink_fu plugin (extracted from Mephisto)... the Git repository is located here.

    From silvertab
  • ActiveSupport has a new method in Rails to aid this - String#parameterize. The relevant commit is here; the example given in the commit message is "Donald E. Knuth".parameterize => "donald-e-knuth"

    In combination with the to_param override mentioned by John Topley, this makes friendlier URLs much easier.

  • rsl's stringex is pretty awesome, think of it as permalink_fu done right.

    From Ryan Bigg
  • I largely use to_param as suggested by John Topley.

    Remember to put indexes such that whatever you're using in to_param is quickly searchable, or you'll end up with a full table scan on each access. (Not a performance-enhancer!)

    A quick work-around is to put the ID somewhere in there, in which case ActiveRecord will ignore the rest of it as cruft and just search on the id. This is why you see a lot of Rails sites with URLs like http://example.com/someController/123-a-half-readable-title .

    For more detail on this and other SEO observations from my experience with Rails, you may find this page on my site useful.

  • For me friendly_id works fine, it can generate slugs too, so You don't need to matter about duplicated urls, scopes are also supported.

    From astropanic
  • I have made a small and simple gem which makes it easier to override the to_param method. It can be found here.

What's the best way to run Wordpress on the same domain as a Rails application?

I've got a standard Rails app with Nginx and Mongrel running at http://mydomain. I need to run a Wordpress blog at http://mydomain.com/blog. My preference would be to host the blog in Apache running on either the same server or a separate box but I don't want the user to see a different server in the URL. Is that possible and if not, what would you recommend to accomplish the goal?

  • Seems to me that something like a rewrite manipulator would do what you want. Sorry I don't have anymore details -- just thinking aloud :)

    From Ian P
  • Actually, since you're using Nginx, you're already in great shape and don't need Apache.

    You can run PHP through fastcgi (there are examples of how to do this in the Nginx wiki), and use a URL-matching pattern in your Nginx configuration to direct some URLs to Rails and others to PHP.

    Here's an example Nginx configuration for running a WordPress blog through PHP fastcgi (note I've also put in the Nginx equivalent of the WordPress .htaccess, so you will also have fancy URLs already working with this config):

    server {
        listen       example.com:80;
        server_name  example.com;
        charset      utf-8;
        error_log    /www/example.com/log/error.log;
        access_log   /www/example.com/log/access.log  main;
        root         /www/example.com/htdocs;
    
        include /www/etc/nginx/fastcgi.conf;
        fastcgi_index index.php;
    
        # Send *.php to PHP FastCGI on :9001
        location ~ \.php$ {
            fastcgi_pass 127.0.0.1:9001;
        }
    
        # You could put another "location" section here to match some URLs and send
        # them to Rails. Or do it the opposite way and have "/blog/*" go to PHP
        # first and then everything else go to Rails. Whatever regexes you feel like
        # putting into "location" sections!
    
        location / {
            index index.html index.php;
            # URLs that don't exist go to WordPress /index.php PHP FastCGI
            if (!-e $request_filename) {
                rewrite ^.* /index.php break;
                fastcgi_pass 127.0.0.1:9001;
            }
    
        }
    }
    

    Here's the fastcgi.conf file I'm including in the above config (I put it in a separate file so all of my virtual host config files can include it in the right place, but you don't have to do this):

    # joelhardi fastcgi.conf, see http://wiki.codemongers.com/NginxFcgiExample for source
    fastcgi_param  GATEWAY_INTERFACE  CGI/1.1;
    fastcgi_param  SERVER_SOFTWARE    nginx;
    
    fastcgi_param  QUERY_STRING       $query_string;
    fastcgi_param  REQUEST_METHOD     $request_method;
    fastcgi_param  CONTENT_TYPE       $content_type;
    fastcgi_param  CONTENT_LENGTH     $content_length;
    
    fastcgi_param  SCRIPT_FILENAME    $document_root$fastcgi_script_name;
    fastcgi_param  SCRIPT_NAME        $fastcgi_script_name;
    fastcgi_param  REQUEST_URI        $request_uri;
    fastcgi_param  DOCUMENT_URI       $document_uri;
    fastcgi_param  DOCUMENT_ROOT      $document_root;
    fastcgi_param  SERVER_PROTOCOL    $server_protocol;
    
    fastcgi_param  REMOTE_ADDR        $remote_addr;
    fastcgi_param  REMOTE_PORT        $remote_port;
    fastcgi_param  SERVER_ADDR        $server_addr;
    fastcgi_param  SERVER_PORT        $server_port;
    fastcgi_param  SERVER_NAME        $server_name;
    
    # PHP only, required if PHP was built with --enable-force-cgi-redirect
    #fastcgi_param  REDIRECT_STATUS    200;
    

    I also happen to do what the Nginx wiki suggests, and use spawn-fcgi from Lighttpd as my CGI-spawner (Lighttpd is a pretty fast compile w/o weird dependencies, so a quick and easy thing to install), but you can also use a short shell/Perl script for that.

    From joelhardi
  • I think joelhardi's solution is superior to the following. However, in my own application, I like to keep the blog on a separate VPS than the Rails site (separation of memory issues). To make the user see the same URL, you use the same proxy trick that you normally use for proxying to a mongrel cluster, except you proxy to port 80 (or whatever) on another box. Easy peasy. To the user it is as transparent as you proxying to mongrel -- they only "see" the NGINX responding on port 80 at your domain.

    upstream myBlogVPS {
            server 127.0.0.2:80;  #fix me to point to your blog VPS
    }
    
     server {
        listen       80;
    
    
        #You'll have plenty of things for Rails compatibility here
    
        #Make sure you don't accidentally step on this with the Rails config!
    
        location /blog {
            proxy_pass         http://myBlogVPS;
            proxy_redirect     off;
    
            proxy_set_header   Host             $host;
            proxy_set_header   X-Real-IP        $remote_addr;
            proxy_set_header   X-Forwarded-For  $proxy_add_x_forwarded_for;
        }
    

    You can use this trick to have Rails play along with ANY server technology you want, incidentally. Proxy directly to the appropriate server/port, and NGINX will hide it from the outside world. Additionally, since the URLs will all refer to the same domain, you can seemlessly integrate a PHP-based blog, Python based tracking system, and Rails app -- as long as you write your URLs correctly.

    Brian Deterling : Agree, the fastcgi solution worked fine when keeping it on the same server, although I had to reinstall/update NGINX and php because fastcgi wasn't included initially. For a separate server, this solution seems to work well - if anyone knows of any gotchas, please add them.
    joelhardi : Proxying will work fine for you but it brings its own set of issues. One of them is performance is a bit worse. Another is you will have to change the code on your backend server to make it aware its running behind a proxy and has to look at things like X-Forwarded-For (which is actually an ...
    joelhardi : ... array of IPs, not necessarily a single address b/c lots of users run behind corporate or ISP proxy servers, and you want to extract the user's actual IP if your app does something with it. Also X-Forwarded-For is a user-set value so not "trusted" -- see Apache mod_extract_forwarded for more ...
    joelhardi : ... info there. (Proxying is good to learn, you might use Varnish or Squid one day!) Another option besides HTTP proxy is to have FCGI listen on a port on an app box (I've set up an app cluster of PHP FCGI backends this way) -- Nginx just talks to 192.168.0.8:9001 instead of 127.0.0.1:9001. Cheers!
    Samnang : Could you help what's problem with this http://serverfault.com/questions/116589/broken-images-css-javascript-uris-in-sub-uri-in-rails-that-using-nginx-for-revers
  • The answers above pretty addresses your question.

    An alternative FCGI would be to use php-fpm. Docs are a tad sparse but it works well.

    From Jauder Ho
  • how does this work with SEO? does google treat this as the same site?

    From anon

How can I ban a whole company from my web site?

For reasons I won't go into, I wish to ban an entire company from accessing my web site. Checking the remote hostname in php using gethostbyaddr() works, but this slows down the page load too much. Large organizations (eg. hp.com or microsoft.com) often have blocks of IP addresses. Is there anyway I get the full list, or am I stuck with the slow reverse-DNS lookup? If so, can I speed it up?

Edit: Okay, now I know I can use the .htaccess file to ban a range. Now, how can I figure out what that range should be for a given organization?

  • Take a look at .htaccess if you're using apache: .htaccess tutorial

    From mrlinx
  • How about an .htaccess:

    Deny from x.x.x.x
    

    if you need to deny a range say: 192.168.0.x then you would use

    Deny from 192.168.0
    

    and the same applies for hostnames:

    Deny from sub.domain.tld
    

    or if you want a PHP solution

    $ips = array('1.1.1.1', '2.2.2.2', '3.3.3.3');
    if(in_array($_SERVER['REMOTE_ADDR'])){die();}
    

    For more info on the htaccess method see this page.

    Now to determine the range is going to be hard, most companies (unless they are big corperate) are going to have a dynamic IP just like you and me.
    This is a problem I have had to deal with before and the best thing is either to ban the hostname, or the entire range, for example if they are on 192.168.0.123 then ban 192.168.0.123, unfortunatly you are going to get a few innocent people with either method.

    John Millikin : If I'm reading the OP right, he wants a way to ban based on DNS name (not IP address).
    Unkwntech : Yah I just updated it.
    joelhardi : Unkwntech's got it. I'll add that it's better performance-wise to figure out the IP-block to ban (no DNS lookups). And even faster to put a firewall (or iptables) block in, so they can't even hit Apache. If they're on a dynamic IP, you could have a cronjob nslookup their domain and update the rule.
    From Unkwntech
  • Do you have access to the actual server config? If so depending on the server you could do it in the configuration.

    See this thread for some information that may be helpful.

    From iros
  • Continue to use gethostbyaddr(), but behind a cache. You should only have to resolve it once per IP address, and then it would not be a significant performance issue. If you want, prime the cache from your server logs so returning users won't even hit the one-time slowdown.

  • If you're practicing safe webhosting, then you have a firewall. Use it.

    Large companies have blocks of IP addresses, but even smaller companies rarely change their IP. So there's an easy way to do this without reducing your performance:

    Every month do a reverse lookup on all the IPs in your log and then put all the IPs used by that company in your firewall as deny.

    After awhile yo'll begin to see whether they have dynamic addresses or not. If they do, then you may have to do reverse lookups for each connection attempt, but unless they are a small company you shouldn't have to worry about it.

    From Adam Davis
  • First search for the company on whois.net. If you know they are just one domain, do a whois lookup. Otherwise, search for domains they own by keyword.

    You can find out the main IP ranges assigned to the company through whois queries, and then build your deny rule(s) accordingly.

    From jtimberman
  • http://en.wikipedia.org/wiki/Rwhois telnet rwhois.arin.net 4321

    This used to work.

  • I know WikiScanner lets you search for a company or other organization, and then lists the IP address ranges belonging to them. Just as an example, here's all the IP addresses belonging to Google, at least according to WikiScanner.

    According to HowStuffWorks, they use something called "IP2Location".

  • If your goal in doing this is to make it slightly inconvenient for people from a company to access your site, follow the advice above. But you won't be able to completely ensure you're blocking every access because they could always be going through a proxy. And if it's accessible to the rest of the public, you'll have to worry about archive.org, search engine caches, etc.

    Probably not the answer you're looking for, but it's accurate.

  • The load shouldn't be put on the webserver, you should put it on the firewall.

  • Note that using the techniques above it will never be possible to completely ban the specific company from accessing your website. It will still be possible for them to use proxy servers or look at your site from home.

    If you absolutely want to control who has access, you should only allow authenticated and authorized users to access your site.