Monday, April 25, 2011

Is a symbol table in Ruby any different from a symbol table in other languages

The wikipedia entry on Symbol tables is a good reference:

http://en.wikipedia.org/wiki/Symbol_table

But as I try to understand symbols in Ruby and how they are represented in the Array of Symbols (returned by the Symbol.all_symbols method),

I'm wondering whether Ruby's approach to the symbol table has any important differences from other languages?

From stackoverflow
  • Ruby doesn't really have a "symbol table" in that sense. It has bindings, and symbols (what lispers call atoms) but it isn't really doing it the way that article describes.

    So in answer to your question: it isn't so much that ruby has the same thing done differently, but rather that it does two different things (:xxx notation --> unique ids and bindings in scopes) and uses similar / overlapping terminology for them.

    To clarify:

    The article you link to gives the conventional definition of a symbol table, to wit

    where each identifier in a program's source code is associated with information relating to its declaration or appearance in the source, such as its type, scope level and sometimes its location

    But this isn't what ruby's symbol table does. It just provides a globally unique identity for a certain class of objects which can be written as :something in the source code, including things like :+ and :"Hi bob!" which aren't identifiers. Also, merely using an identifier will not create a corresponding symbol. And finally, none of the information listed in the passage above is stored in ruby's list of symbols.

    It's a coincidence of naming, and reading that article will not help you understand ruby's symbols.

    Ken : Ruby symbols are not "what Lispers call atoms". Ruby symbols are closest to Lisp keywords, which are just a special kind of symbol. Fractions, vectors, and strings are all atoms in Lisp, for example, but nothing like Ruby symbols.
    Ken : Maybe you're thinking of X11. I think X11 atoms are fairly similar to Ruby and Lisp symbols.
    Ellis : So, when you call Symbol.all_symbols, you're not accessing the Symbol table?
    MarkusQ : @Ellis -- You're accessing a list of symbols, but it isn't a "Symbol Table" as the term is being used in the article you linked to.
    MarkusQ : @Ken No, I meant what I said. For example, when you add two integers in ruby you're sending the message :+ to the receiver. There's a lot of syntactic sugar on the top, and a lot of semantically neutral optimization underneath, but you can make the mapping work.
    Ellis : @MarkusQ - Ken seems to disagree - He says a symbol is "just a pointer in the symbol table".
    MarkusQ : @Ken (cont) That isn't to say that you _must_ look at it that way; there are other valid ways to map ruby onto lisp, it's mostly a matter of using the one that works best in the particular circumstances.
    Ken : @MarkusQ: That's true, but you make it sound like Lisp "atoms" are similar to Ruby's "symbols". In Lisp, "atom" just means "non-list" (or more specifically, non-cons-cell).
    MarkusQ : @Ken -- and in ruby any non-structured entity (including operators that are otherwise unnamed) have a corresponding symbol. It's just that, apart from things like messing with the message passing hierarchy in odd ways you don't really see / think about it.
    MarkusQ : @Ken In any case, I wasn't trying to draw a sharp parallel but rather to highlight a distinction. Symbols in ruby are (like atoms in lisp) part of the value space and not (like the identifiers in a c compiler's symbol table) a compile time accounting trick. That was the thrust of the comparison.
  • The biggest difference is that (like Lisp) Ruby actually has a syntax for symbols, and it's easy to add/remove things at runtime yourself. If you say :balloon (or "balloon".intern) it will intern that for you. Even though you're referring to it by name in your source, internally it's just a pointer in the symbol table. If you compare symbols, it's just a pointer-compare, not a string-compare.

    Languages like C don't really have a way to say simply "create a new symbol for me" at runtime. You can do it implicitly at compile-time by defining a function, but that's really its only use. Since C has no syntax for symbols, if you want to be able to say Balloon in your program but be able to compare it with a single machine instruction, you use enums (or #defines).

    In Ruby, it takes only one character to make a symbol, so you can use it for all kinds of things (like hash keys).

    Ellis : According to a comment by MarkusQ above, "You're accessing a list of symbols, but it isn't a "Symbol Table" as the term is being used in the article you linked to." This differs from your point that "internally it's (a symbol is) just a pointer in the symbol table"
    MarkusQ : @Ellis -- Ken and I aren't disagreeing as much as you seem to think think. The symbol table he's talking about pointing into is _not_ the same sort of compile-time structure you'd find in a language like C (which is what your linked article is about), a point he also makes.
    Ken : I think our statements are compatible. Ruby has a "symbol" type, which has the attributes of symbol-table symbols. But Symbol.all_symbols isn't returning the (entire) Ruby Symbol Table, and the addresses are hidden. Hopefully between our two viewpoints you can extrapolate some truth. :-)
  • Symbols in Ruby are used where other languages tend to use enums, defines, constants and the like. They're also often used for associative keys. Their use has little to do with a symbol table as discussed in that article, except that they obviously exist in one.

0 comments:

Post a Comment