Wednesday, January 13, 2010

Choose The Right Data Structure

This one is actually almost embarrassing to post. I've been working on an extension for TurboGears called tgext.menu. I'll not get into long explanations of why, or what I hope to accomplish with it (I'll leave that for when the first release happens, hopefully early next week).

I will say this, though: If you don't choose the right data structure, right off the bat, you are condemning yourself to a hellish experience coding.

In my case, I had a list that looked like this:

  • ExitApp
  • Foo Spot
  • Foo Spot || Bar
  • Foo Spot || Baz
  • Foo Spot || Foo
  • Foo Spot || Sub || Bar
  • Foo Spot || Sub || Baz
  • Foo Spot || Sub || Foo
  • Sub || Sub 1
  • Sub || Sub 1 || Nested 1
  • TestHome
I wanted that to become a nested unordered list in HTML, looking like this:
  • ExitApp
  • Foo Spot


    • Bar
    • Baz
    • Foo
    • Sub


      • Bar
      • Baz
      • Foo




  • Sub


    • Sub 1


      • Nested 1




  • TestHome
Looks really simple, doesn't it? All you have to do is loop over the list, split each item on the || symbols, generate openings, put your entries, and generate closings. Except you have to compare the current base of the tree with the next base of the tree and the previous base of the tree.

I can't count the number of methods I tried, and every one of them had a small, subtle, error that threw off all the output. Sometimes elements would not be closed properly. Sometimes they would not be opened properly. I tried iterating, I tried recursion. All of them failed with errors that I could not find a generic way to correct.

Saying this was frustrating is a major understatement. Last night, I finally changed my data structure that represented the output: Instead of a list, I went with a tree. Using a tree, I was able to write up (in less than a dozen lines) a simple recursive loop that wrote out the entire tree, with appropriate whitespace, CSS class tags, the work.

In one hour, I solved problems that had been plaguing me for two weeks.

Lesson learned: If I'm spending a lot of time trying to solve a specific problem, and my algorithms are always producing errors, I need to look at the underlying data structures. I'll probably save myself a lot of time that way.

No comments: