Batch vs craft, 80 vs 20
The decision of batch conversion vs hand-crafting comes up often in my work and play. In any version-change situation you have to decide how to handle your painfully acquired collection of data files. Mass-audience programs like Word and Excel generally offer internal conversions for old files, up to a point. In narrower contexts you're generally on your own.
Over the years I've learned when to batch'em and when to hand-craft'em; or in more subtle situations,
how much of the conversion is worth batching. The decision often feels wrong at the time (as in JESUS! THIS IS A LOT OF FUCKING WORK!) but usually turns out to be right. The alternative would have been EVEN MORE FUCKING WORK!
These decisions are related to the well-known 80/20 effect, the observation that the first 80% of the work takes 20% of the time, while the last 20% takes 80% of the time. In short, the decision to automate or not is the ONE piece of programming that CAN'T be automated.
Now that the latest courseware project is finished, I decided to go back and sharpen some tools, including some oddities and bugs in my batch converter. This is an extremely narrow context! I'm the sole user of this converter, which turns my old (nicely compact and orthogonal) courseware script into the modern jumbled mess of HTML/JS/SVG.
This image shows the flavor of the conversion. The left and right sides are the old and new codes for the same section of one page. It's not a direct mapping in any way; the underlying worldviews are entirely different.
Toward the end of the courseware project, QA decided to get fussy about HTML quirks. Can't blame them. It was running fine as is, but in the O'Brien world of HTML/JS/SVG you never know when Acceptable will suddenly become Intolerable. There were two constant oddities that appeared on most pages, and several variable oddities that had to be located more carefully.
Should I try to fix the converter and run everything again? Or just edit the bad pages directly? I decided it would be faster to craft'em in this case. Manually checking and fixing all 5500 HTML pages took 8 hours, distributed over 3 days.
Now that the deadlines are gone, I'm trying the opposite approach, fixing the converter itself to avoid the problem entirely in potential future projects. One of the 'constant' errors took only 30 seconds to fix; the other took 5 minutes. So far I've spent 8 hours TRYING to fix the 'variable' errors, and I'm nowhere near finding them let alone fixing.
So the decision to craft the HTMLs separately was the right decision. As usual.
Labels: TMI, Zero Problems