Thursday, March 7, 2013

Better processing software

Data reduction is what bogged me down last time.  How can I improve on it?  Here's my thinking right now.

Parts that work:
  • Java.  While I considered Perl for a while, I now know at least enough about regular expressions in Java to be dangerous.
  • Breaking the problem down, and solving the easy parts first.
Parts that I can improve:
  •  Premature data modelling.  I was focused on targeting a relational data model.  While there will ultimately be a relational database, concentrating on what I had specified for that model and generating load files for it was putting the cart before the horse.  Derive an object model first, tweak as needed, and then decide how the data should sit relationally.
  • Lack of a scorecard.  While I had processed the data in order of increasing difficulty, the capability to do so was not designed into the rather ad-hoc software.  The program needs a serious redesign to filter and attack the problem in a more systematic way.  Being able to measure progress is a side-effect of a good design in that respect.
If I re-attack the software design with those criteria in mind I should be able to sustain the effort to the end -- and to make something that is easier to return to after a break.

No comments:

Post a Comment