Thursday, January 9, 2014

In The Data Trenches, Part One: The Limitations of Spreadsheets

The spreadsheet is central tool of business.  It allows users to easily, rapidly and accurately perform complex analyzes and calculations, providing insights vital to commercial success.  It is also a popular tool for inventories, contact lists, and even complex catalogs of witnesses and testimony or contracts and their terms.  As these catalogs grow, they become unwieldy, particularly if one wishes to cross reference between sets of data.  Spreadsheets are not designed to address such “relationships;” but relational databases are.  This series will discuss the fundamentals of databases and build a simple project tracking tool to illustrate the many benefits these programs can offer.

While there are a number of relational database programs on the market, Microsoft Access is probably the most widely known.  These articles will therefore use Access as their exemplar, but that is not intended as an endorsement of Access over any of its competitors.  Similarly, spreadsheet examples will be based on Microsoft Excel.
A correspondence file is a simple example of a spreadsheet used for record keeping might include the following fields:
  • Date
  • To
  • From
  • Regarding
  • Summary
  • Next Steps
 Technically known as a “flat file,” this file is simple to build and simple to use.  The column headings can be added in moments, and the list can be continued virtually forever, with new rows of entries added as needed.  But that simplicity belies the limitations inherent in this tool. 

A sample entry might be:

  • Date:              1/1/14
  • To:                  Ms. S. Smith
  • From:             Atty. Jones
  • Re:                  Black Acre
  • Summary:     Notice of alleged ground water contamination
  • Next Steps:  Organize meeting with regulators 

But imagine a file containing thousands of such records, on thousands of subjects.  Then imagine scouring the file for all correspondence regarding “Black Acre.” Experience spreadsheet users will recognize that the search can be completed with a simple “Find” (or “Sort") command; the less experienced might well become lost.  In addition, data needs to be keyed into each and every cell (even if one relies upon the "auto complete" features found in current spreadsheet programs).  “Ms. Smith” and “Black Acre” need to typed each time an item of correspondence is recorded.  Each  entry introduces opportunity for error.  “Black Acre” might become “Blak Arce,” “Blac Acre,” “Blk Acre,” or any other combination that tangled fingers might produce.  An electronic search for “Black Acre” will not produce records logged under “Blk Acre.” 

But there are other weaknesses:

  • This file does not offer any contact information.  One must look elsewhere for Ms. Smith’s address. 
  • Shipping fees are evidently captured elsewhere.
  • Each “Next Step” must be added manually to yet another separate record or tracking tool. 
To fully document that January 14 letter to Ms. Smith regarding Black Acre, one must consult this file, the firm’s address book, the billing system and the document management system, if one wants to see the letter itself.  Four searches for one document is hardly efficient.  More, those searches would not touch the underlying document - the notice of contamination mentioned in the "Summmary" field.

A relational database would provide a more robust tool, although it would require more set up.  It would contain separate files (to use the correct term, "tables") for contact information, projects, expenses, next steps and correspondence, tied together through “relationships,” and organized in such a way that each item of information only need be entered once.  After that, each item may be used again and again via drop-down menus.

Organizing these tables to ensure one time data entry is one of the keys to a successful database.  We will take up that question in our next installment.



No comments:

Post a Comment