Flat-File (Spreadsheet style) versus Relational Databases

I see the question coming up more and more of when to switch from something like Excel to something more like Access or a more fully fledged database server like MS SQL Server for data storage. So, here are a few things I find useful to consider when deciding between a relational database (RDB) driven versus a spreadsheet driven tracker:

  • Number of people trying to access the same data at the same time.
  • Size of the data to be stored.
  • How well you know the data structures you’ll be storing.
  • Integrity constraints needed by business requirements for the base data.
  • Complexity and number of calculations to be performed on the stored data.

If you’re going to have more than two people trying to coordinate update/delete access to the base data, an RDB starts looking better. The more people, the better it looks.

Setting up an RDB requires a lot more forethought than setting up a spreadsheet. If you know for sure everything you need to store and how it all logically relates, that makes the RDB a lot easier to set up, but still not as easy as a spreadsheet.

If you don’t know for sure how you need to store things, the spreadsheet allows you to more easily change the structure of the stored data.

That ease of changing things in a spreadsheet is a draw back if you need a high level of data integrity, because setting up data validations in Excel, for example, can be painful, whereas it’s just part of the design process with an RDB. Spreadsheets effectively require the users to handle the data validation checks.

Now, if you have some a few simple calculations to perform, that can easily be set up in a view/query. Access will bog down if the total number of calculations gets into the thousands (depending on the speed of the computer it’s running on), but an RDBMS like SQL Server can handle a pretty serious work load of simple calculations easily. Complex calculations, however, are where spreadsheets shine.

The good news is that even if it makes sense to store the data in an RDB, most modern spreadsheet programs can use that as a data source and still handle the number crunching for you. You just need to ensure that you’re giving the spreadsheet a view/query with all the base data it needs for the number crunching.