database-tables
 Rating: 1 Star2 Stars3 Stars4 Stars5 Stars
Loading...

Dangers of Adding fields to Database Tables

There are certain dangers if you don’t plan carefully before adding fields to your database. Though, it seems obvious but its not as simple as you might think. If you don’t plan properly you may miss something or ruin the current table which will consume lot of your time fixing. You must have come to a situation where things are looking so obvious but you aren’t getting a proper result. The cause behind the unexpected result can be careless addition to database fields. Lets check few scenarios where these things apply.

SELECT ALL

A SELECT * statement returns all field values from a table. While this is not only a lazy practice, returning more fields than needed will lead to performance problems in your application. In addition to potential performance problems, SELECT * statements can cause your application code to break when additional fields are added to your database table. Consider the following example. Database table Sales has four fields; SaleID, SalePrice, SaleDate and Product. A front-end application calls the following stored procedure to acquire sales for the last two days.

When this procedure is run, it returns all four fields from the Sales table. If the front-end application is coded in such a way so that it tries to handle only these four fields, returning an additional field may result in a broken front end. While something like this should be caught through testing, little situations like this may end up leaking through to production code which would result in a “broken” application.

The most simple method to correct this situation is to not use a SELECT * statement. If you always explicitly list the field names that you want your procedures to return then you’re much more likely to avert these types of problems as well as enjoy some performance gains from not returning all fields in the SELECT statement.

TABLE JOINS

Adding database fields to tables that participate in JOIN statements (probably most of your tables) can potentially cause problems. These problems can arise when the name of the field added also exists as a field name in the joined table and table aliases are not used.

To make this a bit clearer, consider the following example. The Products table has three fields; ProductID, ProductCode and ProductDescription. The Sales table has the following fields; SaleID, ProductID, SaleDate. Most queries join these tables in the form of:

Because most queries are only concerned with the ProductCode from the Products table, a developer as decided to add the ProductCode field to the Sales table to avoid the look-up. Adding the ProductCode to the Sales table will not only go against normalization rules (which is sometimes OK), but will immediately cause the query listed above to fail.

The reason is that no table aliases are used when listing the fields, so the optimizer will not know whether the query is specifying the ProductCode from the Products table or the Sales table. To fix this problem, any statement that looks at the ProductCode from the JOIN of these two tables and which does not use table aliases must be rewritten. To avoid this problem entirely, ALWAYS use table aliases when joining two or more tables.

Conclusion

Adding a field to a table should not cause your programs to crash. The problems encountered here could be very easily avoided if you include these types of situations in your database programming standards and ensure they are enforced. A little bit of looking forward can prevent many hours of reprogramming.

If you’ve got any thoughts, comments or suggestions for things we could add, leave a comment! Also please Subscribe to our RSS for latest tips, tricks and examples on cutting edge stuff.

Related Posts

How to create XML with PHP Mysql

How To Create XML With PHP MySQL

MySQL: foreign keys and referential integrity

Secure sensitive data in MySQL with few easy steps

Setting-Up a Relational Database in MySQL

  • Jice

    [QUOTE]
    If the front-end application is coded in such a way so that it tries to handle only these four fields…
    [/QUOTE]
    This seems to me a very bad coding way. Datas should be stored in structures where the fields names are specified.
    And in this case, if you add a new field in the select clause, * or not, the problem will still be there.

    I agree on the performance problems but a “select *” and good programming habits make database evolutions very easy to maintain : only the programs where the new field is needed should be modified.
    Here comes again the big question : should I give priority to maintenance or to performance.
    For me, maintenance should come first except if performance appears to be a problem…

  • Ture, when it comes to give priority then we should give priority to maintenance over performance but there are some critical cases where we can't neglect performance too.

    Suppose a large site like “Digg” where you have huge database and millions of people online doing database operations, performance becomes the key. Ideally, we should avoid the practice of using “Select *” in our queries.

  • Jice

    In this case, of course… For some huge databases, you can't lose time.

    But anyway the code should map the rows some way to avoid handling 4 fields or the 3rd field (as mentionned in the quoted part of my previous post)…
    It's better to have
    row[“field”]
    than
    row[3]

    Good habit is to have some MVC model and have database access objects where datas are retrieved and mapped in datastructures used anywhere else.

  • Jice

    In this case, of course… For some huge databases, you can't lose time.

    But anyway the code should map the rows some way to avoid handling 4 fields or the 3rd field (as mentionned in the quoted part of my previous post)…
    It's better to have
    row[“field”]
    than
    row[3]

    Good habit is to have some MVC model and have database access objects where datas are retrieved and mapped in datastructures used anywhere else.