UPDATING A DATABASE

The SQL problems studied in the posts on this site can become a serious trouble when updating a database; I mean update in terms of insert, delete and update as such a database.

There are several possibilities in technology at the time of updating a database, for example, we have the case of input data to an information system, when we are migrating a database to a new one that usually has a different structure, in the case of updating a distributed databases that are spread in different locations, when we are loading data to a Data Mart, and a number of other possibilities, in all of them we must be aware of the danger of inserting data into a database when we don’t know the weaknesses of SQL.

In this case the problems go beyond query results, we know that the only way to update a SQL database is by using the language SQL, in the case of an information system a simple interface could let us put data in it which could erroneous data in database, from there on we could never trust in the query results anymore The possibility of erroneous data in SQL databases is present with standard operations INSERT, UPDATE and DELETE.

These operations may cause incorrect data inserted only in the case of using the Table expressions in the query, I mean using the SELECT expression within the operations INSERT, UPDATE and DELETE.

Essential to know about the weaknesses SQL have.

A detailed explanation plus some other problems found with the SQL aggregate operators is found in my book:

Amazon.com: http://tinyurl.com/amnz-sqlbook

Lulu.com: http://stores.lulu.com/store.php?fAcctID=3161804

Problems with the SQL operator Average

What is the average of this set of values? 4, 8, 6

The average is 6, the operation is (4+8+6) / 3.

If we use the SQL operator AVG() with that set of values the result is the same six, unless at least one more value appears in the set, I mean the “value” null, then the set become: 4, 8, 6, null

The operation would be? (4+8+6+null) / 3.

Whatever null means in this context, the common sense tells us that this operation using SQL should not give us a numeric answer because of null, maybe some other kind of answer like for ex. “I don’t know”.

The problem is that when a SQL query with operator AVG() has this later set of values the answer is 6, it means that SQL doesn’t consider null as another value, and worse it doesn’t tell us anything else, with the consequence that the answer may be misinterpreted.

We have to be consious about this problem. We can have a database table with a hundred of tuples with one numeric attribute that has the values 4 and 2, and the rest just losts of nulls,

4, 2, null, null, …, null a hundred times

then in an SQL AVG() operation we obtain the result of 3.

Obviously the answer cannot be 3, it makes no sense. Well Ok, SQL calculates (4+2) / 2 = 3, but may SQL tell us that there are losts of nulls not being considered in the operation, but no, SQL doesn’t tell us anything else but the result three.

Some other problems about the operator sum vs. count can be found in my publication:

A detailed explanation plus some other problems found with the SQL aggregate operators is found in my book:

Amazon.com: http://tinyurl.com/amnz-sqlbook

Lulu.com: http://stores.lulu.com/store.php?fAcctID=3161804

Domain (Data Type) and null

E. F. Codd (The father of the Relational Model) defined a tree valued logic that has True, False and Unknown, this later logic value is represented in current DBMSs with the null mark that, in general, represents the concept of Missing Information. He also classified Unknown into two possible Missing Information interpretations, one is applicable and the other is inapplicable.

 

The only possible mark that we can use in current DBMSs is null which represents Missing applicable Information; it means that when we don’t have a value for an attribute, null is put in the attribute.

 

The problem is that as we have seen in my post http://dbtips.wordpress.com/2009/05/31/some-notes-about-null/, null can not only represent Missing Information but also inapplicable information, and maybe there are some other meanings this mark could have in a database context.

 

In Database literature the null mark is not considered part of an attribute domain, then null could be considered an external mark from the attribute domain, that has a meaning in the context of the tuple that is Missing Applicable Information or Missing Inapplicable Information. The point is that we have to be conscious about the troubles null can cause in order to avoid them at the time of query construction.

There is an alternative that had been raised, the especial values which had been considered instead of the null mark to deal with Missing Applicable Information and Missing Inapplicable Information, for which we can give a meaning depending on the semantics an attribute have in a database context. For example, we can have “U” to represent missing information and “I” to represent not applicable, and some other possibilities that may exists. The problem is that the DBMS will not treat these values with the meaning we are giving, so we have to consider in queries conditions that let us differentiate these values from others.

 After this analysis we could have the following two alternatives: 

 

  1. Be conscious about the problems null mark can cause, and build the queries accordingly, but with the problem that we cannot differentiate the meaning of one null from other.
  2. Use especial values with the advantage that those values have a concrete meaning for us, with the problem that we have to consider them in the query conditions.

 You can find a more complete analysis of the null mark in my book:

Amazon.com: http://tinyurl.com/amnz-sqlbook

Lulu.com: http://stores.lulu.com/store.php?fAcctID=3161804

When subquery is empty in EXISTS (subquery)

Take a look at the example: http://dbtips.wordpress.com/2009/04/26/a-more-complex-sql-weakness-with-the-conditions-not-in-vs-not-exists/

Let’s analyse the result of the second example in the mentioned link:

The query: “Get the Bank names which doesn’t have ATMs in Madrid”

The result of the EXISTS subquery is empty, I mean that the subquery generates no tuples because the only possible tuple is the one that has Madrid in the attribute City an null in the attribute BNK, and the condition NOT EXISTS (subquery) generates True, that’s why we get a result with all the Bank names from de table BANK.

Then, how does the condition EXISTS behave from the perspective of SQL? Is this behaviour the same as in Logic?.

A detailed explanation is found in my book: Lulu.com: http://stores.lulu.com/store.php?fAcctID=3161804

Amazon.com: http://tinyurl.com/amnz-sqlbook

In the case of NOT IN (null)

Take a look at the example: http://dbtips.wordpress.com/2009/04/26/a-more-complex-sql-weakness-with-the-conditions-not-in-vs-not-exists/

Let’s analyse the result of the first example in the mentioned link:

The query: “Get the Bank names which doesn’t have ATMs in Madrid”

We have that the subquery (SELECT B.BNK FROM ATM B WHERE B.CITY = ‘Madrid’) generate the result null, and can we say that A.BANK isn’t in the list where there exists only null?.

 

Ovbiously not, what we can say is that it’s not known if any of the Banks in the table BANK exists in the list generated by the subquery, which has the only element null.

  A detailed explanation is found in my book: Lulu.com: http://stores.lulu.com/store.php?fAcctID=3161804

Amazon.com: http://tinyurl.com/amnz-sqlbook

Some notes about null

Take a look at the examples: http://dbtips.wordpress.com/2009/04/26/a-simple-sql-weakness-with-the-intersect-operation/

The following are some practical issues related to what Mr. C.J. Date said about null. Let’s analyse the results in a couple of cases.Consider that we have a couple of attributes att1 and att2.

Case 1 (simple conditions in the example):

att1 = 3 and att2 = null

if we have the condition att1 = att2, the answer is the same as false. But, is the value 3 different from null?, not neccesarily, attibute att2 could be 3, then the correct answer should be unknown.

Case 2 (Intersect in the example):

The intersection of att1 = null in one table and the same att1 = null in the other table.

In this case of the Intersect, null is considered a value, that is why the intersection take place. But null is not a value is a symbol that represent unknown or some other meaning (soon an analysis of this aspect), that`s not correct, because the att1 value from one table  could or could not be the same as att1 from the other table.

A detailed explanation is found in my book:

Amazon.com: http://tinyurl.com/amnz-sqlbook

Lulu.com: http://stores.lulu.com/store.php?fAcctID=3161804

Some SQL weaknesses explained in a video clip

In the following video clip from youtube.com I explain a couple of the weaknnesses in a graphical way.

http://www.youtube.com/watch?v=FWrH5K6V6E0

A simple SQL weakness with the Intersect operation

sqlintersect

A more complex SQL weakness with the conditions NOT IN vs. NOT EXISTS

(If needed see the example database)
sqlnotin-exists

A simple SQL weaknesses with operator avg()

sqlavg

Follow

Get every new post delivered to your Inbox.