Wicked Number Formatting in T-SQL

One thing you are actually not supposed to do is to format your data in the database.  Databases are to store and manage data, and presentation is for the UI layer.

Life has its funny ways to make you do things which you should not under normal cirumstances, things which you have sworn by yourself not to.  I came to a situation where I needed to format a money number following “#,.####” within the body of a T-SQL stored  procedure.  I simply had to do it there.  The requirement came in a very late moment of the testing phase, when moving this part out was definitely not feasible, considering the approaching deadline.

T-SQL has some limited functions to format data into string.  In reference to numbers - float and money, you can use CONVERT.  For float, there is no sensible option for ‘#,.####‘.  For money, CONVERT gives three options:

0 (default) No commas every three digits to the left of the decimal point, and two digits to the right of the decimal point; for example, 4235.98.
1 Commas every three digits to the left of the decimal point, and two digits to the right of the decimal point; for example, 3,510.92.
2 No commas every three digits to the left of the decimal point, and four digits to the right of the decimal point; for example, 4235.9819.

As you can see, ‘#,.####’ is a mix of 1 and 2.

Poor man is devil, Bulgarians say.  What I did was to get the string, result of CONVERT with style 1, and concatenate it with the last two characters of the string, result of CONVERT with style 2:

UPDATE : 14th December 2009

 

There is a bug in the above code, which produces wrong result in case the last two digits are greater than 49.  The correct solution would be:

Posted by Nik on November 12th, 2009 1 Comment

Google is making us stupid

More often than not, when confronted with a problem, software developers rush to www.google.com, and start searching for samples and ready-to-use pieces of code.  The actual documentation of the language / framework / tool / whatever is often left untouched.  Google’s highly efficient search engine is really good in finding the best answer, much more relevant than the results of the built-in searching facilities of the documentation in question.  Plus, it reveals others’ experience.

As good as it gets, using Google too much has a dark side.  Developers often come to the point when they implement solutions with technologies they do not understand.  They are not troubled by the fact that they use code, written by someone else, and which they don’t understand.  If they run into a trouble with the sample solution they found, they will go to various forums to find a fix.  An old joke about how to write an algorithm to find a lion in the desert and lock it in a cage had it for Delphi that Delphi developers would go to all known forums and post “Does anyone have a component which finds a lion in the desert and locks it in a cage?”.  Well, I am sorry to inform you, smart Delph-haters out there, that is no longer a Delphi-thing only.  You all fell into the trap.

To be honest, that particular approach works quite often, and it works well.  Why am I making fuss about it, then?  Development goes with good speed, results are often nice.  Until the moment when Google doesn’t give an answer.  Or, to be presice, its answer is incomprehensible, because the developer has no understanding of the technologies involves whatsoever. 

The ease with which solutions are found in Google takes away the incentive to get into real understanding of how things actually work.  As a result, junior developers stay junior, and never learn basics which will take them to a higher level of skills.

That is the reason why I am no longer eager to give code samples.  People would never read them, just copy & paste it and even ask specific questions how to integrate them into their own code. I prefer explaining things, and give out only snippets of important code.  Who claims to be a developer should be capable of figuring the rest by oneself.

Posted by Nik on August 24th, 2009 No Comments

SQL Server: Shrink Database Log, Part II

Here is another approach to shrink an extensively large MS SQL Server log file:

Thanks to  Stilgar for helping me out with this one.

Posted by Nik on July 13th, 2009 2 Comments

SQL Server: Shrink Database Log

In Microsoft SQL Server sometimes you can observe the LOG file growing uncontrollably and unexpectedly to a size times the size of the DATA file.  I’ve witnessed  a database with data file of 200 MB and a log file of 4 GB.  Not only harddisk space is wasted, but the backup file becomes too inconvenient to drag around even heavily compressed.

Why does the LOG file grow that much is a different story, which I am not prepared to tell now.  The question is how to make the LOG file small again within the “normal” boundaries.  Using the “Shrink database” command does no good.  Backup and restore of the database gives you the same old large log file.

The only solution I know so far involves the following steps:

  1. Create a new, empty database.
  2. Script all the db structure, and use it to create an identical copy of your old one into the new database you just created.
  3. Use bulk copy to transfer all data from the original database to the new one.

Voila, the LOG file is back to normal.

Note that you might need to disable certain triggers and foreign keys before the copy operations - that depends completely on your database.

Posted by Nik on July 7th, 2009 No Comments

Coding Horrors: MySQL, Dates and Times

An excerpt from MySQL Reference Manual:

Illegal DATETIME, DATE, or TIMESTAMP values are converted to the “zero” value of the appropriate type (’0000-00-00 00:00:00′ or ‘0000-00-00′).

What the fuck.  Really, come on.  The purpose of a database is to store valid and integral data.  Again.  Valid.  And.  Integral.  Data.  It is valid, integral and meaningful that counts.  Not just any rubbish.  If I want rubbish, I can just read back random bytes from the harddisk.  If someone tries to write crap into a database, which the database cannot handle, the database must refuse the operation, and give error. 

Really, I wonder what the hell designers of MySQL were thinking when they implemented this specific feature.  Pardon me, the word “feature” is not appropriate.  The correct word would be “disaster”.  It is a disaster indeed, because it silently causes data loss.  One might argue that it is developers’ responsibility to validate their data, and make sure they provide the database with valid input.  While this is definitely true, it is no excuse for the database to accept graciously any garbage, and on top of that convert it to something which has nothing to do with what it is supposed to be.  That is a major fail in design, and major fail to achieve a primary goal.

The behavior above is true for MySQL versions from 4.1 to 6.0.

This little story only proves that MySQL is only good as a backend for small and simple websites.  If you mean serious business, please reconsider.  There are a bunch of free databases out there, all of them much, much better and more mature.  Some of them even open source.

And yes, I know there is a general setting in MySQL 5.02 and greater to make it give errors.  And yes, I don’t think it makes the fail lesser.

Posted by Nik on May 27th, 2009 2 Comments

Rant: Moving Targets

A trip back in time.  In 2000 we were starting a large project.  The architecture was pretty standard stuff – database server, application server with business objects, semi-thin desktop client, featuring bi-directional communication between the clients and the server.  All were Windows native applications.  During the design phase the question about the inter-tier communication technology came up.  Back then, Windows DNA1 was the super-dee-duper stuff out there, praised by Microsoft as the prime guide to making distributed applications.  We decided to follow its wisdom, and built the communication layer on top of Distributed COM and COM+.  Oh, why, why?

Read the rest of this entry »

Posted by Nik on May 25th, 2009 3 Comments

Virtual ListViews - Name for Speed

ListView.  Huh?  Come on, you all know it - the right pane of Windows Explorer.  That is, where contents of folders are displayed.  Whenever you see large icons, or small items listed in columns, or a list of items with details in other columns, it is always one and the same control - ListView.  Kind of a Listbox, but much, much better and powerful.  ListView control came into this world with Windows Explorer back in WIndows 95, and got numerous enhancements eversince.  ListView is very convenient when it comes to display a list of items.  It can even substitute grid controls for read-only data in many cases, as ListView can display multiple columns per data item.  Add support for icons, check boxes, grouping and different layouts, and you have a real beauty.  How to use ListViews is out of the scope of this post.  If you want to learn it - just google for it.  There are tons of tutorials out there.

Great as they are, ListView contols can be very slow with large number of items (like tens of thousands), especially while populating with data.  Furthermore, by default ListView stores all its items internally.  In case you populate the List View from an in-memory collection, you end up with doubled memory usage.  Another thing to be worried about when working with tons of data.

The solution to both the performance and memory-usage issues is to use Virtual ListViews.

Read the rest of this entry »

Posted by Nik on April 10th, 2009 4 Comments

Coding Horrors: C++ and Overloaded Methods

My tough feelings for C++ as a language are all but new.  Here is another example why this language is not suited for children, and young students should not take C++ as their first programming language, unless they are really smart and have solid common sense to protect them from the bad influence of this language.

Consider the following code:

Nothing special, right?  It looks absolutely logical that the call to myFunction in main() would invoke void myFunction(String, String, bool = false).  Well, it won’t.  The function that will get called is void myFunction(String, bool = false).

Well, I must admit that I myself would not figure out easily.  Further reading revealed the explanation - the compiler is using the more cost-efficient conversion of char * to bool, rather than creating a new instance of class String.  Further reading proved that the behaviour of the compiler is strictly according to the C++ Standards!

Feel free to call me any names you can think of, but for me the authors of C++ Standards live in a parallel, sick universe.  Guys, have you heard that code has to be readable, too?

P.S. Edit: fixing typos.

Posted by Nik on April 9th, 2009 3 Comments

Select Master Records with Multiple Child Records

A friend of mine asked this question:

I have two tables: table T1 with primary key field ID, and table T2 with field ID which is a foreign key to ID from T1, and field VAL.  How do I select all records from T1, for which there are records from T2 with values X and Y? 

For example, considering the following data, I want all records from T1, where exist records records from T2 with values 1 and 2.

T1 has records
1
2
3
T2 has records
1
1
1
2
2
1
3
1
3
2

Desired output is records 1 and 3 from table 1.

The question is actually interesing only if we want to select records from table T1 for which there are an arbitrary number of records with respective values in T2.

There are a few ways to implement this.  Here is my choice number one for an arbitrary number of values:

If we are considering a few number of values - like four or five, I would suggest an alternative, which might prove even more efficient:

 

Providing we have a foreign key on T2.ID to T1.ID, the JOIN-approach can be really efficient. 

Posted by Nik on February 3rd, 2009 No Comments

Lectures OOP w/ C#

I started teaching a course in Object-oriented programming with C# 3.0.  Here is the link to the slides of the lectures.

Posted by Nik on January 29th, 2009 4 Comments