Wednesday, September 21, 2005

I was just thinking, it's about time I talked about SQL Server on my blog LOL

Here are 5 tips for creating well designed tables in Microsoft SQL Server.  They are in no particular order and it's far from an exhaustive list, but it makes for a good standard I think.

1.  Keep your tables narrow:  I don't just mean about the number of columns per-table, but also the width of those columns.  Normalization is your friend here.  I won't go into it here yet, but shoot for 3nf by default. 

By keeping an eye on table width and keeping your rows as narrow as you can you will get more rows on a page which will help you in case a query misses and index and requires a table scan.


2.  Add a deleted bit column to your tables.  Design your tables with an extra column called "deleted" or something like that.  Instead of removing the row from the table if there is a delete, set this flag = True. 

Some reasons you should do this are:

  • It opens up more options for implementing an undelete function and it helps archiving/unarchiving old data.
  • It's non-destructive!
  • It helps index fragmentation.

 

3.  Maintain aggregate tables.  This one probabbly applies more to transactional system than others but by using maybe triggers or a scheduled job to maintain seperate tables that store common aggregates for your main tables you will for make your reports more efficient and can simplify queries and joins. 

Two things to remember: 

  • These tables may not be indexed the same as the transactional tables, but this is a good thing.  Choose your indexes on these tables to suit exactly the queries you need.
  • If you require the aggregate tables to be absolutely up-to-date with the main data tables then using triggers may be the best solution but don't over do it!  Triggers can be costly and have an adverse effect on transaction log throughput.

4.  Maintain created and modified timestamps on each row.  Design your tables with columns called created and deleted and store the timestamp of when the row was created and also the last time the row was modified.  Do this as well as an audit trail. 

This is the kind of gift that just keeps giving!  Some examples:

  • Once your little back room app has 'grown up' and now needs to feed the data warehouse?  No problems! 
  • Need to validate your aduit log or a point-in-time restore?  No problems.
  • Users always love to see these details on the screen, or give your app a history feature like a web browser has so users can backtrack to what they were doing before the phone rang?  No problems.

5.  Don’t allow null unless it actually valid.  Don't be scared of Null values, but only allow them where they make sense.  Remember Null doesn’t equal "" or 0 - it means undefined. 

Consider the mod timestamp (from point 4 above) on a newly created row.  It has never been modified so Null actually makes sense.  The worst thing to do to these kinds of fields is to use "magic" values to avoid null columns,  for example giving "01/01/1970" the special meaning of "unknown" is just a waste of effort and makes your database usable only by applications that know these specific facts, seriously harming interoperability.  Also consider a table that joins to itself to represent a hierarchy - null can be valid in the parent field.

That said, don't allow nulls anywhere that they don't make sense.  If you find yourself thinking about optional fields consider normalizing, or using an attribute-collection pattern instead!   The thing to avoid is creating tables that has a column for each of the potential fields.  The typical example is having one field for each of business phone, business direct line/ext, home phone, mobile phone, fax number - each of them  with the same validation rules, and most of the rows  don't fill all columns!

So there you go!  5 steps closer to peace in the database world!

Wednesday, September 21, 2005 3:58:44 PM (AUS Eastern Standard Time, UTC+10:00)  #    Disclaimer  |  Comments [0]  | 

Jesper Johansson has a good post on his blog this morning about what to do with the Built-In Adminstrator account.
I think this quote could become a new mantra: The built-in Administrator is basically a setup and disaster recovery account. Hallelujah!

Wednesday, September 21, 2005 10:23:36 AM (AUS Eastern Standard Time, UTC+10:00)  #    Disclaimer  |  Comments [0]  | 
 Thursday, September 15, 2005

I thought it time to recap some old database theory ahead of SQL Sevrer 2005 and the potential pain that CLR integration may bring to the DBA/Developer relationship.

Like all technology, if it's not used correctly it's just plain dumb.  CLR integration has been discussed between gurus whose shoes I am noth worth to clean ad nauseum, so we won't cover it here.  In a nutshell we are talking here about the integration of the Microsoft .NET Framework and runtime into Microsoft SQL Server.  This will allow you to write managed code in the database and a bunch of other cool things.

The thing to be mindful I think is when dealing with a .NET complex type.  Consider you're dealing with point data (i.e. x,y,z coordinates in space).  How would you represent this in relational data?  It may be convienient to store each point as some kind of string (NCHAR(5) in SQL Server types maybe) but that's just poor normalization.  What then happens if you want to process all the y values?

More correct may be three integer or float columns - one each for x, y and z.  This is getting there but what could be a scalar result from this?  It would have to be a concatination like:

SELECT cast([x] AS VARCHAR(3)) + ',' + cast([y] AS VARCHAR(3)) + ',' + cast([z] AS VARCHAR(3)) AS Point

to get

Point      
-----------
4,20,9

(1 row(s) affected)

...And that's ugly :)

So what's the bottom line here?  One of the most exciting things I am looking forward to in SQL 2005 CLR code is the idea of using a managed custom class to describe a point, and having it as a scalar result in a T-SQL operation.  That's hot if you ask me!

So what could be the problem?  Some peanut somewhere is probabbly going to use the new found CLR powers and code up a Tetris implementation and have it loaded into the CLR inside SQL 2005.  Sure you could do it, but all you are going to do is anger some DBA and give a bad name to CLR code, making it harder to adopt the correct implementations out in the field.

I'd love to give some credit where it is due for this post, but I can't remember whose writing it was that got me started thinking about this.  Whoever you are, thanks!

Thursday, September 15, 2005 1:20:07 PM (AUS Eastern Standard Time, UTC+10:00)  #    Disclaimer  |  Comments [0]  | 
 Tuesday, September 06, 2005

Ok, so part of this post is a plug for Greg Low's podcast, SQL Down Under -  just to get the caveats out of the way first! 

The guest in this week's show is Itzik Ben-Gan talking about the T-SQL language that is native to Microsoft SQL Server.  About 11 minutes in, his discusses how any problem you can face in T-SQL (or coding in general I'd suppose) is really a logic problem.

The example he gave (that I have illustrated below in a sample) is of the problem of counting the number of occurances of one string in another.  Instinctively, we sometimes start looking at a itterative process, to count each occurance as it occurs.

He proposes a deceptively simple solution:  Replace every occurance of the string being searched
for (@needle) with a zero-length string. 

You can then compare the length of the original text  (@haystack) with the modified text (@delta) and divide that by the length of the string being searched for (@needle) to reach the number of occurances of @needle in @haystack.

Again, the logic of this solution is Itzic's not mine :)

Here's a simple implementation of this logic that illustrates the point well, and ends up a much cleaner (and possibly faster) solution than the itterative one.

---------------------------------------------------------------------------

DECLARE @needle CHAR(4)
DECLARE @haystack CHAR(255)
DECLARE @delta CHAR(255)

SET @needle = 'pie'
SET @haystack = 'meat pie, shepherds pie, pork pie, mud pie, ' +
            'cutie pie, mushroom and beef pie, meat pie, ' +
            'shepherds pie, pork pie, mud pie, cutie pie, ' +
            'mushroom and beef pie, meat pie, shepherds pie, ' +
            'pork pie, mud pie, cutie pie, mushroom and beef pie, ' +
            'are there other kinds of pie?'


SET @delta = REPLACE(@haystack, @needle, '')

--NB: Extra brackets to make order of opperations clear.
SELECT (len(@haystack) - len(@delta))  /  len(@needle) AS NumOccurances

---------------------------------------------------------------------------
Tuesday, September 06, 2005 3:32:58 PM (AUS Eastern Standard Time, UTC+10:00)  #    Disclaimer  |  Comments [0]  | 
 Saturday, September 03, 2005

What a way to spend a week.  Major props to Jesper (http://blogs.technet.com/jesper%5Fjohansson/) for what could have been the best session of Tech.Ed on What no body told you about securing Microsoft SQL Server.

Saturday, September 03, 2005 12:57:56 PM (AUS Eastern Standard Time, UTC+10:00)  #    Disclaimer  |  Comments [0]  | 
 Tuesday, August 30, 2005

Well I'm soon off to the Gold Coast for Tech.Ed



See you on Friday
Tuesday, August 30, 2005 2:11:54 PM (AUS Eastern Standard Time, UTC+10:00)  #    Disclaimer  |  Comments [0]  | 
 Wednesday, July 20, 2005

Well, without having a free minute to write my own blog software I have finally gone and done it - installed someone else's.  I'll be moving all my old blog posts over to this blog... well um... sometime when I get around to it :)  -James

 

Wednesday, July 20, 2005 5:00:00 PM (AUS Eastern Standard Time, UTC+10:00)  #    Disclaimer  |  Comments [1]  |