Programming


It’s been  a while before i wrote anything due to focus on the ongoing’s of the political fiasco in ma mother land Kenya but that doesn’t mean that im out..

Anyway what brought me here is because of the neat features of SQL Server 2008 One being the intellisense bit which to me was long overdue and it was about time it came into action .I wonder if they bought this idea from Red Gate.The guys of SQL tool belt which i happen to use in the office time to time.

SQL 2008 Intellisense

 I’ve also noticed that in the query editor window one can group the sql code by clicking the elipse at the top of the editor as illustrated.Notice the yellow bar on the left and the hook at the first line of Code. Second image shows the “smart tag” rolled up.

SQL 2008 CTP

Smart Tag

More of it as i discover the new features of SQL Server 2008 and Visual Studio 2008

SQL Server Database administrators often have requirements such as importing multiple files to a table on a database. This article discusses how to upload multiple files, especially XML files, to the SQL Server database XML data type column.

Pre-requisite

a. Make sure xp_cmdshell is enabled. If it is not enabled, execute the following command as shown below.

use master
go
sp_configure 'show advanced options',1
go
reconfigure with override
go
sp_configure 'xp_cmdshell',1
go
reconfigure with override
go

b. Create a database and table dedicated for this import as shown below. You could also use an existing database.

USE [master]
GO
/****** Object:  Database [XMLTest]
 Script Date: 04/17/2007 01:49:43 ******/
IF  EXISTS (SELECT name FROM sys.databases WHERE name = N'XMLTest')
DROP DATABASE [XMLTest]
go
create database XMLTest
go
use XMLTest
go

c. Be sure to create a table with an XML data type. Also, make sure at least one column in the table can hold the value of the filename as shown below.

use [XMLTest]
/****** object:  table [dbo].[myimage]
 script date: 09/10/2006 21:55:46 ******/
if  exists (select * from sys.objects
   where object_id = object_id(N'[MYXML]') and type in (N'u'))
drop table [MYXML]
go
create table [MYXML] (
[id] int identity(1,1),
[XML File Name] varchar(100),
[Data] XML)
Go

d. Let us assume that we want to upload all of the .bmp files from the C:\XML [refer Fig 1.0 and 1.1] folder to the “MYXML” table in the “dbo” schema on the “XMLTest” database.

multixml_image001.jpg

Example content of Customer5.xml

<?xml version="1.0" standalone="yes"?>
<Customer>
  <CustomerLogInfo>
    <Date>2007-03-31T06:40:38.0000000-05:00</Date>
    <user>Rainbow.River</user>
    <Userid>1AE</Userid>
    <ls>A-Accessible</ls>
    <eqtid>761</eqtid>
    <es>Stopped</es>
    <tp>30</tp>
  </CustomerLogInfo>
  <CustomerLogInfo>
    <Date>2007-03-31T06:40:38.0000000-05:00</Date>
    <user>Rainbow.River</user>
    <Userid>1AE</Userid>
    <ls>Not-Accessible</ls>
    <eqtid>870</eqtid>
    <es>Stopped</es>
    <tp>30</tp>
  </CustomerLogInfo>
  <CustomerLogInfo>
    <Date>2007-03-31T06:40:38.0000000-05:00</Date>
    <user>Rainbow.River</user>
    <Userid>1AE</Userid>
    <ls>A-Accessible</ls>
    <eqtid>97F</eqtid>
    <es>Started</es>
    <tp>30</tp>
  </CustomerLogInfo>
</Customer>

e. Let us create a procedure, usp_uploadXMLFiles . This creates a stored procedure, usp_uploadXMLfiles, on the master database so that it can be executed and called for any database.

USE [master]
GO
/****** Object:  StoredProcedure [dbo].[usp_uploadXMLfiles]    Script Date: 09/10/2006 23:33:34 ******/
IF  EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[usp_uploadXMLfiles]') AND type in (N'P', N'PC'))
DROP PROCEDURE [dbo].[usp_uploadXMLfiles]
go
set quoted_identifier off
go
create procedure usp_uploadXMLfiles
@databasename varchar(128),
@schemaname varchar(128),
@tablename varchar(128),
@FileNameColumn varchar(128),
@xmlcolumn varchar(128),
@path varchar(500),
@filetype varchar(10),
@printorexec varchar(5) = 'print'
as
set nocount on
declare @dircommand varchar(1500)
declare @insertquery varchar(2000)
declare @updatequery varchar(2000)
declare @count int
declare @maxcount int
declare @filename varchar(500)
set @count=1
set @dircommand = 'dir /b '+@path+@filetype
create table #dir (name varchar(1500))
insert #dir(name) exec master..xp_cmdshell @dircommand
delete from #dir where name is NULL
create table #dir2 (id int identity(1,1),name varchar(1500))
insert into #dir2 select name from #dir
--select * from #dir2
set @maxcount = ident_current('#dir2')
print 'set quoted_identifier off'
print 'go'
while @count <=@maxcount
begin
set @filename =(select name from #dir2 where id = @count)
set @insertquery = 'Insert into ['+@databasename+'].['+@schemaname+'].['+@tablename+']
 ([' +@filenamecolumn +']) values ("'+@filename+'")'
set @updatequery = 'update ['+@databasename+'].['+@schemaname+'].['+@tablename+']
 set ['+@xmlcolumn+'] = (SELECT *    FROM OPENROWSET(BULK "'+@path+@filename+'", SINGLE_BLOB)AS x )
 WHERE ['+@filenamecolumn +']="'+@filename+'"'
if @printorexec ='print'
begin
print @insertquery
print @updatequery
end
if @printorexec ='exec'
begin
set @insertquery='set quoted_identifier off' + char(10)+char(13)+@insertquery
set @updatequery='set quoted_identifier off' + char(10)+char(13)+@updatequery
exec (@insertquery)
exec (@updatequery)
end
set @count = @count +1
end
go

This procedure accepts the following parameters:

@databasename = Name of the database where the schema and table exist.

@schemaname = Schema of the database where the table exists

@tablename = Name of the table where files are going to be uploaded

@FileNameColumn = Name of the column in the table where the file name is going to be stored

@XMLcolumn = The actual XML Datatype column where the file is going to be stored as XML

@path = Path of all the files that are suppose to be uploaded. Example “C:\Windows\”

@filetype = Type of file you want to upload. Example “*.XML”

@printorexec = if “Print” is passed as a parameter it will generate and display the commands. If “Exec” is passed as a parameter, it will execute the command directly meaning upload all the files.

f. Let us execute the procedure with the parameter printorexec=’print’ as shown below.

Exec master..[usp_uploadXMLfiles]
@databasename ='XMLTest',
@schemaname ='dbo',
@tablename ='MYXML',
@FileNameColumn ='XML File Name',
@XMLcolumn = 'Data',
@path = 'c:XML',
@filetype ='*.xml',
@printorexec ='print'

This will generate all of the commands needed for creating a row for each file and updating the row with the proper file as shown below.

set quoted_identifier off
go
Insert into [XMLTest].[dbo].[MYXML] ([XML File Name]) values ("Customer1.xml")
update [XMLTest].[dbo].[MYXML] set [Data] = (SELECT *
 FROM OPENROWSET(BULK "c:XMLCustomer1.xml", SINGLE_BLOB)AS x )
 WHERE [XML File Name]="Customer1.xml"
Insert into [XMLTest].[dbo].[MYXML] ([XML File Name]) values ("Customer2.xml")
update [XMLTest].[dbo].[MYXML] set [Data] = (SELECT *
 FROM OPENROWSET(BULK "c:XMLCustomer2.xml", SINGLE_BLOB)AS x )
 WHERE [XML File Name]="Customer2.xml"
Insert into [XMLTest].[dbo].[MYXML] ([XML File Name]) values ("Customer3.xml")
update [XMLTest].[dbo].[MYXML] set [Data] = (SELECT *
 FROM OPENROWSET(BULK "c:XMLCustomer3.xml", SINGLE_BLOB)AS x )
 WHERE [XML File Name]="Customer3.xml"
Insert into [XMLTest].[dbo].[MYXML] ([XML File Name]) values ("Customer4.xml")
update [XMLTest].[dbo].[MYXML] set [Data] = (SELECT *
 FROM OPENROWSET(BULK "c:XMLCustomer4.xml", SINGLE_BLOB)AS x )
 WHERE [XML File Name]="Customer4.xml"
Insert into [XMLTest].[dbo].[MYXML] ([XML File Name]) values ("Customer5.xml")
update [XMLTest].[dbo].[MYXML] set [Data] = (SELECT *
 FROM OPENROWSET(BULK "c:XMLCustomer5.xml", SINGLE_BLOB)AS x )
 WHERE [XML File Name]="Customer5.xml"
Insert into [XMLTest].[dbo].[MYXML] ([XML File Name]) values ("Customer6.xml")
update [XMLTest].[dbo].[MYXML] set [Data] = (SELECT *
 FROM OPENROWSET(BULK "c:XMLCustomer6.xml", SINGLE_BLOB)AS x )
 WHERE [XML File Name]="Customer6.xml"
Insert into [XMLTest].[dbo].[MYXML] ([XML File Name]) values ("Customer7.xml")
update [XMLTest].[dbo].[MYXML] set [Data] = (SELECT *
 FROM OPENROWSET(BULK "c:XMLCustomer7.xml", SINGLE_BLOB)AS x )
 WHERE [XML File Name]="Customer7.xml"

g. Now let us execute the procedure with the parameter printorexec=’exec’ as shown below.

Exec master..[usp_uploadXMLfiles]
@databasename ='XMLTest',
@schemaname ='dbo',
@tablename ='MYXML',
@FileNameColumn ='XML File Name',
@XMLcolumn = 'Data',
@path = 'c:XML',
@filetype ='*.xml',
@printorexec ='exec'

This will upload all of the xml files.

h. Now let us query the table using the transact SQL statement below.

use XMLTest
go
select * from myxml
go

This produces the following results. [Refer Fig 1.1]

multixml_image002.jpg

i. Click on the XML data links to produce the result shown below.

<Customer>
  <CustomerLogInfo>
    <Date>2007-03-31T06:40:38.0000000-05:00</Date>
    <user>Dancing.Doll</user>
    <Userid>1AE</Userid>
    <ls>A-Accessible</ls>
    <eqtid>761</eqtid>
    <es>Stopped</es>
    <tp>30</tp>
  </CustomerLogInfo>
  <CustomerLogInfo>
    <Date>2007-03-31T06:40:38.0000000-05:00</Date>
    <user>Dancing.Doll</user>
    <Userid>1AE</Userid>
    <ls>Not-Accessible</ls>
    <eqtid>870</eqtid>
    <es>Stopped</es>
    <tp>30</tp>
  </CustomerLogInfo>
  <CustomerLogInfo>
    <Date>2007-03-31T06:40:38.0000000-05:00</Date>
    <user>Dancing.Doll</user>
    <Userid>1AE</Userid>
    <ls>A-Accessible</ls>
    <eqtid>97F</eqtid>
    <es>Started</es>
    <tp>30</tp>
  </CustomerLogInfo>
</Customer>

CONCLUSION

I believe now that now you can be able to upload your XML data comfortably.

In my day to day life im usually asking myself several questions on which T-SQL statement to use when it comes to using NOT IN() and <> statements and most of the time the answer i give myself and others is that “It Depends”

Looking at the to statements(NOT IN() and <> ) they turn out that they are actually the same query, and it should make absolutely no difference. This is because SQL is a declarative language, meaning: you tell the computer what you want, not how to get it. The query engine takes both of these queries and performs them with the exact same sequence of events. In actuality, one of the queries typically outperformes the other.

Let’s set off to find out if NOT IN is quicker than a “bunch of <>” and see a few methods that improve in their performance. Assuming this is the query we want to execute and notting the WHERE clause evaluating the inequality of a column and four integer values:

   s.Status_SV <> 214

   and

    s.Status_SV <> 215

   and

    s.Status_SV <> 216

   and

    s.Status_SV <> 217

Now that we know we’re dealing with integer data, we’ll set up a table with an integer column, then we’ll set up a timing and iteration framework for tracking our query times over multiple executions. Just to make our queries lengthy enough to be nicely measurable, I arbitrarily chose to insert 1,000,000. The test results were less conclusive at 10,000 rows (with one notable exception), but the trends we see at 1,000,000 rows are clearly discernable by the time we have 100,000 rows.

Let’s set up our table:

   PRINT 'Creating sample table...'

   CREATE TABLE [dbo].[tbl_IN_VS_AND]

   (filterCriterion_sv int identity(1,1) PRIMARY KEY CLUSTERED NOT NULL) 


   DECLARE @i int

   SET @i = 0

   PRINT 'Populating sample table (1.000.000 rows.  This could take several minutes)...'

   WHILE @i < 1000000

   BEGIN

      INSERT dbo.tbl_IN_VS_AND DEFAULT VALUES

      SET @i = @i + 1

   END

And our execution framework:

	SET NOCOUNT ON  


   DECLARE @startTime datetime, @endTime datetime

   DECLARE @results int, @iterations int

   DECLARE @i int


   SET @iterations = 100


   ---- setup is finished.  let's run some tests

   PRINT 'Beginning test run...'


   SET @i = 0


   SET @startTime = getDate()


   WHILE @i < @iterations

   BEGIN

   /*

   test queries go within this loop

   */

      SET @i = @i + 1

   END

   SET @endTime = getDate()


   PRINT 'Elapsed Time: ' + cast(datediff(ms, @startTime, @endTime) AS varchar) + ' ms'

Now we’ll write two queries to exercise our original question (NOT IN vs. <>). We’ll start with the NOT IN query, then follow up with the AND <> query:

   SELECT @results = count(filterCriterion_sv)

   FROM tbl_IN_VS_AND

   WHERE filterCriterion_sv NOT IN (214, 215, 216, 217) 


   SELECT @results = count(filterCriterion_sv)

   FROM tbl_IN_VS_AND

   WHERE filterCriterion_sv <> 214

   AND filterCriterion_sv <> 215

   AND filterCriterion_sv <> 216

   AND filterCriterion_sv <> 217

The NOT IN() certainly wins for conciseness. Before running a lot of iterations, let’s look at snippets of the query plan text so we know SQL is doing the same thing under the covers for each query.

NOT IN(): 


       |--Compute Scalar(DEFINE:([Expr1003]=CONVERT_IMPLICIT(int,[globalagg1005],0)))

            |--Stream Aggregate(DEFINE:([globalagg1005]=SUM([partialagg1004])))

                 |--Parallelism(Gather Streams)

                      |--Stream Aggregate(DEFINE:([partialagg1004]=Count(*)))

                           |--Clustered Index

Scan(OBJECT:([master].[dbo].[tbl_IN_VS_AND].[PK__tbl_IN_VS_AND__3B4BBA2E]),

 WHERE:([master].[dbo].[tbl_IN_VS_AND].[filterCriterion_sv]<>(214) AND

 [master].[dbo].[tbl_IN_VS_AND].[filterCriterion_sv]<>(215) AND [

AND <>:

       |--Compute Scalar(DEFINE:([Expr1003]=CONVERT_IMPLICIT(int,[globalagg1005],0)))

            |--Stream Aggregate(DEFINE:([globalagg1005]=SUM([partialagg1004])))

                 |--Parallelism(Gather Streams)

                      |--Stream Aggregate(DEFINE:([partialagg1004]=Count(*)))

                           |--Clustered Index

Scan(OBJECT:([master].[dbo].[tbl_IN_VS_AND].[PK__tbl_IN_VS_AND__3B4BBA2E]),

WHERE:([master].[dbo].[tbl_IN_VS_AND].[filterCriterion_sv]<>(214) AND

[master].[dbo].[tbl_IN_VS_AND].[filterCriterion_sv]<>(215) AND [

It turns out that SQL likes the AND <> so much, it converts the NOT IN() clause into a series of AND <> clauses. After 100 executions of each query, it seems that execution times for the AND <> query tend to be lower than those for the NOT IN query (conversion overhead, maybe?):

   Beginning first test run...

   "NOT IN" ET: 46170 ms

   Beginning second test run...

   "AND <>" ET: 42326 ms

I ran the same series of tests on another occasion and the NOT IN query consistently outperformed the AND <> query. The results regularly go back and forth, much like the heads and tails of a coin toss. So, I have managed to convince myself that, despite the two execution times listed above, these two queries are, indeed, the same query as far as SQL Server is concerned — at least on this day, on this server, for these queries (I still gravitate toward the comforting ambiguity of “it depends”).

I mentioned earlier that there were several queries that outperform our basic AND <> and NOT IN queries (on this server on this day). Let’s take a look at some of those queries and their execution results. The first alternative technique doesn’t use a WHERE clause to filter out our integer values. It places the integer values into a UNION query and does a LEFT OUTER JOIN against that to filter out unequal rows. Here is what that query looks like:

   SELECT @results = count(filterCriterion_sv)

   FROM tbl_IN_VS_AND

   LEFT OUTER JOIN (

         SELECT 214 AS filterValue_val UNION

         SELECT 215 UNION

         SELECT 216 UNION

         SELECT 217 ) AS tbl

      ON tbl_IN_VS_AND.filterCriterion_sv = tbl.filterValue_val

   WHERE tbl.filterValue_val IS NULL

It definitely feels odd, placing things you would normally put in a WHERE clause into a derived table then looking for absent values, but the performance benefit gives us a compelling reason to consider doing this. On this test run of 100 executions, this odd query was consistently outperforming the quicker of our original two queries by about 19%:

   Beginning fourth test run...

   "derived UNION table LEFT OUTER JOIN" ET: 34360 ms

Our last query happened to be our best performing (on this server, on this day). Like the previous query, this one uses a derived table. However, it takes it one step further and nests that inside an IF NOT EXISTS(). Let’s take a look at it:

   SELECT @results = count(filterCriterion_sv)

   FROM tbl_IN_VS_AND

   WHERE NOT EXISTS(SELECT * FROM

      (

         SELECT 214 AS filterValue_val UNION ALL

         SELECT 215 UNION ALL

         SELECT 216 UNION ALL

         SELECT 217 ) AS tbl

      WHERE tbl.filterValue_val = tbl_IN_VS_AND.filterCriterion_sv )

And here is the time it took for 100 executions:

   Beginning seventh test run...

   "NOT EXISTS from derived UNION table" ET: 27920 ms



I’ve been looking around the rollout of Windows Vista and Windows 2008 and im a bit suprised that they happen to be the last of 32 Bit operating system.Even the recently released Microsoft Exchange and it’s several updates is running on 64 Bit. So what does this mean to the african programmers and computer gurus?I think it’s about time we bought some stock in Intel or AMD. As more software moves to 64-bit only editions, there’s bound to be a nice upsurge in server replacements on this new platform. And think about how many desktops for developers and admins will likely need to be replaced with the upgrades to the version after Vista.On the other hand im looking at the perfomance of the 64 Bit’s and I wont be suprised if they turn out to be useless considering the way Exchange 2007 is faring in Africa.

This is the best control any developer on 2.0 platform can use because of the following features:-

1.You have your business logic away from the UI

2.You get to choose which methods to call from your business logic depending on the event you want.

Today i’ve had a very long day thanx to some bug that comes with .net 2005.It occurs if you are dynamically populating the dataview or gridview controls and you make a slight adjustment to the dataclass!rest assured you have to redo everything from scratch!Dont i hate simple technology?nway im still shocked by the driver of this matatu.ask me wat he was doing in the ditch n i’ll give you the whole story.