Showing posts with label SSIS. Show all posts
Showing posts with label SSIS. Show all posts

Tuesday, February 17, 2009

Analysis Services Partition Processing

Recently, while trawling the internet, I noticed that there are not a lot of SSIS examples detailing how to process Analysis Services cubes, especially if you are using Analysis Services partitioning. So I thought I would add an approach I have been using over the last few years.

While reading this you are probably wondering what all the fuss is about. Why not just use the Analysis Services Processing Task? Well, I guess this is definitely an option when processing dimensions and full cubes or measure groups with small data sets. This task is very useful and does provide a logging framework with a lot of detail. However, it is not dynamic. If the partitioning strategy defined is by month for example and the requirement is to process just those partitions with new data, then a more dynamic approach may be needed.

This is the scenario for which I am aiming this blog. This approach does add more code and therefore possibly more maintenance but may be a necessity when encountering larger data sets.

So if we look at the example mentioned above where we are partitioning by month. We could have a measure group with n monthly partitions and a need to process the last two as they could have new data. In this case, we may need a new Analysis Services partition or two which would represent these new months. (I am not going to add the code that creates new partitions in this blog to avoid complexity. I will save this for another blog!)

In this approach I am going to create the relevant XMLA scripts and then use the SSIS Analysis Services Execute DDL Task to execute them. I have seen many people attempt this approach by creating their own XMLA scripts. However, anyone who has tried this before will agree that the string manipulation required can be a real headache. So having a bit of a development background myself I prefer to allow the AMO (Analysis Management Objects) object model to do the hard work for me.

So onto the SSIS package…

First I declare a package variable: strXMLAScript (string)

Then add a script task and add the following code:

Imports System, System.Data, System.Math, Microsoft.SqlServer.Dts.Runtime
Imports AMO = Microsoft.AnalysisServices, Microsoft.SqlServer.Management
Imports Microsoft.AnalysisServices.QueryBinding

Note: You will need to add the relevant AMO reference to the script task before being able to use it. However, when adding a reference the relevant component i.e. Analysis Management Objects may not be available. This can be made available by coping the Microsoft.AnalysisServices.dll from “\Program Files\Microsoft SQL Server\90\sdk\Assemblies\Microsoft.AnalysisServices.DLL” into the relevant framework folder version e.g. “\windows\Microsoft.NET\Framework\v2.0.50727”.

Then declare script variables and set the task variable for event firing:

Dim amoServer As AMO.Server, amoMeasureGroup As AMO.MeasureGroup, amoPartition As AMO.Partition
Dim strXMLAScript As String, strTaskName As String, oVariables As Variables = Dts.Variables

Dts.VariableDispenser.LockOneForRead("System::TaskName", oVariables)
strTaskName = oVariables.Item("System::TaskName").Value.ToString


Then initialized the relevant Analysis Services objects:



amoServer = New AMO.Server()
amoServer.Connect("Data Source=LOCALHOST;Initial Catalog=Adventure Works DW;Provider=MSOLAP.3;Integrated Security=SSPI;Impersonation Level=Impersonate;")
amoMeasureGroup = amoServer.Databases.FindByName(amoServer.ConnectionInfo.Catalog.ToString).Cubes.FindByName("Adventure Works").MeasureGroups.FindByName("Reseller Sales")


Then set the server property CaptureXML. (This will make sure that all actions executed against the server during this session will not be implemented but rather that the XMLA for each action is made available.)


amoServer.CaptureXml = True

Once this is done the search criteria for finding the correct partitions will need to be implemented. This will obviously be site specific but I usually name the partitions by measure group and then yyyyMM. So the relevant partitions I am looking for are ResellerSales200901, ResellerSales200902.


' Loop through measure groups partitions
For Each amoPartition In amoMeasureGroup.Partitions
Select Case amoPartition.Name.ToUpper
Case amoMeasureGroup.Name.ToUpper & " " & DateAdd(DateInterval.Month, -1, Now.Date).ToString("yyyyMM"), _
amoMeasureGroup.Name.ToUpper & " " & Now.Date.ToString("yyyyMM")
amoPartition.Process(AMO.ProcessType.ProcessFull) ' Could be any process Type required

' Report the partions that have been added to the process list
Dts.Events.FireInformation(0, strTaskName, "Added partition - " & amoPartition.Name & _
" to process list.", String.Empty, 0, True)
End Select
Next



Note: Even when the loop is complete, no processing has been done. Only the XMLA script has been captured. Also note that I usually add an information event in the loop so I can easily highlight in my logging framework the partitions that have been added to the process list. I have found this information very useful when debugging.


We are now able to capture the XMLA script that the AMO object model has generated and set the ProcessInTransaction and ProcessInParallel properties as required. The package variable (strXMLAScript) is then populated with the XMLA string .


' Set in transaction and in parallel properties and capture
strXMLAScript = amoServer.ConcatenateCaptureLog(True, True)

' Write to package variable
Dts.VariableDispenser.LockOneForWrite("User::strXMLAString", oVariables)
oVariables.Item("User::strXMLAString").Value = strXMLAScript


The AMO server object does allow for the execution on the XMLA script once it has all been captured. This can be done with the following method:


amoServer.ExecuteCaptureLog(True, True)


However, this will require that you write a lot more code to capture all the possible errors and/or warnings that may occur. I prefer to allow the Analysis Services Execute DDL Task to do all this work for me. It also makes sure that all the relevant events that may need to be captured i.e. Onprogress, OnInformation, OnWarning and OnError are easily captured by your logging framework.


So all that is needed now is the Analysis Services Execute DDL Task source property is set to the variable strXMLA and we are ready to process.


The following shows the execution results with all the possible events that could be captured by your logging framework:


Package


This is clearly a very simple example but can easily extend to accommodate for most, if not all, of your partition processing needs.


Hope you find this helpful!


Happy Deving!


(I have added a zip file to SkyDrive with the sample package and xmla to add the required partitions to the Adventure Works DW Cube. It can be found here).

Wednesday, April 25, 2007

SSIS : Package Configurations

There are many advantages to setting up configurations for your SSIS packages. If you have configuration values set up throughout your packages you can avoid unnecessary retesting and deployment processes later on. It also makes the process of redeploying to multiple environments (Dev, UAT, SysTest, and Prod) a lot easier.

On a recent project my colleagues and I decided that it would be very useful to implement package configurations in our SSIS packages. After a little bit of investigation we realised that the out-of-the-box package configuration although useful would not provide us with what we required.

Our requirements were as follows:

• Multiple filtering options - Because a package can be used in multiple systems and multiple environments (Development, UAT, Production)
• System and Package variable - Some variables (e.g. warehouse connection string) maybe system wide variables whereas some may be unique for a specific package.
• Encryption of Sensitive Information- Some of the variables may contain sensitive information that we wanted to have encrypted in the database.
• Config Audit Log - An audit log of what configuration values were during the time the package executed.
• Config Audit Trail - An audit trail of who has changed the config values
• Simple implementation - an easy way to implement the config process throughout all existing and future packages.

Database
I am sure most SSIS implementation use an Audit database of some kind to capture package and task execution logs. (See a future blog on SSIS logging) This is a good place to store your configuration information as it goes hand-in-hand with your logging information.

As we are storing this information in a database which can be moved between servers with varying users it is advisable to have its connection string being set up using the out of the box SSIS configuration functionality.

The database contains configuration tables i.e. PackageConfig, PackageAudit, PackageConfigLog. The PackageConfig table contains information uniquely defining each package through it’s package GUID and each of the package variables, variable types and variable values along with a flag whether the value should be encrypted or not. You should also include multiple columns for filtering your configuration values by i.e. System, Environment and Warehouse Process etc. The PackageAudit table contains all the necessary columns from the PackageConfig table that need to be audited as well a column for the user who made the change and the time of the change. This table has a trigger that will add a new row to the audit table whenever the main table is added to or changed. The PackageConfigLog will contain the necessary columns from the PackageConfig table that will allow the administrator to see which configuration values were used during each execution. Trust me, without this table you can be testing and get unusual results, take many hours hunting down what you might have done wrong and in the end find out that one of you colleagues change a config value that was vital to your test. If you had checked this table first, life could have been a lot less painful.

SSIS
Inside you package you will need a task to overwrite the variable values as defined in the package. You may want to write a custom task for this or just use a script task. As this is a task that will be used in all you packages I would suggest using the custom task options as changes to the script once it has been implemented in 100s of packages can be a real nightmare.

I have included a simplified version of the code below


' open connection
oSQLConn = New SqlConnection(strConnectionString) : oSQLConn.Open()
oSQLComm
= New SqlCommand("uspGetConfigInfo", oSQLConn)
oSQLComm.CommandType
= CommandType.StoredProcedure

' Add parameters to SQL command
Call ParametersGetConfig()

' Execute and return dataset
oSQLReader = oSQLComm.ExecuteReader()

' Loop through reader
While oSQLReader.Read
If Dts.VariableDispenser.Contains(oSQLReader.GetString(0).ToString) Then
' Lock variable for write
Dts.VariableDispenser.LockOneForWrite(oSQLReader.GetString(0).ToString, oVariables)

' Check variable is the correct type
If oVariables.Item(oSQLReader.GetString(0).ToString).DataType.ToString = oSQLReader.GetString(2) Then
oVariables.Item(oSQLReader.GetString(
0).ToString).Value = Convert.ChangeType(oSQLReader.GetValue(1), _
oVariables.Item(oSQLReader.GetString(
0).ToString).DataType)

' Report information
Dts.Events.FireInformation(0, strComponent, "The variable - " & oSQLReader.GetString(0).ToString & " has been updated.", _
String.Empty, 0, True)

Else ' Report incorrect data type
Dts.Events.FireWarning(0, strComponent, "The variable - " & oSQLReader.GetString(0).ToString & " has a different data type then " & _
"that defined in the config database.", String.Empty, 0)
End If
Else ' Report missing variable
Dts.Events.FireWarning(0, strComponent, "The variable - " & oSQLReader.GetString(0).ToString & " is missing.", String.Empty, 0)
End If
End While

' Close SQL Data Reader
oSQLReader.Close()

A stored procedure (uspGetConfigInfo) will get the variable information from the table discussed earlier and find the relevant variables, verify that the variable type are correct and then replace the value in the package. The change will be reported to the package through the component event namespace and can be captured by your chosen logging mechanism in the OnInformation event handler. The config log information could be implemented using a similar process in which the valid variables (with the correct data type) are written away to the PackageConfigLog table.

Encryption:
Using SQL Server encryption you can encrypt variable values in your PackageConfig table and decrypt them in your GetConfigValues stored procedure. Ensure that these values are not written to the package config log as that would obviously ruin the whole purpose of the encryption process.

Monday, January 8, 2007

SSIS: Event Firing within Script Tasks

When using the Script component is SSIS, I have noticed that alot of people don't take advantage of the IDTSComponentEvents Interface in the DTS namespace. This interface allows for event firing. All the existing tasks use this interface to fire the events required during their execution. Standard events include OnError, OnWarning, OnInformation and OnProgress. The Script Task can use this interface to easily portray to the Progress Tab, SSIS Logging or the EventHandler exactly what is happening in the script. I include it in all my script tasks for error handling but it can be used alot more extensively depending on the complexity. For instances showing the progress as you loop through items in a collection.

The first thing to do when setting this up is to add the System variable TaskName to the ReadOnly variables collection list.



The Code will look as follows :

 

Imports System, System.Data, System.Math, Microsoft.SqlServer.Dts.Runtime

Public Class ScriptMain

Public Sub Main()

Dim SubComponent As String = String.Empty

Try
' Get local instances of variables collection
Dim oVars As Variables = Dts.Variables

' Get Task name
SubComponent = oVars.Item("System::TaskName").Value.ToString

' Loop abb 10 times
For i As Integer = 1 To 10
Dts.Events.FireProgress(
"My Message : Wow look at me go!", i * 10, 0, 0, SubComponent, True)
Next

' Express loop success
Dts.Events.FireInformation(0, SubComponent, "Looping Succeeded", String.Empty, 0, True)

' Show Error
Dts.Events.FireError(0, SubComponent, "This would be my error!", String.Empty, 0)

Dts.TaskResult
= Dts.Results.Success
Catch exDTS As DtsException
Dts.Events.FireError(
CInt(exDTS.ErrorCode), SubComponent, exDTS.Message.ToString, String.Empty, 0)
Catch ex As Exception
Dts.Events.FireError(
0, SubComponent, ex.Message.ToString, String.Empty, 0)
End Try

End Sub

End Class


SSIS shows the progress as :


These events can also be captured by SSIS logging or the SSIS event handler.

Saturday, December 16, 2006

SSIS Macros - Part 3 : Get Task Event Handlers

Anyone who has been using SSIS for a reasonable amount of time will probably be very familiar with SSIS event handlers. I have found this to be a very useful feature and it always one to avoid the spider web nightmare that can occur when you are trying to capture each task's error and end up having a failure constraint coming from each of them.

However when using events one has to be careful at which level you handle them. If you use the OnError event at the package level and the task level, and the task fails, you will notice that the event will fire twice due to the fact that events bubble up from each task to their container. This can easily be overcome by setting the task system variable propogate in the event handler to false. However whenever you are developing a complex task or have need to edit one, it is difficult without a lot of clicking to find out which tasks have event handler, which have progate set to false and which are being handled at the container or package level.

Once again, I thought this would be an ideal time to write a quick macro that would allow me to see the event handler information for more complex packages.

Here is what is displayed by the macro I use:



Below is a sample of the code I used to get this information:

Dim oSSISApp As New SSIS.Application, oPackage As SSIS.Package, oTaskHost As SSIS.TaskHost
Dim oExec As SSIS.Executable, oEventHandler As SSIS.DtsEventHandler, oContainer As Object, i As Integer, j As Integer

Try
' Get package object
oPackage = oSSISApp.LoadPackage(DTE.ActiveDocument.FullName, Nothing)
frm.gbPackage.Text
= oPackage.Name.ToString

' Loop through events in the package
For Each oEventHandler In oPackage.EventHandlers
frm.lvwEventHandler.Items.Add(frm.lvwEventHandler.Items.Count
+ 1)

i
= frm.lvwEventHandler.Items.Count - 1
frm.lvwEventHandler.Items(i).SubItems.Add(oPackage.GetType.Name.ToString)
frm.lvwEventHandler.Items(i).SubItems.Add(oPackage.Name.ToString)
frm.lvwEventHandler.Items(i).SubItems.Add(oPackage.EventHandlers.Count.ToString)
frm.lvwEventHandler.Items(i).SubItems.Add(oEventHandler.Name.ToString)
frm.lvwEventHandler.Items(i).SubItems.Add(oEventHandler.Variables.Item(
"Propagate").Value.ToString)
Next

' Loop through executables in package
For Each oExec In oPackage.Executables
' Check that the executable is a task
If oExec.GetType.Name = "TaskHost" Then

' Set the SSIS type
oTaskHost = CType(oExec, SSIS.TaskHost)

' If no events
If oTaskHost.EventHandlers.Count = 0 Then
frm.lvwEventHandler.Items.Add(frm.lvwEventHandler.Items.Count
+ 1)

i
= frm.lvwEventHandler.Items.Count - 1
frm.lvwEventHandler.Items(i).SubItems.Add(oExec.GetType.Name.ToString)
frm.lvwEventHandler.Items(i).SubItems.Add(oTaskHost.Name.ToString)
frm.lvwEventHandler.Items(i).SubItems.Add(oTaskHost.EventHandlers.Count.ToString)
End If

' Loop through event handlers
For Each oEventHandler In oTaskHost.EventHandlers
frm.lvwEventHandler.Items.Add(frm.lvwEventHandler.Items.Count
+ 1)

j
= frm.lvwEventHandler.Items.Count - 1
frm.lvwEventHandler.Items(j).SubItems.Add(oExec.GetType.Name.ToString)
frm.lvwEventHandler.Items(j).SubItems.Add(oTaskHost.Name.ToString)
frm.lvwEventHandler.Items(j).SubItems.Add(oTaskHost.EventHandlers.Count.ToString)
frm.lvwEventHandler.Items(j).SubItems.Add(oEventHandler.Name.ToString)
frm.lvwEventHandler.Items(j).SubItems.Add(oEventHandler.Variables.Item(
"Propagate").Value.ToString)
Next
End If
Next
Catch ex As Exception

End Try

End Sub


I hope some will find benefit from this, I definitely have.

Thursday, December 14, 2006

SSIS Macros - Part 2 : Get Tasks Without Descriptions

When using SSIS logging or the SSIS Event Handlers to get which tasks have fired and what their results were, I have found that it is very handy to have a good description for each task that is being used in each package.

The few SSIS documenting tools I have seen all take advantage of these descriptions and therefore the benefits of using them greatly outway the effort it takes to define them. However, once your package becomes more complex it can be tedous to have to go to each task click on it, then find the description property in the properties window to see if you remembered to edit the description or not. To ensure that I didn't pull anymore hair out of my head due to this this mundane process, I slapped together a simple macro that will run through a package and see if the Task and Component Descriptions are blank or are still set to the the default.

Here is a result of the macro :



Below is a sample of the code I used to get this information:

Private Sub GetObjectDescriptions(ByVal frm As frmObjectDescriptions)

Dim oSSISApp As New SSIS.Application, oPackage As SSIS.Package, oTaskHost As SSIS.TaskHost
Dim oExec As SSIS.Executable, oEventHandler As SSIS.DtsEventHandler
Dim oMainPipe As MainPipe

Try
' Get package object
oPackage = oSSISApp.LoadPackage(DTE.ActiveDocument.FullName, Nothing)

frm.gbPackage.Text
= oPackage.Name

'Loop through executables in package
For Each oExec In oPackage.Executables
If oExec.GetType.Name = "TaskHost" Then

' Set the SSIS type
oTaskHost = CType(oExec, SSIS.TaskHost)
Dim oNode As TreeNode = frm.tvwObjects.Nodes.Add("Task - " & oTaskHost.Name.ToString & " - " & _
CheckTaskDesc(oSSISApp, oTaskHost.Description.ToString))

' TODO : Find out how to check if it is a data flow or not
Try
oMainPipe
= CType(oTaskHost.InnerObject, MainPipe)

' Loop through components
For i As Integer = 0 To oMainPipe.ComponentMetaDataCollection.Count - 1
oNode.Nodes.Add(
"Dataflow - " & oMainPipe.ComponentMetaDataCollection.Item(i).Name.ToString & " - " & _
CheckPipeCompDesc(oSSISApp, oMainPipe.ComponentMetaDataCollection.Item(i).Description.ToString))
Next
Catch ex As Exception

End Try
End If
Next

Catch ex As Exception
MessageBox.Show(ex.Message.ToString)
Finally

End Try

End Sub



If you are like me and have seen the benefit of having these descriptions defined and want to avoid the headache this process can cause I would recommend take a bit of time to set this up. Your receding hairline will thank you as mine did.

SSIS Macros - Part 1 : Reset Task IDs

As a first blog and seeing that I have found myself doing alot of just SSIS development on my current project, I thought I would share a few nice-to-have visual studio 2005 macros I have been using that have made my SSIS development alot easier.

The macro for this post involves resetting Task/Container Guids in SSIS packages. For those who don't know, each Task or Container in an SSIS package will have an ID propery. This ID property is a unique GUID which uniquely identifies the object. So if you are using SSIS logging or some kind of logging through the event handlers you find that using these GUIDs can be very helpful in logging and auditing the tasks in all your packages.

However, one of the problems that may arise by using these IDs is that if you copy and paste a task into a different container (Loop or Sequence) or package the Task ID will remain the same as the source one that you copied. Therefore it will not uniquely identity that task in the package (if it exists in different containers) or in the packages of the SSIS project. This can be a real headache when you are trying to find which task caused an error or executed for a certain period of time.

If you are copying and pasting packages the same problem will occur. However, it is easy to overcome by using "Generate New ID" through the ID property of the package. This is not as easy when dealing with tasks.

The macro I have included in this post will reset all the task GUIDs in a package.
All it does is use the SSIS object model to loop through all the tasks inside a given package, find the tasks ID and then do a find and replace in the packages XML.

Note: Be aware that the macro will save the package after the IDs have been reset, so do not run it if you do not want other changes you have made to the package to be saved.

I have included sample code that will help you get started writing your own macros. This code will only reset IDs for tasks, so you will need to add your own code for containers.

Note: The object model sees each container (sequence and loop) or package as having a collection of tasks. Therefore if you want to reset all the guids you will have to write code to loop through the package, the containers and all nested containers.

Source Code:

Sub ResetGuids()

Dim oApp As New SSIS.Application
Dim oExec As SSIS.Executable, taskHost As SSIS.TaskHost
Dim aszTaskID(0) As String, gdNewGuid As Guid

Try
' Get package path
Dim oPackage As SSIS.Package = oApp.LoadPackage(DTE.ActiveDocument.FullName, Nothing)

' Loop through tasks
For Each oExec In oPackage.Executables
taskHost
= CType(oExec, SSIS.TaskHost)

' Store task Guids in an array
' Set value and grow array size
aszTaskID(aszTaskID.GetUpperBound(0)) = taskHost.ID
ReDim Preserve aszTaskID(aszTaskID.GetUpperBound(0) + 1)
Next

' Edit package XML
DTE.ExecuteCommand("View.ViewCode")

' Loop through task GUIDS in the array
For i As Integer = 0 To aszTaskID.GetUpperBound(0) - 1

DTE.Find.FindWhat
= aszTaskID(i).ToString
DTE.Find.ReplaceWith
= "{" & System.Guid.NewGuid.ToString.ToUpper & "}"

' Set find criteria
DTE.Find.Target = vsFindTarget.vsFindTargetCurrentDocument
DTE.Find.MatchCase
= False
DTE.Find.MatchWholeWord
= True
DTE.Find.MatchInHiddenText
= False
DTE.Find.Action
= vsFindAction.vsFindActionReplaceAll

If (DTE.Find.Execute() = vsFindResult.vsFindResultNotFound) Then
MsgBox("No values were found")
End If
Next

DTE.ActiveWindow.Close(vsSaveChanges.vsSaveChangesNo)
DTE.ActiveDocument.Save()
Catch ex As Exception
MsgBox(ex.Message.ToString)
End Try

End Sub

 


-- Updated on the 16 Feb 2007


The previous sample only looped through tasks in the package i.e. not tasks in sequence or loop containers. Here is a sample that will capture all IDs for all tasks and containers. This sample will recursively loop through all the package objects.



Public Sub ResetsEachExecutablesGuid()

Dim oApp As New SSIS.Application, aszTaskID(0) As String
Dim oExec As SSIS.Executable, taskHost As SSIS.TaskHost
Dim gdNewGuid As Guid, SSISObject As System.Object

Try
' Get package path
Dim oPackage As SSIS.Package = oApp.LoadPackage(DTE.ActiveDocument.FullName, Nothing)

' Loop through tasks
For Each oExec In oPackage.Executables
SSISObject
= oExec
If oExec.GetType.Name.ToLower <> "taskhost" Then
Call LoopThroughContainer(SSISObject, aszTaskID)
End If

' Capture Task IDs
aszTaskID(aszTaskID.GetUpperBound(0)) = SSISObject.ID
ReDim Preserve aszTaskID(aszTaskID.GetUpperBound(0) + 1)
Next

' Edit package XML
DTE.ExecuteCommand("View.ViewCode")

' Loop through task ids captured
For i As Integer = 0 To aszTaskID.GetUpperBound(0) - 1

DTE.Find.FindWhat
= aszTaskID(i).ToString
DTE.Find.ReplaceWith
= "{" & System.Guid.NewGuid.ToString.ToUpper & "}"

' Set find criteria
DTE.Find.Target = vsFindTarget.vsFindTargetCurrentDocument
DTE.Find.MatchCase
= False
DTE.Find.MatchWholeWord
= True
DTE.Find.MatchInHiddenText
= False
DTE.Find.Action
= vsFindAction.vsFindActionReplaceAll

If (DTE.Find.Execute() = vsFindResult.vsFindResultNotFound) Then
MsgBox("No values were found")
End If
Next

DTE.ActiveWindow.Close(vsSaveChanges.vsSaveChangesNo)
DTE.ActiveDocument.Save()
Catch ex As Exception
MsgBox(ex.Message.ToString)
End Try

End Sub

Private Function LoopThroughContainer(ByVal oContainer As System.Object, ByVal aszTaskID() As String)

Dim oExec As SSIS.Executable, SSISObject As System.Object

For Each oExec In oContainer.Executables
SSISObject
= oExec
If SSISObject.GetType.Name.ToLower <> "taskhost" Then
Call LoopThroughContainer(SSISObject, aszTaskID)
End If

' Capture Task IDs
aszTaskID(aszTaskID.GetUpperBound(0)) = SSISObject.ID
ReDim Preserve aszTaskID(aszTaskID.GetUpperBound(0) + 1)
Next

End Function