Thursday, December 29, 2011
I'm back!
I know I haven't been the most frequent blogger but I'm hoping the new year brings with it the new desire to blog with greater regularity.
I will also be moving the blogs from the other site back to this one...
See you soon!
Tuesday, October 18, 2011
SQL PASS 2011
This was my first PASS and I though the whole experience was top notch. The colleagues I went with, the old colleagues I met up with there as well as a pretty well organised event made it all a very enjoyable experience.
The slight downer would have to be the badly organised schedule with key sessions conflicting and the poorly planned room allocation. Key speakers with their key topics were often placed in smaller rooms which were then closed minutes before the start due to vast numbers and demand. This actually caused us to miss Marco Russo’s “Vertipaq vs OLAP: Change your data modelling approach”, which was very disappointing.
But besides that, I managed to attend some really good sessions. I will be going through these along with some of my personal findings with regard Denali (now called SQL Server 2012) in my next few blog posts.
Please pop in and have a read when they are available.
Thursday, May 26, 2011
SSIS Data Cleansing – Regular Expressions
Monday, May 16, 2011
Tuesday, May 10, 2011
Friday, May 8, 2009
Analysis Services Partition Creation in SSIS
On my return from a short holiday last week (South Africa had two public holidays in one week – Monday and Friday so taking 3 days off gave me the whole week ; ) ) I found an email from a blog reader (Ashkan) asking me to add a blog about SSAS partition creation in SSIS. I had promised that I would in my previous blog so I thought I better get to work.
So here it is…
For this example, I am going to discuss how I usually set up SSAS partitioning on a new cube. In most cases, this is at the stage when the warehouse is fully populated with history data and I need to create a large number of SSAS partitions to represent all this data. So what we need to start is a cube that has been deployed to a SSAS server. (This example will add partitions to the an existing cube on a server not partitions (metadata) to a cube in an Analysis Services project.) This deployed cube will have at least one measure group in it and just the default partition. What I would then need to do, is change the default partition’s query definition to point to my first data set, usually my first month in my warehouse (assuming we are partitioning by month) and then use the following code to create all the partitions that I need.
For this example, I am going to use the sample SSAS database Adventure Works and more specifically the default partition, Internet_Sales which is part of the Measure Group, Internet Sales in the cube, Adventure Works DW.
This partition’s source is set to table binding and is mapped to a specific source table in the cube’s DSV. For this example, I want to partition the measure group against a single fact table aiming each partition at a specific month in that table. Therefore I need to change the partition to query binding and set the source value to:
SELECT fis.[ProductKey],fis.[OrderDateKey],fis.[DueDateKey],fis.[ShipDateKey],fis.[CustomerKey],fis.[PromotionKey],fis.[CurrencyKey],fis.[SalesTerritoryKey],fis.[SalesOrderNumber],fis.[SalesOrderLineNumber],fis.[RevisionNumber],fis.[OrderQuantity],fis.[UnitPrice],fis.[ExtendedAmount],fis.[UnitPriceDiscountPct],fis.[DiscountAmount],fis.[ProductStandardCost],fis.[TotalProductCost],fis.[SalesAmount],fis.[TaxAmt],fis.[Freight],fis.[CarrierTrackingNumber],fis.[CustomerPONumber],
CONVERT ( CHAR ( 10 ), SalesOrderNumber ) + 'Line ' + CONVERT ( CHAR ( 4 ), SalesOrderLineNumber ) AS [SalesOrderDesc]
FROM [dbo].[FactInternetSales] fis
INNER JOIN dbo.DimTime dt
ON fis.OrderDateKey = dt.TimeKey
WHERE Cast(Convert(varchar(6), FullDateAlternateKey, 112) as int) = 200107
(I realise that this is not the best SQL query in the world but it illustrates the point)
The partition Internet_Sales should then be renamed to something that makes more sense. I usually name partitions according to the measure group to which they belong and the specific month they are assigned to. e.g. InternetSales200107
Once this has been done we have a complete partition that we can use to clone more… and you thought cloning was illegal! ; )
In my previous blog (Analysis Services Partition Processing), I went into a lot of detail explaining the script task and how I prefer to capture the XMLA and executing it against the server using an Analysis Services DDL Task. This SSIS package is very similar and therefore I am not going to repeat all that detail here but rather provide the code that is different and follow up with an explanation.
The Script Task’s code would look like this:
Imports System, System.Data, System.Math, Microsoft.SqlServer.Dts.Runtime
Imports AMO = Microsoft.AnalysisServices
Imports Microsoft.AnalysisServices.QueryBinding
Public Class ScriptMain
Public Sub Main()
Dim amoServer As AMO.Server, amoMeasureGroup As AMO.MeasureGroup
Dim amoPartition As AMO.Partition, oVariables As Variables = Dts.Variables
Dim strXMLAScript As String, strTaskName As String
Dim amoQueryBinding As AMO.QueryBinding, amoNewPartition As AMO.Partition
Try
Dts.VariableDispenser.LockOneForRead("System::TaskName", oVariables)
strTaskName = oVariables.Item("System::TaskName").Value.ToString
amoServer = New AMO.Server()
amoServer.Connect("Data Source=LOCALHOST;Initial Catalog=Adventure Works DW;" & _
"Provider=MSOLAP.3;Integrated Security=SSPI;Impersonation Level=Impersonate;")
amoMeasureGroup = amoServer.Databases.FindByName(amoServer.ConnectionInfo.Catalog _
.ToString).Cubes.FindByName("Adventure Works").MeasureGroups. _
FindByName("Internet Sales")
amoServer.CaptureXml = True
' Get first partition
amoPartition = amoMeasureGroup.Partitions(0)
' Loop through number of partitions needed
For iMonth As Integer = 200108 To 200112
amoNewPartition = amoPartition.Clone()
amoNewPartition.ID = "InternetSales " & iMonth
amoNewPartition.Name = "InternetSales " & iMonth
amoNewPartition.Slice = "" ' Add as needed
' Create new query binding
amoQueryBinding = New AMO.QueryBinding
' Set new query binding
amoQueryBinding.QueryDefinition = GetQueryBindingText(iMonth.ToString)
amoQueryBinding.DataSourceID = amoPartition.DataSource.ID
amoNewPartition.Source = amoQueryBinding
' Add to measure group and update parition
amoMeasureGroup.Partitions.Add(amoNewPartition)
amoNewPartition.Update()
' Report progress
Dts.Events.FireInformation(0, strTaskName, "Added partition - " & _
amoNewPartition.Name & " to create partition list.", String.Empty, _
0, True)
Next
' Set in transaction and in parallel properties and capture
strXMLAScript = amoServer.ConcatenateCaptureLog(True, False)
' Write to package variable
Dts.VariableDispenser.LockOneForWrite("User::strXMLAScript", oVariables)
oVariables.Item("User::strXMLAScript").Value = strXMLAScript
Dts.TaskResult = Dts.Results.Success
Catch ex As Exception
Dts.Events.FireError(0, strTaskName, ex.Message, String.Empty, 0)
End Try
End Sub
Private Function GetQueryBindingText(ByVal strMonth As String) As String
Return "SELECT fis.[ProductKey],fis.[OrderDateKey],fis.[DueDateKey],fis.[ShipDateKey]," & _
"fis.[CustomerKey],fis.[PromotionKey],fis.[CurrencyKey],fis.[SalesTerritoryKey]," & _
"fis.[SalesOrderNumber], fis.[SalesOrderLineNumber], fis.[RevisionNumber]," & _
"fis.[OrderQuantity],fis.[UnitPrice],fis.[ExtendedAmount]," & _
"fis.[UnitPriceDiscountPct], " & _
"fis.[DiscountAmount], fis.[ProductStandardCost], fis.[TotalProductCost], " & _
"fis.[SalesAmount], fis.[TaxAmt], fis.[Freight], fis.[CarrierTrackingNumber], " & _
"fis.[CustomerPONumber], " & _
"CONVERT ( CHAR ( 10 ), SalesOrderNumber ) + 'Line ' + CONVERT ( CHAR ( 4 ), " & _
"SalesOrderLineNumber ) AS [SalesOrderDesc] " & _
"FROM [dbo].[FactInternetSales] fis " & _
"INNER JOIN dbo.DimTime dt " & _
"ON fis.OrderDateKey = dt.TimeKey " & _
"WHERE Cast(Convert(varchar(6), FullDateAlternateKey, 112) as int) = " & strMonth
End Function
End Class
This code will loop through a list of your choice (I have simply chosen to loop through 5 months to prove the concept) creating new partitions. This is done by cloning the existing one (the one we edited earlier) and then overwriting the properties that need to be changed. These properties include the partition ID, Name and the query binding. If the cube is using a slicer then you could overwrite this property too. I haven’t included it in this example.
Please notice that I needed to change the parallel argument in the ConcatenateCaptureLog method to false from the example in my previous blog.
strXMLAScript = amoServer.ConcatenateCaptureLog(True, False)
Once this is complete, you can execute the package and it will create all the partitions you would like (or at least according to your loop). Or create new partitions (by making a few alterations) as and when they are needed at the end of your ETL but before a dimension and measure group OLAP process is done. This obviously will need to be approved by your SSAS DBA. I have on the odd occasion found that creating SSAS objects like partitions dynamically through code in a scheduled job, without letting your DBA know, can get you into a little bit of hot water…
Hope you find this helpful!
Happy Deving!
(I have added a zip file to SkyDrive with the sample package. It can be found here).
Monday, March 23, 2009
A very good cause!
Friday, February 20, 2009
Exporting to a text file from Reporting Services
(Personal Reference)
A client recently asked me how one could add an export to text file in Reporting Services. I won’t go through the business reasons for this but this is how I suggested they go about it.
Add the following text:
<Extension Name="TXT" Type="Microsoft.ReportingServices.Rendering.CsvRenderer.CsvReport,Microsoft.ReportingServices.CsvRendering">
<OverrideNames>
<Name Language="en-US">TXT (Pipe Delimited Text File)</Name>
</OverrideNames>
<Configuration>
<DeviceInfo>
<FieldDelimiter>|</FieldDelimiter>
<Extension>TXT</Extension>
<Encoding>ASCII</Encoding>
<NoHeader>true</NoHeader>
</DeviceInfo>
</Configuration>
</Extension>
to the SSRS config file, rsreportserver.config, usually located in:
C:\Program Files\Microsoft SQL Server\MSSQL.3\Reporting Services\ReportServer. The text will need to be included with the other renders in that file between the render tags.
Happy Deving!
Blogging about blogging…
Having recently decided to start blogging again I was on a hunt for a decent blog stat counter. I have used StatCounter in the past but it only allows the last 100 hits (without shelling out fun tokens) to be saved. This was not going to be sufficient to allow for analysis. I was told by a mate to try out Google Analytics and I must say I am very impressed! It allows for all types of analysis and reporting. Go check it out if you are blogging. It is really amazing!
Tuesday, February 17, 2009
Analysis Services Partition Processing
Recently, while trawling the internet, I noticed that there are not a lot of SSIS examples detailing how to process Analysis Services cubes, especially if you are using Analysis Services partitioning. So I thought I would add an approach I have been using over the last few years.
While reading this you are probably wondering what all the fuss is about. Why not just use the Analysis Services Processing Task? Well, I guess this is definitely an option when processing dimensions and full cubes or measure groups with small data sets. This task is very useful and does provide a logging framework with a lot of detail. However, it is not dynamic. If the partitioning strategy defined is by month for example and the requirement is to process just those partitions with new data, then a more dynamic approach may be needed.
This is the scenario for which I am aiming this blog. This approach does add more code and therefore possibly more maintenance but may be a necessity when encountering larger data sets.
So if we look at the example mentioned above where we are partitioning by month. We could have a measure group with n monthly partitions and a need to process the last two as they could have new data. In this case, we may need a new Analysis Services partition or two which would represent these new months. (I am not going to add the code that creates new partitions in this blog to avoid complexity. I will save this for another blog!)
In this approach I am going to create the relevant XMLA scripts and then use the SSIS Analysis Services Execute DDL Task to execute them. I have seen many people attempt this approach by creating their own XMLA scripts. However, anyone who has tried this before will agree that the string manipulation required can be a real headache. So having a bit of a development background myself I prefer to allow the AMO (Analysis Management Objects) object model to do the hard work for me.
So onto the SSIS package…
First I declare a package variable: strXMLAScript (string)
Then add a script task and add the following code:
Imports System, System.Data, System.Math, Microsoft.SqlServer.Dts.Runtime
Imports AMO = Microsoft.AnalysisServices, Microsoft.SqlServer.Management
Imports Microsoft.AnalysisServices.QueryBinding
Note: You will need to add the relevant AMO reference to the script task before being able to use it. However, when adding a reference the relevant component i.e. Analysis Management Objects may not be available. This can be made available by coping the Microsoft.AnalysisServices.dll from “\Program Files\Microsoft SQL Server\90\sdk\Assemblies\Microsoft.AnalysisServices.DLL” into the relevant framework folder version e.g. “\windows\Microsoft.NET\Framework\v2.0.50727”.
Then declare script variables and set the task variable for event firing:
Dim amoServer As AMO.Server, amoMeasureGroup As AMO.MeasureGroup, amoPartition As AMO.Partition
Dim strXMLAScript As String, strTaskName As String, oVariables As Variables = Dts.Variables
Dts.VariableDispenser.LockOneForRead("System::TaskName", oVariables)
strTaskName = oVariables.Item("System::TaskName").Value.ToString
Then initialized the relevant Analysis Services objects:
amoServer = New AMO.Server()
amoServer.Connect("Data Source=LOCALHOST;Initial Catalog=Adventure Works DW;Provider=MSOLAP.3;Integrated Security=SSPI;Impersonation Level=Impersonate;")
amoMeasureGroup = amoServer.Databases.FindByName(amoServer.ConnectionInfo.Catalog.ToString).Cubes.FindByName("Adventure Works").MeasureGroups.FindByName("Reseller Sales")
Then set the server property CaptureXML. (This will make sure that all actions executed against the server during this session will not be implemented but rather that the XMLA for each action is made available.)
amoServer.CaptureXml = True
Once this is done the search criteria for finding the correct partitions will need to be implemented. This will obviously be site specific but I usually name the partitions by measure group and then yyyyMM. So the relevant partitions I am looking for are ResellerSales200901, ResellerSales200902.
' Loop through measure groups partitions
For Each amoPartition In amoMeasureGroup.Partitions
Select Case amoPartition.Name.ToUpper
Case amoMeasureGroup.Name.ToUpper & " " & DateAdd(DateInterval.Month, -1, Now.Date).ToString("yyyyMM"), _
amoMeasureGroup.Name.ToUpper & " " & Now.Date.ToString("yyyyMM")
amoPartition.Process(AMO.ProcessType.ProcessFull) ' Could be any process Type required
' Report the partions that have been added to the process list
Dts.Events.FireInformation(0, strTaskName, "Added partition - " & amoPartition.Name & _
" to process list.", String.Empty, 0, True)
End Select
Next
Note: Even when the loop is complete, no processing has been done. Only the XMLA script has been captured. Also note that I usually add an information event in the loop so I can easily highlight in my logging framework the partitions that have been added to the process list. I have found this information very useful when debugging.
We are now able to capture the XMLA script that the AMO object model has generated and set the ProcessInTransaction and ProcessInParallel properties as required. The package variable (strXMLAScript) is then populated with the XMLA string .
' Set in transaction and in parallel properties and capture
strXMLAScript = amoServer.ConcatenateCaptureLog(True, True)
' Write to package variable
Dts.VariableDispenser.LockOneForWrite("User::strXMLAString", oVariables)
oVariables.Item("User::strXMLAString").Value = strXMLAScript
The AMO server object does allow for the execution on the XMLA script once it has all been captured. This can be done with the following method:
amoServer.ExecuteCaptureLog(True, True)
However, this will require that you write a lot more code to capture all the possible errors and/or warnings that may occur. I prefer to allow the Analysis Services Execute DDL Task to do all this work for me. It also makes sure that all the relevant events that may need to be captured i.e. Onprogress, OnInformation, OnWarning and OnError are easily captured by your logging framework.
So all that is needed now is the Analysis Services Execute DDL Task source property is set to the variable strXMLA and we are ready to process.
The following shows the execution results with all the possible events that could be captured by your logging framework:
This is clearly a very simple example but can easily extend to accommodate for most, if not all, of your partition processing needs.
Hope you find this helpful!
Happy Deving!
(I have added a zip file to SkyDrive with the sample package and xmla to add the required partitions to the Adventure Works DW Cube. It can be found here).