Tagged: ssis

SSIS – Change the Protection Level for all Packages in a Project

Here’s a quick one I found this morning while fixing an execution issue with one of my SSIS packages. I needed to change the Protection Level of all the packages in a project from Encrypt With User Key to Don’t Save Sensitive. The project contains over 30 packages, and I didn’t really fancy opening each one, navigating to the properties and amending the Protection Level for each one – that’s too time-consuming and frankly, boring. I initially figured I’d write a PowerShell or Python script to amend the relevant XML in each of the .dtsx files. But a quick Google came up with this MS Technet page, which details how to make the change via the command line, for both individual packages and for all packages within a project.

A quick amendment to the command to use the value for DontSaveSensitive (0) and to remove the password parameter was all that was required, so running this…

for %f in (*.dtsx) do dtutil.exe /file %f /encrypt file;%f;0

…and pressing Y to confirm each change as prompted, resulted in the change being made to all packages. Result! Well almost. After going back to Visual Studio and changing the Protection Level of the Project (it must match that of the Packages), I tried to build the solution. It returned an error stating that for all the Packages, the Protection Level was incorrect and had to match the Project. But I’d just changed all of that! Turns out I had to open all the packages in Visual Studio (for some reason), and then it would build without making any further changes. Odd.

While this is useful to know for future reference, I’m much more likely to be developing SSIS projects using BIML, which would make changes like this far easier, as it would be a simple change in the BIML template, re-generate the packages and job done. It’s the future…

Execute SSIS package as a Service Account

At my current client, I have a very specific set of parameters to work within regarding security. Chiefly, I need read permissions on the underlying database for a Microsoft Dynamics AX 2012 RS installation. I have a series of SSIS packages that extract data from AX on a daily basis, executed from SQL Server Agent. The AX database, however, is hosted on a remote server and neither myself or the client team have permissions directly on this box. Our only option for connecting to the SQL Server is via a service account which has db_datareader permissions on the AX database. So how to execute the Agent jobs running as the service account? By using a Credential and a SQL Agent Proxy.

Credentials
The first step is to set up a security Credential on the SQL Server that executes the jobs. These can be found under Security>Credentials in SSMS (see Figure 1). The Credential identity should be set to be the name (including the domain)of the Service Account, and must include the password. To create the Credential in T-SQL:

CREATE CREDENTIAL < CredentialName >
	WITH IDENTITY = '<domain>\<service account>'
		,SECRET = '<password>';
GO
Credentials
Figure 1

Proxies
Next, a Proxy for SSIS must be created in the SQL Agent. These are found in SQL Server Agent>Proxies>SSIS Package Execution (Figure 2). To create a proxy in T-SQL (note that subsystem_id 11 specifies that it is for SSIS Package Execution):

USE msdb;
GO

EXEC msdb.dbo.sp_add_proxy @proxy_name = N'<ProxyName>'
	,@credential_name = N'<CredentialName>'
	,@enabled = 1;
GO

EXEC msdb.dbo.sp_grant_proxy_to_subsystem @proxy_name = N'<ProxyName>'
	,@subsystem_id = 11;
GO

 

Proxy
Figure 2

Agent Job
Finally, the Agent job needs to be set up to run as the Proxy. In the Run As drop-down in the relevant Job Step, select the newly created Proxy (see Figure 3).

Agent
Figure 3

References

Credentials
Create a SQL Agent Proxy
sp_add_proxy
sp_grant_proxy_to_subsystem

Powershell Deployment Script Update

Since last weeks post on automated DB and ETL deployment builds with Powershell, I’ve made a couple of small but useful changes to the Powershell script.

The distinct parts of the script are now encapsulated in parameterised functions. The end goal of this is to be able to turn the main script into a Powershell Module that could be installed and used when required. So now there are three functions, GetHeadFromSVN, SyncDB and DeployPackages.

DeployPackages has been changed to make it quicker and more streamlined. It will now only deploy or re-deploy packages that have been changed in the last hour, rather than every package in the repository.

The functions are as follows:

# Get latest DB scripts and SSIS packages from SVN
function GetHeadFromSVN
{
    param($svnpath, $svnurl, $destinationfolder)

    Set-Location $svnpath
    svn checkout $svnurl $destinationfolder
}

# Use Red Gate SQL Compare to deploy schema changes to each DB
function SyncDB
{
    param($sqlcomparelocation, $argfile)

    Set-Location $sqlcomparelocation

    Invoke-Expression ".sqlcompare /ArgFile:$argfile"
}

# Deploy SSIS packages
function DeployPackages
{
    param($filepath, $ssisfolder, $server, $dtutilpath)

        $dtutil = $dtutilpath
        Set-Location $filepath

       # Only deploy packages that have been changed in the last hour
        $files=get-childitem *.dtsx | where {$_.LastWriteTime -gt (Get-Date).AddHours(-1)}

        if ($files -ne $null) {
        foreach ($file in $files) {
            $basename=$file.BaseName
            $deletearguments = " /delete /SQL $ssisfolder/$basename /SourceServer $server"
            $deployarguments = " /file $file /copy SQL``;$ssisfolder/$basename /destserver $server"

            Set-Location $dtutil

            Invoke-Expression "dtutil $deletearguments"
            Invoke-Expression "dtutil $deployarguments"
         }
    }
}

Here are some examples of how they would be called:

GetHeadFromSVN "C:Program FilesSlikSvnbin" http://<svnserver>/repos/bi/Database "C:NightlyBuildsBISVNDatabase"
GetHeadFromSVN "C:Program FilesSlikSvnbin" http://<svnserver>/repos/bi/ETL/branches/Phase2 "C:NightlyBuildsBISVNETL"

SyncDB "C:Program FilesRed GateSQL Compare 9" "C:NightlyBuildsBIConfigurationStagingCompare.xml"

DeployPackages "C:NightlyBuildsBISVNETLSourceToStage" "SourceToStage" "<target server>" "C:Program FilesMicrosoft SQL Server100DTSBinn"
DeployPackages "C:NightlyBuildsBISVNETLStageToWarehouse" "StageToWarehouse" "<target server>" "C:Program FilesMicrosoft SQL Server100DTSBinn"

The Codeplex project page has been updated with the latest code.

Automating DB and SSIS builds from source control

This week I’ve been working on a new branch of a data warehouse project. To make deployment to development environments easier I decided I’d attempt to get some sort of continuous integration/automated builds to our dev server up and running. Doing this sort of thing has never been particularly straightforward for database projects, but with the fantastic SQL Compare command line tools and SQL Source Control from Red Gate, this task is made so much easier.

I wanted something relatively simple (as always), something that meant not having to install a framework. I did look at TeamCity, NAnt and MSBuild, but decided against using any of these, at least in the short term. My reasoning being that something light and fairly portable with as few dependencies as possible would be more useful at this stage.

The intention was to set up a nightly build job that would:

  1. Get the latest code from the development branch from SVN
  2. Deploy any schema changes to the development databases
  3. Delete and re-deploy the SSIS packages for the ETL

I used the excellent Continuous Integration article from Red Gate to get me going. I had started to implement this using a batch file (yuk!), but quickly moved onto using PowerShell, my new favourite toy on Windows! The examples in each section will show the PowerShell commands used in the script.

Setup and Configuration

Working locally initially, I created the following folder structure on the C drive:

  • NightlyBuilds
  • — BI (the project name)
  • —— Configuration (to hold any configuration files)
  • —— SVN (for the SVN checkout)
  • —— Output (for any output or report files)

Source Control

Our development team use dedicated local databases and push changes to the centralised Subversion (SVN) repository using Red Gate SQL Source Control. The first step in the build was to check out the latest SQL schema scripts and SSIS packages. After installing the SlikSVN client, I could use the following Powershell commands to perform the check out:

Set-Location "C:Program FilesSlikSVNbin"
svn checkout http:///repos/BI/Database "c:NightlyBuildsBISVNDatabase"
svn checkout http:///repos/BI/ETL/branches/Phase2 "c:NightlyBuildsBISVNETL"

Deploy Schema Changes

Next up was to use the SQL Compare command line tools to compare each of the databases on the development environment with the schema scripts.

Set-Location "c:Program FilesRed GateSQL Compare 9"
Invoke-Expression ".sqlcompare /ArgFile:C:NightlyBuildsBIConfigurationEDWCompare.xml"
Invoke-Expression ".sqlcompare /ArgFile:C:NightlyBuildsBIConfigurationStagingCompare.xml"

At the moment, I’m using XML argument files for each comparison (this may well change as this progresses), these have a format like:

C:NightlyBuildsBISVNDatabaseEDWbranchesPhase2
server name
    EDW
default
C:NightlyBuildsBIOutputEDWSchemaDiffReport.html
Interactive
C:NightlyBuildsBIOutputEDWSchemaSyncScript.sql
user
role

Deploy SSIS Packages

With the databases being up to date, the next task was to deploy the SSIS packages. Having experienced a couple of issues in the past with packages not being overwritten correctly, I decided that, to be on the safe side, I’d delete an existing package before re-deploying it. The command line tool dtutil.exe lets us perform these tasks and more, but it can be incredibly frustrating to use at times, with it’s inconsistent syntax and problems with differences in SQL Server versions and 32 and 64 bit versions. Using the following function, the path to the SSIS packages is specified and each one is deleted on development, then redeployed.

function DeployPackages 
{
    param($filepath, $ssisfolder, $server)

    Set-Location $filepath

    $files=get-childitem *.dtsx

    foreach ($file in $files) {
        $basename=$file.BaseName
        $deletearguments = " /delete /SQL $ssisfolder/$basename /SourceServer $server"
        $deployarguments = " /file $file /copy SQL``;$ssisfolder/$basename /destserver $server"

        Set-Location $dtutil

        Invoke-Expression "dtutil $deletearguments"
        Invoke-Expression "dtutil $deployarguments"
    } 
}

DeployPackages "C:NightlyBuildsBISVNETLSourceToStage" "EDW/SourceToStage" ""
DeployPackages "C:NightlyBuildsBISVNETLStageToWarehouse" "EDW/StageToWarehouse" ""

Improvements

So that, in a nutshell, is the process I’m currently using. It works, which is nice, but I have quite a few improvements in mind:

  • Firstly, it seems to be very slow when running dtutil. I’m not sure if this is because I’m currently doing everything over a VPN, but I’ll be looking to improve the general performance. To be honest, it’s not a major issue, as it’s mainly designed for builds that happen overnight, but it would be nice if I could streamline it. I did try to use the /fdelete switch on dtutil to remove an entire folder, but that demands that the folder is empty.
  • Add more parameterisation so that it could be used for different projects on different environments
  • Use the SQL Data Compare command line tools to ensure any static/lookup data is up to date
  • Add an option to drop (if required) & create the databases and SSIS folders. This would allow the script to deploy to “virgin” machines and test the entire build.
  • Add job creation scripts
  • Build in SSRS report deployment. The project I’m working on does not use a great deal of SSRS reports (OBIEE is the end user tool), but given that report deployment can be a pain, it would be nice to include this too.
  • Add some reporting as to the success or failure of any component of the script.

Update: Looks like the VPN was indeed causing a slowdown. Running it connected locally to the network, 60+ SSIS packages were deployed in <3 minutes.

Code

I’ve started a project on Codeplex for this, really just as a place to keep the code. Any suggestions or comments are, as always, more than welcome.

Deleting from Excel in SSIS – a workaround

Here’s a quick how-to post, a little work-around for an issue with writing to Excel from SSIS.

I’ve been asked a few times how to use SSIS to delete data from an Excel spreadsheet, then write data into that blank spreadsheet, while maintaining the column headers in the first row.  There is no particularly straightforward way of doing so in SSIS, as attempting a delete statement on the Excel sheet will either remove the data in all rows (including the header) or throw an error as the DELETE statement is not supported.  There is a method that I use, which I will step through here.

The basic idea is to do the following on each execution of the package:

1. Delete the XLS file that is being written to

2. Re-create the XLS using a CREATE TABLE statement

3. Load the data into the empty XLS

In order to do this, start by creating the Data Flow for step 3 and ensure that this works as expected (Figure 1 is a simple example).  This will create the XLS that will be written to.  Now we have the structure, we can use this to re-create the XLS using a SQL statement.  To do so, drop an Excel Source onto the flow (pointing to the destination file) and then an Excel Destination, and link the output from source to destination.  Open the Excel Destination and click the New button next to ‘Name of the Excel sheet’ – this will pop up a window with the CREATE TABLE statement that will be used in step 2.  Copy this and delete both the Excel Source and Destinations as they have served their purpose for this exercise.

On the Control Flow, add an Execute SQL Task, with the ConnectionType set to Excel, and the Connection set to the destination Excel connection manager.  Paste in the the CREATE TABLE script from the previous step into the SQL Statement property value.  This will create the XLS at run time.

Next, add a File System Task, with an Operation of Delete File and the Source Connection set to a new File Connection Manager, also pointing to the destination XLS.

Finally, set up the Control Flow to execute the delete, then execute the CREATE TABLE, then perform the Data Flow, as shown in Figure 2.

One issue I found with this method: the Execute SQL Task connected to an Excel Connection Manager did not work correctly if the Excel version was Microsoft Excel 2007 (.xlsx extension).  Changing this to Excel 97-2003 (.xls) resolved this issue.

SSIS 2008 Deployment Manifest error

I had an issue this morning whereby I was trying to deploy an SSIS 2008 project using the Deployment Manifest, and was hitting the following error:

Could not save the package “<Package Path>” to SQL Server “<SQL Server>”.——————————

ADDITIONAL INFORMATION:The package failed to load due to error 0xC0011008 “Error loading from XML. No further detailed error information can be specified for this problem because no Events object was passed where detailed error information can be stored.”. This occurs when CPackage::LoadFromXML fails.



A quick Google search indicated that the issue was to do with a SQL Server 2005 installation that had been upgraded to 2008.  Right-clicking the Deployment Manifest file showed two options for the Installation Package Utility.  Even though both said 2005, one was actually pointing to the 2008 version.  Choosing this one corrected the problem.  The default location for the 2008 version is

C:Program FilesMicrosoft SQL Server100DTSBinndtsinstall.exe

You can also run this via the command line:

C:Program FilesMicrosoft SQL Server100DTSBinndtsinstall.exe "C:<project path>ProjectName.SSISDeploymentManifest"

SSIS and Tinyint datatype

I’ve had a couple of issues recently with using the tinyint datatype in SSIS.  In one case I have a Lookup Transform which joins to the incoming data flow path on a column defined as a tinyint in the database.  The column metadata in the data flow is DT_I4, a four-byte signed integer.  Dragging this across to the tinyint column in the Lookup gives me the following error:

The following columns cannot be mapped:
[DataFlowColumn, LookupColumn]
One or more columns do not have supported data types, or their data types do not match.

By process of elimination, I discovered the incoming column needs to be converted to a DT_UI1, a single-byte unsigned integer.  One to bear in mind for the future!

Excel connections in SSIS 64-bit

This is a new one on me…when running SSIS in a 64-bit environment, Excel connections (and Access as well I presume) do not work, as the OLE DB Provider for Jet is not supported.

At a current client, they are moving the server platform from Windows 2003 Server to Windows 2008 Server 64-bit.  I had developed some ETL packages for them, pulling data from Excel and CSV sources.  The packages get executed when the files have been unzipped and downloaded to a specified folder.  This is done via simple batch files, which are along the lines of the following:

dtexec /f "F:SSIS PackagesReportingImportClientCSV.dtsx"

When I moved all the packages to Windows 2008, the CSV packages ran no problem.  But when I tried to execute one that imported from Excel, I received the following error:

Code: 0xC00F9304
   Source: ImportClientXLS Connection manager "Excel Connection Manager"

   Description: SSIS Error Code DTS_E_OLEDB_EXCEL_NOT_SUPPORTED: The Excel Connection Manager is not supported in the 64-bit version of SSIS, as no OLE DB provider is available.

In order to use the OLE DB for Jet providers, the packages must be run in 32-bit mode.  This is done by using the x86 version of dtexec and specifying the "/X86" switch.  I amended the batch file in question to look similar to this:

"C:Program Files (x86)Microsoft SQL Server100DTSBinndtexec.exe" /f "F:SSIS PackagesReportingImportClientXLS.dtsx" /X86

Voila!  The package ran successfully.

This link from Microsoft details the limitations when running SSIS in 64-bit environments: 64-bit Considerations for Integration Services