With the all-important Women’s Euro Qualifier between Scotland and Iceland coming up on Friday, I thought it would be a good time to grab some of the data from the UEFA website and have a play with it in Power BI. I’ve got the data for all teams imported, the report below is filtered to show the key statistics for the two teams of interest. When I get a bit more time to spare, I’ll detail the PowerQuery steps I took to get and transform the data.
There was good news for data developers last week, as Microsoft announced that SQL Server Developer Edition will be available as a free download from now on. It’s part of the Visual Studio Dev Essentials program, which I highly recommend signing up for (it’s free!). You can grab SQL 2014 right now, and when 2016 is released later this year, it will be available too.
I’m really happy to see Microsoft make this move and ensure that SQL Server is more readily available to developers. Not that it was expensive before, but removing any barriers to adoption can only help.
Two passions of mine that go hand in hand are sports and data. Fans nowadays can access a whole host of statistics for every sport imaginable. This data can be used for any number of reasons, including for gambling purposes and to help win fantasy leagues. Most premier sports teams now employ data analysts to look for something that will give them an edge and an advantage over their opponents. (On a related note, I’m currently reading Big Data Baseball by Travis Sawchik – a really interesting insight into how the Pittsburgh Pirates used previously untapped data to improve their fortunes).
For the past year or so, I’ve been collecting, cleaning, transforming and analysing football results data, mainly for leagues in Scotland and England. I’ve been using a few of my favoured tools to do this – namely Power BI, PowerPivot and R. The ability to embed reports in a web page was added in a recent Power BI update, so I can now post an example of the kind of thing I’ve been building. Power Query is used to consume raw data from CSV files and scrape data from the web, then cleanse and shape it into a useable format. Relationships are then created in Power BI and I’ve added a set of useful measures and calculations using DAX. See below for my interactive Scottish Premiership report, showing a small subset of data going back to the 2000-2001 season.
This report currently allows a user to view league tables; to analyse shots, shots on target and goals; to view historical team matchups and to analyse monthly team form.
Please give it a go, have a play – any comments and suggestions are welcome. Let me know if you spot any errors! I’ll be adding more analysis to this over time, as well as adding in data from other leagues across Europe and lower league data.
Note: I’ve cross-posted this at my business blog over at FocusBI.co.uk.
Here’s a quick one I found this morning while fixing an execution issue with one of my SSIS packages. I needed to change the Protection Level of all the packages in a project from Encrypt With User Key to Don’t Save Sensitive. The project contains over 30 packages, and I didn’t really fancy opening each one, navigating to the properties and amending the Protection Level for each one – that’s too time-consuming and frankly, boring. I initially figured I’d write a PowerShell or Python script to amend the relevant XML in each of the .dtsx files. But a quick Google came up with this MS Technet page, which details how to make the change via the command line, for both individual packages and for all packages within a project.
A quick amendment to the command to use the value for DontSaveSensitive (0) and to remove the password parameter was all that was required, so running this…
for %f in (*.dtsx) do dtutil.exe /file %f /encrypt file;%f;0
…and pressing Y to confirm each change as prompted, resulted in the change being made to all packages. Result! Well almost. After going back to Visual Studio and changing the Protection Level of the Project (it must match that of the Packages), I tried to build the solution. It returned an error stating that for all the Packages, the Protection Level was incorrect and had to match the Project. But I’d just changed all of that! Turns out I had to open all the packages in Visual Studio (for some reason), and then it would build without making any further changes. Odd.
While this is useful to know for future reference, I’m much more likely to be developing SSIS projects using BIML, which would make changes like this far easier, as it would be a simple change in the BIML template, re-generate the packages and job done. It’s the future…
At my current client, I have a very specific set of parameters to work within regarding security. Chiefly, I need read permissions on the underlying database for a Microsoft Dynamics AX 2012 R2 installation. I have a series of SSIS packages that extract data from AX on a daily basis, executed from SQL Server Agent. The AX database, however, is hosted on a remote server and neither myself or the client team have permissions directly on this box. Our only option for connecting to the SQL Server is via a service account which has db_datareader permissions on the AX database. So how to execute the Agent jobs running as the service account? By using a Credential and a SQL Agent Proxy.
The first step is to set up a security Credential on the SQL Server that executes the jobs. These can be found under Security>Credentials in SSMS (see Figure 1). The Credential identity should be set to be the name (including the domain)of the Service Account, and must include the password. To create the Credential in T-SQL:
CREATE CREDENTIAL < CredentialName > WITH IDENTITY = '<domain>\<service account>' ,SECRET = '<password>'; GO
Next, a Proxy for SSIS must be created in the SQL Agent. These are found in SQL Server Agent>Proxies>SSIS Package Execution (Figure 2). To create a proxy in T-SQL (note that subsystem_id 11 specifies that it is for SSIS Package Execution):
USE msdb; GO EXEC msdb.dbo.sp_add_proxy @proxy_name = N'<ProxyName>' ,@credential_name = N'<CredentialName>' ,@enabled = 1; GO EXEC msdb.dbo.sp_grant_proxy_to_subsystem @proxy_name = N'<ProxyName>' ,@subsystem_id = 11; GO
Finally, the Agent job needs to be set up to run as the Proxy. In the Run As drop-down in the relevant Job Step, select the newly created Proxy (see Figure 3).
I’ve been branching out a bit in my spare time with regards to other tools and languages for manipulating and analysing data. The R language has interested me for a while, and I came across this free tutorial which serves as a pretty good introduction to its basic usage and syntax. It does so in a rather nice interactive web interface. Highly recommended if you have a spare hour or two to go through the course. As a plus, there is a link to a discount on the O’Reilly book Data Mashups In R once the course has been completed. R itself can be downloaded from here. Get stuck in fellow data geeks!
I don’t consider 2012 to have been a great year personally, I’m quite glad to see the back of it. But I do have to join in with almost everyone else in the blogging world and post my highlights of the year. First up of course, is albums…
Records of the year
- Mid Air – Paul Buchanan (Mid Air on Youtube)
- The Devil’s Debt – Love & Money (The Devil’s Debt on Youtube)
- Now for Plan A – The Tragically Hip (Goodnight Attawapsikat on Youtube)
Books of the year (not necessarily published this year, but read by me in 2012)
- No Country for Old Men – Cormac McCarthy
- A is for Angelica – Iain Broome
- The Deep Dark Sleep – Craig Russell
Tech books of the year
- DBA Survivor – Thomas LaRock
- How to Become an Exceptional DBA – Brad McGehee
- Pro SQL Server 2012 Practices – a ton of great SQL folks (still reading this but it’s the business)
Beers of the year
- Hit The Lip – Cromarty Brewing Co
- Magic 8 Ball – Magic Rock Brewing
- A Face With No Name – Tempest Brewing Co
The 49ers beating the Saints in the playoffs courtesy of this last-gasp Alex Smith touchdown pass to Vernon Davis.
Personal highlights and goals
It’s September, it’s football time! The best time of year, it’s a long time since February. Although the season kicked off on Wednesday night with the Cowboys beating the Superbowl champion Giants, I decided I’d predict every regular season game and see who I think will be the eventual winners. The standings came out as follows:
- *NE – 12 and 4
- *BUF – 9 and 7
- NYJ – 7 and 9
- MIA – 5 and 11
No surprise that I think the Patriots will win this division handily. I fancy Buffalo to have a good season with some nice additions on defense, enough to see them sneak into the playoffs.
- *BAL – 11 and 5
- PIT – 9 and 7
- CIN – 6 and 10
- CLE – 3 and 13
Baltimore are the class of this division still. The Steelers and Bengals to take a slight backwards step. It’s the Bengals way!
- *HOU – 10 and 6
- TEN – 8 and 8
- JAX – 4 and 12
- IND – 4 and 12
I think the Texans will make the playoffs for the second time, they look easily the best team here. Andrew Luck will have a decent, but ultimately frustrating, and losing, season with the Colts.
- *KC – 11 and 5
- *SD – 10 and 6
- OAK – 7 and 9
- DEN – 7 and 9
Peyton Manning and the Broncos are making the headlines in the pre-season, but my bet is for the Chiefs to win this competitive division. The Chargers will make the playoffs via the wild card, leaving the Raiders and Broncos disappointed.
- *PHI – 12 and 4
- *NYG – 10 and 6
- DAL – 9 and 7
- WAS – 4 and 12
I’m picking the Eagles to finally live up to their potential in 2012. The Giants will sneak in again, while Tony Romo and RGIII stay at home in January.
- *GB – 13 and 3
- *CHI – 12 and 4
- DET – 8 and 8
- MIN – 4 and 12
Yet again the Packers are the class of the NFC, although the Bears will run them close. The Lions will struggle to keep up and miss out on another playoff trip.
- *ATL – 13 and 3
- CAR – 10 and 6
- TB – 7 and 9
- NO – 6 and 10
This is my big shock division. I think the Falcons will take off this season, and the Panthers will make a run at the playoffs, but miss out to the Giants on the tie-breakers. The Bucs will improve, but I think the Saints will struggle in the big games without Sean Payton and end up with a losing record.
- *SF – 9 and 7
- SEA – 8 and 8
- ARI – 5 and 11
- STL – 3 and 13
A horrendous schedule will see my Niners win four less games than last season, but still win a poor division. Seattle will be competitive but the Cards and Rams will want to forget this one, the latter being rewarded with the first pick in the draft.
NYG at PHI – PHI win
CHI at SF – CHI win
BUF at KC – KC win
SD at HOU – HOU win
PHI at ATL – PHI win
CHI at GB – GB win
KC at BAL – BAL win
HOU at NE – NE win
PHI at GB – GB win
BAL at NE – BAL win
GB vs BAL – GB win
So, I have the Packers to win it all. Really can’t see past them this season. Let’s see how much I get wrong!
Ever tried to reseed an identity column in a table variable? No? Let’s give it a go…
Let’s declare a simple table variable, @country, and insert a few rows.
DECLARE @country TABLE ( CountryId INT IDENTITY(1,1), CountryName VARCHAR(50) NOT null ); INSERT INTO @country ( CountryName ) VALUES ( 'Scotland' ), ( 'England' ), ( 'Wales' ), ( 'Northern Ireland' ) ; SELECT CountryId , CountryName FROM @country ;
Truncating a normal table will remove all the rows and reseed the identity value. The following syntax is not valid for a table variable however:
TRUNCATE TABLE @country;
The following error would be returned: Incorrect syntax near ‘@country’.
We can remove all the rows from the table by using DELETE rather than truncate, but this does not reset the identity value. Can we use DBCC CHECKIDENT, like we would on a physical table? No…
DBCC CHECKIDENT (@country, RESEED, 1);
will return Must declare the scalar variable “@country”., and
DBCC CHECKIDENT ([@country], RESEED, 1);
will return Cannot find a table or object with the name “@country”. Check the system catalog.
Table variables cannot be dropped, nor can they be re-declared. In fact, trying to do the latter returns this error: “The variable name ‘@country’ has already been declared. Variable names must be unique within a query batch or stored procedure.“ The bold part of this message is important to note. Bear this in mind while we try to declare the table variable in it’s own scope:
DECLARE @i INT = 1, @count INT = 5; WHILE @i <= @count BEGIN -- declare table variable in scope of the while loop DECLARE @country TABLE ( CountryId INT IDENTITY(1,1), CountryName VARCHAR(50) NOT null ); INSERT INTO @country ( CountryName ) VALUES ( 'Scotland' ); SELECT CountryId , CountryName FROM @country ; DELETE FROM @country ; SET @i = @i + 1 ; END
The output is as follows:
Note that even though we declared the table every time the loop was executed, no error was returned (even though it was being declared in the same batch) and the identity column continued to increment.
So really, this is just an extremely long-winded way of saying…if you need to reseed an identity column and you are using a table variable, consider using a temporary table or a physical table instead!
This month’s T-SQL Tuesday is hosted by Erin Stellato, and the theme is A Day In The Life. The premise is to track what you do over the course of a day as a SQL professional. My days can differ greatly, depending on the role I’m currently working in, and for what client. My current role is mainly database development and data migration work for an upgrade to an existing application. Last Wednesday went roughly something like this…
0900 – Getting set up for the day by preparing my to-do list. Then I caught up with work-related emails, sent a few replies back before switching Outlook off for a couple of hours – it’s a distraction. I then did the usual morning routine, checking out the latest code from Subversion and updating the database documentation. I also spent a couple of minutes on the SQL Server Central Question Of The Day – I find it leads me into work quite nicely, and I often learn something new.
0920 – Had a discussion with the lead .Net developer regarding some requested data model changes.
1000 – Started working on an SSIS package to migrate data from a legacy system into the main application database.
1130 – Updated system testing and user testing environments with new copies of static data.
1200 – The client Subversion repository had gone through some changes to separate development streams, so I moved all the database scripts to the relevant places. Red Gate Source Control…I can’t live without you!
1230 – Daily conference call to discuss progress and issues with the data migration.
1300 – Lunch (cottage pie – nice!), caught up on the days news and blogs.
1330 – Evaluating and designing the changes required for a new piece of functionality; then updating the relevant stored procedures before testing and documenting.
1600 – Made some logic changes to an existing stored procedure.
1700 – Started modelling some structural changes to a part of the database that was causing the developers some issues.
1715 – Load and performance testing conference call.
1745 – Home time!
I’m not sure that’s the most typical day for me – I’d say the work is more fragmented than normal. But it pretty much answers the question “What did you do all day?”.
Thanks to Erin for hosting T-SQL Tuesday this month!