2016-01-18

What seemed like a trivial task turned into a real nightmare when I had to read in some data from a PowerPivot model using Python. I believe I've researched this very well over the last couple of days but now I hit a brick wall and would appreciate some help from the Python/SSAS/ADO community.

Basically, all I want to do is programmatically access raw data stored in PowerPivot models - my idea was to connect to the underlying PowerPivot (i.e. MS Analysis Services) engine via one of the methods listed below, list the tables contained in the model, then extract the raw data from each table using a simple DAX query (something like EVALUATE (table_name)). Easy peasy, right? Well, maybe not.

0. Some Background Information

As you can see, I've tried several different approaches. I'll try to document everything as carefully as possible so that those uninitiated in PowerPivot functionality will have a good idea of what I'd like to do.

First of all, some background on programmatic access to Analysis Services engine (it says 2005 SQL Server, but all of it ought to still be applicable): SQL Server Data Mining Programmability and Data providers used for Analysis Services connections.

The sample Excel/PowerPivot file I'll be using in the example below can be found here: Microsoft PowerPivot for Excel 2010 and PowerPivot in Excel 2013 Samples.

Also, note that I'm using Excel 2010, so some of my code is version-specific. E.g. wb.Connections["PowerPivot Data"].OLEDBConnection.ADOConnection should be wb.Model.DataModelConnection.ModelConnection.ADOConnection if you're using Excel 2013.

The connection string I'll be using throughout this question is based on the information found here: Connect to PowerPivot engine with C#. Additionally, some of the methods apparently require some sort of initialization of the PowerPivot model prior to data retrieval. See here: Automating PowerPivot Refresh operation from VBA.

Finally, here's a couple of links showing that this should be achievable (note however, that these links mainly refer to C#, not Python):

Made connection to PowerPivot DataModel, how can I fill a dataset with it?

Connecting to PowerPivot with C#

2013 C# connection to PowerPivot DataModel

Connecting Tableau and PowerPivot. It just works. (showing that external apps can in fact read PowerPivot model data - note that the Tableau add-in installs Interop.ADODB.dll assembly, which I guess is what it uses to access the PowerPivot data)

1. Using ADOMD

Here, it appears the problem is that the PowerPivot model has not been initialized:

2. Using AMO

Same story, "the server is not running":

Note that AMO is technically not used for querying data, but I included it as one of the potential ways of connecting to the PowerPivot model.

3. Using ADO.NET

This is similar to What's the simplest way to access mssql with python or ironpython?. Unfortunately, this also doesn't work:

4. Using ADO via adodbapi module

Similar to Opposite Workings of OLEDB/ODBC between Python and MS Access VBA. The error I get is:

This is basically the same problem as with ADO.NET above.

5. Using ADO via Excel/win32com module

The idea for this approach came from this blog post that uses VBA: Export a table or DAX query from Power Pivot to CSV using VBA. Note that this approach uses an explicit Refresh command that initializes the model (i.e. "server"). Here's the error message:

It appears, however, that the ADO connection has been established:

type(Connection) returns instance

print(Connection) returns Provider=MSOLAP.5;Persist Security Info=True;Initial Catalog=Microsoft_SQLServer_AnalysisServices;Data Source=$Embedded$;MDX Compatibility=1;Safety Options=2;ConnectTo=11.0;MDX Missing Member Mode=Error;Subqueries=2;Optimize Response=3;Cell Error Mode=TextValue

It seems the problem lies in the creation of the ADODB.Recordset object.

6. Using ADO via Excel/win32com, direct use of ADODB.Connection

Similar to Connection to Access from Python [duplicate] and Query access using ADO in Win32 platform (Python recipe). Unfortunately, the error Python spits out is the same as in the two examples above:

7. Using ADO via Excel/win32com, direct use of ADODB.Connection plus model refresh

I was hoping I could initialize an instance of Excel, then initialize the PowerPivot model, and then create a connection using the internal connection string Excel uses for embedded PowerPivot data (similar to How do you copy the powerpivot data into the excel workbook as a table? - note that the connection string is different from the one I've used elsewhere). Unfortunately, this doesn't work and my guess is that Python starts the ADODB.Connection process in a separate instance (as I get the same error message when I execute the last three rows without first initializing Excel, etc.):

Show more