LGpart3

Article
05/30/2013

Hi, my name is Dave Wickert. Just kidding. That one no good?

My name is Lee Graber. I’m a senior developer on the Microsoft SQL Server Analysis Services team. My team works on PowerPivot for SharePoint integration.

Alright, we’re back on. There’s one important thing, there’s another, everything’s been important. There’s another important thing to recognize when interacting with a PowerPivot workbook in that so when you created a PowerPivot workbook, you actually retrieved data from some type of external source.

It could be SQL. You could’ve actually imported from Access. Maybe you got it from a text file. There are lots of options for where you actually, I’d be remiss to say data feed. Do not say data feed. There are lots of places where you can actually import data from. And PowerPivot for Excel is actually just like Analysis Services is reformatting that data into our own data structures for doing these efficient queries in our MDX language or in a variant of SQL that we eventually store in an ABF.

Now, when Excel via slicers or any other action that you’re taking interacts, what is happening is queries are being issued to our data cache. Like we have an ABF file. That is where we are getting it. We are not actually going to any type of external data source that sourced the data to us.

So if you change your relational database and then upload your workbook, if you didn’t re-import, then you’re still getting what we have stored in our ABF. And another thing to note is that along with our sort of data cache, Excel has something that they call a pivot cache, which is how when you open the workbook, it doesn’t automatically show you, let’s say I take this workbook.

I update in some way or another the contents of our ABF and save it. When I open it, what I see in the workbook does not reflect this ABF because they’ve actually cached all the results that they need inside their format.

Somewhere else in this zip file, they have their actual pivot cache. So we have multiple levels of where the data is actually stored. There’s the real original data in your relational database Access, text file data feeds.

There is our imported and formatted enhanced version that we eventually store in our ABF, and then there’s another cache in the pivot cache for Excel that holds what they need to visualize when you open the workbook.

So understanding that there are a couple of key things that you can learn about and settings that you can change. One is on the Excel connections, you can set a flag called refresh unload. Refresh unload. So for our embedded connection, if this is set, that means whenever the workbook is opened in Excel services, they will ignore their pivot cache and immediately try and reload this data from our ABF.

So you’ve essentially eliminated one of the sort of levels of caching that could give you potentially confusing results. I’m not saying you should turn this on or not. I’m just saying this is one of your options.

Now the other thing to understand is that in order to get new versions of this data, we have a feature which we call data refresh. Apparently, we couldn’t come up with anymore of an original name so we went with one that would make sense.

So when you upload the workbook in SharePoint and you right click on it there’s actually an option that says manage data refresh. If you have not yet created a schedule, it’ll immediately take you to a page that will look very familiar in, for anybody who’s created schedules for doing anything before. It’s got daily, monthly, weekly.

I want once. I’d like it to run right now. We have concepts of after business hours which is an administrative configuration option that says well, business hours are between 8 a.m. and 5 p.m. So even if someone sets it to run daily, I’d like it to run after 5 p.m. always.

Along with all of that, we have security information that we need because what data refresh is going to do is via a SharePoint timer job, a call is going to be made into our system service to say, look for refresh schedules.

So we’re going to get this call. We’re going to look and say, is there anything that needs to be refreshed right now? We’re going to find, oh, there’s workbook 1, this workbook that you just uploaded. Let’s say, alright. Well, want to refresh it. But right now, the only credentials we have are the system service that we’re running in or we potentially have the farm account in which the timer service runs.

Neither of those are the actual accounts that we want. So when you configure data refresh for us, you similarly have a couple of options. You have, we’ll call it none. It’s not called that. But it maps to another tended account that is configured by an administrator to say, when data refresh occurs, load these, Windows, user name and password.

Log in as that so that when we load the database and we call process full, actually we don’t call, when we call process on the various dimensions and tables that we need to, we have a valid Windows identity.

Another option is to manually enter a username and password. So we’ll actually let you just type in a username and password and we will create a secure stored target application that we can load here to retrieve the Windows identity and log in.

And the final option is similar to before is in SSS target application. Again, these two, this one is different than Excel services. In the Excel services world, there was a user who came through and at the point they are talking to the data provider, they still know who that user was.

In our case, we’re getting a call from the timer service which is the farm account into our service which is running as some process account A. So we don’t have an original user. So at this point, what we need to do is say, okay. What is the target application associated with this system service?

So do we need Keberos Constrained Delegation? The reason that so, when analysis services loads it and we actually call process, it’s going to make a call back into that SQL, Access, data feeds. It’s actually going to make a call into these back end data sources and re-retrieve the data which would be one, two hops.

So typically we would normally need Keberos Constrained Delegation configured. These two guys, however, exist on the same machine always. So we’re able to eliminate this hop and make this the first hop. So by always guaranteeing that these two run on the same machine and that the call to refresh is done against the same server, the same analysis server, services server running the system service that is actually doing this refresh, we can avoid the requirement for Keberos Constrained Delegation.

Some data sources do not use Windows authentication. Maybe you have Oracle here. Maybe you have SAP. In these cases, we have the ability to override the username and password on the connection string so that you can set it when you actually configure data refresh and we will update the connection string right before we make a call to these data sources.

So when this happens, we load the database in a read/write mode. We’ll do process on it and then we resave this back into the XLSX and reload it back up into SharePoint. At that point, you have your updated workbook. You can interact with it and get all the latest data. Yeah.

LGpart3

Additional resources