If you have a MS SQL Server background, analyzing the SSAS memory usage patterns can be confusing. Even if you have to wait for the SQL service to zero out (being sarcastic about slow AWE allocations years ago) the allocated memory, at the end you will see in the Task Manager that the memory has been allocated and will get the warm feeling that things are working as expected. Well, don’t use the same approach with SSAS or you will be disappointed.
SSAS is a file-based database engine. It reads data from different files, makes the joins, and services the requests. All the files requests are going the System File Cache so the overall used memory may not be necessarily found only in the SSAS process. I have already blogged about how file reading and memory usage is done at the OS level. The same applies for SSAS – it makes a file a request to the OS for file; the OS checks to see if the file is in the Cache. If it’s not, the file is retrieved from Disk, placed in the Cache, and returned back to the SSAS (aka hard fault). If it’s found in the Cache, it’s returned back to the process (aka soft fault). Hence, when analyzing SSAS memory usage, we need to consider the System File Cache. There is an excellent tool “RAMMap.exe” from Sysinternals, which gives you a list of all cached files. You can use the tool to view what SSAS files are cached at the OS level. Note: If you run the exe with a text file name as a parameter (i.e. “RAMMap.exe output.txt”), the tool will export all the file paths into a text file; we considered embeding the tool and giving you this list into our app, but we gave up on the idea due to the size of the output file (300Mb on 450GB cube) and the limited value delivered by the report.
I am going to use a server with 1TB of physical memory in the examples to follow. It makes it easier to see the difference in memory behavior when events happened. Speaking about System File Cache, there is an option in the msmdsrv.ini file called “LimitSystemFileCacheSizeMB”, which controlls how many of the SSAS files will be cached; the cache usage limitation is at the OS level and affects other processes. In the Companion snapshot below, we set the option to “0” which means, that there is no limit; the OS will cache as much as it can (note: this may affect other processes; on the same server, there is a SQL instance running with 100/300GB configured for Max Memory, which is dynamically changed during the ETL).
So let’s start with several common scenarios, displaying the memory usage and its variations in SSAS:
-
Min/Max Memory: The SSAS properties allow the configuration of the Min and Max memory settings. This is a bit confusing if you apply common sense. The Minimum Memory does NOT mean that this is the minimum amount of memory that will be used by SSAS. It means that SSAS will start trimming its cache and start taking some other actions if this value is reached! in my experience, the SSAS will stop accepting new connection when its memory consumption goes above this value. Compared to MS SQL, there is not YET a concept of an administrator connection in SSAS, so if the server decides not to take new connection requests, you are out of luck. (Note: inside our service, we do maintain an open connection, which we call Admin connection, so we can be sure that we can talk to the server, when new connections are NOT accepted. This allows us to perform management tasks if needed.) So when the “LowMemoryLimit” is surpassed by the current usage, the server will stop accepting new connection; in other word, it will NOT talk to anyone new. It continues to execute its currently running queries, and as far as the outside world is concerned, the service is simply NOT responding – so the usual approach is: let’s kill the process and restart it! Note: Our software handles this condition automatically, by terminating the most expensive queries until the server becomes “responsive” – please, see the blog on Resource Governor for SSAS, which we introduced as a concept back in 2004! When the “HighMemoryLimit” is reached, the SSAS is supposed to start automatically terminating queries…but the exact details about the server behavior with this setting as well as the “TotalMemoryLimit” one are not very clear. Anyway, we have in production our software, which successfully prevent the “non-responsive server condition” and automatically handles it! Put it differently, without our software, the SSAS is restarted multiple times a DAY due to clients complains; with our software, the server is NEVER restarted due to non-responsiveness! So if your server is not responding, most likely it’s processing expensive queries…check your PerfMon counters.
Here is an example of when the sever becomes non-responsive but in reality, it simply continues to process the current requests, while refusing new work due to lack of resources. The “Active Requests” chart show “bubbles/requests”, which represent current actively-processed requests with diameter representing their duration times (i.e. the bigger, the longer the duration of the request). The drop in the Memory usage below the “Limit Low” represents our software taking proactive actions and terminating expensive requests until the server is back to a responsive state! Please, notice that we are talking about 900GB of physical RAM; check also the “In Use” System Memory, “Standby Cache”, and “Free” Memory.
The High and Low memory setting can be changed dynamically without a server restart and as you can see from the snapshot below, the changes will take effect immediately:
-
Normal Memory with a Query: below is a snapshot of a single query and the IO footprint it causes. Let me point you to the interesting facts. a) The CPU usage is high which most likely means that data is retrieved from the Storage Engine (since it’s a multi-threaded data access); b) The “System Files/IO” chart shows high Logical IO but NO physical IO, which means that the data is coming from the System File Cache and the process memory containing cached data!
-
Backup: Below is an example of Memory and IO usage while a Backup is taking place. There are several interesting observations to point out. Let’s start with the “Stanby”(593GBs) and “In Use”(293GBs) memory in the System Memory chart, and a total of 193GB in "Shrinkable,Nonshrinkable, and Other" memory used by SSAS in the "Usage[GB]" chart. 100GBs are allocated to MS SQL so we get a good match between what’s reported at the OS level and at the SSAS level (293GBs vs. 193GBs). At this point, the logical question from an MS SQL professional will be: why only 193GB when the cube is close to 880GBs? Check out the 592GBs of Standby cache; further analysis with the “RAMMap” tool will reveal that the OS has cached all the SSAS file, and the reported “592GBs” of Stanby cache is actually used for caching SSAS files! This is very helpful if you need to restart the SSAS instance or reprocess of the SSAS files; the cached files are still cached (i.e. 592GBs) after the restart! Note: another interesting observation is the central chart “Logical IO by Process” and its neighboring chart “System IO”. The “Logical IO by Process” chart displays what processes are causing the Logical IO on the System; hovering the mouse over the chart will give you this dynamic information. In our case, we have SSAS causing the most of the IO (i.e. SQL is the other process on this server, which may be responsible for high IO). The “System IO” chart allows us to compare the “Logical” vs. “Physical” vs. “Cached” IOs on the system; in this case, we ca see that the Logical and the Physical IOs are overlapping, which is common when we are doing a Backup of the cubes. The Cached IO is of little use in this case.
-
The Famous Storage Engine (SE) vs. Formula Engine (FE) processing times: This is a common topic worth covering. Below are a couple of snapshots with SE and FE queries. Here is a brief review. The first chart represents an SE query. Characteristic: Multi-threaded data retrieval from the storage engine (i.e. higher CPU utilization), high IO, and not much processing of the data (i.e. calculated cells are low!). As you can see, the CPU is relatively high on a 64 CPU system! The Logical IO is also high (even though it’s coming from Cache!). The second snapshot represents a FE query. Characteristics: Single-threaded data processing (i.e. Lower CPU utilization), lower IO, and high “Calculated Cells” count (see Total Cells Calculated/sec)! What is important to point out here is the fact that the SSAS service is using very little physical memory (i.e. Zero = 100Gb Total – 100Gb SQL usage) and yet, there is NO physical IO on the system! All data is coming mostly from System File Cache!
Storage Engine Query:
Formula Engine Query:
-
Processing Command and Memory Behavior: Below is a snapshot of the SSAS server behavior during multiple-partition processing. Here are some highlights: a) The SSAS process memory usage will drop since most of the cached data and aggregations will be invalidated (note: two Usage charts are placed next to each other to show the change in memory usage). b) the bottom left "Raw Data" chart shows the connection between the number of rows retrieved from SQL and the number of rows processed by the SSAS sever when processing the partitions – as you can see they match exactly! When you issue a ProcessData/ProcessFull/ProcessDefault command against partitions, the number of SQL/data source/ records will match this counter. When the aggregations are processed, the number of records read from the cube will be shown in the chart next to the right - "Aggregatioins". This will give you a sense of the processing command progress. Note: the software contains the “Partition Processing” node, which will give you a visual progress on the raw data/records retrieval from the data source as well as estimate completion times. There will be a separate blog on the topic.
The above examples shows some of my observations. How can you draw your own conclusions? We give you the tools to run your own tests. There are a couple of features of our software that can assist you with these tasks. The first one with the “Stress Test”; it allows you to select queries, parameterize them (optional), and execute them in parallel against the server while watching its behavior (note: there is a separate blog on this task). Second, the “Cache Warmer” allows you pre-warm the SSAS and System File Cache with commonly used data; the tool also allows you to empty either the SSAS or/and the System File cache, so you can analyze queries behavior under different conditions and scenarios (Note: there is a separate blog on this task) Both features give you the option to empty either the SSAS or/and the System File Cache.
Going back to the beginning to complete the full circle, SSAS relies on the OS to cache the files that are commonly used. To better understand SSAS Memory needs, we need to monitor the Physical IOs, Logical IOs, and Cached Data.