Usually, in an OLAP cube, measures are numeric values and strings are used primarily as descriptions of dimension members.
In Analysis Services, strings are stored in separate structures – called “string stores” – because they have dynamic records size, and are less efficient – much less efficient – than numeric fields.
Furthermore, every string has an overhead of 12 additional bytes for management – so even a 1-byte string takes more room than an 8-byte numeric field.
For these two reasons alone we should limit the use of strings in our projects – that means we should use them only as “user friendly labels” for all dimension members – so, whenever possible, we should prefer integer values.
However, the worst thing about strings in Analysis Services – at least before SQL Server 2012 – is that the string store has an absolute maximum size of 4GB.
This limit applies to every single attribute of every dimension and, more precisely, to any of the following types of string file stores:
- *.ksstore – Used to store strings representing keys of attribute members
- *.asstore – Used to store the names of members
- *.bsstore – Used to store the blobs of members
When one of these files exceeds the 4GB size limit, processing will fail and you will receive – usually after a long time – a message like this:
This is a “physical” limit of the engine, hence there is no workaround and to avoid this problem, you can try one or more of these potential solutions:
- Do not use strings as key columns for attributes with a large number of members
- Shorten attributes names
- Shorten translations of attributes names
- Switch to a ROLAP dimension
What I am interested to underline here, is that we can avoid spending a lot of time processing a big dimension only to discover at the end that the engine is not able to complete the process because of this problem.
In fact, we can easily estimate the storage needed by our dimension using this simple calculation:
Total storage = regular storage + overhead storage
Regular storage = (members * characters * translations * 2)
Overhead storage = (members * translations * 12)
Members = number of members in our dimension
Characters = number of characters in the string
Translations = number of translations including the default language
2 = constant value for conversion to Unicode characters
12 = constant value for strings overhead
Here an example:
x 94 characters per member name
x 10 translations
x 2 for conversion to Unicode characters
x 10 translations
x 12 bytes overhead per string
(We can approximate this value to the maximum limit even if the real value is
1GB = 1,0243 bytes = (1,024 * 1,024 * 1,024) bytes = 1,073,741,824 bytes
4GB = 1,073,741,824 * 4 = 4,294,967,296 bytes)
From the above calculations, it is easy to understand that, with a member name that is 94 characters long and 10 translations, our dimension is limited to 2 million members.
Well, the formula is not so complicated, but surely, it would be easier to have a simple utility to calculate the estimated storage occupation of our dimension attributes.
That was exactly my thought a few days ago when I had to do it a lot of time in a very big project.
Therefore, I decided to develop a very simple application during my – very little indeed – spare time.
I called it “SSAS – String Storage Calculator” and you can download it here:
The tool is very simple and I do not want to offend anyone’s intelligence trying to explain what is obvious! 🙂
However, if anyone has any question about it, feel free to ask me.
Three more things:
1) I have checked the correctness of the above formula by creating a dimension – similar to the one of the example – with these values:
- 20 million members (instead of 2 million)
- 94 characters names
- 1 translation (instead of 10)
The estimated size is exactly 4GB and the real size of the *.asstore file was very close to this limit as you can see in this picture:
2) The 4GB limit also apply to DistinctCount measures
3) In SQL Server 2012 the 4GB limit has been removed *but* you have to keep worrying about it because, by default, the limit is still present, so you need to explicitly remove it by activating the “scalable string storage” option for every attribute that may exceed 4GB – and you should do it only for attributes that really need it.
Hence, the SSAS String Storage Calculator will be useful even developing solution in SSAS 2012!
In conclusion, this utility cannot solve the problem for you, but it can help you to save a lot of time identifying the problem before it happens.
I hope you enjoy it!