Ready Reference 3.0: Volatile Expressions
A volatile expression is one that must be recalculated each and every time there is a recalculation event within a workbook. A recalculation event will occur any time someone updates a cell; presses a recalculation shortcut; refreshes a table or query; or executes a macro that triggers recalculation. The needless use of volatile expressions is one of the things that can make a workbook cumbersome and slow.
The use of one or more volatile functions in an expression makes the expression volatile. All other expressions within the workbook that depend upon that expression and those that depend on them also become volatile.
A volatile function such as NOW and TODAY may have little impact on the overall calculation effort required by Excel to resolve the values in a workbook, but some, when used less judiciously, can have a significant impact on the recalculation times in a workbook, making it significantly slower than alternative calculations. Wherever possible, we would prefer a non-volatile solution to a problem if it can be implemented effectively.
Volatile Functions in Excel
- The CELL function returns a number of sets of information about a cell, or when the optional reference argument is omitted, the active cell. Since this information is dependent upon changes which may occur in the operating system, the function is volatile, to ensure that any changes (such as saving with a new file name) are trapped.
- The INDIRECT function allows the derivation of a dynamically calculated range, anywhere within a workbook. Its particular usefulness is that it allows an expression to refer to source data whose location varies not only as to column and row, but also importantly, by worksheet. This latter property is what makes it useful to know about, and allows this function to do something that is almost impossible any other way. However, it is slooooooooooowwwwww to recalculate and so should only be used sparingly when absolutely essential.
- Also, it should be used intelligently, to reduce the recalculation time involved. Things that will improve performance include making the ref_text argument a reference to a separate cell that pre-calculates the target address, reducing the calculation time, and where the row and column numbers vary, using R1C1 addresses instead of A1 addresses, and setting the optional a1 argument to FALSE.
- Alternative approaches which will be less volatile include: use of INDEX where the source sheet is not variable, use of CHOOSE, IFS or SWITCH where there are a relatively small number of finite sheet references to choose from.
- The INFO function returns information about the operating system. Since it depends upon information external to Excel, it has been implemented as a volatile function so that values displayed will be recalculated and displayed correctly each time a recalculation event occurs in Excel.
- The NOW function returns the date and time from the system clock as an Excel date and time number. There is no alternative to using this if that value is genuinely required in the logic of the workbook. If the date is the matter of concern, use TODAY instead, although this will not significantly change the calculation load.
- The OFFSET function allows the derivation of a dynamically calculated range within any one worksheet. It is widely used and causes significant calculation load as it is often used in such a way as to create masses of volatile calculations. Not only is every OFFSET function volatile, but all formulas that depend upon formulas using OFFSET become effectively volatile, as Excel traces the dependency chains of the volatile formulas.
- This is the most significant cause of needless recalculation in many modelling projects, and in our view, should be avoided as much as possible. Its use in calculation of dynamically derived source lists for data validation has a relatively low impact on model calculation speed, but its wide use in formulas within workbooks will have a very deleterious effect.
- As a general rule, we replace OFFSET functions with one or two INDEX functions of the reference type, as these are not volatile and significantly reduce the amount of calculation in workbooks (by orders of magnitude!). The use of the array form of the INDEX function is discouraged, as this becomes effectively volatile due to the vast precedence chains it creates in calculations.
- The RAND function derives a pseudo-random number for use in calculations where randomness is required, such as Monte Carlo type calculations. If this kind of randomness is required, there is no alternative but to accept the calculation impact of using the volatile function. In that kind of analysis, the impact of the randomness will be significant, as it will feed in at the lower end of most formula dependency chains and trigger a significant recaulction load. Better solutions for this would include the use of @RISK and CRYSTAL BALL add-ins for stochastic modelling.
- The RANDBETWEEN function derives a pseudo-random integer between bottom and top inclusively for use in calculations where randomness is required, such as Monte Carlo type calculations. The same comments apply as for RAND above.
- The TODAY function returns the current system date as an Excel date number. Used judiciously, its impact on calculation can be contained. We often use it to provide the seed value in validation lists where a user needs to select a date close to the current system date.
Conditionally Volatile Functions in Excel
Under some conditions functions that would normally not be volatile can be marked as volatile.
- It is possible to write a version of the SUMIF function where the range and the optional sum_range arguments have different dimensions, such as: =SUMIF(A2:A200,">0",B2). In this case, the sum_range is dynamically calculated as being of the same size as A2:A200, starting in B2. This can also happen if through a formula build error, there is a slight mismatch, such as =SUMIF(A2:A200,">0",B2:B201).
- Since Excel XP (2002) the SUMIF function will be marked as volatile in this situation, since the formula needs to be recaulcated if a cell in the dynamic sum_range changes, even though there is not a direct reference to that cell in the formula. To ensure any changes in data are caught, the formula is marked volatile. The point here is, ensure all SUMIF functions are built so that size of range and sum_range have the same dimensions.
Volatility from Nesting
Some formulas will be marked volatile since they contain a nested volatile function, and will be recalculated even though the formula is not using the result returned by the volatile function. For example, the expression =IF(1<2,TRUE,TODAY()) will be marked as volatile and will recalculate on every recalculation event, even though the logical_test argument will never return FALSE, and therefore the TODAY() function will in reality never be called.
Another example, using a newer formula in Excel, introduced in Excel 2016: =IFS(1=1,"Condition A",A1=2,RAND()). Since 1 will always equal 1, the second condition will never be evaluated, but since the second value, dependent on that condition is volatile, the expression is marked as volatile.
- James Clarkson