Mathematical function log10can be quite time consuming. If it is not available, for some languages such as VBScript, you have to compute it base on the fact.
and therefore, we can compute log10 base on natural log.
and since can only be computed once and stored for next computation, we can replace it with the constant. Since multiplication is faster than division in most cases, we can replace the whole division thing with the constant 0.4342944819.
Sometimes, we just want the integer part of the log10, so we can use a faster function base on the fact that it is only 10 possible integer values for unsigned 32-bit integers. The maximum value that can be represented by unsigned 32-bit integers (e.g. uint in C#, LongWord in Delphi) is 4294967295, the log10 value of which is 9. i.e. 4294967295 is larger than but less than .
So instead of computing directly the log10 value, we can easily check which range the given number falls into.
1 2 3 4 5 6 7 | static uint log10(uint v) { return (v >= 1000000000u) ? 9 : (v >= 100000000u) ? 8 : (v >= 10000000u) ? 7 : (v >= 1000000u) ? 6 : (v >= 100000u) ? 5 : (v >= 10000u) ? 4 : (v >= 1000u) ? 3 : (v >= 100u) ? 2 : (v >= 10u) ? 1u : 0u; } |
static uint log10(uint v) { return (v >= 1000000000u) ? 9 : (v >= 100000000u) ? 8 : (v >= 10000000u) ? 7 : (v >= 1000000u) ? 6 : (v >= 100000u) ? 5 : (v >= 10000u) ? 4 : (v >= 1000u) ? 3 : (v >= 100u) ? 2 : (v >= 10u) ? 1u : 0u; }
As shown in above C# code, the function log10 takes the unsigned 32-bit integer v and quickly checks which range it falls into, from a larger set of numbers to a smaller one. So if the input is uniform equally distributed, it will return values for the first few (e.g. 3) checks. i.e. The return value is zero when the input is less than 10 (10 possible inputs); the return value is one when the input is less than 100 (90 possible inputs) and so on. So we should check the largest group set first.
How does this perform? We time the performance in the C# .NET environment.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 | using System; using System.Diagnostics; namespace ConsoleApplication2 { class Program { static uint log10(uint v) { return (v >= 1000000000u) ? 9 : (v >= 100000000u) ? 8 : (v >= 10000000u) ? 7 : (v >= 1000000u) ? 6 : (v >= 100000u) ? 5 : (v >= 10000u) ? 4 : (v >= 1000u) ? 3 : (v >= 100u) ? 2 : (v >= 10u) ? 1u : 0u; } static void Main(string[] args) { Stopwatch w = new Stopwatch(); w.Start(); for (uint i = 0; i <= 200000000; i++) { uint x = log10(i); } w.Stop(); Console.WriteLine(w.ElapsedMilliseconds); w.Restart(); for (uint i = 0; i <= 200000000; i++) { uint y = (uint)Math.Log10(i); } w.Stop(); Console.WriteLine(w.ElapsedMilliseconds); Console.ReadLine(); } } } |
using System; using System.Diagnostics; namespace ConsoleApplication2 { class Program { static uint log10(uint v) { return (v >= 1000000000u) ? 9 : (v >= 100000000u) ? 8 : (v >= 10000000u) ? 7 : (v >= 1000000u) ? 6 : (v >= 100000u) ? 5 : (v >= 10000u) ? 4 : (v >= 1000u) ? 3 : (v >= 100u) ? 2 : (v >= 10u) ? 1u : 0u; } static void Main(string[] args) { Stopwatch w = new Stopwatch(); w.Start(); for (uint i = 0; i <= 200000000; i++) { uint x = log10(i); } w.Stop(); Console.WriteLine(w.ElapsedMilliseconds); w.Restart(); for (uint i = 0; i <= 200000000; i++) { uint y = (uint)Math.Log10(i); } w.Stop(); Console.WriteLine(w.ElapsedMilliseconds); Console.ReadLine(); } } }
It is no surprise that the log10 is roughly three times faster. 3507 milliseconds versus 10327 milliseconds (the traditional System.Math.Log10).
References:
1. http://graphics.stanford.edu/~seander/bithacks.html#IntegerLog10Obvious
–EOF (The Ultimate Computing & Technology Blog) —
loading...
Last Post: Access Violation Bug Migrating to 64-bit
Next Post: Add Formatted Text to Word from VBScript
my bench, compiled in release mode (thus probably the compiler is inlining the function) are 638 millisec for your version versus 6917 for the Math.Log version, that’s over 10x faster.
Which compiler are you using?
The default compiler that ships with Visual studio 2015 professional.
I’d rather use binary search at least 🙂
I read your comment and i too thought it was correct, but actually a binary search would be slower: integer having log10(x)>=9 are (4,294,967,296 – 999,999,999) = 3,294,967,297 that’s about 76% of the total, having log10(x)>=8 are 20%, having log10(x)>=7 are 2%, having 6 are 0,2%… so it make sense to do the test in that exact order as the autor.
Binary Search is only good for large N, for small N, it is faster without if-checks.
> faster
You can’t say that for sure without benchmarking. Both the binary search and chained-if methods suffer from branch misprediction, which is very far from ideal.
It’s exactly 1 branch misprediction, which is circa 16 cycles on x86_64. Hardly a big deal.
Untrue! Integers in a real program do not occur with a uniform distribution, and smaller numbers are more common.