In [here], the performance comparison over floating point optimization regarding to different configurations for Delphi 2007 and XE3 is presented. We can see that Delphi 2007 produces quite nice 32-bit floating point. But the Delphi XE3 when generating 64-bit code on 32-bit single precision, it will generate lots of redundant conversion code to double-type and convert them back to single precision. This is time-consuming and from the figures, we can see that Delphi XE3 by default generates very slow single-precision 64-bit code but its double-type is a lot faster than its single-type. Why? because it generates ‘cvtss2sd’ assembly instructions that convert single to double. This is intended to keep a higher precision as much as possible because a single-type var will loose the accuracy quickly during a long-series of computation. The following shows the extra conversion code to double-type for each single-type operation is generated.
In Delphi XE3, there is a undocumented compiler switch, which is EXCESSPRECISION, the default is ‘ON’, which will generate the conversion code, like above, during the single-type computation. If, this is turned off, the compiler generates faster code without the conversion code. The following is the remarks for this compiler switch (Click to enlarge)
The {$EXCESSPRECISION OFF} is specific to x64 platform, and the code generated is without conversion code, like below.
It surely is a lot faster, but how fast would that be? We time the program, as shown in here, but with the option to turn EXCESSPRECISION off.
It is obvious that with EXCESSPRECISION turned off, it offers a faster speed, no matter in DEBUG or RELEASE mode. However, the effects of INLINE are not so clear, and even worse in the DEBUG mode.So when you want to move the single-type application to XE3 for 64-bit executables, it is recommended to turn EXCESSPRECISION off. The precision loss in my opinion, can be ignored. I would suggest this is set to default in the future release of Delphi compiler.
–EOF (The Ultimate Computing & Technology Blog) —
loading...
Last Post: Tower of Hanoi, Recursion
Next Post: Optimized ABS function in Assembly