Implementing Table-Lookup-Based Trig Functions


Implementing Table-Lookup-Based Trig Functions



For a videogame I'm implementing in my spare time, I've tried implementing my own versions of sinf(), cosf(), and atan2f(), using lookup tables. The intent is to have implementations that are faster, although with less accuracy.

My initial implementation is below. The functions work, and return good approximate values. The only problem is that they are slower than calling the standard sinf(), cosf(), and atan2f() functions.

So, what am I doing wrong?

// Geometry.h includes definitions of PI, TWO_PI, etc., as // well as the prototypes for the public functions #include "Geometry.h"  namespace {     // Number of entries in the sin/cos lookup table     const int SinTableCount = 512;      // Angle covered by each table entry     const float SinTableDelta = TWO_PI / (float)SinTableCount;      // Lookup table for Sin() results     float SinTable[SinTableCount];      // This object initializes the contents of the SinTable array exactly once     class SinTableInitializer {     public:         SinTableInitializer() {             for (int i = 0; i < SinTableCount; ++i) {                 SinTable[i] = sinf((float)i * SinTableDelta);             }         }     };     static SinTableInitializer sinTableInitializer;      // Number of entries in the atan lookup table     const int AtanTableCount = 512;      // Interval covered by each Atan table entry     const float AtanTableDelta = 1.0f / (float)AtanTableCount;      // Lookup table for Atan() results     float AtanTable[AtanTableCount];      // This object initializes the contents of the AtanTable array exactly once     class AtanTableInitializer {     public:         AtanTableInitializer() {             for (int i = 0; i < AtanTableCount; ++i) {                 AtanTable[i] = atanf((float)i * AtanTableDelta);             }         }     };     static AtanTableInitializer atanTableInitializer;      // Lookup result in table.     // Preconditions: y > 0, x > 0, y < x     static float AtanLookup2(float y, float x) {         assert(y > 0.0f);         assert(x > 0.0f);         assert(y < x);          const float ratio = y / x;         const int index = (int)(ratio / AtanTableDelta);         return AtanTable[index];         }  }  float Sin(float angle) {     // If angle is negative, reflect around X-axis and negate result     bool mustNegateResult = false;     if (angle < 0.0f) {         mustNegateResult = true;         angle = -angle;     }      // Normalize angle so that it is in the interval (0.0, PI)     while (angle >= TWO_PI) {         angle -= TWO_PI;     }      const int index = (int)(angle / SinTableDelta);     const float result = SinTable[index];      return mustNegateResult? (-result) : result; }  float Cos(float angle) {     return Sin(angle + PI_2); }  float Atan2(float y, float x) {     // Handle x == 0 or x == -0     // (See atan2(3) for specification of sign-bit handling.)     if (x == 0.0f) {         if (y > 0.0f) {             return PI_2;         }         else if (y < 0.0f) {             return -PI_2;         }         else if (signbit(x)) {             return signbit(y)? -PI : PI;         }         else {             return signbit(y)? -0.0f : 0.0f;         }     }      // Handle y == 0, x != 0     if (y == 0.0f) {         return (x > 0.0f)? 0.0f : PI;     }      // Handle y == x     if (y == x) {         return (x > 0.0f)? PI_4 : -(3.0f * PI_4);     }      // Handle y == -x     if (y == -x) {         return (x > 0.0f)? -PI_4 : (3.0f * PI_4);     }      // For other cases, determine quadrant and do appropriate lookup and calculation     bool right = (x > 0.0f);     bool top = (y > 0.0f);     if (right && top) {         // First quadrant         if (y < x) {             return AtanLookup2(y, x);         }         else {             return PI_2 - AtanLookup2(x, y);         }     }     else if (!right && top) {         // Second quadrant         const float posx = fabsf(x);         if (y < posx) {             return PI - AtanLookup2(y, posx);         }         else {             return PI_2 + AtanLookup2(posx, y);         }     }     else if (!right && !top) {         // Third quadrant         const float posx = fabsf(x);         const float posy = fabsf(y);         if (posy < posx) {             return -PI + AtanLookup2(posy, posx);         }         else {             return -PI_2 - AtanLookup2(posx, posy);         }     }     else { // right && !top         // Fourth quadrant         const float posy = fabsf(y);         if (posy < x) {             return -AtanLookup2(posy, x);         }         else {             return -PI_2 + AtanLookup2(x, posy);         }     }      return 0.0f; } 



Slow SQL update could use some help

1:



When a web applicattion generates excel and image files dynamically
"Premature optimization is the root of all evil" - Donald Knuth.
Why not always use psyco for Python code?
Nowadays compilers provide very efficient intrinsics for trigonometric functions that get the best from modern processors (SSE etc.), which explains why you can hardly beat the built-in functions.


how to use Memcache to speed up the PHP?
Don't lose too much time on these parts and instead concentrate on the real bottlenecks that you can spot with a profiler..
When do compilers inline C++ code?


How much memory does a Java object use when all its members are null?


I need the most recent record in a join (PostgresSQL)

2:



common pre-mature optimizations and micro-optimizations
Remember you have a co-processor ...

you would have seen an increase in speed if it were 1993 ...

however today you will struggle to beat native intrinsics.. Try viewing the disassebly to sinf..


3:


Someone has already benchmarked this, and it looks as though the Trig.Math functions are already optimized, and will be faster than any lookup table you can come up with:. http://www.tommti-systems.de/go.html?http://www.tommti-systems.de/main-Dateien/reviews/languages/benchmarks.html. (They didn't use anchors on the page so you have to scroll about 1/3 of the way down).


4:


I'm worried by this place:.
// Normalize angle so that it is in the interval (0.0, PI) while (angle >= TWO_PI) {     angle -= TWO_PI; } 
But you can: Add time-meters to all functions, write special performance tests, run performance tests, print report of time test..

I think you will know answer after this tests.

. Also you could use some profiling tools such as AQTime..


5:


The built-in functions are very well optimized already, so it's going to be REALLY tough to beat them.

Personally, I'd look elsewhere for places to gain performance.. That said, one optimization I can see in your code:.
// Normalize angle so that it is in the interval (0.0, PI) while (angle >= TWO_PI) {     angle -= TWO_PI; } 
Could be replaced with:.
angle = fmod(angle, TWO_PI); 



94 out of 100 based on 74 user ratings 874 reviews