Cire`s build of TA3D
- Balthazar
- Moderator
- Posts: 2055
- Joined: Wed Nov 01, 2006 4:31 pm
- Location: Russian Federation
- Contact:
Cire`s build of TA3D
Cire++, if you`ll manage to build MSVC version of TA3D, could you send it to me or post a link, so we could compare them?
Well I got it to compile but i'am still working on stabalizing it. As soon as its somewhat stable i'll release it.
Now Zuff gave me control to do networking but like usual I like to get things working well and move/recode things that I see as troublsome issues. Right now i'am attacking the vector and matrix classes and operations (the heart of what makes 3D) My goal is to rewrite these so we get 2x the operations out of them, without breaking backwards compatability.
I'am using templates, but one of which will be a specialed class which will use enhanced cpu instruction sets (if present). The downside is this enhanced class will probably only work in windows, though it could be easilly ported to other OS's. It will work by testing for enhanced cpu instruction sets at runtime, and if so will use the enhanced classes otherwize fall back to using the strandard classes.
I've also been messing around with removing allgerogl calls and translating these to direct opengl calls, in some areas i got nearly 18 frames faster per second, in other areas only 3 to 8 frames faster.
Another thing i'am looking at doing is force aligning areas that will be needed, espeically when we start passing crap through network tubes.
So in essence i'am still mucking about and familiarinzing myself more and more with the engine, overall i'am pleased with the progress that zuff made but there are still alot of areas that need to be addressed. LOL, the last time I stuck my nose in the source we ended up ripping apart much of the code and rewriting it, progress on new stuff slowed down but i'am sure that zuff will agree that the project benifited greatly from it, with the edition of alot of new classes and code that made life alot easier. Sadly we didn't finish that rewrite, certain key areas still need to be addressed, which is what i'am compiling a list of now.
++Cire.
Now Zuff gave me control to do networking but like usual I like to get things working well and move/recode things that I see as troublsome issues. Right now i'am attacking the vector and matrix classes and operations (the heart of what makes 3D) My goal is to rewrite these so we get 2x the operations out of them, without breaking backwards compatability.
I'am using templates, but one of which will be a specialed class which will use enhanced cpu instruction sets (if present). The downside is this enhanced class will probably only work in windows, though it could be easilly ported to other OS's. It will work by testing for enhanced cpu instruction sets at runtime, and if so will use the enhanced classes otherwize fall back to using the strandard classes.
I've also been messing around with removing allgerogl calls and translating these to direct opengl calls, in some areas i got nearly 18 frames faster per second, in other areas only 3 to 8 frames faster.
Another thing i'am looking at doing is force aligning areas that will be needed, espeically when we start passing crap through network tubes.
So in essence i'am still mucking about and familiarinzing myself more and more with the engine, overall i'am pleased with the progress that zuff made but there are still alot of areas that need to be addressed. LOL, the last time I stuck my nose in the source we ended up ripping apart much of the code and rewriting it, progress on new stuff slowed down but i'am sure that zuff will agree that the project benifited greatly from it, with the edition of alot of new classes and code that made life alot easier. Sadly we didn't finish that rewrite, certain key areas still need to be addressed, which is what i'am compiling a list of now.
++Cire.
Females: impossible to live with, most powerful money reducing agent known to man, 99% of the time they drive us insane; yet somehow we desire to have as many as we can.
- zuzuf
- Administrateur - Site Admin
- Posts: 3281
- Joined: Mon Oct 30, 2006 8:49 pm
- Location: Toulouse, France
- Contact:
Yeah, there is still lots of work to do and lots of things to improve. But I am not sure I understand how you want to optimize the vector & matrix classes, I tried to optimize it but at the end my compiler still produce faster code !!
As far as allegrogl is concerned, the only calls to it within the renderer are for text drawing (which uses the FONT object from AllegroGL through the GLX_FONT class), I agree it's not very fast, but there is only a few lines of text drawn during a frame, since the rest is only OpenGL how did you manage to get up to 18 fps more !! Anyway that's good news .
Currently if you want to optimize something you could also look at the unit rendering code, it's a bit slow , with fifty units it can get much slower (~ 40 fps) whereas when there is no unit on screen during the same game I can get 200 fps !!
I am going to commit some code to SVN, with some improvements to path following code and bug fixes.
PS: since we now use a quick restart mechanism to update some config options such as fullscreen, screen size, color depth, FSAA, ... the binary is now called ta3d-bin.exe and called through ta3d.bat so you can type ta3d.bat and it will call and monitor ta3d-bin.exe
As far as allegrogl is concerned, the only calls to it within the renderer are for text drawing (which uses the FONT object from AllegroGL through the GLX_FONT class), I agree it's not very fast, but there is only a few lines of text drawn during a frame, since the rest is only OpenGL how did you manage to get up to 18 fps more !! Anyway that's good news .
Currently if you want to optimize something you could also look at the unit rendering code, it's a bit slow , with fifty units it can get much slower (~ 40 fps) whereas when there is no unit on screen during the same game I can get 200 fps !!
I am going to commit some code to SVN, with some improvements to path following code and bug fixes.
PS: since we now use a quick restart mechanism to update some config options such as fullscreen, screen size, color depth, FSAA, ... the binary is now called ta3d-bin.exe and called through ta3d.bat so you can type ta3d.bat and it will call and monitor ta3d-bin.exe
=>;-D Penguin Powered
I've already started looking at unit engine zuff, and there are many many areas we can target to improve efficenicy, both in speed and resources.
Now with respects to new vector/matrix code i've used templates, they probably won't be 'super faster' but they are much easier to use and expand upon, as well as wrapped all neatly with themselves, freeing external global cache for additional cache code.
Heres a early draft of 2D and 3D vectors, (not ready for use yet). The super class for SEE operations are not implemented yet.
You can't speically try to write a better vector that will run faster by simply doing small things but if you look at cpu instructions you can use some opts that cpus provide to 'improve' upon preformance. What I mean by this is per say most cpu's now come with the SEE instruction set (implmeented quite awhile ago), There are some instructions in this area that are designed for vector math. The same goes with additional cpu's. Now most modern day compilers can optomize for these instruction sets but usually not, its invoking these instruction sets that really sets things in motion.
Its true that its damn hard to write machine code that will outpreform a modern day compiler generated code, but thats not what you are doing when you use instruction sets, what you are doing is taking advantage of special instruction sets on cpus that the compiler most time does not make use of. You'll see what i mean later.
Anyhow here is a rough draft of new vector code. Comments welcome.
Now with respects to new vector/matrix code i've used templates, they probably won't be 'super faster' but they are much easier to use and expand upon, as well as wrapped all neatly with themselves, freeing external global cache for additional cache code.
Heres a early draft of 2D and 3D vectors, (not ready for use yet). The super class for SEE operations are not implemented yet.
You can't speically try to write a better vector that will run faster by simply doing small things but if you look at cpu instructions you can use some opts that cpus provide to 'improve' upon preformance. What I mean by this is per say most cpu's now come with the SEE instruction set (implmeented quite awhile ago), There are some instructions in this area that are designed for vector math. The same goes with additional cpu's. Now most modern day compilers can optomize for these instruction sets but usually not, its invoking these instruction sets that really sets things in motion.
Its true that its damn hard to write machine code that will outpreform a modern day compiler generated code, but thats not what you are doing when you use instruction sets, what you are doing is taking advantage of special instruction sets on cpus that the compiler most time does not make use of. You'll see what i mean later.
Anyhow here is a rough draft of new vector code. Comments welcome.
Code: Select all
template <class T>
class TA3D_Vector2
{
typedef TA3D_Vector2<T> MyType;
public:
T x;
T y;
__forceinline TA3D_Vector2()
{
return;
}
__forceinline ~TA3D_Vector2()
{
return;
}
__forceinline TA3D_Vector2( const MyType &Vec )
{
Set( Vec.x, Vec.y );
return;
}
__forceinline TA3D_Vector2( const T C1, const T C2)
{
Set( C1, C2 );
return;
}
__forceinline TA3D_Vector2( const T vAll )
{
Set( vAll, vAll );
return;
}
__forceinline void Set( const T v1, const T v2)
{
x = v1;
y = v2;
return;
}
// Begin Operators
// Assingment operators
__forceinline MyType &operator= (const MyType &rhs)
{
Set( rhs );
return (*this);
}
// Equality operators:
__forceinline bool operator== (const MyType &rhs)
{
return ((x == rhs.x) && (y == rhs.y));
}
friend __forceinline bool operator== (const MyType &lhs, const MyType &rhs)
{
return ((lhs.x == rhs.x) && (lhs.x == rhs.y));
}
__forceinline bool operator!= (const MyType &rhs)
{
return (!(*this == rhs));
}
friend __forceinline bool operator!= (const MyType &lhs, const MyType &rhs)
{
return (!(lhs == rhs));
}
// Addition operators:
friend __forceinline MyType operator+ (const MyType &lhs, const MyType &rhs)
{
return (MyType (lhs.x + rhs.x, lhs.y + rhs.y));
}
__forceinline MyType &operator+= (MyType &rhs)
{
x += rhs.x;
y += rhs.y;
return (*this);
}
// Subtraction operators:
friend __forceinline MyType operator- (const MyType &lhs, const MyType &rhs)
{
return (MyType (lhs.x - rhs.x, lhs.y - rhs.y));
}
__forceinline MyType &operator-= (MyType &rhs)
{
x -= rhs.x;
y -= rhs.y;
return (*this);
}
// Multiplication operators:
friend __forceinline MyType operator* (const MyType &lhs, const MyType &rhs)
{
return (MyType (lhs.x * rhs.x, lhs.y * rhs.y));
}
friend __forceinline MyType operator* (const MyType &lhs, const T rhs)
{
return (MyType (lhs.x * rhs, lhs.y * rhs));
}
friend __forceinline MyType operator* (const T lhs, const MyType &rhs)
{
return (MyType (lhs * rhs.x, lhs * rhs.y));
}
__forceinline MyType &operator*= (const MyType &rhs)
{
x *= rhs.x;
y *= rhs.y;
return (*this);
}
__forceinline MyType &operator*= (const T rhs)
{
x *= rhs;
y *= rhs;
return (*this);
}
// Division operators:
friend __forceinline MyType operator/ (const MyType &lhs, const MyType &rhs)
{
return (MyType (lhs.x / rhs.x, lhs.y / rhs.y));
}
friend __forceinline MyType operator/ (const MyType &lhs, const T rhs)
{
return (MyType (lhs.x / rhs, lhs.y / rhs));
}
__forceinline MyType &operator/= (const MyType &rhs)
{
x /= rhs.x;
y /= rhs.y;
return (*this);
}
__forceinline MyType &operator/= (const T rhs)
{
Cols.x /= rhs;
Cols.y /= rhs;
return (*this);
}
// Sum ( X + Y )
__forceinline T Sum (void)
{
return (x + y);
}
// Dot ( class * class ).Sum
__forceinline T Dot (const MyType &rhs)
{
return (((*this) * rhs).Sum());
}
// Magnitude the square root of our sum
__forceinline T Magnitude (void)
{
return (::sqrt (Dot (*this)));
}
// Normalize x and y divided by magnitude
__forceinline void Normalize (void)
{
*this /= Magnitude();
return;
}
// get the noramalize result without modifing vars
__forceinline MyType GetNorm (void)
{
return (*this / Magnitude());
}
// GetProject along axis vector.
__forceinline MyType GetProjection( MyType &AxisVec )
{
return ((this->Dot(AxisVec) / AxisVec.Dot(AxisVec)) * AxisVec);
}
// Get projections along x and y axis(s)
__forceinline void GetProjections (MyType &AxisVec, MyType &OutX, MyType &OutY)
{
OutX = this->GetProjection (AxisVec);
OutY = (*this) - OutX;
return;
}
// Sin reulst based on another vector
friend __forceinline MyType sin (const MyType &Vec)
{
MyType Return;
Return.x = ::sin (Vec.x);
Return.y = ::sin (Vec.y);
return (Return);
}
// sin result of our vector
__forceinline void sin (void)
{
x = ::sin (x);
y = ::sin (y);
return;
}
// cos
friend __forceinline MyType cos (const MyType &Vec)
{
MyType Return;
x = ::cos (Vec.x);
y = ::cos (Vec.y);
return (Return);
}
__forceinline void cos (void)
{
x = ::cos (x);
y = ::cos (y);
return;
}
// tan
friend __forceinline MyType tan (const MyType &Vec)
{
MyType Return;
Return.x = ::tan (Vec.x);
Return.y = ::tan (Vec.y);
return (Return);
}
__forceinline void tan (void)
{
x = ::tan( x );
y = ::tan( y );
return;
}
// square root
friend __forceinline MyType sqrt (const MyType &Vec)
{
MyType Return;
Return.x = ::sqrt (Vec.x);
Return.y = ::sqrt (Vec.y);
return (Return);
}
__forceinline void sqrt (void)
{
x = ::sqrt( x );
y = ::sqrt( y );
return;
}
}; // template class<T> class TA3D_Vector2;
template <class SType > class TA3D_Vector4;
template <class SType > class TA3D_Matrix4;
template <class T>
class TA3D_Vector3 : TA3D_Vector2<T>
{
public:
typedef TA3D_Vector3<T> MyType;
typedef TA3D_Vector2<T> MyPType;
T z;
// Constructors:
__forceinline TA3D_Vector3()
{
return;
}
__forceinline ~TA3D_Vector3()
{
return;
}
// Constructor passing another 3D Vector.
__forceinline TA3D_Vector3 (const MyType &Vec)
{
Set( Vec );
return;
}
// Constructor passing a 2D Vector
__forceinline TA3D_Vector3( const MyPType &Vec )
{
Set( Vec.x, Vec.y, 0 );
return;
}
/*
// Constructor passing a 4D Vector
__forceinline operator TA3D_Vector4<T> ()
{
return (TA3D_Vector4<SType> (x, y, z, 0));
}
*/
__forceinline TA3D_Vector3( T v1, T v2, T v3)
{
Set( v1, v2, v3 );
return;
}
__forceinline TA3D_Vector3( T vAll )
{
Set(vAll, vAll, vAll);
return;
}
// WARNING: does not zero z quard
__forceinline void Set( T C1, T C2 )
{
return MyPType::Set( C1, C2 );
}
__forceinline void Set( T C1, T C2, T C3 )
{
x = C1;
y = C2;
z = C3;
return;
}
//Begin Operators:
// Assingment Operators:
__forceinline MyType &operator= (const MyType &rhs)
{
x = rhs.x;
y = rhs.y;
z = rhs.z;
return (*this);
}
__forceinline bool operator== (const MyType &rhs)
{
return ( (z == rhs.z) && (x == rhs.x) && (y == rhs.y) );
}
// allow vec3 == vec2 operator (compares x,y)
__forceinline bool operator== (const MyPType &rhs)
{
return MyPType::operator==(rhs);
}
__forceinline bool operator!= (const MyType &rhs)
{
return (!(*this == rhs));
}
// allow vec3 != vec2 operator
__forceinline bool operator!= (const MyPType &rhs)
{
return (!(*this == rhs));
}
// Addition Operators:
friend __forceinline MyType operator+ (MyType &lhs, MyType &rhs)
{
return (MyType (lhs.x + rhs.x, lhs.y + rhs.y, lhs.z + rhs.z));
}
friend __forceinline MyType operator+ (MyPType &lhs, MyType &rhs)
{
return (MyType (lhs.x + rhs.x, lhs.y + rhs.y, rhs.z));
}
friend __forceinline MyType operator+ (MyType &lhs, MyPType &rhs)
{
return (MyType (lhs.x + rhs.x, lhs.y + rhs.y, lhs.z ));
}
__forceinline MyType &operator+= (MyType &rhs)
{
x += rhs.x;
y += rhs.y;
z += rhs.z;
return (*this);
}
__forceinline MyType &operator+= (MyPType &rhs)
{
x += rhs.x;
y += rhs.y;
return (*this);
}
//Subtraction operators:
friend __forceinline MyType operator- (MyType &lhs, MyType &rhs)
{
return (MyType (lhs.x - rhs.x, lhs.y - rhs.y, lhs.z - rhs.z));
}
friend __forceinline MyType operator- (MyPType &lhs, MyType &rhs)
{
return (MyType (lhs.x - rhs.x, lhs.y - rhs.y, rhs.z));
}
friend __forceinline MyType operator- (MyType &lhs, MyPType &rhs)
{
return (MyType (lhs.x - rhs.x, lhs.y - rhs.y, lhs.z));
}
__forceinline MyType &operator-= (MyType &rhs)
{
x -= rhs.x;
y -= rhs.y;
z -= rhs.z;
return (*this);
}
__forceinline MyType &operator-= (MyPType &rhs)
{
x -= rhs.x;
y -= rhs.y;
return (*this);
}
// Multiplication operators:
friend __forceinline MyType operator* (MyType &lhs, MyType &rhs)
{
return (MyType (lhs.x * rhs.x, lhs.y * rhs.y, lhs.z * rhs.z));
}
friend __forceinline MyType operator* (MyType &lhs, T rhs)
{
return (MyType (lhs.x * rhs, lhs.y * rhs, lhs.z * rhs));
}
friend __forceinline MyType operator* (MyPType &lhs, MyType &rhs)
{
return (MyType (lhs.x * rhs.x, lhs.y * rhs.y, rhs.z));
}
friend __forceinline MyType operator* (MyType &lhs, MyPType &rhs)
{
return (MyType (lhs.x * rhs.x, lhs.y * rhs.y, lhs.z));
}
friend __forceinline MyType operator* (T lhs, MyType &rhs)
{
return (MyType (lhs * rhs.x, lhs * rhs.y, lhs * rhs.z));
}
__forceinline MyType &operator*= (MyType &rhs)
{
x *= rhs.x;
y *= rhs.y;
z *= rhs.z;
return (*this);
}
__forceinline MyType &operator*= (MyPType &rhs)
{
x *= rhs.x;
y *= rhs.y;
return (*this);
}
__forceinline MyType &operator*= ( T rhs )
{
z *= rhs;
MyPType::operator*=(rhs);
return (*this);
}
// Division operators:
friend __forceinline MyType operator/ (MyType &lhs, MyType &rhs)
{
return (MyType (lhs.x / rhs.x, lhs.y / rhs.y, lhs.z / rhs.z));
}
friend __forceinline MyType operator/ (MyType &lhs, T rhs)
{
return (MyType (lhs.x / rhs, lhs.y / rhs, lhs.z / rhs));
}
friend __forceinline MyType operator/ (MyPType &lhs, MyType &rhs)
{
return (MyType (lhs.x / rhs.x, lhs.y / rhs.y, rhs.z));
}
friend __forceinline MyType operator/ (MyType &lhs, MyPType &rhs)
{
return (MyType (lhs.x / rhs.x, lhs.y / rhs.y, lhs.z));
}
friend __forceinline MyType operator/ (T lhs, MyType &rhs)
{
return (MyType (lhs / rhs.x, lhs / rhs.y, lhs / rhs.z));
}
__forceinline MyType &operator/= (MyType &rhs)
{
x /= rhs.x;
y /= rhs.y;
z /= rhs.z;
return (*this);
}
__forceinline MyType &operator/= (MyPType &rhs)
{
x /= rhs.x;
y /= rhs.y;
return (*this);
}
__forceinline MyType &operator/= ( T rhs )
{
z /= rhs;
MyPType::operator*=(rhs);
return (*this);
}
__forceinline T Sum( void )
{
return ( z + MyPType::Sum() );
}
__forceinline T Dot( MyType &rhs )
{
return (((*this) * rhs).Sum());
}
__forceinline T Magnitude (void)
{
return (::sqrt (Dot (*this)));
}
__forceinline void Normalize (void)
{
*this /= Magnitude();
return;
}
__forceinline MyType Cross3 (MyType &rhs)
{
return (MyType ((y * rhs.z) - (z * rhs.y),
(z * rhs.x) - (x * rhs.z),
(x * rhs.y) - (y * rhs.x)));
}
friend __forceinline MyType sin (MyType &Vec)
{
MyType Return( ::sin(Vec.x), ::sin(Vec.y), ::sin(Vec.z) );
return (Return);
}
__forceinline void sin (void)
{
x = ::sin (Cols.x);
y = ::sin (Cols.y);
z = ::sin (Cols.z);
return;
}
friend __forceinline MyType cos (MyType &Vec)
{
MyType Return( ::cos( Vec.x ), ::cos( Vec.y ), ::cos( Vec.z ) );
return (Return);
}
__forceinline void cos (void)
{
x = ::cos (Cols.x);
y = ::cos (Cols.y);
z = ::cos (Cols.z);
return;
}
friend __forceinline MyType tan (MyType &Vec)
{
MyType Return( ::tan( Vec.x ), ::tan( Vec.y ), ::tan( Vec.z ) );
return (Return);
}
__forceinline void tan (void)
{
x = ::tan (x);
y = ::tan (y);
z = ::tan (z);
return;
}
friend __forceinline MyType sqrt (MyType &Vec)
{
MyType Return( ::sqrt (Vec.x), ::sqrt (Vec.y), ::sqrt (Vec.z) );
return (Return);
}
__forceinline void sqrt (void)
{
x = ::sqrt (x);
y = ::sqrt (y);
z = ::sqrt (z);
return;
}
}; // Tempate class<T> TA3D_Vector3
Females: impossible to live with, most powerful money reducing agent known to man, 99% of the time they drive us insane; yet somehow we desire to have as many as we can.
Indeed SSE instruction sets improve speed greatly. Its one of the benefits linux has over windows since linux apps can be compiled using the max the cpu supports.
But theres a downside to it. I can compile all my stuff with SSE4 instruction sets and have all the fancy new features it adds but if I run my program on a machine that doesn't have an SSE4 cpu then my code will crash.
For example, spring was compiled with SSE1 instructions originally but one person complained loudly that his Via cpu didn't support SSE1 and as a result it crashed for him. Windows programs thus need to downplay the cpu specific optimizations they have in order to be more portable.
But theres a downside to it. I can compile all my stuff with SSE4 instruction sets and have all the fancy new features it adds but if I run my program on a machine that doesn't have an SSE4 cpu then my code will crash.
For example, spring was compiled with SSE1 instructions originally but one person complained loudly that his Via cpu didn't support SSE1 and as a result it crashed for him. Windows programs thus need to downplay the cpu specific optimizations they have in order to be more portable.
Any program can use enhanced cpu instruction sets, under any OS, but you have to approach it intelligently.
What I do is 'test' for cpu instruction sets at runtime, and if so then use it otherwize you 'fall back' to back stuff.
Soon as i have vector and matrix classes done i'll post some code to show how to do this. Though i'am not entirely sure it will work under linux, it may need some alterations.
++Cire.
What I do is 'test' for cpu instruction sets at runtime, and if so then use it otherwize you 'fall back' to back stuff.
Soon as i have vector and matrix classes done i'll post some code to show how to do this. Though i'am not entirely sure it will work under linux, it may need some alterations.
++Cire.
Females: impossible to live with, most powerful money reducing agent known to man, 99% of the time they drive us insane; yet somehow we desire to have as many as we can.
- zuzuf
- Administrateur - Site Admin
- Posts: 3281
- Joined: Mon Oct 30, 2006 8:49 pm
- Location: Toulouse, France
- Contact:
sounds great, since for now the win32 binary I build doesn't use SSE (to improve portability), I tested TA3D built with SSE and it's much faster (on an amd64/EMT64 cpu in 64bits mode GCC uses SSE by default, so I tested this on a 32bits cpu).
Just one thing:
currently TA3D uses operator% for dot product and operator* for cross product.
If it needs some more work on Linux, we'll do it. vector & matrix classes are very important and if we can make them 1% faster then TA3D will run 1% faster.
Just one thing:
currently TA3D uses operator% for dot product and operator* for cross product.
If it needs some more work on Linux, we'll do it. vector & matrix classes are very important and if we can make them 1% faster then TA3D will run 1% faster.
=>;-D Penguin Powered
Ok, just about done the new classes, but I want to now go back through and redesign the thing because I made alot of stuipd key mistakes. The biggest one Iwant to adjust is instead of using constructs based on x, y, z, and w, to be changed to an array.
I'll provide methods for accessing these via x,y,z,w but acutally store them as an array. Main reason for this is that OpenGL liked arrays, and thus most graphic rotuines would execute faster if they could simply take the array, rather then having to 'make' one each time we need.
I also wana tweak some functions.
Hope to post new stuff soon.
++Cire.
I'll provide methods for accessing these via x,y,z,w but acutally store them as an array. Main reason for this is that OpenGL liked arrays, and thus most graphic rotuines would execute faster if they could simply take the array, rather then having to 'make' one each time we need.
I also wana tweak some functions.
Hope to post new stuff soon.
++Cire.
Females: impossible to live with, most powerful money reducing agent known to man, 99% of the time they drive us insane; yet somehow we desire to have as many as we can.
Who is online
Users browsing this forum: No registered users and 35 guests