Float4 being:
__declspec(align(16)) struct float4
{
float r;
float g;
float b;
float a;
};
But the more I think about it all I really need is a float and some 8 bit colours. It would save on memory and as it is I'm not using SSE for anything other than turning a float into a float4 to plug back into the master star map. Like so:
//-----------------------------------------
//Use SSE to do some conversion into float4
//-----------------------------------------
float4 f4Res = {0,0,0,0};
__m128 result = _mm_set1_ps(starLum);
_asm
{
movaps xmm0, result
movaps f4Res, xmm0
}
*_star = f4Res;
I may end up doing a bit more reworking and start using a single float for brightness, then keeping track of colour variance with UV, essentially making everything YUV in the back end. By the time I'm ready to render to texture convert it to a 128bit RGBA DDS.
It's worth a try to speed things up a bit, and save memory. After all, working with six 128bit 1024x1024 (or larger) arrays can be slow.
No comments:
Post a Comment