Friday, December 9, 2011

HDR procedural starfield Pt4

I think I've settled on a format for my stars; it's somewhat unconventional, but much lighter than float4 (128bits). It's a YRGB hybrid thing (56bits).

After reading up on YUV (or YCrCb) a little bit more, I realized that in order to use it properly all components need to be the same bit depth (ideally), and you need all three components to do the conversion properly. Meaning Y, U and V. So making Y a float and UV unsigned chars would save me a bunch of memory during star generation, but then creates a huge headache for colourization.

So by making them all floats I'm not saving much, and by making them uchars, I'm losing my HDR. And because I'm NOT too worried about obscene optimizations I figure there wasn't much in terms of savings between 2 uchars, and 3 uchars. Thus I settled on keeping my stars HDR using a float, and keeping my colours RGB using uchars.

Once I'm ready to write to file then I can do all the fancy conversions :)

Tuesday, December 6, 2011

HDR procedural starfield Pt3

I've been thinking about how I'm generating my stars, and for some reason when I decided float4 was the way to go, I've stuck with it during my rewrite of the app. However, when I think about it the only reason I did so was because of colour variation, and SSE... and the fact that it fits nicely into D3DFMT_A32B32G32R32F.

Float4 being:

__declspec(align(16)) struct float4
{
float r;
float g;
float b;
float a;
};

But the more I think about it all I really need is a float and some 8 bit colours. It would save on memory and as it is I'm not using SSE for anything other than turning a float into a float4 to plug back into the master star map. Like so:

//-----------------------------------------
//Use SSE to do some conversion into float4
//-----------------------------------------
float4 f4Res = {0,0,0,0};
__m128 result = _mm_set1_ps(starLum);
_asm
{
movaps xmm0, result
movaps f4Res, xmm0
}
*_star = f4Res;

I may end up doing a bit more reworking and start using a single float for brightness, then keeping track of colour variance with UV, essentially making everything YUV in the back end. By the time I'm ready to render to texture convert it to a 128bit RGBA DDS.

It's worth a try to speed things up a bit, and save memory. After all, working with six 128bit 1024x1024 (or larger) arrays can be slow.