One more HLSL trick


When implementing a quite complex pixel shader on pixel shader for multi-cascade shadows maps on shader model 2.0, I’d got a usual error message saying that “Arithmetic instruction limit of 64 exceeded“, so once again I have to find an instruction slot.

I had a code that tries to find the best shadow map for current pixel, something like:

//float2 map[] is a u/v pair; if it is "inside" the texture, I can use it
if (map[0].x < 0 || map[0].x > 1 ||
    map[0].y < 0 || map[0].y > 1)

This code was converted to:

//float2 map[] is a u/v pair; if it is "inside" the texture, I can use it
if( any (map[0]-saturate(map[0] ) ) )

This takes much less instructions, so now I have some free slots!




Q: In Visual Studio, how to set conditional breakpoint on -1.#IND000000000000 ?

 vairable == -1.#IND000000000000//this cannot work, of course


vairable != vairable

More productive programming


(inspired by 10 best tricks of fooling myself to work and Create more productive environment at your desk)

  1. Drink a good tea constantly.
    it helps to concentrate. I recommend Pu-erh or Oolong)
  2. Drink a your tea with no sugar.
    not only it tastes much better (this is the only reasonable way to drink a tea), but also when it is accidentally  spilled on a keyboard, the keyboard is recoverable, and the keys do not stick. Thus you will be significantly more productive than somebody who drinks tea/coffee  with sugar (or Pepsi or whatever)
  3. Keep your browser closed; get updates using RSS readers.
    I personally had been spending  a lot of time  checking news sites/ /. / HN /Merriam-Webster’s Word of the Day/German word of the day/ Wikipedia watch list  etc.  each time “the code is compiling”, and then continue to surf. Now RSS saves me a lot of time.
  4. Ask a co-worker for help, if you are stuck on a problem more than 30 minutes.
    usually, you will find the solution even before you finish explaining him the issue. If not, you give him a chance to solve your problem and fill genius.
  5. Use Ginkgo biloba, if you still feel dumb.
    I don’t know whether it’ effect is fully Psychosomatic or not, but it helps.

HLSL trick #2


Saving some expensive instruction in shader model 2…
(Trick #1 is described here)

Instead of, e. g.,

float getTotalDiffuse()
 float l1 = getDiffuse(g_light1);
 float l2 = getDiffuse(g_light2);
 float l3 = getDiffuse(g_light3);
 float l4 = getDiffuse(g_light4);
 return l1+l2+l3+l4;


float getTotalDiffuse()
 float4 l = {getDiffuse(g_light1),
 return dot (l, float4(1,1,1,1));

The trick is that sometimes if you want to sum some floats (or vectors), it may be cheaper to use dot product with vector (1, … , 1)

HLSL trick


If you write a HLSL Shader Model 2 you know you are limited to 32 const registers. Here is a trick that helped me to save one.

Suppose you want to write in pixel shader:

 float4x4 l_Color=
      tex2D(samTex0, In.Tex0),
      tex2D(samTex1, In.Tex1),
      tex2D(samTex2, In.Tex2),
      tex2D(samTex3, In.Tex3)
   l_Color[g_BlendLayer] = foo(l_Color[g_BlendLayer]); 
   //OOPS! illegal syntax
   out.Color = combine (Color);

But you cannot write in ShaderModel 2 “array[index] = blah“.

Second try:

  float4 blended = foo(l_Color[g_BlendLayer]);
   switch (BlendLayer)
      case 0: l_Color[0] = blended; break;
      case 1: l_Color[1] = blended; break;
      case 2: l_Color[2] = blended; break;
      case 3: l_Color[3] = blended; break;
   //OOPS! illegal syntax

Of course, there is no switch/case in ShaderModel 2…

  float4 blended = foo(l_Color[g_BlendLayer]);
   switch (BlendLayer)
      case 0: l_Color[0] = blended; break;
      case 1: l_Color[1] = blended; break;
      case 2: l_Color[2] = blended; break;
      case 3: l_Color[3] = blended; break;
   //OOPS! illegal syntax

Same problem: “array[index] = blah“.

  float4 blended = foo(l_Color[g_BlendLayer]);
  if (0 == BlendLayer) l_Color[0] = blended; else
  if (1 == BlendLayer) l_Color[1] = blended; else
  if (2 == BlendLayer) l_Color[2] = blended; else
  if (3 == BlendLayer) l_Color[3] = blended; 
   //OOPS! Too bad

Although this code is syntactically correct, it is wrong: condition are bad, and nested ifs are even worse (just look on disassembly). Too much instructions.

  float4 blended = foo(l_Color[g_BlendLayer]);
  if (0 == BlendLayer) l_Color[0] = blended; 
  if (1 == BlendLayer) l_Color[1] = blended; 
  if (2 == BlendLayer) l_Color[2] = blended; 
  if (3 == BlendLayer) l_Color[3] = blended; 
   //Much better!

This code (same as previous, but without “else”s, is much better. But still, I get “error X5589: Invalid const register num: 32. Max allowed is 31.”

  float4 blended = foo(l_Color[g_BlendLayer]);
  if (0 == BlendLayer--) l_Color[0] = blended; 
  if (0 == BlendLayer--) l_Color[1] = blended; 
  if (0 == BlendLayer--) l_Color[2] = blended; 
  if (0 == BlendLayer--) l_Color[3] = blended; 
   //It works!

Learning the assembly code really helps.
Here is how you get .asm files from your HLSL shaders,
using Microsoft’s fxc tool:

fxc /Gfp /Zi /T ps_2_0 /Fc out.asm l:\efx\mini.fx /E PS_Test

or, to get nice HTML:

fxc /Gfp /Zi /T ps_2_0 /Cc /Fc out.asm.html l:\efx\mini.fx /E PS_Test



unordered facts.

Etymology of etymology: Greek ἐτυμολογία (etumologíā); from ἔτυμον (étumon), meaning “true sense”, and -λογία (-logía), meaning “study”; from λόγος (lógos), meaning “speech, account, reason.” (wikipedia)

Pronunciation of pronunciation is IPA: /pɹəˌnʌnsiˈeɪʃən/, SAMPA: /pr@%nVnsi”eIS@n/ (wiktionary)

Doublet for doublet is twin.

The term antonym is synonymous with opposite.

Antonym to antonym is synonym.

There is no synonym to synonym, AFAIK.

TLA is TLA. (Three-letter acronym)

FLAB is FLAB (Four-letter abbreviation)

Onomatopoeic is not onomatopeic.

Awkward is an awkward word.

RAS syndrome is an example of RAS syndrome.

Portmanteau is Portmanteau.


heterological is not heterological nor autological.



Playing with JavaScript.

There are ten matchsticks. You must move the matches such that there are 5 crossed pairs of them. Each turn you can move one matchstick over exactly two other matchsticks. (crossed matchsticks are considered as two!)
10-matchsticks puzzle




Let N = any number not divisible by 2 and 5.
Does there exist a k (for each such N), such that 10^k – 1 is divisible by N?
Or: Is there 99..9 for any N, such as 99..9 is divisible by N, if N is coprime with 10?


Yes.  It is multiplicative order of 10 modulo N. The sequence is can be found at The On-Line Encyclopedia of Integer Sequences.




-- all numbers than cannot be devided by 2 or 5
seq1 :: [Integer]
seq1 = filter (\a->(a `mod` 10) `elem` [1,3,7,9]) [1..]

-- find 99..9 that can be devided by n
findNum n = head $ [x | x<-[1..], (10^x-1) `mod` n == 0]

--prints the sequene
take 100 $ map findNum3 seq2

The Treachery of Computer Images

The Treachery of Computer Images

This is not a painting

Hello, World! program on HLSL


If someone is missing real “Hello, World” program written in a shader language, here it is.

Pixel shader HLSL, shader model 3.0  (in effects file); pixel shader only, that takes as input only u,v.

More serious challenge could be to write Quine program, but I have other things to do.

const int L[18]={
	0xad27,	0xa925,	0xED25,	0xa925,	0xaDB7,	0x0000,
	0x85eE,	0x852b,	0xad2b,	0xf92e,	0x712b,	0x51e9,	0x0000,
	0xC3CF,	0xc36f,	0xc326,	0xc360,	0xf3cf
float4 PS(float2 tex : TEXCOORD0):COLOR0
	float4 output = float4(0,tex.x,tex.y,1);
	int x= (1-tex.x)*16;
	int y= (1-tex.y)*18;
	//no bitwise ioerations on shadermodel 3 yet :(
	int mask=L[y]*2;
	for (int i = 0; i<16&& i <x; i++)
		mask*=0.5;output.r = ( frac(0.5*mask) < 0.1);
	return output;