correct MMX-optimized variant of VP3 IDCT, with comments (thank you