For hackers ...
Mar. 25th, 2009 11:43 am![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
Intel has announced the instruction set for its new vector-supercomputer-disguised-as-a-graphics-card 'Larrabee'.
http://software.intel.com/en-us/articles/prototype-primitives-guide/ has a C++ implementation using the data types and intrinsic names which the real thing will use.
It has a full set of the instructions you would expect, including count-set-bits and find-first-set-bit; it has vector gather and scatter (finally!), it has the normal-for-Intel irritating omissions (add-with-carry for 32-bit numbers only?), and it has one or two really quite surprising instructions:
I will buy a chocolate pudding at the Carlton next Thursday for the person to give the least ludicrously contrived example in which this instruction might be useful. There is also a BITINTERLEAVE11_PI which takes alternate bits from the two source elements.
http://software.intel.com/en-us/articles/prototype-primitives-guide/ has a C++ implementation using the data types and intrinsic names which the real thing will use.
It has a full set of the instructions you would expect, including count-set-bits and find-first-set-bit; it has vector gather and scatter (finally!), it has the normal-for-Intel irritating omissions (add-with-carry for 32-bit numbers only?), and it has one or two really quite surprising instructions:
BITINTERLEAVE21_PI - 2:1 Bit-Interleave Int32 Vectors
Performs an element-by-element bitwise interleave, using a 2:1 pattern, between int32 vector v2 and int32 vector v3. The low 21 bits from elements in v2 are interleaved with the low 11 bits from elements in v3 to form a vector of 32-bit values. Bits alternate 2:1, so that source elements A and B combine bitwise this way (high to low):
A20 B10 A19 A18 B9 A17 A16 B8 … A5 A4 B2 A3 A2 B1 A1 A0 B0
I will buy a chocolate pudding at the Carlton next Thursday for the person to give the least ludicrously contrived example in which this instruction might be useful. There is also a BITINTERLEAVE11_PI which takes alternate bits from the two source elements.
no subject
Date: 2009-03-25 12:09 pm (UTC)no subject
Date: 2009-03-25 12:19 pm (UTC)(S)
no subject
Date: 2009-03-25 12:33 pm (UTC)no subject
Date: 2009-03-25 02:01 pm (UTC)no subject
Date: 2009-03-25 04:09 pm (UTC)In a graphics card? I don't know, as I have a cellular background. There may be interesting graphic algorithms that can use such specialized instructions.
no subject
Date: 2009-03-25 04:31 pm (UTC)no subject
Date: 2009-03-26 02:08 am (UTC)See http://eprints.iisc.ernet.in/5698/ and http://ieeexplore.ieee.org/Xplore/login.jsp?url=http%3A%2F%2Fieeexplore.ieee.org%2Fiel5%2F4489030%2F4489031%2F04489200.pdf%3Farnumber%3D4489200&authDecision=-203 for examples of bitwise interleave uses. This one specifically can put the crc in the high bits for a coded stream in the lower bits to transmit ECC along with signal in a single DWORD
no subject
Date: 2009-03-26 02:08 am (UTC)no subject
Date: 2009-03-26 02:10 am (UTC)