What kind of cpu instruction you'd like to have and why?

Come up with any super useful, crazy or just silly stuff, but keep it sort of realistic.

Here is an example: instruction that applies mask to some data so result is all the masked bits gathered on the right(or left I don't care).

0x01011101(data)

0x01100101(mask)

0x00001011(result)

See? It's first, third, sixth and seventh bit of data on the right(for sake of it) in the result.

The problem is I don't know where or why to apply that yet.

fld dword st(0),eax

fst eax,dword st(0)

I would've killed for these.

one more index register on the 6510 could make wonders

The ARM cpu instructions is just fine, it's the most elegant to code in assembly with...

http://en.wikipedia.org/wiki/ARM_architecture

You only need one instruction: http://en.wikipedia.org/wiki/One_instruction_set_computer

Sounds like this thread

Currently: A cordic instruction:

if (carry)

{

dest += arg1 >> arg2;

} else {

dest -= arg1 >> arg2;

}

That would help *lots* in complex numbers rectangular to polar conversion and vice versa.

Could also do super fast fixed-point sin/cos/sqrt/atan2/log/exponent

Reason is; I execute lots of these in a tight loop:

Add coordinates (complex numbers) in x,y, get sqrt(x*x+y*y) as well as atan2(y,x) in r and theta back.

**Code:**

```
static const sInt atantab[19] =
{
0x0000c910, 0x0000c910, 0x0000c910,
0x000076b2, 0x00003eb7, 0x00001fd6, 0x00000ffb,
0x000007ff, 0x00000400, 0x00000200, 0x00000100,
0x00000080, 0x00000040, 0x00000020, 0x00000010,
0x00000008, 0x00000004, 0x00000002, 0x00000001,
};
#define scaleK 0x00004dba
void cordicXY2RT ( int x, int y, int *r, int *theta)
{
int i;
int z = 0;
for (i=0; i<16+3; i++)
{
int xa,ya;
int angle = atantab[i];
int shift = i-2;
if (shift < 0) shift = 0;
ya = y >> shift;
xa = x >> shift;
if (y < 0) { x -= ya, y += xa, z -= angle; }
else { x += ya, y -= xa, z += angle; }
}
*r = fixmul16 (x, scaleK);
*theta = z;
}
```

NOP

MsK', but this time you aren't constrained by 8 bit. Go wild!

A single cycle NOP on 6502.

a "traverse kd-tree node"-instruction would be neat!

**Quote:**

a "traverse kd-tree node"-instruction would be neat!

Doc.K. told me about a funky intruction on the KC85 series which is called "PUSE", mnemonic for "Punkt setzen" ( set point, a.k.a. draw pixel: X,Y,Colour ). That'd be a great instruction to have when dealing with bitplanes.

What kind of sick fuck prefixes a binary number with 0x?

.. better than prefixing it with (______0______)

skurk, because I mix 'em up all the time. Is it 0000f, no, must be something else.

I'm with Gargaj :)

**Quote:**

fld dword st(0),eax

fst eax,dword st(0)

I'd be completely satisfied with fmov st(0), 0.4535902f

The CPU would have an array of 4x4 8bits registers, and can do all logic & arithmetic with 8, 16, 24, 32 bits, say A[0] = op(A[1], A[2]. In case of 16 or 24 bits, it does it for any pair/triplet of 8 bits registers. And of course with conditional executions for each instructions, ARM style. That would allow some very dense routines.

What about a single-byte instruction that displays a cool demo? :)

cache levels got big. how about explicit process cache page/window and load/store instructions and "fire and forget" batch memory writeback operations. also fpu to sse and general register moves. yeah.

hardware GC

**Quote:**

Doc.K. told me about a funky intruction on the KC85 series which is called "PUSE", mnemonic for "Punkt setzen" ( set point, a.k.a. draw pixel: X,Y,Colour ). That'd be a great instruction to have when dealing with bitplanes.

You might like a certain "bset" instruction too. Almost as simple to use as the PUSE instruction for drawing pixels. Emphasis on almost. :)

