Easy lookup/access to members in "an array" of different datatypes if possible

category: code [glöplog]

doomdoom: We're talking performance-wise here.

added on the 2012-02-13 16:01:41 by kbi

His solution isn't much, if any, slower than carrying pointers and type identification variables yourself. It is, however, more elegant.

(though for _simple_ Variant stuff I'd personally still go for type enum + union with atomic types + ptr)

added on the 2012-02-13 16:31:06 by superplek

doom: that should be possible with templates too...

added on the 2012-02-13 16:31:37 by raer

(there's the virtual function call hit)

added on the 2012-02-13 16:32:05 by superplek

raer: yeah indeed.

added on the 2012-02-13 16:33:35 by superplek

xTr1m: It's called a Tagged Union, and you're doing it wrong (as pointed out already).

Tagged unions can some times make sense (e.g for ASTs), but I tend to prefer to get rid of them if possible as well.

added on the 2012-02-13 16:39:04 by kusma

Polymorphism has a price, but this also isn't great, performance-wise:

Code:

struct DataEntry {
    union {
        float f;
        vector v;
        whatever w;
    } value;
    DataType type;
}

If a vector is three floats, and I'm betting it is much of the time, then every single-float entry gets 8 bytes of padding, just to accomodate vectors. If a "whatever" can be something larger, say a 4x4 matrix, then the union becomes hugely wasteful and very unkind to the caches. And either way you'll have to switch on the "type" member, which means a performance hit that's at least comparable to a virtual function call. So I'm really not seeing how this is so much more efficient than textbook polymorphism, but I do see it being much more difficult.

As for aligning the elements sequentially in memory (which xtr1m's method doesn't do anyway), heap allocation isn't as all-over-the-place as one might think, and nobody says you have to rely on default heap allocation anyway. E.g.

added on the 2012-02-13 16:40:27 by doomdoom

If you're aiming for a vector library, use Eigen and the std library. Should solve most of your problems.

added on the 2012-02-13 16:43:13 by raer

I had a similar problem when trying to create an interface for OGL uniforms. I chose an enum and a void * for storage, which might not be pretty, but works for all types of uniforms.
I TRIED using templates, but failed, to be honest...

added on the 2012-02-13 16:46:17 by raer

doom: but you'd be incorrectly assuming that a) the virtual table you have in each entry is the same size as the type value, and b) that the cost of the virtual call is the same as the switch.

but anyway. cant you pack all the elements of different types into their own separate buffers and index/order them more cleverly? :)

added on the 2012-02-13 17:14:46 by smash

Quote:

cant you pack all the elements of different types into their own separate buffers and index/order them more cleverly? :)

it's funny since early on i pointed out that it was an idea to take a few steps back and revise the plan. but as per usual things derail into a detailed discussion of implementing a, in this case most probably, fundamentally flawed approach :)

added on the 2012-02-13 17:36:56 by superplek

Quote:

And either way you'll have to switch on the "type" member, which means a performance hit that's at least comparable to a virtual function call.

smash said it already but for good measure: this is a completely bogus assumption.

added on the 2012-02-13 17:37:52 by superplek

So how about simply asking the right question back:

Rudi, your problem looks fishy. What do you need that array for? (and don't bother replying with "ints and floats". What purpose does the thing serve in the grander scheme of things?).

added on the 2012-02-13 17:49:20 by kb_

I am a sissy, and use "List<object>" in c#, which, for some reason, also allows for value types.

Fuck performance, we have quad cores :-)

added on the 2012-02-13 17:52:30 by Jcl

Yeah, and dozens of applications doing things the slow way in single threads :)

superplek: You got point there that the method I proposed is error-prone. However I would debate whether accessing virtual functions are that peformance-sexy when compared to using raw pointers.. In any case, tagged unions are a better solution in this scenario, I agree.

added on the 2012-02-13 17:56:36 by kbi

Jcl, the magic behind it is called "boxing" and is the very incarnation of "fuck performance", yes ;)

added on the 2012-02-13 17:56:40 by kb_

smash: yes! my final solution is to use separate buffers. it was all in front of my eyes and i couldnt see it. i saw that it wouldnt really care if they where ordered the way i first wanted than having separate buffers for each datatype/structure. i can update them variables in whatever order. because they're not used anywhere in the updating-code.

i have not yet implemented this. i will when i have the time i just had some problems with conversion of template type, the C2440 error, like this: '=' : cannot convert from 'Point<T> **' to 'Point<T> **'

even when if i write <char> or <int> for example, it will give an error if i have two separate point tables. like this;
Point<char> **point_c;
Point<int> **point_i;

Code:


	template <class T>void Add(T *v0, T v1, T v2)
	{
		if (typeid(*v0).name() == "char")
		{
			point_c = (Point<char> **)realloc(point_c, (numPoints_c + 1) * sizeof(Point<char>*));
			point_c[numPoints_c] = new Point<char>(v0, v1, v2);
			numPoints_c ++;
		} else
		if (typeid(*v0).name() == "int")
		{
			point_i = (Point<int> **)realloc(point_i, (numPoints_i + 1) * sizeof(Point<int>*));
			point_i[numPoints_i] = new Point<int>(v0, v1, v2);
			numPoints_i ++;
		}
	}

i get error C2664: 'Point<T>::Point(T *,T,T)' : cannot convert parameter 1 from 'char *' to 'int *'

dunno where my head is at. since i am using templates as parameter this shitty function wont work the way i wanted it to be.

I see why there is an error, but i hate stumbling into these problems when i wanted template parameters. (also this is just pure experiments with the typeid thingy).

added on the 2012-02-13 18:00:21 by rudi

ah, i see there is been written alot. maybe i should read through all the replies whe i have a more clear head.

added on the 2012-02-13 18:07:25 by rudi

smash: I didn't say the cost was the same, I said it was comparable. The difference is likely to be insignificant. Of course it depends on stuff, so who knows. I do know that the space overhead per polymorphic object is generally the size of a single pointer. But yeah, however you do it, mixing small datatypes in a collection like that is going to hold you back, performance-wise, so it's a little academic.

added on the 2012-02-13 18:10:09 by doomdoom

rudi: danger, you're doing templates wrong :)
you know you could just overload the "Add" function to take different types right?

added on the 2012-02-13 18:14:09 by smash

smash: yes, i allready know why causing that. one reason why i hate overloading!!: i have to write more functions!! :)
and i read just doomdoom's post, it looks nice i need to try out the ZoidBerg approach too.
but my head is really tired today. i will maybe look at the code tomorrow...

added on the 2012-02-13 18:18:57 by rudi

Quote:

smash said it already but for good measure: this is a completely bogus assumption.

Why? Virtual function call:

- read pointer to vtable from object
- read pointer to virtual function from vtable
- call function
- return from function

Switch method:

- fetch datatype variable
- read pointer from jump table
- jump to relevant case label
- jump to end of switch statement

Yes, a function call is more expensive than a jump, this is obvious, but I still fail to see how those two scenarios are not comparable, performance-wise.

added on the 2012-02-13 18:19:56 by doomdoom

kb_:

Quote:

So how about simply asking the right question back:

Rudi, your problem looks fishy. What do you need that array for? (and don't bother replying with "ints and floats". What purpose does the thing serve in the grander scheme of things?).

As I replied to smash that I don't think i really need one array, but i will use separate arrays for the types. and kb_: What do you mean by the grander scheme of things? I want to be able to update the datatypes (and own structs or objects) from the top-to-bottom if possible.

It's still just a big mess though.

added on the 2012-02-13 18:32:05 by rudi

i forgot to say. these are actually pointers to datatypes etc. but you allready knew that.

added on the 2012-02-13 18:34:02 by rudi

kb_: if i need to change any variable in the runtime, i must use a pointer. there is no way im building an array of crappy values that are a copy and to updated those in the runtime.

added on the 2012-02-13 18:42:06 by rudi

pouët.net

Easy lookup/access to members in "an array" of different datatypes if possible

login