c - Comparing numbers of bits per position in two lists of integers -


i have run problem can circumvent arranging algorithm differently, it's quite interesting , maybe 1 of has idea.

the situation follows: have 2 lists of unsigned long integers, both lists have same size, , if helpful can assume size power of two. size of these lists in range of several hundred. want compute integer has set bit in every position in first list has more set bits second list.

speed everything.

simplified example:  list1   list2  1010    0101 1111    0000 1100    0011 1010    0101  result: 1010 because of 4>0, 2<=2, 3>1, 1<=3 

edit: alternative arrangement of data result in bit vectors contain bits of position in several different vectors. in case use bit counting algorithm , compare, amount less 30 operations per 64 bits in both lists. have matrix of bits , can use bit vectors columns or rows.

additional structure: john willemse's comment made me realise calculate third list, these 3 lists complement each other bitwise. though don't see how helpful.

you can transposed counters - instead of having int each bit position of data, uint each bit position of count. don't need many bits..

you can addition/subtraction way defined on bitvectors, each "bit" being slice of bit position across counts.

perhaps sounds vague, let's jump right in: (not tested)

// add in item list2 carry0 = count0 & item2; count0 ^= item2; carry1 = count1 & carry0; count1 ^= carry0; .. etc many bits need in counters // subtract item list1 borrow0 = ~count0 & item1; count0 ^= item1; borrow1 = ~count1 & borrow0; count1 ^= borrow0; .. etc 

the result signs, last counter you're using.


or, different: maybe can use sub-fields of int, swar style. works if fields small or don't need many, because there isn't space. 4-bit items it's not bad, uint32_t offering 4 counters range -128 127, might enough (the final difference must in range, intermediate results can wrap safely)

anyway how work spread bits out either lookup table or pdep, (not tested)

uint32_t spread = _pdep_u32(item, 0x01010101); // or uint32_t table[] = {     0x00000000, 0x00000001, 0x00000100, 0x00000101,     0x00010000, 0x00010001, 0x00010100, 0x00000101,     0x01000000, 0x01000001, 0x01000100, 0x00000101,     0x01010000, 0x01010001, 0x01010100, 0x01010101 }; uint32_t spread = table[item]; 

then swar addition or subtraction, can optimized bit, because know they're increments or decrements or no change, (not tested)

// add in spread item 2 uint32_t h = 0x80808080; count = ((count &~h) + sp2) ^ (count & h); // subtract spread item 1 count = ((count | h) - sp1) ^ (~count & h); 

the result sign of every sub-field, easy extract annoying compress (unless have pext).


Comments

Popular posts from this blog

java - Run spring boot application error: Cannot instantiate interface org.springframework.context.ApplicationListener -

reactjs - React router and this.props.children - how to pass state to this.props.children -

Excel VBA "Microsoft Windows Common Controls 6.0 (SP6)" Location Changes -