Wednesday, September 15, 2010

...then it isn't

I continue my previous post with some more experiments and a roundup.

What was left out

In my last set of measurements, I left out the most important test. I wanted to see whether there was a threshold of the ratio of serial scans over random accesses where there would be a flip of the performance characteristics of my two implementations. Let me remind you here that the mixed test I performed previously gave a 1/7 probability of a serial scan (of random length and starting point) and the rest 6/7 to a random operation at a random position.

The test

The code this time was this:
for(int i = 0; i < RAND_ITERS; i++) {
size = theArray.size();
double op = rng.nextDouble();

if (op <= SERIAL_SCAN_PROB) {
int start = nextRandomIndex(dist, size);
int length = (int)(rng.nextInt(size - start));
for(int j = 0; j < length; j++) {
} else {
op = rng.nextDouble();
if (op < 0.2) {
theArray.getFirst(nextRandomIndex(dist, size));
} else if (op < 0.3) {
theArray.getSecond(nextRandomIndex(dist, size));
} else if (op < 0.4) {
theArray.setFirst(nextRandomIndex(dist, size), rng.nextInt());
} else if (op < 0.5) {
theArray.setSecond(nextRandomIndex(dist, size), rng.nextInt());
} else if (op < 0.7) {
theArray.add(rng.nextInt(), rng.nextInt());
} else {

The value of SERIAL_SCAN_PROB is what interests us here. Since I wanted to see the effects of caching, I settled to a specific size for the arrays that, based on my previous findings, was chosen as 5M elements. As before, all experiments where done with three access distributions, a ~Uni(0,size), a ~N(0.5, 0.1) and a ~N(0.5, 0.0001). RAND_ITERS was 1K and the whole test was run 20 times for each of the two structures. My hardware setup was the one I used last time.

The numbers

As before, no listing of boring data. If for any reason you want them, ask and you shall receive. The same goes for the code. Here are the graphs:

First off, the serial scans are the most time consuming of the operations, so it is normal the more are performed the more time is needed for each iteration of the test. Next, let's note that overall, the more localized the scans (in the order of the graphs), the more there is a difference in performance between the implementations. This is not because SingleArray does worse (note that its performance is more or less steady) but because DoubleArray is performing better. The more localized the operations, the better DoubleArray performs. However, there is no more random way to access an array other than a Uniform distribution, so only in this scenario can the single array implementation compete. Finally, more importantly but less interestingly, DoubleArray performs marginally better in all cases.


I think there are no more realistic tests to do here. There are marginal cases that I would like to test and find if the single array implementation dominates, but they are not important right now. What comes next is my library and, after all this, I will implement it with a double array backing store.


I did a similar experiment with a backing store of two parallel linked integer lists (not Integer objects). Even @1M elements, the random start serial reads test hadn't finished after 1.5h of 100% CPU. I killed it. 'nuff said.

No comments:

Post a Comment