15 Dec, 2011, plamzi wrote in the 21st comment:
Votes: 0
Twisol said:
(EDIT: Well, based on other answers, it looks like this isn't complete yet. From some debug output, I think it needs to be able to merge groups that are separated by more than one negative region… time to work on that a bit.)


It's clear you have a winning approach. I wouldn't worry about finishing it unless someone was paying me for it :)
15 Dec, 2011, Ssolvarain wrote in the 22nd comment:
Votes: 0
Tyche said:
Ssolvarain said:
By a show of hands, how many of you are actually robots posing as humans?

I used to be human, but then I took an arrow to a knee.


I am disappoint…
15 Dec, 2011, Runter wrote in the 23rd comment:
Votes: 0
You shouldn't trim the set unless you are certain there are positive numbers in the set. I'm not even sure the gains are worth it.

For example, [-11, -7, -5, -25, -2] , if you trimmed this then you'd be left with a zero length array. The largest set here is a negative number. Not zero. (-2)
15 Dec, 2011, Twisol wrote in the 24th comment:
Votes: 0
Runter said:
You shouldn't trim the set unless you are certain there are positive numbers in the set. I'm not even sure the gains are worth it.

That's already handled. If there are no positives in the list at all, it gets compacted into a single negative entry, and that's returned before trimming.

Twisol said:
# If there's only one entry, return it, whether it's positive or negative.
return sets[0] if sets.length == 1
# Remove negatives from the front and back.
trim! sets

As for why I trim, it's to make the algorithm simpler. I want to always ensure that I analyze [pos, neg, pos] sequences, never [neg, pos, neg]. By removing negative sets, it's much simpler to guarantee that (and it also guarantees that I have (2n-1) sets, so I'm never left with an incomplete sequence).
15 Dec, 2011, Twisol wrote in the 25th comment:
Votes: 0
Updated to iteratively merge more distant regions together. I'm sure there's a better way to do this, but it does match up with Sharmair's answer. I basically overhauled merge(), and wrapped the loop in calculate() in one that increases the distance every time merge() hits a wall.

$ time ruby vector.rb
[58…570, 21494]

real 0m27.972s
user 0m28.226s
sys 0m0.044s

# Isolate regions of positive and negative numbers
def partition (v)
sets = []

left = 0
i = 1
v.each_cons(2) do |l, r|
# are the signs different?
if (l >= 0) != (r >= 0)
sets.push [(left…i), v[left…i].reduce {|a, n| a + n}]
left = i
end
i += 1
end

sets.push [(left…i), v[left…i].reduce {|a, n| a + n}]
end

# Remove negative regions from the front and back - they'll never be included.
def trim! (sets)
sets.shift if sets.first[1] < 0
sets.pop if sets.last[1] < 0
end

# Merge adjacent regions of positive over a lesser region of negative.
def merge! (sets, distance)
distance = distance * 2 + 1
original_length = sets.length

i = 0
while i + distance <= sets.length
nums = sets[i…i+distance]
sum = nums.reduce(0) {|a, n| a + n[1]}
max = nums.reduce(0) {|a, n| n[1] > a ? n[1] : a}

if sum >= max
sets[i…i+distance] = [[(nums.first[0].first…nums.last[0].last), sum]]
else
i += 2
end
end

sets.length != original_length
end

def calculate (v)
# Partition the vector into regions by sign.
sets = partition v

# If there's only one entry, return it, whether it's positive or negative.
return sets[0] if sets.length == 1
# Remove negatives from the front and back.
trim! sets
# If there's only one now, it's the only positive island, so return it.
return sets[0] if sets.length == 1

# Iteratively merge adjacent regions to create larger regions
distance = 1
while distance*2 + 1 <= sets.length
nil while merge!(sets, distance)
distance += 1
end

# We have our maxima. Return the largest of our remaining regions.
sets.reduce {|a, x| x[1] > a[1] ? x : a}
end
15 Dec, 2011, Sharmair wrote in the 26th comment:
Votes: 0
Twisol said:
$ time ruby vector.rb
[58…570, 21494]

real 0m27.972s
user 0m28.226s
sys 0m0.044s


Ok, that is pretty close to mine, though one of us must be wrong
by one on the end range, I had 569. Though it is nice to finally
get a matching solution.
15 Dec, 2011, Twisol wrote in the 27th comment:
Votes: 0
58…570 translates to interval notation as 58…570 translates to interval notation as [58, 570). Since we're dealing with integral indices, that's just [58, 569]. So I'm pretty sure it's all the same.
16 Dec, 2011, David Haley wrote in the 28th comment:
Votes: 0
Bah. Algorithms.
plamzi said:
Twisol said:
(EDIT: Well, based on other answers, it looks like this isn't complete yet. From some debug output, I think it needs to be able to merge groups that are separated by more than one negative region… time to work on that a bit.)


It's clear you have a winning approach. I wouldn't worry about finishing it unless someone was paying me for it :)

No, unfortunately, the optimal solution is much simpler. Twisol's approach is a much more intuitive one, perhaps, because the code follows the mental model that you might have, but it's not the most efficient one.

The 'well-known' solution takes 0.11s on my system (a macbook pro laptop). That's almost 3 times as fast. Yes, comparing across systems is sketchy, but it should be quite clear why the standard solution is much faster: it's doing far less stuff.

Oftentimes in algorithms, the obvious mental model map is inferior to something else. Twisol's approach is most certainly leagues better than the brute force method (which is to compute every possible subset, and then count sums and find the biggest). But this problem does not need you to construct sets and trim them and all kinds of things like that.
16 Dec, 2011, Twisol wrote in the 29th comment:
Votes: 0
David Haley said:
But this problem does not need you to construct sets and trim them and all kinds of things like that.

Eh. It's easy enough to see the answer if you're already aware of it. Mental models are typically the fastest way to get acquainted with a problem, though. For my part, I was solving for "when do you include a negative in the subset?" The answer I saw was "when it's smaller than the positives on either side", meaning you'd benefit from the expansion. The flaw there is that I didn't realize it might get stuck where you could suffer a small loss before a larger gain by expanding further at once.

I'll definitely be studying that wiki page later. It's still not obvious to me why or how it works….
16 Dec, 2011, Idealiad wrote in the 30th comment:
Votes: 0
I'm liking this thread, seeing some cool approaches to the problem.

For comparison with my first attempt I gave the wikipedia page algorithm that DH linked a shot:

(defn kadane [v]
(loop
[x v
max-ending-here 0
max-so-far 0]
(if (seq x)
(recur
(next x)
(max 0 (+ max-ending-here (first x)))
(max max-ending-here max-so-far))
max-so-far)))


which gave an answer of 21494and a time of 4.399442 msecs, now agreeing with Twisol and Sharmair, so something was pretty off with my first try :D.
16 Dec, 2011, Sharmair wrote in the 31st comment:
Votes: 0
David Haley said:
Oftentimes in algorithms, the obvious mental model map is inferior to something else.

Maybe. But in this case, my 'mental model' of the problem seems to still be the best method
I have thought of so far (I am also kinda happy it is still the fastest one shown here, though
the lang used probably has alot to do with is being over 100 times faster then the next one so
far, and almost a million times faster then some). But to give more info on my method,
let me see if I can put into words the thinking I did…

From the start I knew intuitively that it should be able to be done in a single pass of the data.
While I was thinking of it, I was figuring how if you started at the first positive number and added
values to it, keeping track of the sum as you went. As long as that sum was positive that range of
elements could contribute to a total sum (and I would be keeping track of a peak max in that range).
If the sum ever dropped negative though, that range would never contribute to a later range (though
it might contain the overall max), so at that point I would reset the current start of the range to the
next element and reset the current sum to zero, as I would start looking at a new range. It turns out
that this same method would also take care of any leading nagative numbers, so you could just use
this right from the start. Now, to reprint the code, and add comments to it to make it more clear:
int rstart = 0;//start index of max range
int rend = 0;//end index of max range
int maxsum = 0;//max sum found so far
int csum = 0;//current sum of the working range
int start = 0;//start index if working range
for(int current = 0; current < size; ++current){//scan through the array
csum += data[current];//add current element to current sum
if(csum < 1){//is the working range no longer contributing?
start = current+1;//set new working range start to next element index
csum = 0;//clear working sum
}else if(csum > maxsum){//is this a new overall max?
rstart = start;//set the new max range and sum
rend = current;
maxsum = csum;
}
}
16 Dec, 2011, Rarva.Riendf wrote in the 32nd comment:
Votes: 0
Quote
From the start I knew intuitively that it should be able to be done in a single pass of the data.

Well that is basically in the problem definition (going from hours to some milliseconds ;p) nothing really intuitive about it (and also why I knew my algo was definitely not the best solution :P)
16 Dec, 2011, David Haley wrote in the 33rd comment:
Votes: 0
Twisol said:
Eh. It's easy enough to see the answer if you're already aware of it.

Well that's exactly what I'm saying. Your approach is the "obvious" one if you're modeling the problem in the intuitive sense: constructing sets of numbers and trimming them. But the point was that the "obvious" model is (usually) not the fastest one.

Twisol said:
I'll definitely be studying that wiki page later. It's still not obvious to me why or how it works….

It's actually not too different of an idea. Basically, you keep adding numbers until your set becomes negative. You know that there is no point continuing once the set is negative. And every time you get a sum, you record if the sum is the highest you've seen so far.

In other words, with a single scan, you know when there's no point continuing a subset. And you also know that you should always combine numbers if you can, unless you reach a negative sum, in which case you might as well start over.

Sharmair said:
Maybe. But in this case, my 'mental model' of the problem seems to still be the best method I have thought of so far

Well, I'd argue that your solution isn't a very "obvious" mental model, but hey, sometimes people come up with the fast solution right away. :smile:

Sharmair said:
(I am also kinda happy it is still the fastest one shown here, though the lang used probably has alot to do with is being over 100 times faster then the next one so far, and almost a million times faster then some).

Yes, it should be utterly obvious that a C program will outperform a Ruby or Python program.

What will be similar are things like growth rates as the input changes.

Incidentally, the solution you gave is (basically) the canonical solution, so people can read up on Wikipedia too for the same method.

Rarva said:
Well that is basically in the problem definition (going from hours to some milliseconds ;p) nothing really intuitive about it (and also why I knew my algo was definitely not the best solution :P)

It's not always obvious that you can do things in a single pass, and sometimes, you can't. There are still algorithms that go from hours to milliseconds without scanning the input in a single pass.
17 Dec, 2011, Tyche wrote in the 34th comment:
Votes: 0
program LCS;
#include( "stdlib.hhf" )
static
data: int32[5000] := [336,502,-557,-371,507,…array ellided…,213,42,-253,450];
begin LCS;
mov(0,eax); // greatest sum so far
mov(0,ebx); // temp sum
mov(0,edi); // sum start index
mov(0,esi); // array index
mov(0,ecx); // final start index
mov(0,edx); // final end index
top:
add(data[esi*4],ebx);
jns positive;
mov(esi,edi);
inc(edi);
mov(0,ebx);
positive:
cmp(ebx,eax);
jng next;
mov(edi,ecx);
mov(esi,edx);
mov(ebx,eax);
next:
inc(esi);
cmp(esi,5000);
jb top;
done:
stdout.puti32(ecx);
stdout.put("-");
stdout.puti32(edx);
stdout.put(nl);
stdout.puti32(eax);
stdout.put(nl);
end LCS;


This is Sharmair's method, but it's around 10% faster than gcc 4.5.3 -O3.
Using HLA/FASM.

Actually the first test is < 0 rather than < 1. I don't believe Sharmair's correctly handles the array Runter posted.
This - > {10,-10,9,2}
17 Dec, 2011, plamzi wrote in the 35th comment:
Votes: 0
David Haley said:
Well that's exactly what I'm saying. Your approach is the "obvious" one if you're modeling the problem in the intuitive sense: constructing sets of numbers and trimming them. But the point was that the "obvious" model is (usually) not the fastest one.

Sharmair said:
(I am also kinda happy it is still the fastest one shown here, though the lang used probably has alot to do with is being over 100 times faster then the next one so far, and almost a million times faster then some).

Yes, it should be utterly obvious that a C program will outperform a Ruby or Python program.



Incidentally, the solution you gave is (basically) the canonical solution, so people can read up on Wikipedia too for the same method.

Rarva said:
Well that is basically in the problem definition (going from hours to some milliseconds ;p)


In this thread, Twisol was the first one to go down to milliseconds, so in practical terms his solution may be on par with the canonical one. People with extremely mathematical minds (used to know one) may see a faster solution right away, and there may be a hundred-times faster solution than the canonical one that no human mind has puzzled out yet, but one that would take a genius and half a lifetime. Point is the time frame for delivery should be part of evaluating how good a solution is, not just efficiency. So even setting aside difference in languages, I wouldn't rush to discard Twisol's 'obvious' approach.
17 Dec, 2011, Twisol wrote in the 36th comment:
Votes: 0
plamzi said:
In this thread, Twisol was the first one to go down to milliseconds, so in practical terms his solution may be on par with the canonical one.

Except… it was incomplete because it only merged very close-by positive regions. The first answer I gave was incorrect, apparently. When I modified it to iteratively try larger distances after exhausting smaller ones, it jumped to 28 seconds (and gave the right answer).

David Haley said:
Twisol said:
Eh. It's easy enough to see the answer if you're already aware of it.

Well that's exactly what I'm saying. Your approach is the "obvious" one if you're modeling the problem in the intuitive sense: constructing sets of numbers and trimming them. But the point was that the "obvious" model is (usually) not the fastest one.

I meant that the optimal answer is obvious to you because you're already aware of it. I wasn't - and it took some considerable thought to come up with the model I derived. (It's obvious in hindsight… which is quite the point I'm making here.)

David Haley said:
Twisol said:
I'll definitely be studying that wiki page later. It's still not obvious to me why or how it works….

It's actually not too different of an idea. Basically, you keep adding numbers until your set becomes negative. You know that there is no point continuing once the set is negative. And every time you get a sum, you record if the sum is the highest you've seen so far.

I don't understand how you know you can stop when you hit a negative sum. If you continue past that point, you could find a sufficiently large entry to counterbalance that loss and actually gain some value.
17 Dec, 2011, Runter wrote in the 37th comment:
Votes: 0
Quote
there may be a hundred-times faster solution than the canonical one that no human mind has puzzled out yet


At best this is an overstatement. At worst, just wrong. You may speed it up significantly, but the only way you're going to do that is with processing power or more efficient execution. The algorithm itself is canonical for a reason. Decades of work from "math people", usually in entry level CS classes, have 'solved' this one. You're not going to find a hidden solution hundreds, or even 10's of times, or even 5 times faster from the algorithm side. At best you might eek out gains only to find that, like Twisol, an underlying nuance for edge cases you didn't account for sped you up.
17 Dec, 2011, Runter wrote in the 38th comment:
Votes: 0
@twisol

That's because the algorithm does this:
max_ending_here = max(0, max_ending_here + x)


In other words,

max_ending_here = 0 if (max_ending_here + x <= 0)

You know that beyond that point it won't add value if the negative number itself overcomes your sum. So if the set is this: [1, 2, -4, 10] once you are to the -4 the algorithm knows to stop summing because the -4 reduces the value, not adds to it. And anything after that is better off without the -4 included.

So it's not all negative numbers.
17 Dec, 2011, Rarva.Riendf wrote in the 39th comment:
Votes: 0
David Haley said:
Rarva said:
Well that is basically in the problem definition (going from hours to some milliseconds ;p) nothing really intuitive about it (and also why I knew my algo was definitely not the best solution :P)

It's not always obvious that you can do things in a single pass, and sometimes, you can't. There are still algorithms that go from hours to milliseconds without scanning the input in a single pass.


If not single pass, it would have been a fixed number, otherwise the gain would not have been that great on an old machine. As it was not an implementation problem but an algorithm one, and there is no mathematical theory to apply here, you need to calculate the sum, whatever you do. That is why I said in this perticular case, with all the given info, the goal was to reduce passes.

With less info (not telling it was an algo problem), no info on when it was solved (with newest machine multithreading can give huge gain with those kind of problems) then I would not have said it was obvious.

As an example with multithreading getting the max of an array is way faster.
17 Dec, 2011, plamzi wrote in the 40th comment:
Votes: 0
Runter said:
Quote
there may be a hundred-times faster solution than the canonical one that no human mind has puzzled out yet


At best this is an overstatement. At worst, just wrong. You may speed it up significantly, but the only way you're going to do that is with processing power or more efficient execution. The algorithm itself is canonical for a reason. Decades of work from "math people", usually in entry level CS classes, have 'solved' this one. You're not going to find a hidden solution hundreds, or even 10's of times, or even 5 times faster from the algorithm side. At best you might eek out gains only to find that, like Twisol, an underlying nuance for edge cases you didn't account for sped you up.


I said 'there may be' because I wanted to make a point, one that I think every algorithm competition makes by introducing a time constraint. Even the most amazing solution is useless if it takes so much time and resources that history moves to obsolete it.

I agree with you to the extent that given that human beings created the rules (math), and that this particular problem is fairly simple, there may not be a more efficient solution than the canonical one. That said, I do want to put it out there that it's wrong to assume "math people" are the smartest thing in the universe, or that math itself abstracts reality in a perfect way.
20.0/78