Closed Bug 609212 Opened 14 years ago Closed 14 years ago

Peacekeeper's SHA1 hash algorithm is slow compared to Chrome due to profiling

Categories

(Core :: JavaScript Engine, defect)

x86
Windows XP
defect
Not set
normal

Tracking

()

RESOLVED FIXED
Tracking Status
blocking2.0 --- final+

People

(Reporter: developer, Assigned: billm)

References

(Blocks 3 open bugs)

Details

(Whiteboard: fixed-in-tracemonkey)

Attachments

(4 files)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.12) Gecko/20101026 Firefox/3.6.12
Build Identifier: Mozilla/5.0 (Windows NT 5.1; rv:2.0b8pre) Gecko/20101102 Firefox/4.0b8pre

In test 6 of Peacekeeper, it runs a SHA1 hash of an email over and over again.  FF4 is currently hurt because of peacekeeper's settimeout issue and firefox's "new Date" bug (603872) but once those are dealt with, FF4 is still slower than Chrome in the hash function.

On my box, I'm getting the numbers:

FF4: 1902ms
Chrome7: 1232ms

I'm attaching the test case to the bug.

Reproducible: Always

Steps to Reproduce:
1. Run the test5b.html attachment in FF4 and Chrome7
2. Compare the numbers
3.
Actual Results:  
Chrome is faster

Expected Results:  
FF4 should be as fast or faster
Attached file test5b.html
Blocks: peacekeeper
What numbers do you get if you set the "javascript.options.jitprofiling.content" preference to false?  What about leaving that true but setting "javascript.options.methodjit.content" to false?  Over here we end up faster than Chrome if I do either of those....
For what it's worth, under methodjit I bet bug 608823 is at least part of the problem.

The numbers I see are something like

Chrome        : 773ms
-j or -m -j   : 618ms
      -m      : 4016ms
      -m -j -p: 1003ms
Status: UNCONFIRMED → NEW
blocking2.0: --- → ?
Depends on: 608823, 580468, 467263
Ever confirmed: true
So the loops the profiler detects as not traceable are:

Line 175: maybeShortLoop
Line 89: maybeShortLoop
Line 80: maybeShortLoop
Line 43: maybeShortLoop
Line 19: has 6 inner looops, I think.  Note that once this is blacklisted, we
         won't unblacklist line 175 or line 43 unless we unblacklist this loop.
Line 205: has 7 inner loops.

We do unblacklist the loops on lines 80 and 89 when we decide to trace the loop on line 78.

For the line 205 loop, of the 7 inner loops we have already decided to trace 4, by the way... should we perhaps be excluding those 4 from the count?



I suspect the real big win here in the -j case comes from tracing the loop(s) on lines 205 and 19.  Especially the latter.  However if I change the profiler to not do the |numInnerLoops > 3| check I do see the loops on 19 and 205 traced, but that ends up _slower_ than trunk (!300ms or so).  Looks like we first decide to trace 19, then end up blacklisting it because we abort while trying to compile it, because we have blacklisted some of the other loops inside it due to them not executing enough times.  Could we perhaps exempt loops that have a traced (or being traced when profiling is on) parent from that blacklisting check?  It looks like the line 73 loop is the one most responsible for this part.  On the other hand, commenting out that Blacklist call in ExecuteTree doesn't seem to speed things up...
Blocks: 580468, 467263
No longer depends on: 467263, 580468
Some (other) issues that hurt JM a lot here: 

1) string concatenation (bug 608776 and bug 608782)
2) charCodeAt/String.fromCharCode (bug 579574) -- sad, these functions are used everywhere
3) <int>.toString(16) (> 2x slower than V8 and TM, I'll investigate and file a bug if needed)
Depends on: 579574, 608782
Summary: Peacekeeper's SHA1 hash algorithm is slow compared to Chrome → Peacekeeper's SHA1 hash algorithm is slow compared to Chrome due to profiling
Here's the results for the about:config changes:

"javascript.options.jitprofiling.content" -> false: 1225ms
"javascript.options.methodjit.content" -> false: 1246ms
(In reply to comment #5)
> 3) <int>.toString(16) (> 2x slower than V8 and TM, I'll investigate and file a
> bug if needed)

Filed bug 609296.
Depends on: 609296
Assignee: general → wmccloskey
(In reply to comment #3)
> The numbers I see are something like
> 
> Chrome        : 773ms
> -j or -m -j   : 618ms
>       -m      : 4016ms
>       -m -j -p: 1003ms

Hmm, weird, I get totally different numbers. Here's what I'm seeing (32-bit):
  Chrome: 669ms
  -j or -j -m: 3990ms
  -m: 989ms
  -m -j -p: 3178ms

It's possible that some stuff got checked in that affected this benchmark, but it's odd that the methodjit got faster and the tracer got slower. Could you re-run your numbers, Boris?
Using toda's build I get numbers like so (64-bit build):

-j      : 2850ms
   -m   : 1340ms
-j -m   : 2967ms
-j -m -p: 2650ms

There's a lot of noise, though.

So something _definitely_ slowed down -j here.
The numbers in comment 9 are on tm.  On m-c on the same machine, -j is 628ms.
Depends on: 610583
I filed bug 610583 on the -j regression.  Not much point digging into this until that's fixed, I guess.
With the patch in bug 610583, I see the following numbers over here:

v8: 700ms
-j or -j -m: 620ms
-m: 1200ms
-m -j -p: 960ms
It seems like there are a few problems here. The maybeShort problem should be fixed by patch in bug 606890 comment 8.

Another problems seems to be that we compile a trace and then blacklist it because it runs for too few iterations. Rather than blacklisting the loop, this patch runs the loop's next 5000 iterations in the method jit.

Another problem is the >3 inner loops issue. I'll post that next.
Attachment #489343 - Flags: review?(dmandelin)
This patch raises the number of inner loops that we allow to 7. This check used to be more important ones, but raising the limit doesn't cause any regressions. And we need to trace a lot of inner loops for the SHA1 benchmark.
Attachment #489347 - Flags: review?(dmandelin)
Attachment #489343 - Flags: review?(dmandelin) → review+
Attachment #489347 - Flags: review?(dmandelin) → review+
Whiteboard: fixed-in-tracemonkey
blocking2.0: ? → -
I assume fixed-in-tracemonkey?
(In reply to comment #17)
> I assume fixed-in-tracemonkey?

No, I had to back this out. I'll try to get it back in after beta8.

http://hg.mozilla.org/tracemonkey/rev/cb76b2d61096
Renominating for blocker2.0 because this blocks (or possibly is a dup of) bug 608867 which is a blocker.
Blocks: 608867
blocking2.0: - → ?
blocking2.0: ? → final+
I've landed the inner loops patch, which is low risk.

http://hg.mozilla.org/tracemonkey/rev/643454386bec

With this patch, I measure the time for this benchmark dropping from 825ms to 608ms with -mjp. This compares with 984ms for -m, 556ms for -j, and 543ms for -mj. All these numbers are on 32-bit Linux.

So we're in the ballpark now, at least.
Whiteboard: fixed-in-tracemonkey
Oh, I forgot to say that I marked this fixed-in-tracemonkey because we're now basically even with Chrome. Crankshaft runs this benchmark in 601ms.
(In reply to comment #21)
> Oh, I forgot to say that I marked this fixed-in-tracemonkey because we're now
> basically even with Chrome. Crankshaft runs this benchmark in 601ms.

As a regular user i have to ask.It´s fixed now in nighty?.Meaning if i download the latest 4.0b9pre then it's faster now in peacekeeper?

Or do i have to test some specific build?
(In reply to comment #22)
> As a regular user i have to ask.It´s fixed now in nighty?.Meaning if i download
> the latest 4.0b9pre then it's faster now in peacekeeper?

It should be fixed in the next tracemonkey nightly, which will be built tonight. If you test it, I'd appreciate it if you post the results. I was only testing in the shell, which can sometimes deceive you.
(In reply to comment #23)
> It should be fixed in the next tracemonkey nightly, which will be built
> tonight. If you test it, I'd appreciate it if you post the results. I was only
> testing in the shell, which can sometimes deceive you.

Ok.I will test the current build and then re run the test tomorrow.Any hints as to what subscore should increase or will the overall score increase?
Ok this was with 25.12.2010 build:
http://img98.imageshack.us/img98/5959/25122010.jpg

And this with 30.12.2010 build:
http://img12.imageshack.us/img12/8011/30122010.jpg

The Data subscore recieved a healthy 400 point increase and overall score improved by over 100 points too.
Blocks: 608733
http://hg.mozilla.org/mozilla-central/rev/643454386bec
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
This doesn't look like it was completely fixed.  I just ran the test case on Mozilla/5.0 (Windows NT 5.1; rv:2.0b10pre) Gecko/20110115 Firefox/4.0b10pre and got the results:

1430ms

While Chrome 8.0.552.237 has:

1217ms

So FF4 has sped up but it still isn't equal to Chrome (which didn't change that much).  I haven't tested a tracemonkey build though, but Comment #26 makes me think that it should be in the normal nightly.
Followup bug seems worth filing.

/be
I'll file a followup but this bug just isn't really fixed.  I'm still getting numbers like comment 13 in the shell, and the original browser testcase is showing an even bigger slowdown....
Blocks: 626165
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: