Wednesday, February 11, 2015

cpu speed - Why is Pentium 4 Prescott slow compared to Pentium M?

I was benchmarking if it is worth to put loop inside a callback function so I tested fourth order Runge-Kutta of on y'=y in C++, all with gcc 5.1 on Ubuntu with compilation command


g++ -std=c++11 -O3 -march=native --fast-math test.cpp

double dt=(t_end-t_0)/N;
auto y=y_0;
auto t=t_0;
for(size_t k=0;k {
auto k_1=f(k*dt, y);
auto k_2=f(k*dt + 0.5*dt, y + 0.5*dt*k_1);
auto k_3=f(k*dt + 0.5*dt, y + 0.5*dt*k_2);
auto k_4=f(k*dt + dt, y + dt*k_3);
y+=dt*(k_1 + 2*k_2 + 2*k_3 + k_1)/6.0;
}
return y;

Inlining was achieved by a template and a function object. For dynamic binding a function pointer was used.


Specs as given by /proc/cpuinfo


cpu family      : 6
model : 13
model name : Intel(R) Pentium(R) M processor 1.73GHz
stepping : 8
microcode : 0x20

Frequency from sudo cpufreq-info


current policy: frequency should be within 800 MHz and 1.73 GHz.
The governor "userspace" may decide which speed to use
within this range.
current CPU frequency is 1.73 GHz (asserted by call to hardware).

Results


                  ODE solution       exp(1)             diff                   Execution time
Function pointer 2.718281828037378 2.718281828459045 -4.21667145644733e-10 53321972
Inlined call 2.718281828037378 2.718281828459045 -4.21667145644733e-10 19916460

Specs as given by /proc/cpuinfo


cpu family      : 15
model : 4
model name : Intel(R) Pentium(R) 4 CPU 3.40GHz
stepping : 3
microcode : 0x5

Frequency from sudo cpufreq-info


current policy: frequency should be within 2.80 GHz and 3.40 GHz.
The governor "userspace" may decide which speed to use
within this range.
current CPU frequency is 3.40 GHz.

Results


                  ODE solution       exp(1)             diff                   Execution time
Function pointer 2.718281828037378 2.718281828459045 -4.21667145644733e-10 70811683
Inlined call 2.718281828037378 2.718281828459045 -4.21667145644733e-10 19928642

So the Prescott performs no better (it seems to be much worse), than the much lower clocked Pentium M. Sure, Prescott had a very long pipeline, but my code is highly predictable since N=2^30. So what makes the Prescott that slow despite its high CPU frequency?

No comments:

Post a Comment

linux - How to SSH to ec2 instance in VPC private subnet via NAT server

I have created a VPC in aws with a public subnet and a private subnet. The private subnet does not have direct access to external network. S...