Monday, October 21, 2019

performance - Are networks now faster than disks?



This is a software design question



I used to work on the following rule for speed




cache memory > memory > disk > network


With each step being 5-10 times the previous step (e.g. cache memory is 10 times faster than main memory).



Now, it seems that gigabit ethernet has latency less than local disk. So, maybe operations to read out of a large remote in-memory DB are faster than local disk reads. This feels like heresy to an old timer like me. (I just spent some time building a local cache on disk to avoid having to do network round trips - hence my question)



Does anybody have any experience / numbers / advice in this area?



And yes I know that the only real way to find out is to build and measure, but I was wondering about the general rule.




edit:



This is the interesting data from the top answer:




  • Round trip within same datacenter 500,000 ns


  • Disk seek 10,000,000 ns





This is a shock for me; my mental model is that a network round trip is inherently slow. And its not - its 10x faster than a disk 'round trip'.



Jeff attwood posted this v good blog on the topic http://blog.codinghorror.com/the-infinite-space-between-words/


Answer



Here are some numbers that you are probably looking for, as quoted by Jeff Dean, a Google Fellow:






Numbers Everyone Should Know




L1 cache reference                             0.5 ns
Branch mispredict 5 ns
L2 cache reference 7 ns
Mutex lock/unlock 100 ns (25)
Main memory reference 100 ns
Compress 1K bytes with Zippy 10,000 ns (3,000)
Send 2K bytes over 1 Gbps network 20,000 ns
Read 1 MB sequentially from memory 250,000 ns
Round trip within same datacenter 500,000 ns

Disk seek 10,000,000 ns
Read 1 MB sequentially from network 10,000,000 ns
Read 1 MB sequentially from disk 30,000,000 ns (20,000,000)
Send packet CA->Netherlands->CA 150,000,000 ns





It's from his presentation titled Designs, Lessons and Advice from Building Large Distributed Systems and you can get it here:






The talk was given at Large-Scale Distributed Systems and Middleware (LADIS) 2009.



Other Info









It's said that gcc -O4 emails your code to Jeff Dean for a rewrite.





No comments:

Post a Comment

linux - How to SSH to ec2 instance in VPC private subnet via NAT server

I have created a VPC in aws with a public subnet and a private subnet. The private subnet does not have direct access to external network. S...