Thursday, April 14, 2016

What is Best storage servers infrastructure ? DAS/NAS/SAN or installing GlusterFS/LUSTER/HDFS/RBDB

I am trying to design an infrastucture for the project I am working on. It would be somehow a file-sharing/downloading project (like rapidshare) and I would need high storage sizes and good scability, and I would add new storage nodes after my project grows up.



I have come up with 3 solutions for my project which are using Luster, GlusterFS, HDFS, RDBD.




For start, i would have 2 servers, one server is for glusterfs client + webserver + db server+ a streaming server, and the other server is gluster storage node. (After sometime, i would be adding more node servers, and client servers (dont know how many new client new servers to add, will see later)



So, i am thinking to work with glusterfs. But i really wonder that if i have to use high performance servers with high sotrage sizes or avarage/slow servers with high storage sizes? Or nas/das/san solutions are better for glusterfs storage nodes? I might buy a nas and install glusterfs onto it. I would be happy to listen to your recommendations for the server properties (for each clients and nodes) . I really dont know if I really need high amount of ram and good cpus to for the nodes. I am sure i need it for client servers.



The files would be streamed as well, so the Automatic file replication is important, thus, my system should work like a cloud, when needed, according to high traffic, the storage nodes should copy the most demanded file to be streamed and would help me to get rid of scability problems and my visitors would able to stream/download those files.



Also, i am open to your experiences/thoughts about any good solution. Luster, hdfs, rbdb are the other options and i would be happy to listen to your thoughts here. I would be very very happy to hear back from anyone commented of any words I have used here.



Thanks







Edit:



I know the IOPS is the critical variable that i have to count on in every calculation if my network design, thats why i say random requests. But unfortunately, i dont have any statistics at all. Thats why i am here :)



My project is like that, you enter a download url to my website, my url downloads it, and you start download it from my own server, like a proxy downloader.



So i have a server 100mbit connection and 2TB hdd for now. I am thinking add nas servers. Really dont know if i have to add duplicated storage nodes in nas. And is there a limit that i can connect nas devices ? i mean i can connect max 2 nas servers to my main server?

No comments:

Post a Comment

linux - How to SSH to ec2 instance in VPC private subnet via NAT server

I have created a VPC in aws with a public subnet and a private subnet. The private subnet does not have direct access to external network. S...