I was trying to prepare for a meeting a few weeks back on the topic of Hadoop; its impact on the network was supposed to be a large part of the conversation.  Needless to say, it wasn’t, but during the preparation, I was looking for material specific to building and designing networks specific for Hadoop environments.

Since there aren’t that many articles out there on the subject, I figured I’d share what I was able to find.
Understanding Hadoop Clusters and the Network – by Brad Hedlund

Architecting a Network for Hadoop – by Stuart Miniman

Considering 10GE Hadoop clusters and the network – by Brad Hedlund

Big Data in the Enterprise: Network Design Considerations – by Cisco

Dell Force10 Hadoop Network – The Scale out Solution – by Dell (looks like more of Brad’s work)

Network Design Considerations for Hadoop ‘Big Data Clusters’ and the Hadoop File System – by Arista

Arista – Hadoop Cluster Applications – by Arista (couldn’t find this one again because I first found this several months ago, so it is attached right below).
aristahadoopapplication_tn.pdf
File Size: 910 kb
File Type: pdf
Download File

Cloudera Hadoop Infrastructure with Cisco UCS Servers and Cisco Nexus Switches Powered by Intel Xeon Processors

For those other SDN aficionados out there, you may like this next one (combines optical switching, SDN, and big data):  
          Programming Your Network at Run-time for Big Data Applications  by IBM T.J Watson Research Center and Rice University.  Sounds like Plexxi would be a good fit here. 

This is not meant to be an exhaustive list of all resources out there, but just a starting point.  I highly recommend reading Brad’s first and then even reading some general Hadoop whitepapers that aren’t focused on the network to really understand the application itself.  As you can see from the documents above, the vendors still recommend their same switches for Hadoop environments, but it is important to be knowledgeable as a network person when interfacing with the application folks.  You should always try and be relevant and do your best to understand the applications that ride over the network.  That’s why we are building the networks anyway.

For those stepping into a meeting in the next hour and need a few high level bullet points, here you go:

  • Understand amount of servers, amount per rack, and how many NICs per server are to be used
  • Understand overall growth
  • It’s common to deploy TOR switches – either 1 or 2 per rack based on server density and importance of the cluster to the business (just like in other parts of the DC)
  • Hadoop has no dependencies on Layer 2 connectivity  – use layer 3 whenever possible
  • Small deployments (just a few racks) can take advantage of L3 in the aggregation layer
  • Larger deployments will have L3 down to the TOR switch
  • 1GE is most common in today’s Hadoop environments, but 10GE is gaining traction
  • Hadoop environments are largely all bare metal – no virtualization 
  • Build out to be non-blocking whenever possible; large amounts of data will be shuffled across the network should there be a “rack” failure, e.g. single TOR switch fails
  • Take into consideration the traffic is bursty – use switches with optimized buffers 

Regards,
Jason

Follow me on Twitter: @jedelman8


 


Comments




Leave a Reply