Do you create crazy for loops to deal with lots of hosts? I dont. I like simple. My middle school math teacher once told me that the best mathematician is a lazy mathematician. I thought, wow I should be the best at math then. Seriously though, I help lots of folks who are new to distributed computing. Standing over their shoulders I watch as they usually struggle with lots of concepts and best practices once things are spread out horizontally. Probably the very first thing someone is faced with on a daily basis is “How do I interact with all these systems?” One answer and the topic of a previous blog entry is to use multiple terminals or tabbed terminals but this only goes so far. For my money pdsh has taken me a long way. I have no intention of describing everything it can do but suffice it to say its worth your time getting to know this tool and another related tool pdcp. In short these two tools will let you run commands and copy files across groups of nodes defined dedicated group files or using a simple regular expression type pattern at the command line. A simple linux alias ( alias mypdsh=’pdsh -w nodes[1-4]‘ )is alway helpful too.
Yes I love AAAnold too but this about terminals not terminators. Do you use lots of tabbed terminals and end up flipping back and forth. Terminator can make your life much easier. I was able to use this on Ubuntu with just a few commands but on Mac it took a little more doing in that I had to install Fink. Fink worked whereas the parallel operation on mac ports did not work. In short you can split a single terminal into multiple windows using a few simple commands. If you work on lots of systems simultaneously this is a no brainer. This along with pdsh can make your life much easier at the command line.
The 6th Annual Hadoop Summit in San Jose is coming June 26th to San Jose. There are several activities in town that week including meetups, training and the show itself. Be the first one to find me at the show and tell me you saw this on my site and I will buy you a beer (maybe two if I have already been drinking). I will be there doing everything in my power to stay out of the booth. If you havent done so I recommend getting signed up soon as we expect record attendance.
Too busy you say? Yeah I wish I had time for all the stuff I want to do. The truth is the Hadoop market is white hot. The simplest use cases are enough to get people interested in doing POCs and getting installations up and running. I wish I had more time for Meetups and conferences. The truth is people want Hadoop and need it. The market is awash with confusing double speak and alternate takes on how “what we do is like Hadoop but better!” Dont buy the hype. There are lots of great tools out there and some may be great for visualization or be related to long running technology investments at your site but my advice from the field is dont mess with what works. Hadoop is intentionally and purposefully designed to be roboust and it works. Before you go trying to rearchitect what works at scale (or worse letting someone else talk you in to something crazy) for Big Data make sure you understand what your getting into. There are all kinds of schemes out there for altering the original design with hopes of financial, performance or coolness factors in mind. I can tell you as someone who is not a writer for a large IT news site but a doer. Use what works and what is being used at scale today. Most folks who want to use Hadoop for a business purposes agree. Leave the science projects for people who work in labs. Simply having your data in one spot (for longer than 90 days) and being able to run queries on that data in less than 24 hours without paying someone 10 million for a vertically scaled refrigerator pays for itself fast.
Well I thought it would be a good idead to start meeting folks in my local area who want to talk about Hadoop. Yes the local HUG exists and yes I plan on going. Take time to stop by if you are in town. I also thought that there is another need though. Lots of folks know they need a strategy for Big Data and want to talk about it with someone who understands it in a non threatening environment. I have found that having lots of people from diverse environments is the best way to foster lively discussion around architecture and potential future plans for a business. Yes I can talk nuts and bolts but I think more people are interested in the back of the napkin discussions and whiteboards that need to take place to define future direction for enterprise level systems. I heard someone call a thought leader in a technical field an “alpha nerd” in a recent article as someone who could speak technology with people (not *to* people which is an important difference in that it involves active listening) in plain language. Anyway, I like to think of myself this way and I hope to foster some active discussions. My idea to start is to host at least a monthly meetup somewhere that people actually like which usually means food and drink outside an office setting.
We’ll see how it goes but for now I am planning on Thursday Feb 21st 2012 at 630pm at Backyard Bistro in Raleigh.
Welcome to TECHtonka! This is my new site containing lots of technology and all the ramblings I can dream up. Yeah I did think of Dances with Wolves when I named the site. I thought it was a good name for a site that contained large amounts of detail about various topics in technology. Buffalo are large and they run in herds much like the technology we all use today. Welcome to TECHtonka – someday a huge site on technology but for today its my blog.