Hadoop! 8-node cluster

I’ve been following this blog:

http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/

I’m using Mac Mini’s to build a Hadoop cluster.  Big Data, and all that.

After successfully figuring out how to image the Ubuntu Server install from one mac mini to many, I’ve got a 9-node cluster running tip-top.

The WordCount example algorithm is running at ~22 minutes to parse a 20GB file.  It’s a nearly linear progression from a 2 and 4-node cluster.

The recent breakthrough with the multi-node (more than 2 node) cluster is that each /etc/hosts file needs to have ALL the nodes in it… not just the master and itself.

Once I did that, the instructions work and life is grand 🙂
Thanks to Michael Noll for putting this together AND maintaining it for those of us making the first inroads to Hadoop and Big Data.

I’m off to build an 81-node cluster, and start pushing some real data through it.

Cisco Security updates

It’s been an eventful few weeks on the Cisco security end of things.

First, we resolved an issue with our new BotNet Filter, allowing us to shun traffic from known “bad”sources.

Second, I implemented some new IPS policies that allow us to track new types of information.  I’m still working on correlating them together …  I might actually try using the BlackStratus appliance for that–

Third, our FW had a delay in it that we’ve been working through and my TAC engineer was able to duplicate it in his lab environment.  I should hear back next week about a solution to this annoying little problem 🙂

I’m also expecting a visit from Gary Halleen in the next few weeks to continue a conversation about our failed ISE installation – and the new 1.2 features.
While he’s here we’re also going to talk about VACL capture – in an effort to limit the traffic sent to our IPS.

Portal channel: Print Credits

This week we finished building and testing the new Print Credits portal channel.

We had initially designed this some time ago, but never finished it.

ASWOU asked for something nearly identical, so we finished and polished the app and showed it to them.  They were thrilled and had some other good ideas —

The channel should be release in the next few weeks, at the request of ASWOU

NetApp nomination

This week I completed paperwork nominating Dave Diemer (and WOU) for a NetApp innovation award!

The snapshot feature (and the wouTV video describing it) are certainly innovative — but enough to show up on NetApp’s radar?
It will be interesting to see what happens.  Winners are flown to San Diego, put up in a hotel for 3 days and presented with an award …

Snapshots delivered to end-users are just one example of the many ways that UCS @ WOU is an agile, innovative organization.  We respond quickly to new challenges.  Our creative minds work together to utilize the infrastructure we already have to respond creatively when new needs arise.

I intend to continue to innovate, so long as I’m in this field.

Hadoop – update

Well, it’s been awhile since I’ve given an update on the Hadoop cluster…

I built 1, 2 and 4 node clusters.
I ran some basic files through them and here’s the results:

In reality, I’m VERY pleased with the results of the WordCount algorithm (that was provided).  Going from 1 to 2 to 4 Mini’s isn’t linear (how could it be?), but it’s surprisingly close, so I’m thrilled.

I was having some cluster stability issues, so I decided to try Hadoop 2.x and Ubuntu Server 12.04.1 LTS
Hadoop 2.x is TOTALLY different, and their walkthrough is … sparse.
So, back to 1.0.4, but at least I know how it works.  I’m sticking with Ubuntu Server, as it’s cleaner, faster and easier to configure.

On little trick:  the mini’s won’t boot (or reboot) without a monitor.  Solution?  Here:
http://gallery.nancyblenkhorn.com/main.php/v/Headless+Mac+Mini/
Off to Radio Shack 😉

At this point, I’ve built a new master, and a new slave.  Richard is going to image the slave, then we’ll try to make another one.  If that’s successful, we’ll image 6 more and build an 8-node cluster.  POWER!

Once I can get the cluster stable (running the same file over and over, running multiple tests with consistent success) — then we’ll image more and make larger clusters.

I can’t decide if I want to make an 8-node cluster AND a 16 AND a 32 …
Or if I want to try to add new Mini’s to an existing cluster.  Either way I’ll have to document it well and provide that information once it’s running.
I’m also interested in testing the performance of the cluster on a local switch, and distributed across the network to see what kind of latency a real network provides in this environment.

A little at a time and I’ll have a large, stable cluster.  Then off to algorithmics.  There are some bright MIS students here at WOU, and I’m sure one of them is familiar enough with Python (and parallel programming) to help me turn this pile of mini’s into something quite powerful. 

The ultimate goal, of course, is to actually crunch some data — not simply to learn about Hadoop.  Once I’ve read enough and played enough, I’m confident that I can process some Big Data from WOU and learn things we didn’t know before 🙂

Full Crew!

We officially hired Megan Eichler last month to begin in the Desktop Support/Telecom Tech position.

She is actively engaged in training for her new position, and is coming up to speed quickly.  She begins her LEA schooling soon, and will be working on that process for 3-4 years.

This once again give us a full crew, so I intend to schedule a departmental meeting for later this month or early next. Much of that meeting will focus on the training opportunities and other staff updates.

NetApp award

Today I was planning to start into the documentation for a NetApp award we are applying for next week.  That didn’t happen so it’ll have to wait until next week 🙂

Monday will also be the official launch of DegreeWorks … and EarlyIQ.  And it’s the start of the ramp-up for the final Banner SIS upgrade happening in a few weeks.  Oh and little details like the start of the term and fee payment.  Pfft.

This week I started into some mods for wouTV and worked to try to meet with everyone to begin building a new task list for the term.

The beginning of the term (monday) should also bring in a slough of BYOD, wireless, new and interesting questions, and … more meetings 🙂

Last post of the year

Well, the world didn’t end today (12-21-12)

Mayans.  Who can trust a people that doesn’t really exist anymore.
Did they know something we didn’t?
Did aliens come and get them?
Did they simply get tired of counting when they hit 2012?

The world may never know…

But on the lighter side of things, I intend to wake up tomorrow and eat lots of food and enjoy myself for the next week or so.  We have worked hard this term … really this whole year.  We have a full crew and I plan to schedule a departmental staff meeting next year. 

My plan for next year is to:

1. Check-in with everyone for their current project status
2. Update and distribute the next 6 months of projects
3. Do staff evals.

Top projects:

1. WOU EDW
2. Long list from McDonald
3. Ramp-up for Banner upgrade

Early IQ

Single Sign-On just grows and grows and grows 🙂

Early next year, we’ll be bringing a tool called EarlyIQ into Portal’s SSO.
EarlyIQ has two sections: the student success center, and a staff area.  Both will be connected.

Ron has begun the initial technical configuration, and we’ll loop back around on it next year.  In a perfect world, we’ll get it working by the start-of-the-term — but we might not hit that.

So we’ll get that done, and link it from Portal.

Faculty WorkLoad – updated

The Faculty WorkLoad team met this week, going over updates provided by Nathan Brake.

With such an incredibly busy year (Banner migration, upgrades, etc…) I’m very pleased with the progress that Nathan has made.  We are still in Phase 1 (setup), but have a written plan with some very rough dates to complete that and move onto Phase 2 (reporting).  Once complete, we can focus on testing for Phase 3 (integration with HR/Payroll).

It will be great to have this project done, and I expect it to be completed in 2013.