here is a sample job script I got running to test out some hadoop mapreduce jobs for our new cluster. You can put this in the same directory with the map/reducer files. the -file parameter will package up those files and send them to the tasknodes in the cluster so you don't have to install them yourself.


I've been starting to write some hadoop and python streaming jobs and there isn't all that much documentation regarding it out there. Things like, how do I pass environment variables, how do I pass along modules that my scripts might need, etc...

here's a couple of quick tips... to pass environment variables to your tasknodes use this command line param when launching a hadoop job:

Exciting times at the new job. We're releasing a handful of super secret invites to the new site. Crazy days to come. Wish I could say more at this time.

After 5+ years at Panasonic I've decided to accept a new position with Blue Rover Labs. It will be an exciting change back to start up life and I'm looking forward to getting started. I bought a Civic Hybrid with carpool stickers this morning so I'm all set for the commute :)

It was a pleasure to work with everyone at Panasonic. They really do take care of their employees and that really made the decision difficult.

We'll see what happens in the future!

I've gone ahead and added the createuserAction to my indexController to illustrate how we could reuse all the code previously written to validate and save creating a new user. If you want to view the change log to see the code I added to support this, you can view it here: [url=][/url]

Here's what our createuserAction code looks like now

I've gone ahead and added validation to our test project:

If you take a look at the indexController you'll now see I've added an updateAction that uses a Zend Form component.

The goal of the validation was to provide context sensitive validation on a model. A model by itself might not know if it's valid or not so you'll want to apply different contexts depending on what the client operation is. Saving an order might have a different validation need than shipping an order. Instead of hacking a bunch of if/else statements you can simple say

As a follow up to my previous post I'm going to post some live examples of model layer with Zend framework based on Domain Driven Design concepts. I've spent the past few days studying up on domain driven design as well as a lot of Martin Fowler's work in relation to domain modeling and I think I have at least some code to start with that could get some conversations rolling. The issue is Zend Framework doesn't have a formal model layer and it shouldn't. Models are very much domain specific to what you're working on. As I've mentioned in my previous post sometimes you need complex business logic that will need to rapidly change and adapt to growing business needs. Active record can only get you so far when complexity expands.

I've posted a working Zend Framework project that has a basic working set of models so far. The current concept has a User who in turn owns a profile object. That profile object holds a nickname attribute that we can play around with. I have two tables users and profiles that are joined with a foreign key (user_id). Normally when you want profile information you'd just do a join and work with the result. What if that profile becomes context sensitive? What if a user can have a different profile based on the location he's in? Maybe he's a CIA agent and if he's overseas he needs a different profile to be covert. You'd now have to start adding conditions to your logic to account for this. Having the user own a profile object that can be injected at runtime allows for these contexts to be more manageable.

What if your user is going to be used in a workflow engine, a piece of marketing business logic that operates on users, it will be nice to be able to pass around a complete User object that has well documented properties and APIs instead of forcing each business logic service to know the details of your database.

I'm currently in requirements mode for an upcoming project that should prove to be pretty complex. The current active record/table gateway patterns just aren't going to cut it for the complex business logic that's approaching. I'm starting to lean towards the domain model approach which would increase the initial complexity of the design but allow for the flexibility for future changes and features. The issue is with where to put your business logic? Most of the rails and Zend framework examples show a more active record type pattern which means that the business logic is tied into the persistence storage layer.

For example...

$user = new User();

As you may know there are many storage engines in MySQL MyISAM, InnoDB, Falcon, CSV, Blackhole, Archive, etc...There is a storage engine that comes with the MySQL Max Download called the "Blackhole Engine". According to the documentation it basically dumps it's storage to /dev/null. A storage engine that doesn't store anything? What good could that be?

Well if you run in a high volume production system where you may have one or more master databases for writes/updates/deletes and a whole farm of slaves reading the log from that master than this may be of interest to you. The concept is pretty simple. You have a Master database that is in charge of all your inserts, deletes, updates which in turn has connections to all those slaves. That means network traffic, disk I/O, CPU power all taking up resources that you really want for the Master's primary goal of collecting and maintaining data.

This is where the Blackhole Engine comes in. The actual process of logging the SQL statements that hit the Master database that the slaves consume lives above the storage engine level in the main MySQL server level. So with the Blackhole Engine piping data to /dev/null you can actually use it as a proxy to your slaved databases without the need to duplicate the data on that machine (it could very well be on the same machine!). See below for an example image...

So I had a little wiggle room in the budget and one of the things I wanted to do was to improve the team's knowledge level. I contracted out training and consulting with several firms the first of which is MySQL(sun microsystems). I wanted an onsite developer training course for a week. Seemed simple enough at the time, how hard could getting training be I already had the approval from the higher ups to go ahead. Here is the process I had to go through.......

1. Contact MySQL to put together a training package, work out costs, requirements, etc...

2. Get Statement of Work from MySQL to start the process on our side... ok everything is fine so far