-
Recent Posts
Recent Comments
- Alvin on Workflow Engines for Hadoop
- ivan provalov on Workflow Engines for Hadoop
- raspberry pi, lighttpd and wordpress pretty permalink | Rants, grunts and chants on Moving wordpress blog to lighttpd
- Keith Wiley on Getting Started with Apache Hadoop 0.23.0
- Hardik on Getting Started with Apache Hadoop 0.23.0
Archives
Categories
Meta
Companies
Me
Author Archives: joecrow
using jarjar to solve hive and pig antlr conflicts
Pig 0.9+ and Hive 0.7+ (and maybe older versions, too) both use antlr. Unfortunately, they use incompatible versions which causes problems if you try to pull in both pig and hive via ivy or maven. Oozie has come up with … Continue reading
Posted in Uncategorized
Leave a comment
Workflow Engines for Hadoop
Over the past 2 years, I’ve had the opportunity to work with two open-source workflow engines for Hadoop. I used and contributed to Azkaban, written and open-sourced by LinkedIn, for over a year while I worked at Adconion. Recently, I’ve … Continue reading
Posted in hadoop
4 Comments
Getting Started with Apache Hadoop 0.23.0
Hadoop 0.23.0 was released November 11, 2011. Being the future of the Hadoop platform, it’s worth checking out even though it is an alpha release. Note: Many of the instructions in this article came from trial and error, and there are … Continue reading
Posted in hadoop
15 Comments
Recap: Apache Flume (incubating) User Meetup, Hadoop World 2011 NYC Edition
The Apache Flume (incubating) User Meetup, Hadoop World 2011 NYC Edition was Wednesday, November 9. It was collocated with the Hive Meetup at Palantir’s awesome office space in the meatpacking district in Manhattan. The following are my notes from the two … Continue reading
Recap: April Puppet NYC Meetup
Last week, I attended my first Puppet NYC meetup, which was hosted at Gilt Groupe. As a fairly recent user of puppet, it was great to meet some folks from the community in NYC that are using it on a … Continue reading
Silently broken Gmail
At work, we have google apps, which comes with several gigs of gmail storage. For email, though, we use outlook server with a low quota. Rather than deleting email, I “archive” to gmail via IMAP. One day, though, gmail IMAP … Continue reading
Posted in Apple, Programming
1 Comment
two puppet tricks: combining arrays and local tests
Joining Arrays I found myself wanting to join a bunch of arrays in my puppet manifests. I had 3 lists of ip addresses, but wanted to join all 3 lists together into a single list to provide all ips to … Continue reading
Python setup.py bdist_rpm on CentOS 5.5
I recently learned that python setup.py can be used to build a rpm using the bdist command. Since we’re using puppet to manage installed software, this makes it really easy to add python modules to a bunch of servers. During the … Continue reading
Moving wordpress blog to lighttpd
I’ve moved my wordpress blog from a hosted account on godaddy.com to a server that’s running lighttpd on ubuntu. The move was more complex than I expected, so I thought I’d share some details for others… I already had lighttpd … Continue reading
Posted in Linux
3 Comments
JAVA_HOME on Mac OS X
I was working on configuring HBase to run on my Mac OS X machine, and I ran into a hiccup setting up the JAVA_HOME environment variable. Eventually, I determined that there’s a “Home” directory inside of each Java Framework. So, … Continue reading
Posted in Apple
Leave a comment