June 16, 2011

Hadoop competitor

Interesting – if it works and deploys well, HPCC is a good thing for the scalable processing space.

April 19, 2011

Favorite Portal 2 quote so far

Either:

Okay, look. We both said a lot of things that you are going to regret. But I think we can put our differences behind us. For science… you monster

or

Do you know the biggest lesson I learned from what you did? We’re a lot alike, you and I. You tested me, I tested you. You killed me, I – oh no, wait. I guess I haven’t killed you yet. Well … food for thought

April 12, 2011

Android app in 1.5 hours

I have to give great props to the Google Android team. I was able to go from “I want to build an app for my phone” to “My app is now published on the Android market” in approximately 90 minutes of effort.

1) Download the SDK – that was easy.

2) Add the plugin to Eclipse – also easy

3) Set up the various platforms in the SDK “system”. This was confusing at first. Eventually I realized that one wants (typically) to go for ‘lowest common denominator’ – so Android API 1.5

4) Find a skeletal app to rework. I was writing a soundboard, so it turned out to be incredibly easy to repurpose.

5) Make my app changes – easy, because of the template.

6) Load the emulator and try out my changes. This took a little while to get right, because I didn’t have the target set up properly in my app.
7) Run the emulator and test my code – straightforward
8) Create a package-able version, and load it directly onto my phone

9) Use Astro file manager to install the app from the phone itself – easy

10) Verify that the app worked on my phone – easy. surprisingly so.

11) Create a signed package for publishing (this took a little while because the wizard was a little confusing, but not horrible)

12) Sign up as an android developer. This cost me $25, but was very easy.

13) Upload my app. This took a little while because I had to rename it (there was a conflict), create screenshots, etc.

14) Publish.

Within minutes, the app was live on the Android Marketplace.

February 16, 2011

Startups and Agile Teams

I have spent a long time working in software startups of various stripes, and I have also spent a long time working with Agile teams delivering software of various stripes.

It is  difficult to sufficiently emphasize how similar those two models are.    Think about it:

Startup Agile
What’s the simplest thing that could possibly sell? Working Software over Comprehensive Documentation
Hire smart, talented people, and most of the problems will take care of themselves Individuals and interactions over processes and tools
Get something out quick so your customers can give you feedback Customer collaboration over contract negotiation
What you end up building is almost never what you originally envisioned Responding to change over following a plan

Read this fascinating article about Dropbox, and how it succeeded where its competitors did not. It should help you down this path.

January 4, 2011

random Jim Croce-inspired madness

If I could save Time in a Timesheet
if Jobs could make iPods call you
Google would cache every page til Eternity passes
and then it would trend them for you….

January 3, 2011

one of the most useful commands ever for j2ee development

at least for me:

find . -name \*.jar -exec jar -tf \{\} \; -print > ../alljars.txt

What this does:

  1. find every file in or below this directory with an extension of .jar
  2. create a listing of every file in that .jar file
  3. outputs that listing, including the name of the originating jar file at the bottom into one big text file:  alljars.txt

You can now search for classes/resources you need in the alljars.txt file – and when you find it, simply scroll down to find the name of the jar file the class you are looking for is in.

December 17, 2010

Interesting thoughts on distributed computing

But not for the faint of heart:

http://db.cs.berkeley.edu/jmh/calm-cidr-short.pdf

September 24, 2010

A fun poem

разтегателни диваниA physicist may be described
(to first approximation)
As a simple prolate spheroid
Of infectious obfuscation.
Attempts to oversimplify
Reveal their odd propensity
To speak of spheroid cattle
Which are uniform in density—
Their perfect planes are frictionless;
Collisions are elastic;
They’re rarely seen acknowledging
The random or stochastic.
The chaos of the world outside
May leave them full of fears;
Such terra incognita
Might be filled with… Engineers!

September 15, 2010

Big Ball of Mud is the “most popular” software architecture

I read this, and I am somewhat idignant:

Big Ball of Mud still seems to be the most popular way to design and architect software.

Just because something is ‘common’ doesn’t make it popular.   Your standard everyday cold is pretty common, but it is not popular.   Traffic jams are common, but I doubt anyone involved in them thinks that they are popular.  Wading through bureaucratic red tape is common, not popular.

Primarily, BBoM happens because cost-benefit analysis is time-consuming and difficult.   If programmers, architects and managers could measure and understand the longer-term cost of short-term poor decision making, we would get better decisions.    Remedy?  There’s no one magic bullet, but I suspect that a focus on code coverage and limited code complexity is a great place to start – I don’t know that I’ve ever seen a BBoM project with high code coverage and low complexity.     This may be correlation and not causation, but I think there’s a legitimate story for how forcing unit testing discipline and simplicity pays dividends in terms of architectural strength.

September 13, 2010

Farewell, Bloglines

I’ve been using Bloglines for a long time, since  2004 if I’m not mistaken.  It’s been a constant and welcome part of my online experience.

Alas, apparently, they could not find a way to make money off of it.

Which is unfortunate, because I always thought that it would have been a fabulous corporate knowledge-sharing tool – a “private” Bloglines, within a company, that you could add subscriptions that others could use to stay up on important events and thought-leaders in your industry.   Saving and ranking specific posts so that they would potentially become more widespread – identify competitive threats and potential strategic opportunities.

I realize that this model is not without challenges (“We don’t want our employees visiting the web!”), but I’m sure there would have been some organizations with enough strategic vision to see the opportunity inherent in such “corp-sourcing”.    Enough, I would imagine, that they could have made enough money to keep the public site going.  Alas

Also, I see various mentions in the Blogline obituaries that suggest that the day of the RSS reader is done – that we’re replacing it with social link-sharing like Twitter or Facebook.  As someone who generally produces more of these links than I consume, I am puzzled – RSS Readers allowed me to review a wide assortment of feeds at my leisure – Twitter and Facebook are far more ephemeral and constrained to the strategies I use to follow people and to be followed in return.

September 10, 2010

This just in

Programmer who cares deeply about performance disagrees with claim that ‘Premature Optimization is the root of all evil’

September 3, 2010

Perspectives…

Glenn Alleman, who is often critical of some of the less structured aspects of Agile  (not in a nasty or spiteful way) mentions a project he is working on:

I’m working a moderate ($300M) Army program through January

Moderate?  That’s a jaw-droppingly large amount of money, and IMO, it explains a lot of the friction between his perspective and a more classical “agile” perspective – agile projects are (in my experience) 100-1000x less expensive, with a corresponding lack of scrutiny/accountability/oversight from management.  Many of the problems/issues that agile was designed to resolve would never happen on a project that large, because that’s way too much money to be sloshing around without high levels of management accountability.

Anyways, I am endlessly fascinated by all the different ways that people can build things.

August 27, 2010

Issues & Concerns w/Google Web Toolkit 2.0

If you’re considering using GWT to do web development, here are some of the issues we’ve encountered – sufficiently difficult and frustrating that the organization has decided to abandon GWT and return back to Spring MVC as the web tier.

  1. Difficult to integrate into SEO – I didn’t see this directly (because I am not focused on SEO) but apparently all that javascript makes indexing the pages much, much harder
  2. Unit Testing – the design of GWT widgets makes it very difficult to write unit tests using anything other than the “built-in” GWTUnitTest, which isn’t helpful for our situation (we need the output in a particular format for tracking purposes).    So essentially we didn’t unit test our GWT widgets much at all
  3. Which means our code coverage was very low, and it requires a lot of “jumping through hoops” to get the code coverage to even modest (40%) levels.
  4. Most people can’t seem to get used to the programmatic paradigm – horizontalPanel.add( verticalPanel3 );    It’s very hard to debug programmatic panel creation, and it’s easy to get something wrong, and not realize it until you’ve compiled and built everything.   I know they’ve added some XML-based visual building recently, but unfortunately we’re working with legacy code
  5. Compilation time – it seems to take a _long_ time to build all the locals and browser variants, and going in and restricting the list of locals and browsers made it harder to track down bugs

This was a large organization, with a lot of resources and fairly smart people, and GWT just simply defeated them – they could not figure out a way to get it to behave in a way that their organization could absorb.    I’m sure all the GWT experts are sneering at our “pathetic lack of skillz” and that’s fine (whatever!), but for me, I don’t plan on recommending GWT to any of my clients.

July 22, 2010

JavaScript Native Interface & History Repeating

I’ve only recently started to look at the Google Web Toolkit.   I haven’t gotten far enough into the implementation and usage of it to make a firm decision, but I do like the philosophical concepts (which I will get into later).  Yeah, I’m probably late to the game on GWT, but I was early to the game on a bunch of other things, so it balances out, I think.

One thing that I was surprised by (pleasantly) is the “JSNI” – the way that the GWT allows the user to “drop in” javascript in situations where the GWT widgets just won’t do.

Here’s the first blog post I found about JSNI .  I love the first comment.

JavaScript is king in the browser and GWT is for cowards.

Hee hee.  Go back 20 years or so, and you’d see the exact same argument, only with different names:

Assembly is king in graphics, and C is for cowards

Pretty much the same situation – a certain group of people have made their living from being experts at something cryptic and difficult.  Along comes something (in the older case, DirectX) that attempts to simplify that difficult thing, and those experts begin flinging poo at it.

This was back in 07, of course, I wonder if those people still hate GWT and the leaky abstractions it represents.

July 21, 2010

Using Hadoop for Data Mining

I wrote a whitepaper on Hadoop, and how you can use it to perform Business Intelligence on data that’s too expensive to analyze using existing solutions, either because the data is too messy, too voluminous, or both.

There are other uses for Hadoop, but I think this is one of the most strategic.

Let me know  your thoughts!

What is Hadoop

My recent post on Hadoop may leave people wondering “WTH is Hadoop?”.

Well first, if anything in the world can be called “Cloud Computing”, Hadoop can.

Hadoop is an open source software system that creates two things:

1) A highly scalable, fault-tolerant distributed file system (loosely based on the Google File System)

2) A highly scalable implementation of Google’s MapReduce algorithm

And it’s open source, and free, and has been in use at Facebook and Yahoo for several years now.

Your next question may be “What is MapReduce?”

MapReduce is an algorithm that splits a large amount of data into smaller chunks, and allows the data to be sorted and aggregated in various ways.   It’s one of the cornerstones of Google’s massive software infrastructure – a system that lets Google process all the data that comes in about who is linking to who, and which tags and text are being used, etc.

Essentially, Hadoop is a cloud-based data analysis tool – something that can scale very cost-effectively, and can chew through terabytes and/or petabytes of data using off-the-shelf computers with off-the-shelf operating systems and hardware.

Latest Strategic Hadoop News

Momentum continues to grow for Hadoop – the ability to use Hadoop to cost-effectively perform large scale data mining and data cleansing is considerable.

June 21, 2010

My theory on software problems

We seem to be rapidly converging on three types of problems in the software domain:

A) Problems that can’t be solved easily by humans, but are trivial for computers, even at large scale

B) Problems that can’t be solved easily by humans or computers (primarily because of scale)

C) Problems that are easily solved by humans, but nigh-impossible for computers

Over time, we continue to see set (B) shrink and set (A) increase.    But we see very little improvement in (C).

Examples of (C) include:

  • Voice recognition
  • Establishing connections between pieces of data, based on semantics
  • Natural Language Processing
  • Monster AI (in games)
  • Forecasting & Predictions
  • Troubleshooting and Debugging
  • Developing Software

Over the last 20 years (or so), I have seen people predict confidently that any one of these problems would be easily solved in the next few years, and, without exception, they have been wrong.  Not a little wrong.  Not slightly wrong – spectacularly and utterly wrong.

I know this, because I remember the frustration associated with arguing with the “visionaries” about these problems – they would posit “X” – “We will see computers automatically connect semantic markup”.  I would object that this was a far more complicated problem than they thought, and they would sniff at me, and roll their eyes – I “just didn’t get it”.     Or they would predict that no-one would be writing software in 10 years, or that IT would disappear, etc, etc, etc.

Well, I’m tired of this disdain for the real world, and here’s the graph to show exactly how right I was.

The fact is – set (C) above is the set of things that require human-level AI to solve.   But once we have human-level AI, all of these problems become trivial at the same time.  (This is analogous to the NP-complete problem)

We chip away at the AI problem a little every year – computers get faster, algorithms get smarter, things that were essentially impossible become simply difficult, etc.   But until we get a near-complete AI model working reliably, I submit that we will not see the problems at the top of this post go away.

Think about it – each of these problems requires a rich understanding of human context.  Of judgement, of positing alternate universes in one’s head, in order to determine if alternate paths lead to success or failure.   And until a piece of software has the ability to do those things, it will not solve these problems.

So from now on, my answer to all pretenders to this throne will be:  ”The problem you describe requires human-level AI to solve.  If you want to make progress on your problem, go solve the human-AI problem first.”

/rant off

April 27, 2010

I see your meta and meta it!

Apple says that you must write apps for the iPhone in C, C++ or Objective C.

Surely the clever folks at Adobe could produce a flash compiler that produces C code, not executable object code.   Then, the developer could take said generated C code and compile it legitimately  using approved compilers.

April 20, 2010

Google vs. Bing – round 2

I have been struggling for the last 24 hours or so to launch a child thread in my Spring/Hibernate/Struts2 J2EE application.    The challenge was that the child thread needed to be able to read/write the database and access services, etc.

I used Google, asked a bunch of questions, tried a bunch of different things, but was still stuck.

So in desperation, I turned to Bing, and asked one of the questions again.  And one of the first results on Bing was a clue that led me to the last piece of the puzzle, and success.

Having said that, most of the solution was described properly by Google, there was just one bit of information I hadn’t picked up.

So I’d say Google gets 80% of the credit, Bing gets 20%.    But that’s about 19% more than I would have credited Bing before.