Real-Time Analytics with Apache Storm

In the world of large scale data analytics, there is batch-type processing and real-time. Sometimes, the former is fine, such as when log files need be periodically analyzed using a framework such as MapReduce. Processing data in real-time can become paramount in applications such as Advertising. This is where Apache Storm comes in.

I used Udacity’s Real-Time Analytics with Apache Storm course to work with Storm: setting up my own Topologies, going from a basic Word-count application to an application linked to the Twitter sample stream.

On testing Topologies:
Gradually construct your topology.
Starting on one node to test, then distribute the topology.

Capacity planning:
Test a sample data set on a small setup, and see whether that setup has enough throughput capacity to handle the amount of data coming through. This is done by monitoring CPU usage over several days, to capture usage patterns during a day or over the course of a week. Importantly, these CPU profiles need to show some headroom to allow for load fluctuations.
At Twitter, a particular topology will be tested like this for a few days before going to production.

Computing aggregates
If you’d like to compute say a moving average, this can be done by storing the last few minutes of data in a bolt, and periodically store an aggregate statistics into persistent key-value storage such as Redis.

Peter Thiel’s Model for Innovation in the Next 20 Years

I almost started this post by claiming that Peter Thiel is “obviously” one of the most well-known figures in Silicon Valley. Then I realized that, though quite probably the case within the Valley, he’s unlikely to be as well known outside of this geographic region.

To a broader audience, Peter Thiel (1967) is an Frankfurt, Germany-born entrepreneur, investor and businessman whose track record includes two Stanford degrees, being an early investor in Facebook, a co-founder of software company Palantir, and more.

This post is essentially about the future the way we think about the future. Specifically, Thiel posits that we should catalyze a renewed focus on technology and innovation in our society by learning from the ’50s and ’60s. We can do this by “look[ing] at all the science fiction book that were written at the time, and [..] all the things that didn’t happen, and making a concerted effort to make that happen.” Full video here.

So what did ’50s and ’60s science fiction envision that has (partially) happened?

  • computer revolution
  • the Internet

So what did ’50s and ’60s science fiction envision that hasn’t happened?

  • space travel[1]
  • robots[2]
  • underwater cities
  • flying cars[3]
  • reforestation of the desert

I’m always looking for intelligent frameworks and methodologies for thinking about the future of technology itself, and its impact on humanity. Thiel’s proposal to taking a lead from the ’50s and ’60s seems potentially quite fruitful. A quick search yields the following collection of books. Note: I’ve allowed myself to be partly biased by both Thiel’s mention of the late ’60s (for that decade), as well as titles I’ve heard described as (near-)classics before (for both decades).

  • 1953: “Childhood’s End” by Arthur C. Clarke
  • 1961: “Stranger in a Strange Land” by Robert A. Heinlein
  • 1950: “I, Robot” by Isaac Asimov
  • 1965: “Dune” by Frank Herbert
  • 1968: “Do Androids Dream of Electric Sheep?” by Philip K. Dick
  • 1966: “Flowers for Algernon” by Daniel Keyes
  • 1969: “Slaughterhouse” by Kurt Vonnegut

I can’t wait to dive into every single one at least one of these.

Note: Thiel mentions the phrase “Critical vectors of technological progress over the next 20 yeas.” At first glance, this seems like a phrase I’d like to keep a mind as a useful framework for thinking about the future of technology.

[1] SpaceX is on top of it. There is a nascent private space industry, but need MOAR.
[2] Watch this and this.
[3] Peter Thiel himself has said flying cars would be a “bad idea” since they’d never be allowed to take off.

What’s this all about, anyway?

Well, fundamentally: information. On the spectrum from theoretic to applied, this is about the nature of information, its relationship to physical reality; to more applied things; viewing software as the creation and manipulation of information. A software engineer applies the principles of engineering to software. Some of these principles are inherited from engineering in the broader sense; some are specific to software.

From a mathematical perspective, I seek to build an intellectual framework of all the above by formulating precise definitions and relationships, and sets and subsets.

* Information is the only resource that grows as it is used.

Who do I write for?

Myself, to be honest.

Javascript: Lexicographic Sorting

In Javascript, what would you expect the following expression to produce?

[1, 10, 8].sort();

If you thought this would evaluate to [1, 8, 10], you would benefit from a better understanding of the way Javascript deals with arrays. The above expression actually evaluates to:

[1, 10, 8].sort() = [1, 10, 8]

Why is this? The answer:

  • Javascript arrays can contain any valid value type, including null, String, Number, Boolean or another Object;
  • therefore, the Array.prototype.sort() method first applies the toString() method on each element of the array, and then sorts them in lexicographic order — also known as alphabetical sorting — as opposed to numerical order.

Wearable Tech Pioneer Predicts Google Glass Future

Georgia Tech Director of Contextual Computing Group Thad Starner discusses wearable computing and Google Glass technology on Bloomberg Television’s “Bloomberg West.”

In the interview, Thad Starner makes several interesting points:

  • The first version of a “Glass-like” device came to be in 1993, for which they coined the phrase “wearable computing”.
  • Starner sees the next major breakthrough coming: “a lot of it will be intelligence”, “smart assistants listening in on (your side of) conversations.” [..] “a computer that lives your life with you, learns about the human world.”[1]
  • Wearable technology infused and linked with intelligence will provide an “English butler experience for the average person”.
  • Smartphones are too slow to interact with, and don’t have access to your environment, the context in which you live and operate from minute to minute.
  • Google had independently decided to move into wearable computing, two months before hiring Mr. Starner to be one of the tech leads on the Glass project at Google[X].
  • Wearable devices have been around since the 1990s. The first were mp3 players, which are now ubiquitous. “We’ll see a constellation of devices around the body.”

[1] The recent movie “Her” offers an interesting vision of what it might be like to live with an “AI assistant” looking over our shoulder.

Stephen Wolfram Introduces ‘His Biggest Project Yet’

Stephen Wolfram, the genius behind Mathematica and Wolfram Alpha, announced a few weeks ago his latest, and according to him, most significant project yet: the Wolfram Language. Today, Venture Beat features an in-depth look at what they speculate might one day result in ‘Sentient Code.’

In Wolfram’s own words,

“Mathematica is this perfect precise computation engine, and WolframAlpha is general information about the world,” Wolfram told me. “Now we can combine the two.”

Now before you even think of comparing this technology to something as primitive as Google’s ‘Knowledge Graph’, think again. Wolfram:

“The knowledge graph is a vastly less ambitious project than what we’ve been doing at Wolfram Alpha,” Wolfram says quickly when I bring it up. “It’s just Wikipedia and other data.” […]
“Making the world computable is a much higher bar than being able to generate Wikipedia-style information … a very different thing. What we’ve tried to do is insanely more ambitious.”

Hyperboles: check. But seriously, even though Wolfram himself finds this project ‘the most horribly complicated to explain’ of all he’s ever done, he is able to give us a high-level description of purpose of Wolfram Language:

“In general, what we’re trying to do is so that as long as a person can describe what they want, our goal is to get that done. A human defines what the goal should be, and a computer does its best to figure out what that means, and does its best to do it,” Wolfram says.
This is only possible because the new Wolfram computational framework includes the complex and precise algorithms developed in over 20 years of Mathematica development, plus the knowledge engine built up inside WolframAlpha.

What particularly fascinated me were Wolfram’s reflections towards the end of the article.

“Today, there are probably 10-50 billion computers in the world today, depending on how you define them, and lots of devices have computers in them,” he told me. “In the near future, almost everything will be made of computers — even small objects. At that point, computation becomes even more important that it is today, and things are adaptable and modifiable at all levels.”
Then he started to slow down, thinking as he spoke. Wolfram is talking, perhaps, about the singularity, the point at which intelligence is the single defining factor of everything, and development accelerates at a pace we can’t currently begin to comprehend, and the world changes much more rapidly and much more profoundly than we can imagine.

The future he predicts is not too far-fetched:

  • Within the 2010s, projections indicate there will be up to 50 Billion devices connected to the Internet;
  • Within the 2010s, projections also indicate up to 5 Billion people, a majority of humanity, will become connected to the Internet;
  • Chips keep shrinking, making computing possible on an ever smaller scale and thereby more pervasive.

Technological components such as Wolfram Language and corresponding Cloud services strongly suggest Wolfram intends to have an impact on accelerating machine intelligence, and ultimately perhaps a ‘Singularity’.

Naturally, it is now inevitable that Wolfram, similar to other geniuses contemplating Artificial Intelligence, will use it to create some kind of alternative persona.

Fun detail: Wolfram Language apparently already has 11,000 pages of documentation.

Post Scriptum: it turns out the reference documentation for the Wolfram Language is available at

Glass Development Kit: Early Release

A sneak peak of the Google’s Glass Development Kit (GDK) has been released. To give a sense of the state of the Glass development ecosystem: even the official Developer preview is ‘yet to come’.

The GDK is an addition to the Android SDK and allows developers to build ‘Glassware’ that runs directly on Glass, making use of its native hardware and software functionality, including Voice, Gesture Detection and ‘Cards’. The current documentation includes information on design patterns (ensuring a consistent user experience), and best practices regarding Cards, Menus, ‘Icons and Assets’ and Writing.

I can’t wait to start experimenting with Glass as soon as possible.

The Future of JavaScript: ECMAScript 6

JavaScript was developed by Brendan Eich and released in 1995. A year later it was developed into a standard: ECMAScript, which today is the foundation of not only JavaScript but also ActionScript, JScript and others.

Today’s version of JavaScript is based on ECMAScript 3 [what happened to the other versions?]. The next big thing, JavaScript 2.0, is ECMAScript 6, which will introduce a lot of new functionality, some of which has been part of popular libraries for some time, for example Underscore.js.

ECMAScript 6 is code-named “Harmony” or “”. It is meant to be a better language for the complex applications and libraries that thousands of developers are using JS for today.

Let’s look at some of these new features, and some in more detail.

Ways to use ECMA6 today, before the standard is finalized:

It appears that, schockingly, “ECMA6” support will vary among browsers.

  • Default function parameters:(specification documentation)
  • var myNameIs = function(firstName = "John", lastName = "Doe") {
      console.log("Hi, my name is " + firstName + " " + lastName);
  • Sets
  • Maps
  • ‘Destructuring’ for Arrays and Objects