(I'm) easy like sunday morning

Pumba, vai buscar...

Cool Hacker

Hacking road traffic alerts.
http://www.youtube.com/watch?v=ZR0U1nyQlD0

Broadcasting SMS to all girls bar at same time.
http://www.youtube.com/watch?v=7JpRQu5HQEQ

Playing with Train Station Platforms announcements.
http://www.youtube.com/watch?v=CxwPEQLGh9M

Remote desktop into co-worker computer in front of boss.
http://www.youtube.com/watch?v=sllHBOERLKg

Changing LIVE TV autocue speed.
http://www.youtube.com/watch?v=0pAr7G9jbeo

Don't know if is real, but it sure looks cool :)

Google, the Big Brother

(A)Tipical Day?

Its morning, you get to your computer, go check the gmail, ah there’s some some news fotos on web picasa from a friend, lets check them out…
Then, whats new today? lets check some rss feeds from the google reader and by the way, how is the real world doing? lets check the google news, humm this is interesting, lets do some browsing on it. Open google search …
search, search, search …

Lunch time, check gmail again, then the gtalk inside the browser pop’s up with a friend asking how you doing, you make a little talk….
Lets take also a peek at how the stock options doing on google finance.
search, search, search …

Its late afternoon, you scheduled a dinner, humm, exactly where is that restaurant street?, lets check google maps, (or google earth for an even cooler experience), you find out exactly where is the restaurant, even add a placemark…

Get back home from dinner, download you digital camera pictures onto computer, upload them into web picasa, make a blog post(on google blogger) showing a picture from that night adding some comments where you where, and what you did, and who you where with. Check your mail again, your friend saying, check this new cool youtube video, lets check that out, and maybe 1, 2, 3… 15 more …

Besides all these you might also use google for making your home page, share documents, share a calendar, google analytics for collecting your web site usage statistic, google desktop search to make fast searches on your computer? google shopping, google images, etc, etc…

So overall, how many google applications are you using? how much information does google has or can potentially have about you? Of course, even if it looks in theory that is possible to collect and analyze all this data, in practice ends up being a big hard and complex work… anyway… we can only imagine …

But from a data mining perspective, this is very exciting, imagine the knowledge existing in all the mails you get, the newsletters, the fotos you make, the places you look at in the world, where you go, what images you search about, what you click on, what are your interests, what you blog about, what news you subscribe, maybe some information on what you work on, what videos you see, and maybe some information of your school, with google desktop indexing all documents…its endless, your agenda, your shopping habits, etc etc etc etc.

Soon, google knows more about you than you do, imagine the ultimate google application:

Me: Show me what’s new today!
Google: Top 3 news according to your interests:
  • New Apple MacBook Air.
  • Ruby 1.9 released.
  • Estonia weather, on 19 January: Sun is shinning, with 24º ! [maybe hoax]

Me: Entertain me!
Google: Do you want to?
  • Play a Joe Satriani cd.
  • watch a Seinfeld episode.
  • Check out a chocolate cake recipe…
  • Buy a chocolate cake …

Me: What is my favorite color, movie, drink?
Google: black, simpsons, coffee!

Google: Hey!!
Me: yes?
Google: haven’t you forgot to pay your water bill? And also your gradma birthday is in 3 days!
Me: humm … ops ….
Google: Maybe you want to check out this cooking book(link), 13 recipes contain ingredients your grandma likes... and shipping cost free to her address area...

Visualizing Data

Brief

In order to learn a bit about data visualization and also learn about simple data warehousing, i’ve tried out a small experiment, where i set out to built an small visualization application.

In this post I’ll be trying to cut boring details as much as possible, so i’ve simplified several things that are not overly important.

I planned the following:


The goal is to create a visualization to show the amount of vegetarians around the world. Imagine looking at world map, with each country showing in a graphical way the number of vegetarians, you should be able to zoom in to europe for example, to see which country’s lead vegetarian eating or zoom out and see the whole world picture, click on a specific country and see the statistics for it all over a month, choose a particular day of month to visualize, dragable navigation of the world map… this is what i decided to go for…

On the technical side, as i was interested in using Processing framework and because I am a ruby addict, this turned out to be a good excuse to play with jruby.

Part 1 - Aggregating data


Normally this process involves a lot of work, but i had an easy task, i could collect clean data from another database. I’m interested in the table with vegetarian people. But what to collect, what to summarize, what to calculate ?

NOTE: Specify up front what is the goal of the visualization as much as possible, this will influence the way all design will be done.

The kind of data aggregation to do depends on the visualization… So I’ve decided even if i have tons more data available i just want to see overall count of veggies by country by day.

So in an warehouse fashion lets choose the facts and dimensions:

Facts:
  • number of vegetarians.
Dimensions:
  • Time.
  • Localization(country)
facts: are generally numeric data that captures specific values.

dimensions: contain the reference information that gives each transaction its context. When dimensions are created they should be as enriched with most information as possible(and calculated values).

Next Step is to build the “warehouse”, for this is used a plain database where i created 3 tables:
as



Country:

Initially i only had 2 char ISO code identifying country, but i enriched the dimension with all the other values.
I used geoname.org webservice to collect other values. Specially important are the geo coordinates for the country bounding box which where used to calculate central latitude and a central longitude of a country, that is going to be used for the visualization.
Things like continent, population, capital, are can be used later for summarizing data for continent, for showing ratio of number of veggies for total of population, number of veggies for square meter, etc etc… think of the possibilities… :)

Time:

I made a “group by” day for collecting data from the database. I wanted the finest granularity detail as a day.
So from a day, we can calculate, day, month, year, day of week, weekday?, day in year, day in month, quarter, week day name, etc etc…
What is this useful for? Well imagine you want to see number of vegetarians on wednesday’s compared to monday’s, or the same for quarters, or months, maybe getting close to summer months, the number of veggies might go up a bit ?

Aggregating

With the basic schema laid out, its time for a data collection. I used the ActiveRecord part of the rails framework, using jruby. Its not the first time i’ve used ActiveRecord as standalone and i like it a lot… simplifies data access hugely, and because its all inside ruby, a couple more lines of code, and voilá, all the needed extra calculated columns get done also.
This collected and calculated values are then inserted into a local mySql using the schema above: fact_vegetarian, dim_date and dim_country.

I’ve collected values for a whole month.

Ended up with 225 lines of code for the warehouse part code, with some comments… but no repeated code.

Part 2 - Building a Visualizer


What the visualizer does generically is to shoot some queries to database, filtered by the view expected and also by some global vars, like date, country and shows it back as bubbles with sizes proportional to amount of veggies on for each country’s.

Application was divided into different drawing components:

  • Show World Data, its the opening scenario, showing the whole world 1 month statistics.
  • Show Country, used showing a specific country stats.
  • Show Stats, a strip at bottom showing a graph of the number over the month, where x axis are the days of the month, and y axis the number of veggies for a given.
  • Show Buttons, button used to control zoom, reset, etc…
(Probably a refactoring will reduce the Show World Map and the Show Country into a single Drawing component, has a lot of repeated code.)

I’ve created a different module for each one, which where then mixed in into main class the inherits from Processing.Sketch, to avoid ending up with a big ball of spaghetti code :)

Defined some globals vars, like:

  • mouse coordinates, for the dragable navigation.
  • zoom level, to know what is the zoom level.
  • active month, filter for queries.
  • active country, filter for queries.
  • active day, filter for queries.

Made some stuff clickable, like the country codes, displayed on top of the country’s, so the user has the possibility to filter and see stats on bottom of a single country. This is done by checking how far away is the mouse position to the central point of a country.

Also on the bottom, the stats strip has on the x axis the possibility to click on the day of the month, so the user can select a particular day and that will update the world visualization, showing the numbers of the number of veggies for a given day for all the world.
as



And zoomed out, whole world view:



Ended up with 584 lines of code, with a big chunk of repeated code, and some comments…

Overall making the Visualization was a lot more work that the warehouse part, because I had a lot of fighting around with correct coordinates positioning, getting a decent map, maintaining map country coordinates with the zooms.

Using jruby was mostly a nice experience, there are a couple of things to learn at first , for example on how to include java libraries, no biggie, but I had also some type conversion issue when i tried to refactor the code at some point, i guess its because of the java type’s, that jruby guys hide and convert automatically, they can show up in some edge cases? … but then again might be also my inexperience with jruby…

I’ve used version 1.0 of jruby, i think is a great work that jruby guys have done, making accessible to ruby community all the millions of java libraries out there. But of course don’t expect to do 100% ruby code like you do with old ruby, sometimes there’s some java lurking out of the jruby box.

Processing is great, has also huge potential, had a couple of troubles with 1 or 2 plugins i tried, but i end up using base distribution and that works and feels 100%. Is probably not intended to do full applications, but more like Sketches and stand alone small visualizations, which is fine. I look forward to do more stuff with it, its fun!

Textile vs Markdown vs Multimarkdown vs Maruku

A lightweight markup language is a markup language with a simple syntax, designed to be easy for a human to enter with a simple text editor, and easy to read in its raw form.
Wikipedia.

They are the language of choice for wiki’s, but can also be used for creating regular documents.

Textile, Markdown, Multimarkdown and Maruku are all different flavors of lightweight markup languages that i’ve tried so far.

Textile

Nice: Has a lot of formating features. Tables, colors, etc…

Could be better: Text editing experience ends up being almost like editing programming code. does not feel like a text document. Too many strange characters around.

Markdown

Nice: looks a lot better while editing text than textile. Plain text for the most part.

Could be better: its too stripped down, no tables for example…

MultiMarkdown

Nice: Its like an extended version of markdown, keeping the basic ideas of markdown and adding several features like: tables, css formating, Supports several output formats: doc, rtf, latex (which can be easily converted to pdf). Supports math formulas.

Could be better: I've had (little) trouble with the older versions of the Multimarkdown Textmate, especially with the bundle preview command, not showing images and depending on the way you include the css, could also not format it correctly. But is the same as Markdown, so does not look like its a MultiMarkdown problem.

Update: Is all working now. Anyway editing the textmate bundle commands is quite straightforward, so is not hard to change/update something.

And i’d like also 1 textmate command for generating a full pdf, but it exists now 2 commands away(1st. Latex and then to pdf), so its not too bad.

Maruku

Nice: Is like an extended Markdown with a called Meta-data syntax, where you have a special syntax to add a table of contents, css classes, source code syntax formatting, put in head, that you want headers numerated, and it does automatically for you. Generates pdf natively. Is all ruby made, thus touching my ruby bias factor :)

Could be better, PDF is not able to include image yet, math generation not ready (in testing/development). The down-part of introducing the special tags is that makes text tiny bit uglier. Also, i didn’t found a way to resize a picture, but should be a way(?)

In Summary: Textile is ugly, Markdown is too simple, Maruku is no there yet, but is promising(especially because is ruby made, will make it a lot easier for me to change when needed)…. So i use now Multimarkdown. I think the MultiMarkdown is more mature and has more features.

MultiMarkdown Tips:

I use a lot Css to format the text is very powerfull.

Also another thing i do is, putting in head a path to a local css file, which i can then easily edit and even reference external css paths.

For creating PDf the first step is to generate a Latex file, although Multimarkdown Textmate bundle has this, i just extended a bit, so besides generating a latex file, also saves it. Thus after you just have to do Apple+R, and voilá PDF doc

cd "${TM_MULTIMARKDOWN_PATH:-~/Library/Application Support/MultiMarkdown}"
cd bin
newfile2=${TM_FILEPATH/%.md/.tex}
./multimarkdown2XHTML.pl | ./xhtml2latex.pl > $newfile2
mate $newfile2

Do tinker with the Xslt templates, they can be quite powerful, they can be used to make different templates that generate completely different documents, for example: xhtml xslt: xhtml-toc-h2.xslt on header will make a linkable table of contents.

Makuru Tips:

Because i didn’t found any Maruku textmate bundle, i made the following commands for textmate:

Textmate Commands code for Maruku

Maruku Validate Syntax

#!/opt/local/bin/ruby
require 'rubygems'
require 'maruku'
file_path= ENV['TM_FILEPATH'].split(".").first+".html"
doc = Maruku.new(STDIN.read)
doc.inspect

Maruku to html, for when i need the doc converted into an html fragment, to copy-paste somewhere.

#!/opt/local/bin/ruby
require 'rubygems'
require 'maruku'
file_path= ENV['TM_FILEPATH'].split(".").first+".html"
doc = Maruku.new(STDIN.read)
puts doc.to_html

Maruku create complete html, mostly for preview the doc in a browser.

#!/opt/local/bin/ruby
require 'rubygems'
require 'maruku'
file_path= ENV['TM_FILEPATH'].split(".").first+".html"
doc = Maruku.new(STDIN.read)
a = File.open(file_path, "w")
a.puts doc.to_html_document
a.close
`open #{file_path}`

Maruku create pdf

newfile2=${TM_FILEPATH/%.md/.pdf}
maruku --pdf ${TM_FILEPATH}
open $newfile2

Maruku cleanup, to clean up all files generated by previous commands

pdf_file=${TM_FILEPATH/%.md/.pdf}
html_file=${TM_FILEPATH/%.md/.html}
log_file=${TM_FILEPATH/%.md/.log}
aux_file=${TM_FILEPATH/%.md/.aux}
out_file=${TM_FILEPATH/%.md/.out}
tex_file=${TM_FILEPATH/%.md/.tex}
rm $pdf_file
rm $html_file
rm $log_file
rm $aux_file
rm $out_file
rm $tex_file

And by the way, a big thanks, for all creators and developers of these very useful languages.

Mac Essential Apps

  • Textmate, the ultimate text editor!
  • Quicksilver, universal access.
  • Skype, communication tool.
  • Growl, desktop notifier.
  • iGtd for gtd/task management.
  • Keepass, password management.
  • Fillezila, ftp client.
  • Remote desktop, access remote window machines.
  • Chicken of the VNC, access remote machines.
  • Safari, light weight, fast, pretty, nice bookmarks management, very nice search.
  • Firefox, development work, javascript debugging.
  • Charles, web sniffer.
  • Opera, fast, full of special features, an useful alternative.
  • Aqua Database client, a super database client.
  • [Depricated, see comments]Sqlite database browser, because aqua does not support sqlite
  • Adobe pdf reader.
  • Chmox for reading chm’s.
  • Macport (DarwinPorts), for getting all linux apps.
  • Excel, quick and dirty data summarizer.
  • FreeMind, free mind map application.
  • LiquidCD for recording cd’s and dvd’s.
  • Spirited away, hide windows i’m not using.
  • VLC, video, music playing.
  • Lingon, is a graphical interface for creating launchd configuration files and controlling them through launchctl for Mac OS X Tiger.
  • ImageWell, for screenshot embellishments
  • AutopanoPro.
  • Photoshop.
On the Web:
  • Gmail, personal email reader.
  • Google reader, news feeds reader, to keep up to date on the blog world.
  • Wikipedia, wonderful source of knowledge.

Textmate Love

I've tried huge number of text editors and IDE's, especially on my windows era, none so well balanced in features, limitations, looks and speed.

Its appraised by the ruby community, especially because of the early ruby on rails screencasts made by David Heinemeier Hansson, but not only, to me this little application has the same magical effect as ruby: Simple is good. (and powerful)
Is an application i tend to be using using all day long. For: quick text editing, running/coding/documentation and validate ruby, code/validate/compress and minimize javascript, svn interactions, preview and validation of html and even for writing docs, from blog entries to full pdf documents(check my upcoming post on markup languages).

check it on: http://en.wikipedia.org/wiki/TextMate


Plugins i use a lot:
  • Ruby
  • MultiMarkdown
  • Subversion
  • SVNMate, to show svn icons on files inside project drawer.
  • HTML
  • Javascript
  • Javascript tools
  • GetBundle
The Plugins can be customized, and you can add your own commands easily, even in Ruby :)


Couple of tips i find useful:

  • Ctrl+S for a firefox/safari like search with word highlight .
  • Quicksilver can be used to access textmate commands, by using a quicksilver proxy, so if you don’t want to pick up the mouse and select a command from menu and you don’t remember the shortcut, you can always open quicksilver proxy and use its text search capabilities to find the command, for example, image you are editing an html page and you want to validate syntax, you can do:
    1. Alt+Space(opens quicksilver proxy)
    2. type “w3c”, to find: “Validate syntax (w3c)”
    3. press Enter
  • Ctrl+Apple+T, will give you the “Select Bundle Item” box where you can do the same example as above.
  • Apple+T, will open “Go To File”.
  • Alt->mouse select text, will do vertical selection of text.
  • Debugging Bundle Commands, if you are making your own command, select Output: Create New document, or Output: Show as tool tip. You’ll get outputted into a new document the result of script execution.