Thursday, December 04, 2014 on Tizen

Adventurous, heh? Playing with early versions of famous on a platform that barely has launched phones running it? Fear not!

If you try loading 0.3 on Tizen you'll notice it won't work. There's a bug we need to fix first.

On famous/core/Engine.js we'll need to add a line to handle window.requestAnimationFrame(). Tizen only cares about webkitRequestAnimationFrame() so we'll have it as follows:

Add this code around line 60.

If you feel like cloning the whole project and playing with it, here's the link:

Monday, September 15, 2014

Udacity had a Progress Bar update

September 2nd, 2014

September 15th, 2014

It set me back now that I'm at the end of it, but in the middle of the course it wasn't reflecting my evolution at all. A much needed improvement!

Update on September 19th

Further... away! I've submitted my course assessment already, but that bar won't go green. Oh, well.

September 21st...

Last update on September 23rd
By now I have received my certificate, but the progress bar was still far from filling all the way up. What I expected to be a bug, ended up turning into a series of progress bar iterations I managed to capture.

Monday, July 28, 2014

Denormalization using the Google Pipeline API

I've been using Google's MapReduce for AppEngine for some time. It uses the Pipeline API to connect its map, shuffle and reduce phases. It gave me ideas.

I grabbed that pipeline API and implemented a denormalizing pipeline. It'd receive data from each table from a relational database and denormalize the data to Appengine's Datastore (non relational). What the pipeline would help with is wait for the missing table so it could do the joins to complete the denormalization. It worked, but datastore writes were soaring, quickly making my app hit the daily budget.

I decided to run some tests. Yes, I should've ran them before coding the whole thing.

The first call to the run method writes 32 times to the datastore. Summing up all the writes and we have a total of 104. Each call to the run method that has a child writes around 30 times to DS, and the last one without a child writes just 8 times.

Now writes get down to business: 108 writes on the generator pipeline, 8 on the other calls to run without a child. 162 total on the rest, summing it all up to 270. Ouch.

By checking the RPC on that pricey one here's what I see:

98 writes on one put. Now what's in there? I clicked on evaluate and found a dp.put(entities_to_put), and on entities_to_put there are _SlotRecord entities, _PipelineRecord entities and _BarrierRecord entities, the whole pipeline pack.

This is what was letting loose the writing frenzy. I had set up a Generator, that would start some other generator and then they'd all feed data back up the pipeline.

Well. The pipeline API does an amazing job for mapreduce, but clearly not usefull at all for what I had in mind here. My plan ended up being overkill. I was a little deceived by the really simple samples at the getting started guide. They just show how simple it is to actually set one up, as you see in the gists above, they truely are. Denormalizing at the source is the way to go in this case. Data comes in denormalized and Datastore saves it, done.

Friday, June 13, 2014

BigQuery CSV Export getting cached?

The only way around this is randomizing the CSV filename. You'll have to do some house cleaning once in a while to get rid of the flood of files.

Creating a dummy empty CSV on Google Cloud Storage before the export with no cache control doesn't work. Don't even try. BigQuery overwrites it and changes the cache control to the default one.