Monday, July 28, 2014

Denormalization using the Google Pipeline API

I've been using Google's MapReduce for AppEngine for some time. It uses the Pipeline API to connect its map, shuffle and reduce phases. It gave me ideas.

I grabbed that pipeline API and implemented a denormalizing pipeline. It'd receive data from each table from a relational database and denormalize the data to Appengine's Datastore (non relational). What the pipeline would help with is wait for the missing table so it could do the joins to complete the denormalization. It worked, but datastore writes were soaring, quickly making my app hit the daily budget.

I decided to run some tests. Yes, I should've ran them before coding the whole thing.



The first call to the run method writes 32 times to the datastore. Summing up all the writes and we have a total of 104. Each call to the run method that has a child writes around 30 times to DS, and the last one without a child writes just 8 times.



Now writes get down to business: 108 writes on the generator pipeline, 8 on the other calls to run without a child. 162 total on the rest, summing it all up to 270. Ouch.

By checking the RPC on that pricey one here's what I see:


98 writes on one put. Now what's in there? I clicked on evaluate and found a dp.put(entities_to_put), and on entities_to_put there are _SlotRecord entities, _PipelineRecord entities and _BarrierRecord entities, the whole pipeline pack.

This is what was letting loose the writing frenzy. I had set up a Generator, that would start some other generator and then they'd all feed data back up the pipeline.

Well. The pipeline API does an amazing job for mapreduce, but clearly not usefull at all for what I had in mind here. My plan ended up being overkill. I was a little deceived by the really simple samples at the getting started guide. They just show how simple it is to actually set one up, as you see in the gists above, they truely are. Denormalizing at the source is the way to go in this case. Data comes in denormalized and Datastore saves it, done.

Friday, June 13, 2014

BigQuery CSV Export getting cached?

The only way around this is randomizing the CSV filename. You'll have to do some house cleaning once in a while to get rid of the flood of files.

Creating a dummy empty CSV on Google Cloud Storage before the export with no cache control doesn't work. Don't even try. BigQuery overwrites it and changes the cache control to the default one.

Friday, May 16, 2014

As ruas estão sumindo do Google Maps Brasil

Hoje de manhã notei um buracão onde antes havia uma rua:

https://www.google.com/maps/@-23.5302594,-46.6958091,18z
com a foto aérea fica mais fácil enxergar

Passando por outras regiões, mais problemas.

https://www.google.com/maps/@-23.5746074,-46.7035023,19z


Pelo Google Map Maker elas estavam lá publicadas, mas desativadas de alguma forma. Aliás acho que é algum problema de integração com o map maker. Reportando e aguardando correções.

Update 19/05/2014:

A segunda anomalia foi corrigida. Toda a rua foi transformada em "major artery" e reconectada. Não por usuários do mapmaker, não há registros de alterações lá. Aparentemente faxina interna.



Update: 22/05/2014: A continuação da primeira imagem lááá de cima, por enquanto a situação é essa:
screenshot do mapmaker memso
Update: 24/05/2014: Sem muitos mapmakers na região para ajudar... Encontrei mais um buraco na Rua Tito, solicitando ajuda no groups.


Upate: 02/06/2014 - Quase lá.



Update: 06/06/2014 - Todos os problemas foram resolvidos, e não via mapmaker.

Update: 09/06/2014 - Descobri de onde veio a correção. Lá no google maps no browser mesmo, no cantinho inferior direito: report a problem. Nada relacionado ao map maker mesmo.



Friday, March 16, 2012

I9100UHKG4 sucks

Android 2.3.4 on modem version I9100UHKG4 sucked severely on my SGS2. My data connection would get stuck uploading data without ever getting data downloaded back. The upload arrow was the only one lighting up. I needed to turn data connection OFF and then back ON to get internet working again.

I upgraded android to 2.3.6 and it came with modem version UHKE2. It worked like a charm, 100%, not a single time I had the problem I was constantly having before.

Two days ago I upgraded the OS to the Polish version of the 4.0.3 (ICS) and now I'm getting problems again. I'm getting "no service" after a whole day so I'll downgrade just the modem to the previous version. The current one is XXLPQ.

Friday, November 18, 2011

Decodificador NET HD é uma torradeira de HDMI

Minha Time Machine 2 tem duas portas HDMI atrás e uma porta lateral. Primeiro a porta HDMI onde estava conectado o decodificador da NET parou de funcionar. Troquei, coloquei na segunda porta traseira onde estava ligado o PS3. Essa também pifou. Comprei um cabo HDMI novo e conectei na última porta que me restou: a lateral. Essa também se foi.

O decodificador da NET destruiu todas as portas num intervalo de mais ou menos um mês cada uma. O técnico veio até minha casa e disse que o problema só podia ser na rede elétrica porque o decodificador tem proteção contra isso. Deixou outro aparelho e agora só tenho portas component disponíveis.

Não achei relatos semelhantes online então deixo aqui o meu.

Thursday, July 08, 2010

Amazon, take my dead tree books!

Now that I have a kindle what I want most is having all my books in it. All of them, including the ones I already have.

Amazon, take my dead tree books and give me a kindle edition of them in return.

That would make me happy, kill my book lendings thus making people get their own copies and it'd make you a profit on recycling.