Monday, March 9, 2009

Indexing the Sensorbase and puzzled with the distribution (for now)

I'm following the SAX approach for the database indexing and currently pulled some data to run the distribution analysis and this is the current schema for the database I'm using:



What I am doing right now is running the "sensorbase crawler" which pulls all the projects summaries available for the given user and then pulls all the charts possible for the each project (and member). After that data get normalized, transformed into the SAX representation and stored in the database.

While I'm working on the configuring charts retrieval (those parameters), I am worrying about the distribution of the data points from Build and Devtime streams: it looks not normal, - rather exponential. I'm planning to pull other streams from the sensorbase and work a little more on the data distribution analysis and if it'll be the same I guess it will require SAX schema correction.

No comments: