Thursday, May 28, 2009

Working on the Pilot data set.

Aiming the dissertation thesis proposal I'm working on the Trajectory code right now. The main change here is that the data comes now from the single project and categorized by users instead of simply navigating "anonymous" streams before. This change in the analysis flow pushed me to change a database schema and it looks now as follows:

Also, changes in schema and analysis forced me to rewrite bunch of the iBATIS queries, the coolest query so far is like that:
SELECT sm.id AS motif_id, sm.substring AS motif, sme.id AS entry_id,
(SELECT COUNT(*) FROM sax_motif_offset
JOIN sax_motif_entry ON sax_motif_offset.sax_motif_entry = sax_motif_entry.id
JOIN sax_motif ON sax_motif.id = sax_motif_entry.sax_motif
WHERE sax_motif.id = motif_id) AS entry_frequency
FROM sax_motif sm
JOIN sax_motif_entry sme ON sme.sax_motif=sm.id
JOIN chart ON chart.id = sme.chart
WHERE sm.sax_index = #value#
GROUP BY motif
HAVING entry_frequency > 1
ORDER BY entry_frequency DESC;
and retrieves all motifs for the specific index sorted by frequency.

No comments: