Tuesday, February 17, 2009

Euclidean and DTW metrics based sequence (subsequence) matching

Just did a small experiment with R using Euclidean and DTW metrics.
So, I had two synthetic timeseries in hands - one was a kind of devtime telemetry trend which I was using as the reference and the second one just a "peak pattern" The purpose was to see if this "peak pattern" will match anything on the reference timeseries and if it will, than what exactly.

So, figure 1 represents both timeseries and their normalized versions



Figure 2 shows the same normalized timeseries at the top, found matches for "peak pattern" when using Euclidean distance at the middle, and Euclidean distances at the bottom of the figure (red bars correspond to the set of the minimal found distances (7 in total)).


Figure 3 is very similar to the number 2, but the DTW distance is used.


Figure 4 mimics #3 but threshold is lowered, so more matches shown.


The R code is here. Will follow with more details in the next post.

1 comment:

Unknown said...

Hi, I was wondering whether your R code for this post was still available?

I'm just starting to learn R and would like to know how you made some of those charts with multiple subsequences.

Thanks!