Semi-supervised clustering of time-dependent categorical sequences with application to discovering education-based life patterns