Abstract
AbstractMatrix decomposition is a widely used tool in machine learning with many applications such as dimension reduction or visualization. In this paper we consider decomposing X, a matrix of size $$n \times m$$
n
×
m
, to a product WS where we require that S, a matrix of size $$n \times k$$
n
×
k
, needs to have consecutive ones property. More specifically, we require that each row of S needs to be in the form of $$0, \ldots , 0, 1, \ldots , 1, 0, \ldots , 0$$
0
,
…
,
0
,
1
,
…
,
1
,
0
,
…
,
0
. Such decompositions are particularly meaningful if X is a matrix where each row represents a time series; in such a case the ones in each row in S represent a time segment. We show that the optimization problem is inapproximable. To solve the problem we propose 5 different algorithms. The first two algorithms are based on solving iteratively S while keeping W fixed and then solving W while keeping S fixed. The next two algorithms are based on greedily optimizing a single row in S and the corresponding column in W. The last algorithm first finds the optimal decomposition of with $$2k - 1$$
2
k
-
1
non-overlapping rows, and then greedily combines the rows until k rows remain. We compare the algorithms experimentally, focusing on the quality of the decomposition as well as the computational time. We show experimentally that our algorithms yield interpretable results in practical time.
Publisher
Springer Science and Business Media LLC
Subject
Computer Networks and Communications,Computer Science Applications,Information Systems
Reference30 articles.
1. Abboud A, Bringmann K, Hermelin D, Shabtay D (2022) Seth-based lower bounds for subset sum and bicriteria path. ACM Trans Algorithms 18(1):1–22
2. Bellman R (1961) On the approximation of curves by line segments using dynamic programming. Commun ACM 4(6):284–284
3. Booth KS, Lueker GS (1976) Testing for the consecutive ones property, interval graphs, and graph planarity using PQ-tree algorithms. J Comput Syst Sci 13(3):335–379
4. Chen Z, Cichocki A (2005) Nonnegative matrix factorization with temporal smoothness and/or spatial decorrelation constraints. Laboratory for Advanced Brain Signal Processing, RIKEN, Tech. Rep, 68
5. Cheng Y, Church GM (2000) Biclustering of expression data. In ISMB 8:93–103