We address the problem of designing and implementing a data structure for the Range Minimum Query problem. We show a surprising connection between this classical problem and the geometry of a properly defined set of points in the Cartesian plane. Building on this insight, we hinge upon a well-known result in Computational Geometry to introduce the first RMQ solution that exploits (i.e., learns) the distribution of such 2D-points via proper error-bounded linear approximations. Because of these features, we name the resulting data structure: Fully-Learned RMQ, shortly FL-RMQ. We prove theoretical bounds for its space usage and query time, covering both worst-case scenarios and average-case performance for uniformly distributed inputs. These bounds compare favorably with the ones achievable by the best-known indexing solutions (i.e., the ones that allow access to the indexed array), especially when the input data follow some geometric regularities that we characterize in the paper, thus providing principled evidence of FL-RMQ being a novel data-aware solution to the RMQ problem. We corroborate our theoretical findings with a wide set of experiments showing that FL-RMQ offers more robust space-time trade-offs than the other known practical indexing solutions on both artificial and real-world datasets. We believe that our novel approach to the RMQ problem is noteworthy not only for its interesting space-time trade-offs, but also because it is flexible enough to be applied easily to the encoding variant of RMQ (i.e., the one that does not allow access to the indexed array), and moreover, because it paves the way to research opportunities on possibly other problems.
FL-RMQ: A Learned Approach to Range Minimum Queries
Paolo Ferragina;
2025-01-01
Abstract
We address the problem of designing and implementing a data structure for the Range Minimum Query problem. We show a surprising connection between this classical problem and the geometry of a properly defined set of points in the Cartesian plane. Building on this insight, we hinge upon a well-known result in Computational Geometry to introduce the first RMQ solution that exploits (i.e., learns) the distribution of such 2D-points via proper error-bounded linear approximations. Because of these features, we name the resulting data structure: Fully-Learned RMQ, shortly FL-RMQ. We prove theoretical bounds for its space usage and query time, covering both worst-case scenarios and average-case performance for uniformly distributed inputs. These bounds compare favorably with the ones achievable by the best-known indexing solutions (i.e., the ones that allow access to the indexed array), especially when the input data follow some geometric regularities that we characterize in the paper, thus providing principled evidence of FL-RMQ being a novel data-aware solution to the RMQ problem. We corroborate our theoretical findings with a wide set of experiments showing that FL-RMQ offers more robust space-time trade-offs than the other known practical indexing solutions on both artificial and real-world datasets. We believe that our novel approach to the RMQ problem is noteworthy not only for its interesting space-time trade-offs, but also because it is flexible enough to be applied easily to the encoding variant of RMQ (i.e., the one that does not allow access to the indexed array), and moreover, because it paves the way to research opportunities on possibly other problems.File | Dimensione | Formato | |
---|---|---|---|
LIPIcs.CPM.2025.7.pdf
accesso aperto
Tipologia:
PDF Editoriale
Licenza:
Creative commons (selezionare)
Dimensione
1.91 MB
Formato
Adobe PDF
|
1.91 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.