Query evaluation

From Wikipedia, the free encyclopedia

In database theory, the query evaluation problem is the problem of determining the answers to a query on a database. Research in database theory aims at determining the computational complexity of answering different kinds of queries over databases, in particular over relational databases.

Formal definition[edit]

The query evaluation problem takes two inputs: the query to be answered, and the database on which to answer it. The output of the problem is the set of answers to the query on the database. If the queries are Boolean queries, i.e., queries have a yes or no answer (for example, Boolean conjunctive queries) then the query evaluation problem is a decision problem.

The query evaluation problem is usually posed for a specific class of queries and databases. For instance, one example of the query evaluation problem would be the problem of evaluating a conjunctive query on a relational database.

The computational complexity of the problem can be measured in different ways,[1] to account for the fact that the two inputs of the problem are different:

  • The combined complexity of the query evaluation problem is its computational complexity when measured as a function of the two inputs, i.e., the query and the database, as usual in computational complexity.
  • The data complexity is the computational complexity when the query is fixed and when the input is just the database. For instance, we say that query evaluation has polynomial-time data complexity for a class of queries if, for every fixed query Q in that class, given a database D, we can compute the answers to Q on D in polynomial time.
  • Less commonly, we can study the query complexity, which is the computational complexity when the database is fixed and when the input is just the query. This is also called expression complexity.[2]

Query classes[edit]

The complexity of query evaluation can be studied for several query classes, for instance acyclic queries, conjunctive queries, unions of conjunctive queries, Datalog, regular path queries, etc., up to logical formalisms like first-order logic or monadic second-order logic.

For instance, for Boolean conjunctive queries, the complexity of query evaluation is polynomial in data complexity: it even falls in the class AC0. By contrast, the query complexity and combined complexity are NP-complete[3] by a reduction from 3-colorability.[citation needed]

Boolean versus non-Boolean queries[edit]

The complexity of query evaluation can be studied for queries that return answers, or for Boolean queries (yes/no queries). However, we can often reduce to the case of Boolean queries. More specifically, if the number of answers to the query is always polynomial in the database size, and if we can rewrite the query to a Boolean query for each answer, then we can reduce query evaluation for a non-Boolean query to polynomially many Boolean query evaluation problems.[citation needed]

References[edit]

  1. ^ Vardi, Moshe Y. (1982-05-05). "The complexity of relational query languages (Extended Abstract)". Proceedings of the fourteenth annual ACM symposium on Theory of computing. STOC '82. New York, NY, USA: Association for Computing Machinery: 137–146. doi:10.1145/800070.802186. ISBN 978-0-89791-070-5.
  2. ^ Abiteboul, Serge; Hull, Richard; Vianu, Victor (1994-12-02). Foundations of Databases: The Logical Level (1st ed.). Reading, Mass.: Pearson. ISBN 978-0-201-53771-0.
  3. ^ Greco, Gianluigi; Scarcello, Francesco (2014-06-18). "Counting solutions to conjunctive queries: structural and hybrid tractability". Proceedings of the 33rd ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems. PODS '14. New York, NY, USA: Association for Computing Machinery: 132–143. doi:10.1145/2594538.2594559. ISBN 978-1-4503-2375-8.