News: 0001528154

  ARM Give a man a fire and he's warm for a day, but set fire to him and he's warm for the rest of his life (Terry Pratchett, Jingo)

PostgreSQL Lands Self-Join Elimination Optimization

([Programming] 6 Hours Ago Self-Join Elimination)


More than seven years in the making, merged yesterday for PostgreSQL is a self-join elimination "SJE" feature as a performance optimization for some queries.

Discussions over this feature began back in 2018 with [1]this thread over removing unneeded self joins. PostgreSQL developer Alexander Kuzmenkov explained there:

"There is a join optimization we don't do -- removing inner join of a table with itself on a unique column. Such joins are generated by various ORMs, so from time to time our customers ask us to look into this. Most recently, it was discussed on the list in relation to an article comparing the optimizations that some DBMS make."

Over the past two years there have been [2]patches working out this Self-Join Elimination feature while yesterday was finally ready for landing in Git for what will become PostgreSQL 18.

Yesterday's [3]commit explains of Self-Join Elimination:

"The Self-Join Elimination (SJE) feature removes an inner join of a plain table to itself in the query tree if it is proven that the join can be replaced with a scan without impacting the query result. Self-join and inner relation get replaced with the outer in query, equivalence classes, and planner info structures. Also, the inner restrictlist moves to the outer one with the removal of duplicated clauses. Thus, this optimization reduces the length of the range table list (this especially makes sense for partitioned relations), reduces the number of restriction clauses and, in turn, selectivity estimations, and potentially improves total planner prediction for the query.

This feature is dedicated to avoiding redundancy, which can appear after pull-up transformations or the creation of an EquivalenceClass-derived clause like the below.

SELECT * FROM t1 WHERE x IN (SELECT t3.x FROM t1 t3);

SELECT * FROM t1 WHERE EXISTS (SELECT t3.x FROM t1 t3 WHERE t3.x = t1.x);

SELECT * FROM t1,t2, t1 t3 WHERE t1.x = t2.x AND t2.x = t3.x;

In the future, we could also reduce redundancy caused by subquery pull-up after unnecessary outer join removal in cases like the one below.

SELECT * FROM t1 WHERE x IN

(SELECT t3.x FROM t1 t3 LEFT JOIN t2 ON t2.x = t1.x);

Also, it can drastically help to join partitioned tables, removing entries even before their expansion."

PostgreSQL 18.0 should be out later this year with this new SJE feature and much more for this popular open-source database server.



[1] https://www.postgresql.org/message-id/flat/64486b0b-0404-e39e-322d-0801154901f3%40postgrespro.ru

[2] https://www.postgresql.org/message-id/E1qvafl-002FJE-8E%40gemulon.postgresql.org

[3] https://github.com/postgres/postgres/commit/fc069a3a6319b5bf40d2f0f1efceae1c9b7a68a8



phoronix

On the other hand, life can be an endless parade of TRANSSEXUAL
QUILTING BEES aboard a cruise ship to DISNEYWORLD if only we let it!!