From d1754ac0030c6f96600316f9dbcdf0386fee9ca7 Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Tue, 8 Nov 2022 18:25:03 -0500 Subject: [PATCH 01/70] Doc: improve tutorial section about grouped aggregates. Commit fede15417 introduced FILTER by jamming it into the existing example introducing HAVING, which seems pedagogically poor to me; and it added no information about what the keyword actually does. Not to mention that the claimed output didn't match the sample data being used in this running example. Revert that and instead make an independent example using FILTER. To help drive home the point that it's a per-aggregate filter, we need to use two aggregates not just one; for consistency expand all the examples in this segment to do that. Also adjust the example using WHERE ... LIKE so that it'd produce nonempty output with this sample data, and show that output. Back-patch, as the previous patch was. (Sadly, v10 is now out of scope.) Discussion: https://postgr.es/m/166794307526.652.9073408178177444190@wrigleys.postgresql.org --- doc/src/sgml/query.sgml | 65 +++++++++++++++++++++++++++++++---------- 1 file changed, 49 insertions(+), 16 deletions(-) diff --git a/doc/src/sgml/query.sgml b/doc/src/sgml/query.sgml index 9046d7c9fbe..a864d146f02 100644 --- a/doc/src/sgml/query.sgml +++ b/doc/src/sgml/query.sgml @@ -706,40 +706,39 @@ SELECT city FROM weather HAVING Aggregates are also very useful in combination with GROUP - BY clauses. For example, we can get the maximum low - temperature observed in each city with: + BY clauses. For example, we can get the number of readings + and the maximum low temperature observed in each city with: -SELECT city, max(temp_lo) +SELECT city, count(*), max(temp_lo) FROM weather GROUP BY city; - city | max ----------------+----- - Hayward | 37 - San Francisco | 46 + city | count | max +---------------+-------+----- + Hayward | 1 | 37 + San Francisco | 2 | 46 (2 rows) which gives us one output row per city. Each aggregate result is computed over the table rows matching that city. We can filter these grouped - rows using HAVING and the output count using - FILTER: + rows using HAVING: -SELECT city, max(temp_lo), count(*) FILTER (WHERE temp_lo < 30) +SELECT city, count(*), max(temp_lo) FROM weather GROUP BY city HAVING max(temp_lo) < 40; - city | max | count ----------+-----+------- - Hayward | 37 | 5 + city | count | max +---------+-------+----- + Hayward | 1 | 37 (1 row) @@ -749,12 +748,18 @@ SELECT city, max(temp_lo), count(*) FILTER (WHERE temp_lo < 30) names begin with S, we might do: -SELECT city, max(temp_lo), count(*) FILTER (WHERE temp_lo < 30) +SELECT city, count(*), max(temp_lo) FROM weather WHERE city LIKE 'S%' -- - GROUP BY city - HAVING max(temp_lo) < 40; + GROUP BY city; + + + city | count | max +---------------+-------+----- + San Francisco | 2 | 46 +(1 row) + @@ -791,6 +796,34 @@ SELECT city, max(temp_lo), count(*) FILTER (WHERE temp_lo < 30) because we avoid doing the grouping and aggregate calculations for all rows that fail the WHERE check. + + + Another way to select the rows that go into an aggregate + computation is to use FILTER, which is a + per-aggregate option: + + +SELECT city, count(*) FILTER (WHERE temp_lo < 45), max(temp_lo) + FROM weather + GROUP BY city; + + + + city | count | max +---------------+-------+----- + Hayward | 1 | 37 + San Francisco | 1 | 46 +(2 rows) + + + FILTER is much like WHERE, + except that it removes rows only from the input of the particular + aggregate function that it is attached to. + Here, the count aggregate counts only + rows with temp_lo below 45; but the + max aggregate is still applied to all rows, + so it still finds the reading of 46. + From 8317a3f4570f2f5403c795035ddefd656ca4b6be Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Wed, 9 Nov 2022 11:08:52 -0500 Subject: [PATCH 02/70] Doc: add comments about PreventInTransactionBlock/IsInTransactionBlock. Add a little to the header comments for these functions to make it clearer what guarantees about commit behavior are provided to callers. (See commit f92944137 for context.) Although this is only a comment change, it's really documentation aimed at authors of extensions, so it seems appropriate to back-patch. Yugo Nagata and Tom Lane, per further discussion of bug #17434. Discussion: https://postgr.es/m/17434-d9f7a064ce2a88a3@postgresql.org --- src/backend/access/transam/xact.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c index ed655baf989..7ed0a6ce42a 100644 --- a/src/backend/access/transam/xact.c +++ b/src/backend/access/transam/xact.c @@ -4410,6 +4410,10 @@ AbortCurrentTransaction(void) * a transaction block, typically because they have non-rollback-able * side effects or do internal commits. * + * If this routine completes successfully, then the calling statement is + * guaranteed that if it completes without error, its results will be + * committed immediately. + * * If we have already started a transaction block, issue an error; also issue * an error if we appear to be running inside a user-defined function (which * could issue more commands and possibly cause a failure after the statement @@ -4535,6 +4539,10 @@ CheckTransactionBlock(bool isTopLevel, bool throwError, const char *stmtType) * a transaction block than when running as single commands. ANALYZE is * currently the only example. * + * If this routine returns "false", then the calling statement is + * guaranteed that if it completes without error, its results will be + * committed immediately. + * * isTopLevel: passed down from ProcessUtility to determine whether we are * inside a function. */ From 5fa9f1b690e3564951ba9ac4a470de08f2203432 Mon Sep 17 00:00:00 2001 From: Michael Paquier Date: Thu, 10 Nov 2022 16:33:55 +0900 Subject: [PATCH 03/70] Fix comment of SimpleLruInit() in slru.c sync_handler was not mentioned in the comment block of the function. Oversight in dee663f. Author: Aleksander Alekseev Discussion: https://postgr.es/m/CAJ7c6TPUd9BwNY47TtMxaijLHSbyHNdhu=kvbGnvO_bi+oC6_Q@mail.gmail.com Backpatch-through: 14 --- src/backend/access/transam/slru.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/backend/access/transam/slru.c b/src/backend/access/transam/slru.c index a1950fd0944..6da20d7531d 100644 --- a/src/backend/access/transam/slru.c +++ b/src/backend/access/transam/slru.c @@ -181,6 +181,7 @@ SimpleLruShmemSize(int nslots, int nlsns) * ctllock: LWLock to use to control access to the shared control structure. * subdir: PGDATA-relative subdirectory that will contain the files. * tranche_id: LWLock tranche ID to use for the SLRU's per-buffer LWLocks. + * sync_handler: which set of functions to use to handle sync requests */ void SimpleLruInit(SlruCtl ctl, const char *name, int nslots, int nlsns, From e40d131ccb74462701080db17378a20c49bf5575 Mon Sep 17 00:00:00 2001 From: Amit Kapila Date: Thu, 10 Nov 2022 17:19:30 +0530 Subject: [PATCH 04/70] Fix comments atop ReorderBufferAddInvalidations. The comments atop seem to indicate that we always accumulate invalidation messages in a top-level transaction which is neither required nor matches with the code. Author: Amit Kapila Reviewd by: Masahiko Sawada Backpatch-through: 14, where it was introduced in commit c55040ccd0 Discussion: https://postgr.es/m/CAA4eK1LxGgnUroPz8STb6OfjVU1yaHoSA+T63URwmGCLdMJ0LA@mail.gmail.com --- .../replication/logical/reorderbuffer.c | 22 ++++++++++--------- 1 file changed, 12 insertions(+), 10 deletions(-) diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c index 271a74c6908..1d1649a22a8 100644 --- a/src/backend/replication/logical/reorderbuffer.c +++ b/src/backend/replication/logical/reorderbuffer.c @@ -3197,16 +3197,17 @@ ReorderBufferAddNewTupleCids(ReorderBuffer *rb, TransactionId xid, } /* - * Setup the invalidation of the toplevel transaction. + * Accumulate the invalidations for executing them later. * * This needs to be called for each XLOG_XACT_INVALIDATIONS message and - * accumulates all the invalidation messages in the toplevel transaction as - * well as in the form of change in reorder buffer. We require to record it in - * form of the change so that we can execute only the required invalidations - * instead of executing all the invalidations on each CommandId increment. We - * also need to accumulate these in the toplevel transaction because in some - * cases we skip processing the transaction (see ReorderBufferForget), we need - * to execute all the invalidations together. + * accumulates all the invalidation messages in the toplevel transaction, if + * available, otherwise in the current transaction, as well as in the form of + * change in reorder buffer. We require to record it in form of the change + * so that we can execute only the required invalidations instead of executing + * all the invalidations on each CommandId increment. We also need to + * accumulate these in the txn buffer because in some cases where we skip + * processing the transaction (see ReorderBufferForget), we need to execute + * all the invalidations together. */ void ReorderBufferAddInvalidations(ReorderBuffer *rb, TransactionId xid, @@ -3222,8 +3223,9 @@ ReorderBufferAddInvalidations(ReorderBuffer *rb, TransactionId xid, oldcontext = MemoryContextSwitchTo(rb->context); /* - * Collect all the invalidations under the top transaction so that we can - * execute them all together. See comment atop this function + * Collect all the invalidations under the top transaction, if available, + * so that we can execute them all together. See comments atop this + * function. */ if (txn->toptxn) txn = txn->toptxn; From af0ba46ba7d1ecb9bc3669d78c3b45adc2e6520f Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Thu, 10 Nov 2022 10:23:49 -0500 Subject: [PATCH 05/70] Re-allow building on Microsoft Visual Studio 2013. In commit 450ee7012 I supposed that all platforms we now care about have snprintf(), since that's required by C99. Turns out that Microsoft did not get around to adding that until VS2015. We've dropped support for VS2013 as of HEAD (cf 6203583b7), but not in the back branches, so add a hack for this in the back branches only. There's no easy shortcut to an exact emulation of standard snprintf in VS2013, but fortunately we don't need one: this code was just fine with using sprintf before 450ee7012, so we can make it do so again on that platform (and any others where the problem might crop up). Per bug #17681 from Daisuke Higuchi. Back-patch to v12, like the previous patch. Discussion: https://postgr.es/m/17681-485ba2ec13e7f392@postgresql.org --- src/port/snprintf.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/src/port/snprintf.c b/src/port/snprintf.c index 87525663907..8306ab4f2b8 100644 --- a/src/port/snprintf.c +++ b/src/port/snprintf.c @@ -109,6 +109,16 @@ #undef vprintf #undef printf +/* + * We use the platform's native snprintf() for some machine-dependent cases. + * While that's required by C99, Microsoft Visual Studio lacks it before + * VS2015. Fortunately, we don't really need the length check in practice, + * so just fall back to native sprintf() on that platform. + */ +#if defined(_MSC_VER) && _MSC_VER < 1900 /* pre-VS2015 */ +#define snprintf(str,size,...) sprintf(str,__VA_ARGS__) +#endif + /* * Info about where the formatted output is going. * From 881c5c481ed72de0faf9a9ef281530b6120f0078 Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Thu, 10 Nov 2022 17:24:26 -0500 Subject: [PATCH 06/70] Fix alter_table.sql test case to test what it claims to. The stanza "SET STORAGE may need to add a TOAST table" does not test what it's supposed to, and hasn't done so since we added the ability to store constant column default values as metadata. We need to use a non-constant default to get the expected table rewrite to actually happen. Fix that, and add the missing checks that would have exposed the problem to begin with. Noted while reviewing a patch that made changes in this test case. Back-patch to v11 where the problem came in. --- src/test/regress/expected/alter_table.out | 30 +++++++++++++++++------ src/test/regress/sql/alter_table.sql | 11 ++++++--- 2 files changed, 29 insertions(+), 12 deletions(-) diff --git a/src/test/regress/expected/alter_table.out b/src/test/regress/expected/alter_table.out index 0cfa1b7464a..d29bcc0da6d 100644 --- a/src/test/regress/expected/alter_table.out +++ b/src/test/regress/expected/alter_table.out @@ -2329,12 +2329,26 @@ alter table recur1 alter column f2 type recur2; -- fails ERROR: composite type recur1 cannot be made a member of itself -- SET STORAGE may need to add a TOAST table create table test_storage (a text); +select reltoastrelid <> 0 as has_toast_table + from pg_class where oid = 'test_storage'::regclass; + has_toast_table +----------------- + t +(1 row) + alter table test_storage alter a set storage plain; -alter table test_storage add b int default 0; -- rewrite table to remove its TOAST table +-- rewrite table to remove its TOAST table; need a non-constant column default +alter table test_storage add b int default random()::int; +select reltoastrelid <> 0 as has_toast_table + from pg_class where oid = 'test_storage'::regclass; + has_toast_table +----------------- + f +(1 row) + alter table test_storage alter a set storage extended; -- re-add TOAST table select reltoastrelid <> 0 as has_toast_table -from pg_class -where oid = 'test_storage'::regclass; + from pg_class where oid = 'test_storage'::regclass; has_toast_table ----------------- t @@ -2344,11 +2358,11 @@ where oid = 'test_storage'::regclass; create index test_storage_idx on test_storage (b, a); alter table test_storage alter column a set storage external; \d+ test_storage - Table "public.test_storage" - Column | Type | Collation | Nullable | Default | Storage | Stats target | Description ---------+---------+-----------+----------+---------+----------+--------------+------------- - a | text | | | | external | | - b | integer | | | 0 | plain | | + Table "public.test_storage" + Column | Type | Collation | Nullable | Default | Storage | Stats target | Description +--------+---------+-----------+----------+-------------------+----------+--------------+------------- + a | text | | | | external | | + b | integer | | | random()::integer | plain | | Indexes: "test_storage_idx" btree (b, a) diff --git a/src/test/regress/sql/alter_table.sql b/src/test/regress/sql/alter_table.sql index ed8941a890a..7ddf9f898a8 100644 --- a/src/test/regress/sql/alter_table.sql +++ b/src/test/regress/sql/alter_table.sql @@ -1567,13 +1567,16 @@ alter table recur1 alter column f2 type recur2; -- fails -- SET STORAGE may need to add a TOAST table create table test_storage (a text); +select reltoastrelid <> 0 as has_toast_table + from pg_class where oid = 'test_storage'::regclass; alter table test_storage alter a set storage plain; -alter table test_storage add b int default 0; -- rewrite table to remove its TOAST table +-- rewrite table to remove its TOAST table; need a non-constant column default +alter table test_storage add b int default random()::int; +select reltoastrelid <> 0 as has_toast_table + from pg_class where oid = 'test_storage'::regclass; alter table test_storage alter a set storage extended; -- re-add TOAST table - select reltoastrelid <> 0 as has_toast_table -from pg_class -where oid = 'test_storage'::regclass; + from pg_class where oid = 'test_storage'::regclass; -- test that SET STORAGE propagates to index correctly create index test_storage_idx on test_storage (b, a); From 8c5f592998f6fb32f78b98ee21d788f9b21e164c Mon Sep 17 00:00:00 2001 From: Jeff Davis Date: Thu, 10 Nov 2022 14:46:30 -0800 Subject: [PATCH 07/70] Fix theoretical torn page hazard. The original report was concerned with a possible inconsistency between the heap and the visibility map, which I was unable to confirm. The concern has been retracted. However, there did seem to be a torn page hazard when using checksums. By not setting the heap page LSN during redo, the protections of minRecoveryPoint were bypassed. Fixed, along with a misleading comment. It may have been impossible to hit this problem in practice, because it would require a page tear between the checksum and the flags, so I am marking this as a theoretical risk. But, as discussed, it did violate expectations about the page LSN, so it may have other consequences. Backpatch to all supported versions. Reported-by: Konstantin Knizhnik Reviewed-by: Konstantin Knizhnik Discussion: https://postgr.es/m/fed17dac-8cb8-4f5b-d462-1bb4908c029e@garret.ru Backpatch-through: 11 --- src/backend/access/heap/heapam.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c index b398e307c97..70574f6db41 100644 --- a/src/backend/access/heap/heapam.c +++ b/src/backend/access/heap/heapam.c @@ -8912,8 +8912,7 @@ heap_xlog_visible(XLogReaderState *record) /* * We don't bump the LSN of the heap page when setting the visibility * map bit (unless checksums or wal_hint_bits is enabled, in which - * case we must), because that would generate an unworkable volume of - * full-page writes. This exposes us to torn page hazards, but since + * case we must). This exposes us to torn page hazards, but since * we're not inspecting the existing page contents in any way, we * don't care. * @@ -8927,6 +8926,9 @@ heap_xlog_visible(XLogReaderState *record) PageSetAllVisible(page); + if (XLogHintBitIsNeeded()) + PageSetLSN(page, lsn); + MarkBufferDirty(buffer); } else if (action == BLK_RESTORED) From d3a9bd7ec98f5cb8999fbe03f75742d00a7eb549 Mon Sep 17 00:00:00 2001 From: Amit Kapila Date: Mon, 14 Nov 2022 10:22:28 +0530 Subject: [PATCH 08/70] Fix cleanup lock acquisition in SPLIT_ALLOCATE_PAGE replay. During XLOG_HASH_SPLIT_ALLOCATE_PAGE replay, we were checking for a cleanup lock on the new bucket page after acquiring an exclusive lock on it and raising a PANIC error on failure. However, it is quite possible that checkpointer can acquire the pin on the same page before acquiring a lock on it, and then the replay will lead to an error. So instead, directly acquire the cleanup lock on the new bucket page during XLOG_HASH_SPLIT_ALLOCATE_PAGE replay operation. Reported-by: Andres Freund Author: Robert Haas Reviewed-By: Amit Kapila, Andres Freund, Vignesh C Backpatch-through: 11 Discussion: https://postgr.es/m/20220810022617.fvjkjiauaykwrbse@awork3.anarazel.de --- src/backend/access/hash/hash_xlog.c | 5 ++--- src/backend/access/hash/hashpage.c | 10 +++++++--- 2 files changed, 9 insertions(+), 6 deletions(-) diff --git a/src/backend/access/hash/hash_xlog.c b/src/backend/access/hash/hash_xlog.c index af35a991fc3..6e2f0c39252 100644 --- a/src/backend/access/hash/hash_xlog.c +++ b/src/backend/access/hash/hash_xlog.c @@ -352,11 +352,10 @@ hash_xlog_split_allocate_page(XLogReaderState *record) } /* replay the record for new bucket */ - newbuf = XLogInitBufferForRedo(record, 1); + XLogReadBufferForRedoExtended(record, 1, RBM_ZERO_AND_CLEANUP_LOCK, true, + &newbuf); _hash_initbuf(newbuf, xlrec->new_bucket, xlrec->new_bucket, xlrec->new_bucket_flag, true); - if (!IsBufferCleanupOK(newbuf)) - elog(PANIC, "hash_xlog_split_allocate_page: failed to acquire cleanup lock"); MarkBufferDirty(newbuf); PageSetLSN(BufferGetPage(newbuf), lsn); diff --git a/src/backend/access/hash/hashpage.c b/src/backend/access/hash/hashpage.c index 7ee505c6bab..30800705340 100644 --- a/src/backend/access/hash/hashpage.c +++ b/src/backend/access/hash/hashpage.c @@ -804,9 +804,13 @@ _hash_expandtable(Relation rel, Buffer metabuf) /* * Physically allocate the new bucket's primary page. We want to do this * before changing the metapage's mapping info, in case we can't get the - * disk space. Ideally, we don't need to check for cleanup lock on new - * bucket as no other backend could find this bucket unless meta page is - * updated. However, it is good to be consistent with old bucket locking. + * disk space. + * + * XXX It doesn't make sense to call _hash_getnewbuf first, zeroing the + * buffer, and then only afterwards check whether we have a cleanup lock. + * However, since no scan can be accessing the buffer yet, any concurrent + * accesses will just be from processes like the bgwriter or checkpointer + * which don't care about its contents, so it doesn't really matter. */ buf_nblkno = _hash_getnewbuf(rel, start_nblkno, MAIN_FORKNUM); if (!IsBufferCleanupOK(buf_nblkno)) From 514dc9093a2440d778db7a525b805681b63b9e44 Mon Sep 17 00:00:00 2001 From: Noah Misch Date: Thu, 17 Nov 2022 07:35:06 -0800 Subject: [PATCH 09/70] Account for IPC::Run::result() Windows behavior change. This restores compatibility with the not-yet-released successor of version 20220807.0. Back-patch to 9.4, which introduced this code. Reviewed by Andrew Dunstan. Discussion: https://postgr.es/m/20221117061805.GA4020280@rfd.leadboat.com --- src/test/perl/TestLib.pm | 12 ++++-------- 1 file changed, 4 insertions(+), 8 deletions(-) diff --git a/src/test/perl/TestLib.pm b/src/test/perl/TestLib.pm index 610050e1c4b..2a354ea4ef9 100644 --- a/src/test/perl/TestLib.pm +++ b/src/test/perl/TestLib.pm @@ -758,15 +758,11 @@ sub command_exit_is my $h = IPC::Run::start $cmd; $h->finish(); - # On Windows, the exit status of the process is returned directly as the - # process's exit code, while on Unix, it's returned in the high bits - # of the exit code (see WEXITSTATUS macro in the standard - # header file). IPC::Run's result function always returns exit code >> 8, - # assuming the Unix convention, which will always return 0 on Windows as - # long as the process was not terminated by an exception. To work around - # that, use $h->full_results on Windows instead. + # Normally, if the child called exit(N), IPC::Run::result() returns N. On + # Windows, with IPC::Run v20220807.0 and earlier, full_results() is the + # method that returns N (https://github.com/toddr/IPC-Run/issues/161). my $result = - ($Config{osname} eq "MSWin32") + ($Config{osname} eq "MSWin32" && $IPC::Run::VERSION <= 20220807.0) ? ($h->full_results)[0] : $h->result(0); is($result, $expected, $test_name); From 371722f0c33f02c94f51b16c186e8a42db371cf7 Mon Sep 17 00:00:00 2001 From: Andrew Dunstan Date: Fri, 18 Nov 2022 08:38:26 -0500 Subject: [PATCH 10/70] Fix version comparison in Version.pm Version strings with unequal numbers of parts were being compared incorrectly. We cure this by treating a missing part in the shorter version as 0. per complaint from Jehan-Guillaume de Rorthais, but the fix is mine, not his. Discussion: https://postgr.es/m/20220628225325.53d97b8d@karst Backpatch to release 14 where this code was introduced. --- src/test/perl/PostgresVersion.pm | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/src/test/perl/PostgresVersion.pm b/src/test/perl/PostgresVersion.pm index 4e764c36a55..884d0e949b9 100644 --- a/src/test/perl/PostgresVersion.pm +++ b/src/test/perl/PostgresVersion.pm @@ -120,9 +120,12 @@ sub _version_cmp for (my $idx = 0;; $idx++) { - return 0 unless (defined $an->[$idx] && defined $bn->[$idx]); - return $an->[$idx] <=> $bn->[$idx] - if ($an->[$idx] <=> $bn->[$idx]); + return 0 + if ($idx >= @$an && $idx >= @$bn); + # treat a missing number as 0 + my ($anum, $bnum) = ($an->[$idx] || 0, $bn->[$idx] || 0); + return $anum <=> $bnum + if ($anum <=> $bnum); } } From eedb0f7b7023bac2429092b01bf76f05dd17a114 Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Sat, 19 Nov 2022 12:00:27 -0500 Subject: [PATCH 11/70] pg_dump: avoid unsafe function calls in getPolicies(). getPolicies() had the same disease I fixed in other places in commit e3fcbbd62, i.e., it was calling pg_get_expr() for expressions on tables that we don't necessarily have lock on. To fix, restrict the query to only collect interesting rows, rather than doing the filtering on the client side. Back-patch of commit 3e6e86abc. That's been in v15/HEAD long enough to have some confidence about it, so now let's fix the problem in older branches. Discussion: https://postgr.es/m/2273648.1634764485@sss.pgh.pa.us Discussion: https://postgr.es/m/7d7eb6128f40401d81b3b7a898b6b4de@W2012-02.nidsa.loc Discussion: https://postgr.es/m/45c93d57-9973-248e-d2df-e02ca9af48d4@darold.net --- src/bin/pg_dump/pg_dump.c | 42 +++++++++++++++++++++++++++------------ 1 file changed, 29 insertions(+), 13 deletions(-) diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c index 7ea694ce07d..4a8f4937605 100644 --- a/src/bin/pg_dump/pg_dump.c +++ b/src/bin/pg_dump/pg_dump.c @@ -4169,6 +4169,7 @@ void getPolicies(Archive *fout, TableInfo tblinfo[], int numTables) { PQExpBuffer query; + PQExpBuffer tbloids; PGresult *res; PolicyInfo *polinfo; int i_oid; @@ -4184,15 +4185,17 @@ getPolicies(Archive *fout, TableInfo tblinfo[], int numTables) j, ntups; + /* No policies before 9.5 */ if (fout->remoteVersion < 90500) return; query = createPQExpBuffer(); + tbloids = createPQExpBuffer(); /* - * First, check which tables have RLS enabled. We represent RLS being - * enabled on a table by creating a PolicyInfo object with null polname. + * Identify tables of interest, and check which ones have RLS enabled. */ + appendPQExpBufferChar(tbloids, '{'); for (i = 0; i < numTables; i++) { TableInfo *tbinfo = &tblinfo[i]; @@ -4201,9 +4204,23 @@ getPolicies(Archive *fout, TableInfo tblinfo[], int numTables) if (!(tbinfo->dobj.dump & DUMP_COMPONENT_POLICY)) continue; + /* It can't have RLS or policies if it's not a table */ + if (tbinfo->relkind != RELKIND_RELATION && + tbinfo->relkind != RELKIND_PARTITIONED_TABLE) + continue; + + /* Add it to the list of table OIDs to be probed below */ + if (tbloids->len > 1) /* do we have more than the '{'? */ + appendPQExpBufferChar(tbloids, ','); + appendPQExpBuffer(tbloids, "%u", tbinfo->dobj.catId.oid); + + /* Is RLS enabled? (That's separate from whether it has policies) */ if (tbinfo->rowsec) { /* + * We represent RLS being enabled on a table by creating a + * PolicyInfo object with null polname. + * * Note: use tableoid 0 so that this object won't be mistaken for * something that pg_depend entries apply to. */ @@ -4223,15 +4240,18 @@ getPolicies(Archive *fout, TableInfo tblinfo[], int numTables) polinfo->polwithcheck = NULL; } } + appendPQExpBufferChar(tbloids, '}'); /* - * Now, read all RLS policies, and create PolicyInfo objects for all those - * that are of interest. + * Now, read all RLS policies belonging to the tables of interest, and + * create PolicyInfo objects for them. (Note that we must filter the + * results server-side not locally, because we dare not apply pg_get_expr + * to tables we don't have lock on.) */ pg_log_info("reading row-level security policies"); printfPQExpBuffer(query, - "SELECT oid, tableoid, pol.polrelid, pol.polname, pol.polcmd, "); + "SELECT pol.oid, pol.tableoid, pol.polrelid, pol.polname, pol.polcmd, "); if (fout->remoteVersion >= 100000) appendPQExpBuffer(query, "pol.polpermissive, "); else @@ -4241,7 +4261,9 @@ getPolicies(Archive *fout, TableInfo tblinfo[], int numTables) " pg_catalog.array_to_string(ARRAY(SELECT pg_catalog.quote_ident(rolname) from pg_catalog.pg_roles WHERE oid = ANY(pol.polroles)), ', ') END AS polroles, " "pg_catalog.pg_get_expr(pol.polqual, pol.polrelid) AS polqual, " "pg_catalog.pg_get_expr(pol.polwithcheck, pol.polrelid) AS polwithcheck " - "FROM pg_catalog.pg_policy pol"); + "FROM unnest('%s'::pg_catalog.oid[]) AS src(tbloid)\n" + "JOIN pg_catalog.pg_policy pol ON (src.tbloid = pol.polrelid)", + tbloids->data); res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK); @@ -4265,13 +4287,6 @@ getPolicies(Archive *fout, TableInfo tblinfo[], int numTables) Oid polrelid = atooid(PQgetvalue(res, j, i_polrelid)); TableInfo *tbinfo = findTableByOid(polrelid); - /* - * Ignore row security on tables not to be dumped. (This will - * result in some harmless wasted slots in polinfo[].) - */ - if (!(tbinfo->dobj.dump & DUMP_COMPONENT_POLICY)) - continue; - polinfo[j].dobj.objType = DO_POLICY; polinfo[j].dobj.catId.tableoid = atooid(PQgetvalue(res, j, i_tableoid)); @@ -4306,6 +4321,7 @@ getPolicies(Archive *fout, TableInfo tblinfo[], int numTables) PQclear(res); destroyPQExpBuffer(query); + destroyPQExpBuffer(tbloids); } /* From 66c744c366ba2b20e0770611a2a8b5901688fcb8 Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Sat, 19 Nov 2022 13:09:14 -0500 Subject: [PATCH 12/70] Doc: sync src/tutorial/basics.source with SGML documentation. basics.source is supposed to be pretty closely in step with the examples in chapter 2 of the tutorial, but I forgot to update it in commit f05a5e000. Fix that, and adjust a couple of other discrepancies that had crept in over time. (I notice that advanced.source is nowhere near being in sync with chapter 3, but I lack the ambition to do something about that right now.) --- src/tutorial/basics.source | 27 ++++++++++++++++++++++----- 1 file changed, 22 insertions(+), 5 deletions(-) diff --git a/src/tutorial/basics.source b/src/tutorial/basics.source index 3e74d718ab0..d09ff5029bc 100644 --- a/src/tutorial/basics.source +++ b/src/tutorial/basics.source @@ -79,6 +79,11 @@ SELECT * WHERE city = 'San Francisco' AND prcp > 0.0; +-- You can request that the results of a query be returned in sorted order: + +SELECT * FROM weather + ORDER BY city, temp_lo; + -- Here is a more complicated one. Duplicates are removed when DISTINCT is -- specified. ORDER BY specifies the column to sort on. (Just to make sure the -- following won't confuse you, DISTINCT and ORDER BY can be used separately.) @@ -108,7 +113,8 @@ SELECT city, temp_lo, temp_hi, prcp, date, location -- table name. If you want to be clear, you can do the following. They give -- identical results, of course. -SELECT weather.city, weather.temp_lo, weather.temp_hi, weather.prcp, weather.date, cities.location +SELECT weather.city, weather.temp_lo, weather.temp_hi, + weather.prcp, weather.date, cities.location FROM weather JOIN cities ON weather.city = cities.name; -- Old join syntax @@ -125,8 +131,8 @@ SELECT * -- Suppose we want to find all the records that are in the temperature range -- of other records. w1 and w2 are aliases for weather. -SELECT w1.city, w1.temp_lo, w1.temp_hi, - w2.city, w2.temp_lo, w2.temp_hi +SELECT w1.city, w1.temp_lo AS low, w1.temp_hi AS high, + w2.city, w2.temp_lo AS low, w2.temp_hi AS high FROM weather w1 JOIN weather w2 ON w1.temp_lo < w2.temp_lo AND w1.temp_hi > w2.temp_hi; @@ -142,16 +148,27 @@ SELECT city FROM weather WHERE temp_lo = (SELECT max(temp_lo) FROM weather); -- Aggregate with GROUP BY -SELECT city, max(temp_lo) +SELECT city, count(*), max(temp_lo) FROM weather GROUP BY city; -- ... and HAVING -SELECT city, max(temp_lo) +SELECT city, count(*), max(temp_lo) FROM weather GROUP BY city HAVING max(temp_lo) < 40; +-- We can filter rows before aggregating them: +SELECT city, count(*), max(temp_lo) + FROM weather + WHERE city LIKE 'S%' + GROUP BY city; + +-- Another way is the FILTER clause, which operates per-aggregate: +SELECT city, count(*) FILTER (WHERE temp_lo < 45), max(temp_lo) + FROM weather + GROUP BY city; + ----------------------------- -- Updates: From 0f5d6d5faddbca5acd84b2ac12e3fb8604b906f6 Mon Sep 17 00:00:00 2001 From: Andres Freund Date: Wed, 16 Nov 2022 20:00:59 -0800 Subject: [PATCH 13/70] Fix mislabeling of PROC_QUEUE->links as PGPROC, fixing UBSan on 32bit ProcSleep() used a PGPROC* variable to point to PROC_QUEUE->links.next, because that does "the right thing" with SHMQueueInsertBefore(). While that largely works, it's certainly not correct and unnecessary - we can just use SHM_QUEUE* to point to the insertion point. Noticed when testing a 32bit of postgres with undefined behavior sanitizer. UBSan noticed that sometimes the supposed PGPROC wasn't sufficiently aligned (required since 46d6e5f5679, ensured indirectly, via ShmemAllocRaw() guaranteeing cacheline alignment). For now fix this by using a SHM_QUEUE* for the insertion point. Subsequently we should replace all the use of PROC_QUEUE and SHM_QUEUE with ilist.h, but that's a larger change that we don't want to backpatch. Backpatch to all supported versions - it's useful to be able to run postgres under UBSan. Reviewed-by: Tom Lane Discussion: https://postgr.es/m/20221117014230.op5kmgypdv2dtqsf@awork3.anarazel.de Backpatch: 11- --- src/backend/storage/lmgr/proc.c | 24 ++++++++++++++---------- 1 file changed, 14 insertions(+), 10 deletions(-) diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c index 37d917a1f3e..cb9d3e6d580 100644 --- a/src/backend/storage/lmgr/proc.c +++ b/src/backend/storage/lmgr/proc.c @@ -1320,13 +1320,13 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable) uint32 hashcode = locallock->hashcode; LWLock *partitionLock = LockHashPartitionLock(hashcode); PROC_QUEUE *waitQueue = &(lock->waitProcs); + SHM_QUEUE *waitQueuePos; LOCKMASK myHeldLocks = MyProc->heldLocks; TimestampTz standbyWaitStart = 0; bool early_deadlock = false; bool allow_autovacuum_cancel = true; bool logged_recovery_conflict = false; ProcWaitStatus myWaitStatus; - PGPROC *proc; PGPROC *leader = MyProc->lockGroupLeader; int i; @@ -1374,13 +1374,16 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable) * we are only considering the part of the wait queue before my insertion * point. */ - if (myHeldLocks != 0) + if (myHeldLocks != 0 && waitQueue->size > 0) { LOCKMASK aheadRequests = 0; + SHM_QUEUE *proc_node; - proc = (PGPROC *) waitQueue->links.next; + proc_node = waitQueue->links.next; for (i = 0; i < waitQueue->size; i++) { + PGPROC *proc = (PGPROC *) proc_node; + /* * If we're part of the same locking group as this waiter, its * locks neither conflict with ours nor contribute to @@ -1388,7 +1391,7 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable) */ if (leader != NULL && leader == proc->lockGroupLeader) { - proc = (PGPROC *) proc->links.next; + proc_node = proc->links.next; continue; } /* Must he wait for me? */ @@ -1423,24 +1426,25 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable) } /* Nope, so advance to next waiter */ aheadRequests |= LOCKBIT_ON(proc->waitLockMode); - proc = (PGPROC *) proc->links.next; + proc_node = proc->links.next; } /* - * If we fall out of loop normally, proc points to waitQueue head, so - * we will insert at tail of queue as desired. + * If we iterated through the whole queue, cur points to the waitQueue + * head, so we will insert at tail of queue as desired. */ + waitQueuePos = proc_node; } else { /* I hold no locks, so I can't push in front of anyone. */ - proc = (PGPROC *) &(waitQueue->links); + waitQueuePos = &waitQueue->links; } /* - * Insert self into queue, ahead of the given proc (or at tail of queue). + * Insert self into queue, at the position determined above. */ - SHMQueueInsertBefore(&(proc->links), &(MyProc->links)); + SHMQueueInsertBefore(waitQueuePos, &MyProc->links); waitQueue->size++; lock->waitMask |= LOCKBIT_ON(lockmode); From 033b0fcda56e9179ae015f4aca6f46dec1181d9f Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Mon, 21 Nov 2022 10:50:50 -0500 Subject: [PATCH 14/70] Prevent instability in contrib/pageinspect's regression test. pageinspect has occasionally failed on slow buildfarm members, with symptoms indicating that the expected effects of VACUUM FREEZE didn't happen. This is presumably because a background transaction such as auto-analyze was holding back global xmin. We can work around that by using a temp table in the test. Since commit a7212be8b, that will use an up-to-date cutoff xmin regardless of other processes. And pageinspect itself shouldn't really care whether the table is temp. Back-patch to v14. There would be no point in older branches without back-patching a7212be8b, which seems like more trouble than the problem is worth. Discussion: https://postgr.es/m/2892135.1668976646@sss.pgh.pa.us --- contrib/pageinspect/expected/page.out | 3 ++- contrib/pageinspect/sql/page.sql | 3 ++- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/contrib/pageinspect/expected/page.out b/contrib/pageinspect/expected/page.out index 8a303efc5f4..f5b6ad8593a 100644 --- a/contrib/pageinspect/expected/page.out +++ b/contrib/pageinspect/expected/page.out @@ -1,5 +1,6 @@ CREATE EXTENSION pageinspect; -CREATE TABLE test1 (a int, b int); +-- Use a temp table so that effects of VACUUM are predictable +CREATE TEMP TABLE test1 (a int, b int); INSERT INTO test1 VALUES (16777217, 131584); VACUUM (DISABLE_PAGE_SKIPPING) test1; -- set up FSM -- The page contents can vary, so just test that it can be read diff --git a/contrib/pageinspect/sql/page.sql b/contrib/pageinspect/sql/page.sql index b5c41cc8ac5..5bff568d3b5 100644 --- a/contrib/pageinspect/sql/page.sql +++ b/contrib/pageinspect/sql/page.sql @@ -1,6 +1,7 @@ CREATE EXTENSION pageinspect; -CREATE TABLE test1 (a int, b int); +-- Use a temp table so that effects of VACUUM are predictable +CREATE TEMP TABLE test1 (a int, b int); INSERT INTO test1 VALUES (16777217, 131584); VACUUM (DISABLE_PAGE_SKIPPING) test1; -- set up FSM From 9fa14706476fd1593ddceed8ff50c81407dc5a25 Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Mon, 21 Nov 2022 15:37:48 -0500 Subject: [PATCH 15/70] Revert "Prevent instability in contrib/pageinspect's regression test." This reverts commit 5cda142bb9d2bd7e7ed1c22ae89afe58abfa8d7b (in v14 only). It turns out that that fails under force_parallel_mode = regress, because pageinspect's disk-access functions are marked parallel safe, which they are not if you try to use them on a temp table. The cost of fixing that pre-v15 seems to exceed the value of making this test case fully stable, so we will just leave things as-is in v14. --- contrib/pageinspect/expected/page.out | 3 +-- contrib/pageinspect/sql/page.sql | 3 +-- 2 files changed, 2 insertions(+), 4 deletions(-) diff --git a/contrib/pageinspect/expected/page.out b/contrib/pageinspect/expected/page.out index f5b6ad8593a..8a303efc5f4 100644 --- a/contrib/pageinspect/expected/page.out +++ b/contrib/pageinspect/expected/page.out @@ -1,6 +1,5 @@ CREATE EXTENSION pageinspect; --- Use a temp table so that effects of VACUUM are predictable -CREATE TEMP TABLE test1 (a int, b int); +CREATE TABLE test1 (a int, b int); INSERT INTO test1 VALUES (16777217, 131584); VACUUM (DISABLE_PAGE_SKIPPING) test1; -- set up FSM -- The page contents can vary, so just test that it can be read diff --git a/contrib/pageinspect/sql/page.sql b/contrib/pageinspect/sql/page.sql index 5bff568d3b5..b5c41cc8ac5 100644 --- a/contrib/pageinspect/sql/page.sql +++ b/contrib/pageinspect/sql/page.sql @@ -1,7 +1,6 @@ CREATE EXTENSION pageinspect; --- Use a temp table so that effects of VACUUM are predictable -CREATE TEMP TABLE test1 (a int, b int); +CREATE TABLE test1 (a int, b int); INSERT INTO test1 VALUES (16777217, 131584); VACUUM (DISABLE_PAGE_SKIPPING) test1; -- set up FSM From 98a1135b8ce98a64cd5a8e4df503217ec6b88483 Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Mon, 21 Nov 2022 17:07:07 -0500 Subject: [PATCH 16/70] Add comments and a missing CHECK_FOR_INTERRUPTS in ts_headline. I just spent an annoying amount of time reverse-engineering the 100%-undocumented API between ts_headline and the text search parser's prsheadline function. Add some commentary about that while it's fresh in mind. Also remove some unused macros in wparser_def.c. While at it, I noticed that when commit 78e73e875 added a CHECK_FOR_INTERRUPTS call in TS_execute_recurse, it missed doing so in the parallel function TS_phrase_execute, which surely needs one just as much. Back-patch because of the missing CHECK_FOR_INTERRUPTS. Might as well back-patch the rest of this too. --- src/backend/tsearch/ts_parse.c | 13 ++++++ src/backend/tsearch/wparser_def.c | 4 -- src/backend/utils/adt/tsvector_op.c | 3 ++ src/include/tsearch/ts_public.h | 61 +++++++++++++++++++++-------- 4 files changed, 60 insertions(+), 21 deletions(-) diff --git a/src/backend/tsearch/ts_parse.c b/src/backend/tsearch/ts_parse.c index 92d95b4bd49..dbd5b176e1a 100644 --- a/src/backend/tsearch/ts_parse.c +++ b/src/backend/tsearch/ts_parse.c @@ -433,6 +433,8 @@ parsetext(Oid cfgId, ParsedText *prs, char *buf, int buflen) /* * Headline framework */ + +/* Add a word to prs->words[] */ static void hladdword(HeadlineParsedText *prs, char *buf, int buflen, int type) { @@ -449,6 +451,14 @@ hladdword(HeadlineParsedText *prs, char *buf, int buflen, int type) prs->curwords++; } +/* + * Add pos and matching-query-item data to the just-added word. + * Here, buf/buflen represent a processed lexeme, not raw token text. + * + * If the query contains more than one matching item, we replicate + * the last-added word so that each item can be pointed to. The + * duplicate entries are marked with repeated = 1. + */ static void hlfinditem(HeadlineParsedText *prs, TSQuery query, int32 pos, char *buf, int buflen) { @@ -590,6 +600,9 @@ hlparsetext(Oid cfgId, HeadlineParsedText *prs, TSQuery query, char *buf, int bu FunctionCall1(&(prsobj->prsend), PointerGetDatum(prsdata)); } +/* + * Generate the headline, as a text object, from HeadlineParsedText. + */ text * generateHeadline(HeadlineParsedText *prs) { diff --git a/src/backend/tsearch/wparser_def.c b/src/backend/tsearch/wparser_def.c index 559dff63558..a3e5baf9782 100644 --- a/src/backend/tsearch/wparser_def.c +++ b/src/backend/tsearch/wparser_def.c @@ -1914,10 +1914,6 @@ prsd_end(PG_FUNCTION_ARGS) */ /* token type classification macros */ -#define LEAVETOKEN(x) ( (x)==SPACE ) -#define COMPLEXTOKEN(x) ( (x)==URL_T || (x)==NUMHWORD || (x)==ASCIIHWORD || (x)==HWORD ) -#define ENDPUNCTOKEN(x) ( (x)==SPACE ) - #define TS_IDIGNORE(x) ( (x)==TAG_T || (x)==PROTOCOL || (x)==SPACE || (x)==XMLENTITY ) #define HLIDREPLACE(x) ( (x)==TAG_T ) #define HLIDSKIP(x) ( (x)==URL_T || (x)==NUMHWORD || (x)==ASCIIHWORD || (x)==HWORD ) diff --git a/src/backend/utils/adt/tsvector_op.c b/src/backend/utils/adt/tsvector_op.c index 9a4f0a64427..ca23d32d7b3 100644 --- a/src/backend/utils/adt/tsvector_op.c +++ b/src/backend/utils/adt/tsvector_op.c @@ -1625,6 +1625,9 @@ TS_phrase_execute(QueryItem *curitem, void *arg, uint32 flags, /* since this function recurses, it could be driven to stack overflow */ check_stack_depth(); + /* ... and let's check for query cancel while we're at it */ + CHECK_FOR_INTERRUPTS(); + if (curitem->type == QI_VAL) return chkcond(arg, (QueryOperand *) curitem, data); diff --git a/src/include/tsearch/ts_public.h b/src/include/tsearch/ts_public.h index adb9ae5fb9c..c9e5888bdd4 100644 --- a/src/include/tsearch/ts_public.h +++ b/src/include/tsearch/ts_public.h @@ -30,33 +30,60 @@ typedef struct } LexDescr; /* - * Interface to headline generator + * Interface to headline generator (tsparser's prsheadline function) + * + * HeadlineParsedText describes the text that is to be highlighted. + * Some fields are passed from the core code to the prsheadline function, + * while others are output from the prsheadline function. + * + * The principal data is words[], an array of HeadlineWordEntry, + * one entry per token, of length curwords. + * The fields of HeadlineWordEntry are: + * + * in, selected, replace, skip: these flags are initially zero + * and may be set by the prsheadline function. A consecutive group + * of tokens marked "in" form a "fragment" to be output. + * Such tokens may additionally be marked selected, replace, or skip + * to modify how they are shown. (If you set more than one of those + * bits, you get an unspecified one of those behaviors.) + * + * type, len, pos, word: filled by core code to describe the token. + * + * item: if the token matches any operand of the tsquery of interest, + * a pointer to such an operand. (If there are multiple matching + * operands, we generate extra copies of the HeadlineWordEntry to hold + * all the pointers. The extras are marked with repeated = 1 and should + * be ignored except for checking the item pointer.) */ typedef struct { - uint32 selected:1, - in:1, - replace:1, - repeated:1, - skip:1, - unused:3, - type:8, - len:16; - WordEntryPos pos; - char *word; - QueryOperand *item; + uint32 selected:1, /* token is to be highlighted */ + in:1, /* token is part of headline */ + replace:1, /* token is to be replaced with a space */ + repeated:1, /* duplicate entry to hold item pointer */ + skip:1, /* token is to be skipped (not output) */ + unused:3, /* available bits */ + type:8, /* parser's token category */ + len:16; /* length of token */ + WordEntryPos pos; /* position of token */ + char *word; /* text of token (not null-terminated) */ + QueryOperand *item; /* a matching query operand, or NULL if none */ } HeadlineWordEntry; typedef struct { + /* Fields filled by core code before calling prsheadline function: */ HeadlineWordEntry *words; - int32 lenwords; - int32 curwords; - int32 vectorpos; /* positions a-la tsvector */ - char *startsel; + int32 lenwords; /* allocated length of words[] */ + int32 curwords; /* current number of valid entries */ + int32 vectorpos; /* used by ts_parse.c in filling pos fields */ + + /* The prsheadline function must fill these fields: */ + /* Strings for marking selected tokens and separating fragments: */ + char *startsel; /* palloc'd strings */ char *stopsel; char *fragdelim; - int16 startsellen; + int16 startsellen; /* lengths of strings */ int16 stopsellen; int16 fragdelimlen; } HeadlineParsedText; From 1d9a3b63d607e42bf9001cefffbd9ac7f369462b Mon Sep 17 00:00:00 2001 From: Daniel Gustafsson Date: Mon, 21 Nov 2022 23:25:48 +0100 Subject: [PATCH 17/70] Replace link to Hunspell with the current homepage The Hunspell project moved from Sourceforge to Github sometime in 2016, so update our links to match the new URL. Backpatch the doc changes to all supported versions. Discussion: https://postgr.es/m/DC9A662A-360D-4125-A453-5A6CB9C6C4B4@yesql.se Backpatch-through: v11 --- doc/src/sgml/textsearch.sgml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml index fbe049f0636..fb3cc34c303 100644 --- a/doc/src/sgml/textsearch.sgml +++ b/doc/src/sgml/textsearch.sgml @@ -2902,7 +2902,7 @@ SELECT plainto_tsquery('supernova star'); url="https://www.cs.hmc.edu/~geoff/ispell.html">Ispell. Also, some more modern dictionary file formats are supported — MySpell (OO < 2.0.1) - and Hunspell + and Hunspell (OO >= 2.0.2). A large list of dictionaries is available on the OpenOffice Wiki. From fac042d5ac035b0a3313793185fe8494491aec5d Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Tue, 22 Nov 2022 14:40:20 -0500 Subject: [PATCH 18/70] YA attempt at taming worst-case behavior of get_actual_variable_range. We've made multiple attempts at preventing get_actual_variable_range from taking an unreasonable amount of time (3ca930fc3, fccebe421). But there's still an issue for the very first planning attempt after deletion of a large number of extremal-valued tuples. While that planning attempt will set "killed" bits on the tuples it visits and thereby reduce effort for next time, there's still a lot of work it has to do to visit the heap and then set those bits. It's (usually?) not worth it to do that much work at plan time to have a slightly better estimate, especially in a context like this where the table contents are known to be mutating rapidly. Therefore, let's bound the amount of work to be done by giving up after we've visited 100 heap pages. Giving up just means we'll fall back on the extremal value recorded in pg_statistic, so it shouldn't mean that planner estimates suddenly become worthless. Note that this means we'll still gradually whittle down the problem by setting a few more index "killed" bits in each planning attempt; so eventually we'll reach a good state (barring further deletions), even in the absence of VACUUM. Simon Riggs, per a complaint from Jakub Wartak (with cosmetic adjustments by me). Back-patch to all supported branches. Discussion: https://postgr.es/m/CAKZiRmznOwi0oaV=4PHOCM4ygcH4MgSvt8=5cu_vNCfc8FSUug@mail.gmail.com --- src/backend/utils/adt/selfuncs.c | 45 ++++++++++++++++++++++++++++---- 1 file changed, 40 insertions(+), 5 deletions(-) diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c index 10017cb583a..1f042e7c631 100644 --- a/src/backend/utils/adt/selfuncs.c +++ b/src/backend/utils/adt/selfuncs.c @@ -6202,7 +6202,7 @@ get_stats_slot_range(AttStatsSlot *sslot, Oid opfuncoid, FmgrInfo *opproc, * and fetching its low and/or high values. * If successful, store values in *min and *max, and return true. * (Either pointer can be NULL if that endpoint isn't needed.) - * If no data available, return false. + * If unsuccessful, return false. * * sortop is the "<" comparison operator to use. * collation is the required collation. @@ -6331,11 +6331,11 @@ get_actual_variable_range(PlannerInfo *root, VariableStatData *vardata, } else { - /* If min not requested, assume index is nonempty */ + /* If min not requested, still want to fetch max */ have_data = true; } - /* If max is requested, and we didn't find the index is empty */ + /* If max is requested, and we didn't already fail ... */ if (max && have_data) { /* scan in the opposite direction; all else is the same */ @@ -6369,7 +6369,7 @@ get_actual_variable_range(PlannerInfo *root, VariableStatData *vardata, /* * Get one endpoint datum (min or max depending on indexscandir) from the - * specified index. Return true if successful, false if index is empty. + * specified index. Return true if successful, false if not. * On success, endpoint value is stored to *endpointDatum (and copied into * outercontext). * @@ -6379,6 +6379,9 @@ get_actual_variable_range(PlannerInfo *root, VariableStatData *vardata, * to probe the heap. * (We could compute these values locally, but that would mean computing them * twice when get_actual_variable_range needs both the min and the max.) + * + * Failure occurs either when the index is empty, or we decide that it's + * taking too long to find a suitable tuple. */ static bool get_actual_variable_endpoint(Relation heapRel, @@ -6395,6 +6398,8 @@ get_actual_variable_endpoint(Relation heapRel, SnapshotData SnapshotNonVacuumable; IndexScanDesc index_scan; Buffer vmbuffer = InvalidBuffer; + BlockNumber last_heap_block = InvalidBlockNumber; + int n_visited_heap_pages = 0; ItemPointer tid; Datum values[INDEX_MAX_KEYS]; bool isnull[INDEX_MAX_KEYS]; @@ -6437,6 +6442,12 @@ get_actual_variable_endpoint(Relation heapRel, * might get a bogus answer that's not close to the index extremal value, * or could even be NULL. We avoid this hazard because we take the data * from the index entry not the heap. + * + * Despite all this care, there are situations where we might find many + * non-visible tuples near the end of the index. We don't want to expend + * a huge amount of time here, so we give up once we've read too many heap + * pages. When we fail for that reason, the caller will end up using + * whatever extremal value is recorded in pg_statistic. */ InitNonVacuumableSnapshot(SnapshotNonVacuumable, GlobalVisTestFor(heapRel)); @@ -6451,13 +6462,37 @@ get_actual_variable_endpoint(Relation heapRel, /* Fetch first/next tuple in specified direction */ while ((tid = index_getnext_tid(index_scan, indexscandir)) != NULL) { + BlockNumber block = ItemPointerGetBlockNumber(tid); + if (!VM_ALL_VISIBLE(heapRel, - ItemPointerGetBlockNumber(tid), + block, &vmbuffer)) { /* Rats, we have to visit the heap to check visibility */ if (!index_fetch_heap(index_scan, tableslot)) + { + /* + * No visible tuple for this index entry, so we need to + * advance to the next entry. Before doing so, count heap + * page fetches and give up if we've done too many. + * + * We don't charge a page fetch if this is the same heap page + * as the previous tuple. This is on the conservative side, + * since other recently-accessed pages are probably still in + * buffers too; but it's good enough for this heuristic. + */ +#define VISITED_PAGES_LIMIT 100 + + if (block != last_heap_block) + { + last_heap_block = block; + n_visited_heap_pages++; + if (n_visited_heap_pages > VISITED_PAGES_LIMIT) + break; + } + continue; /* no visible tuple, try next index entry */ + } /* We don't actually need the heap tuple for anything */ ExecClearTuple(tableslot); From 6ff738902e1868aceb20dc8c272ca457f25dbd4f Mon Sep 17 00:00:00 2001 From: Alvaro Herrera Date: Thu, 24 Nov 2022 10:45:10 +0100 Subject: [PATCH 19/70] Make multixact error message more explicit There are recent reports involving a very old error message that we have no history of hitting -- perhaps a recently introduced bug. Improve the error message in an attempt to improve our chances of investigating the bug. Per reports from Dimos Stamatakis and Bob Krier. Backpatch to 11. Discussion: https://postgr.es/m/CO2PR0801MB2310579F65529380A4E5EDC0E20A9@CO2PR0801MB2310.namprd08.prod.outlook.com Discussion: https://postgr.es/m/17518-04e368df5ad7f2ee@postgresql.org --- src/backend/access/transam/multixact.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/backend/access/transam/multixact.c b/src/backend/access/transam/multixact.c index ea9e7815a6f..7d058217e99 100644 --- a/src/backend/access/transam/multixact.c +++ b/src/backend/access/transam/multixact.c @@ -799,7 +799,8 @@ MultiXactIdCreateFromMembers(int nmembers, MultiXactMember *members) if (ISUPDATE_from_mxstatus(members[i].status)) { if (has_update) - elog(ERROR, "new multixact has more than one updating member"); + elog(ERROR, "new multixact has more than one updating member: %s", + mxid_to_string(InvalidMultiXactId, nmembers, members)); has_update = true; } } From f849b598c863520c99e3c27256b9ee0778da33a0 Mon Sep 17 00:00:00 2001 From: Amit Kapila Date: Fri, 25 Nov 2022 09:25:50 +0530 Subject: [PATCH 20/70] Fix uninitialized access to InitialRunningXacts during decoding. In commit 272248a0c, we introduced an InitialRunningXacts array to remember transactions and subtransactions that were running when the xl_running_xacts record that we decoded was written. This array was allocated in the snapshot builder memory context after we restore serialized snapshot but we forgot to reset the array while freeing the builder memory context. So, the next time when we start decoding in the same session where we don't restore any serialized snapshot, we ended up using the uninitialized array and that can lead to unpredictable behavior. This problem doesn't exist in HEAD as instead of using InitialRunningXacts, we added the list of transaction IDs and sub-transaction IDs, that have modified catalogs and are running during snapshot serialization, to the serialized snapshot (see commit 7f13ac8123). Reported-by: Maxim Orlov Author: Masahiko Sawada Reviewed-by: Amit Kapila, Maxim Orlov Backpatch-through: 11 Discussion: https://postgr.es/m/CACG=ezZoz_KG+Ryh9MrU_g5e0HiVoHocEvqFF=NRrhrwKmEQJQ@mail.gmail.com --- src/backend/replication/logical/snapbuild.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c index 50df199f01f..1df82bc9ddc 100644 --- a/src/backend/replication/logical/snapbuild.c +++ b/src/backend/replication/logical/snapbuild.c @@ -343,6 +343,9 @@ AllocateSnapshotBuilder(ReorderBuffer *reorder, MemoryContextSwitchTo(oldcontext); + /* The initial running transactions array must be empty. */ + Assert(NInitialRunningXacts == 0 && InitialRunningXacts == NULL); + return builder; } @@ -363,6 +366,10 @@ FreeSnapshotBuilder(SnapBuild *builder) /* other resources are deallocated via memory context reset */ MemoryContextDelete(context); + + /* InitialRunningXacts is freed along with the context */ + NInitialRunningXacts = 0; + InitialRunningXacts = NULL; } /* From 44636b04c8b636939ce8f958c6b840462a75ffda Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Sat, 26 Nov 2022 10:30:31 -0500 Subject: [PATCH 21/70] Remove temporary portlock directory during make [dist]clean. Another oversight in 9b4eafcaf. --- GNUmakefile.in | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/GNUmakefile.in b/GNUmakefile.in index 623074305ce..70f635b16e7 100644 --- a/GNUmakefile.in +++ b/GNUmakefile.in @@ -115,7 +115,7 @@ $(call recurse,distprep coverage,doc src config contrib gpcontrib) # it's not built by default $(call recurse,clean,doc contrib gpcontrib src config) clean: - rm -rf tmp_install/ + rm -rf tmp_install/ portlock/ # Garbage from autoconf: @rm -rf autom4te.cache/ # leap over gpAux/Makefile into subdirectories to avoid circular dependency. @@ -132,7 +132,7 @@ distclean maintainer-clean: $(MAKE) -C config $@ $(MAKE) -C gpMgmt $@ $(MAKE) -C src $@ - rm -rf tmp_install/ + rm -rf tmp_install/ portlock/ # Garbage from autoconf: @rm -rf autom4te.cache/ rm -f config.cache config.log config.status GNUmakefile From d240b2636134d041f04686c88facc6b55fbdd239 Mon Sep 17 00:00:00 2001 From: Andrew Dunstan Date: Sun, 27 Nov 2022 09:03:22 -0500 Subject: [PATCH 22/70] Fix binary mismatch for MSVC plperl vs gcc built perl libs When loading plperl built against Strawberry perl or the msys2 ucrt perl that have been built with gcc, a binary mismatch has been encountered which looks like this: loadable library and perl binaries are mismatched (got handshake key 0000000012800080, needed 0000000012900080) To cure this we bring the handshake keys into sync by adding NO_THREAD_SAFE_LOCALE to the defines used to build plperl. Discussion: https://postgr.es/m/20211005004334.tgjmro4kuachwiuc@alap3.anarazel.de Discussion: https://postgr.es/m/c2da86a0-2906-744c-923d-16da6047875e@dunslane.net Backpatch to all live branches. --- src/tools/msvc/Mkvcbuild.pm | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm index d3f32c53c84..3a9246e8578 100644 --- a/src/tools/msvc/Mkvcbuild.pm +++ b/src/tools/msvc/Mkvcbuild.pm @@ -608,6 +608,9 @@ sub mkvcbuild # hack to prevent duplicate definitions of uid_t/gid_t push(@perl_embed_ccflags, 'PLPERL_HAVE_UID_GID'); + # prevent binary mismatch between MSVC built plperl and + # Strawberry or msys ucrt perl libraries + push(@perl_embed_ccflags, 'NO_THREAD_SAFE_LOCALE'); # Windows offers several 32-bit ABIs. Perl is sensitive to # sizeof(time_t), one of the ABI dimensions. To get 32-bit time_t, From 37c2b4934dd857bfcfea7f78533a8120d3e05b0d Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Tue, 29 Nov 2022 10:52:44 -0500 Subject: [PATCH 23/70] Remove bogus Assert and dead code in remove_useless_results_recurse(). The JOIN_SEMI case Assert'ed that there are no PlaceHolderVars that need to be evaluated at the semijoin's RHS, which is wrong because there could be some in the semijoin's qual condition. However, there could not be any references further up than that, and within the qual there is not any way that such a PHV could have gone to null yet, so we don't really need the PHV and there is no need to avoid making the RHS-removal optimization. The upshot is that there's no actual bug in production code, and we ought to just remove this misguided Assert. While we're here, also drop the JOIN_RIGHT case, which is dead code because reduce_outer_joins() already got rid of JOIN_RIGHT. Per bug #17700 from Xin Wen. Uselessness of the JOIN_RIGHT case pointed out by Richard Guo. Back-patch to v12 where this code was added. Discussion: https://postgr.es/m/17700-2b5c10d917c30687@postgresql.org --- src/backend/optimizer/prep/prepjointree.c | 24 +++++++++-------------- src/test/regress/expected/join.out | 20 +++++++++++++++++++ src/test/regress/sql/join.sql | 10 ++++++++++ 3 files changed, 39 insertions(+), 15 deletions(-) diff --git a/src/backend/optimizer/prep/prepjointree.c b/src/backend/optimizer/prep/prepjointree.c index eca7ac93a2d..eb4450a3632 100644 --- a/src/backend/optimizer/prep/prepjointree.c +++ b/src/backend/optimizer/prep/prepjointree.c @@ -3412,16 +3412,6 @@ remove_useless_results_recurse(PlannerInfo *root, Node *jtnode) jtnode = j->larg; } break; - case JOIN_RIGHT: - /* Mirror-image of the JOIN_LEFT case */ - if ((varno = get_result_relid(root, j->larg)) != 0 && - (j->quals == NULL || - !find_dependent_phvs(root, varno))) - { - remove_result_refs(root, varno, j->rarg); - jtnode = j->rarg; - } - break; case JOIN_SEMI: /* @@ -3430,14 +3420,17 @@ remove_useless_results_recurse(PlannerInfo *root, Node *jtnode) * LHS, since we should either return the LHS row or not. For * simplicity we inject the filter qual into a new FromExpr. * - * Unlike the LEFT/RIGHT cases, we just Assert that there are - * no PHVs that need to be evaluated at the semijoin's RHS, - * since the rest of the query couldn't reference any outputs - * of the semijoin's RHS. + * There is a fine point about PHVs that are supposed to be + * evaluated at the RHS. Such PHVs could only appear in the + * semijoin's qual, since the rest of the query cannot + * reference any outputs of the semijoin's RHS. Therefore, + * they can't actually go to null before being examined, and + * it'd be OK to just remove the PHV wrapping. We don't have + * infrastructure for that, but remove_result_refs() will + * relabel them as to be evaluated at the LHS, which is fine. */ if ((varno = get_result_relid(root, j->rarg)) != 0) { - Assert(!find_dependent_phvs(root, varno)); remove_result_refs(root, varno, j->larg); if (j->quals) jtnode = (Node *) @@ -3456,6 +3449,7 @@ remove_useless_results_recurse(PlannerInfo *root, Node *jtnode) break; default: + /* Note: JOIN_RIGHT should be gone at this point */ elog(ERROR, "unrecognized join type: %d", (int) j->jointype); break; diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out index 73d95559569..1e82018e357 100644 --- a/src/test/regress/expected/join.out +++ b/src/test/regress/expected/join.out @@ -3570,6 +3570,26 @@ where b; 0 | t | t (2 rows) +-- Test PHV in a semijoin qual, which confused useless-RTE removal (bug #17700) +explain (verbose, costs off) +with ctetable as not materialized ( select 1 as f1 ) +select * from ctetable c1 +where f1 in ( select c3.f1 from ctetable c2 full join ctetable c3 on true ); + QUERY PLAN +---------------------------- + Result + Output: 1 + One-Time Filter: (1 = 1) +(3 rows) + +with ctetable as not materialized ( select 1 as f1 ) +select * from ctetable c1 +where f1 in ( select c3.f1 from ctetable c2 full join ctetable c3 on true ); + f1 +---- + 1 +(1 row) + -- -- test inlining of immutable functions -- diff --git a/src/test/regress/sql/join.sql b/src/test/regress/sql/join.sql index 1b994597d3c..9c6823a0889 100644 --- a/src/test/regress/sql/join.sql +++ b/src/test/regress/sql/join.sql @@ -1124,6 +1124,16 @@ select * from select a as b) as t3 where b; +-- Test PHV in a semijoin qual, which confused useless-RTE removal (bug #17700) +explain (verbose, costs off) +with ctetable as not materialized ( select 1 as f1 ) +select * from ctetable c1 +where f1 in ( select c3.f1 from ctetable c2 full join ctetable c3 on true ); + +with ctetable as not materialized ( select 1 as f1 ) +select * from ctetable c1 +where f1 in ( select c3.f1 from ctetable c2 full join ctetable c3 on true ); + -- -- test inlining of immutable functions -- From 305491acc24b7cb9edf2f19234ae38201410273e Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Tue, 29 Nov 2022 11:46:33 -0500 Subject: [PATCH 24/70] Prevent clobbering of utility statements in SQL function caches. This is an oversight in commit 7c337b6b5: I apparently didn't think about the possibility of a SQL function being executed multiple times within a query. In that case, functions.c's primitive caching mechanism allows the same utility parse tree to be presented for execution more than once. We have to tell ProcessUtility to make a working copy of the parse tree, or bad things happen. Normally I'd add a regression test, but I think the reported crasher is dependent on some rather random implementation choices that are nowhere near functions.c, so its usefulness as a long-lived test feels questionable. In any case, this fix is clearly correct given the design choices of 7c337b6b5. Per bug #17702 from Xin Wen. Thanks to Daniel Gustafsson for analysis. Back-patch to v14 where the faulty commit came in (before that, the responsibility for copying scribble-able utility parse trees lay elsewhere). Discussion: https://postgr.es/m/17702-ad24fdcdd1e9047a@postgresql.org --- src/backend/executor/functions.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c index 820d4c60cc0..41da04f652b 100644 --- a/src/backend/executor/functions.c +++ b/src/backend/executor/functions.c @@ -1013,7 +1013,7 @@ postquel_getnext(execution_state *es, SQLFunctionCachePtr fcache) { ProcessUtility(es->qd->plannedstmt, fcache->src, - false, + true, /* protect function cache's parsetree */ PROCESS_UTILITY_QUERY, es->qd->params, es->qd->queryEnv, From 33e71619d0e0a2f767d103ad6bb9d666045b38d4 Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Tue, 29 Nov 2022 15:43:17 -0500 Subject: [PATCH 25/70] Improve heuristics for compressing the KnownAssignedXids array. Previously, we'd compress only when the active range of array entries reached Max(4 * PROCARRAY_MAXPROCS, 2 * pArray->numKnownAssignedXids). If max_connections is large, the first term could result in not compressing for a long time, resulting in much wastage of cycles in hot-standby backends scanning the array to take snapshots. Get rid of that term, and just bound it to 2 * pArray->numKnownAssignedXids. That however creates the opposite risk, that we might spend too much effort compressing. Hence, consider compressing only once every 128 commit records. (This frequency was chosen by benchmarking. While we only tried one benchmark scenario, the results seem stable over a fairly wide range of frequencies.) Also, force compression when processing RecoveryInfo WAL records (which should be infrequent); the old code could perform compression then, but would do so only after the same array-range check as for the transaction-commit path. Also, opportunistically run compression if the startup process is about to wait for WAL, though not oftener than once a second. This should prevent cases where we waste lots of time by leaving the array not-compressed for long intervals due to low WAL traffic. Lastly, add a simple check to keep us from uselessly compressing when the array storage is already compact. Back-patch, as the performance problem is worse in pre-v14 branches than in HEAD. Simon Riggs and Michail Nikolaev, with help from Tom Lane and Andres Freund. Discussion: https://postgr.es/m/CALdSSPgahNUD_=pB_j=1zSnDBaiOtqVfzo8Ejt5J_k7qZiU1Tw@mail.gmail.com --- src/backend/access/transam/xlog.c | 6 ++ src/backend/storage/ipc/procarray.c | 133 +++++++++++++++++++++------- src/include/storage/procarray.h | 1 + 3 files changed, 110 insertions(+), 30 deletions(-) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 1ec6b82938a..ce464494778 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -13503,6 +13503,9 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess, wait_time = wal_retrieve_retry_interval - TimestampDifferenceMilliseconds(last_fail_time, now); + /* Do background tasks that might benefit us later. */ + KnownAssignedTransactionIdsIdleMaintenance(); + (void) WaitLatch(&XLogCtl->recoveryWakeupLatch, WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH, @@ -13779,6 +13782,9 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess, streaming_reply_sent = true; } + /* Do any background tasks that might benefit us later. */ + KnownAssignedTransactionIdsIdleMaintenance(); + /* * Wait for more WAL to arrive. Time out after 5 seconds * to react to a trigger file promptly and to check if the diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c index 89fafdbedc1..3c7701e15b6 100644 --- a/src/backend/storage/ipc/procarray.c +++ b/src/backend/storage/ipc/procarray.c @@ -271,6 +271,17 @@ typedef enum GlobalVisHorizonKind VISHORIZON_TEMP } GlobalVisHorizonKind; +/* + * Reason codes for KnownAssignedXidsCompress(). + */ +typedef enum KAXCompressReason +{ + KAX_NO_SPACE, /* need to free up space at array end */ + KAX_PRUNE, /* we just pruned old entries */ + KAX_TRANSACTION_END, /* we just committed/removed some XIDs */ + KAX_STARTUP_PROCESS_IDLE /* startup process is about to sleep */ +} KAXCompressReason; + static ProcArrayStruct *procArray; @@ -356,7 +367,7 @@ static bool HaveVirtualXIDsDelayingChkptGuts(VirtualTransactionId *vxids, int nvxids, int type); /* Primitives for KnownAssignedXids array handling for standby */ -static void KnownAssignedXidsCompress(bool force); +static void KnownAssignedXidsCompress(KAXCompressReason reason, bool haveLock); static void KnownAssignedXidsAdd(TransactionId from_xid, TransactionId to_xid, bool exclusive_lock); static bool KnownAssignedXidsSearch(TransactionId xid, bool remove); @@ -5576,6 +5587,17 @@ ExpireOldKnownAssignedTransactionIds(TransactionId xid) LWLockRelease(ProcArrayLock); } +/* + * KnownAssignedTransactionIdsIdleMaintenance + * Opportunistically do maintenance work when the startup process + * is about to go idle. + */ +void +KnownAssignedTransactionIdsIdleMaintenance(void) +{ + KnownAssignedXidsCompress(KAX_STARTUP_PROCESS_IDLE, false); +} + /* * Private module functions to manipulate KnownAssignedXids @@ -5658,7 +5680,9 @@ ExpireOldKnownAssignedTransactionIds(TransactionId xid) * so there is an optimal point for any workload mix. We use a heuristic to * decide when to compress the array, though trimming also helps reduce * frequency of compressing. The heuristic requires us to track the number of - * currently valid XIDs in the array. + * currently valid XIDs in the array (N). Except in special cases, we'll + * compress when S >= 2N. Bounding S at 2N in turn bounds the time for + * taking a snapshot to be O(N), which it would have to be anyway. */ @@ -5666,42 +5690,91 @@ ExpireOldKnownAssignedTransactionIds(TransactionId xid) * Compress KnownAssignedXids by shifting valid data down to the start of the * array, removing any gaps. * - * A compression step is forced if "force" is true, otherwise we do it - * only if a heuristic indicates it's a good time to do it. + * A compression step is forced if "reason" is KAX_NO_SPACE, otherwise + * we do it only if a heuristic indicates it's a good time to do it. * - * Caller must hold ProcArrayLock in exclusive mode. + * Compression requires holding ProcArrayLock in exclusive mode. + * Caller must pass haveLock = true if it already holds the lock. */ static void -KnownAssignedXidsCompress(bool force) +KnownAssignedXidsCompress(KAXCompressReason reason, bool haveLock) { ProcArrayStruct *pArray = procArray; int head, - tail; + tail, + nelements; int compress_index; int i; - /* no spinlock required since we hold ProcArrayLock exclusively */ + /* Counters for compression heuristics */ + static unsigned int transactionEndsCounter; + static TimestampTz lastCompressTs; + + /* Tuning constants */ +#define KAX_COMPRESS_FREQUENCY 128 /* in transactions */ +#define KAX_COMPRESS_IDLE_INTERVAL 1000 /* in ms */ + + /* + * Since only the startup process modifies the head/tail pointers, we + * don't need a lock to read them here. + */ head = pArray->headKnownAssignedXids; tail = pArray->tailKnownAssignedXids; + nelements = head - tail; - if (!force) + /* + * If we can choose whether to compress, use a heuristic to avoid + * compressing too often or not often enough. "Compress" here simply + * means moving the values to the beginning of the array, so it is not as + * complex or costly as typical data compression algorithms. + */ + if (nelements == pArray->numKnownAssignedXids) { /* - * If we can choose how much to compress, use a heuristic to avoid - * compressing too often or not often enough. - * - * Heuristic is if we have a large enough current spread and less than - * 50% of the elements are currently in use, then compress. This - * should ensure we compress fairly infrequently. We could compress - * less often though the virtual array would spread out more and - * snapshots would become more expensive. + * When there are no gaps between head and tail, don't bother to + * compress, except in the KAX_NO_SPACE case where we must compress to + * create some space after the head. */ - int nelements = head - tail; + if (reason != KAX_NO_SPACE) + return; + } + else if (reason == KAX_TRANSACTION_END) + { + /* + * Consider compressing only once every so many commits. Frequency + * determined by benchmarks. + */ + if ((transactionEndsCounter++) % KAX_COMPRESS_FREQUENCY != 0) + return; - if (nelements < 4 * PROCARRAY_MAXPROCS || - nelements < 2 * pArray->numKnownAssignedXids) + /* + * Furthermore, compress only if the used part of the array is less + * than 50% full (see comments above). + */ + if (nelements < 2 * pArray->numKnownAssignedXids) return; } + else if (reason == KAX_STARTUP_PROCESS_IDLE) + { + /* + * We're about to go idle for lack of new WAL, so we might as well + * compress. But not too often, to avoid ProcArray lock contention + * with readers. + */ + if (lastCompressTs != 0) + { + TimestampTz compress_after; + + compress_after = TimestampTzPlusMilliseconds(lastCompressTs, + KAX_COMPRESS_IDLE_INTERVAL); + if (GetCurrentTimestamp() < compress_after) + return; + } + } + + /* Need to compress, so get the lock if we don't have it. */ + if (!haveLock) + LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE); /* * We compress the array by reading the valid values from tail to head, @@ -5717,9 +5790,16 @@ KnownAssignedXidsCompress(bool force) compress_index++; } } + Assert(compress_index == pArray->numKnownAssignedXids); pArray->tailKnownAssignedXids = 0; pArray->headKnownAssignedXids = compress_index; + + if (!haveLock) + LWLockRelease(ProcArrayLock); + + /* Update timestamp for maintenance. No need to hold lock for this. */ + lastCompressTs = GetCurrentTimestamp(); } /* @@ -5791,18 +5871,11 @@ KnownAssignedXidsAdd(TransactionId from_xid, TransactionId to_xid, */ if (head + nxids > pArray->maxKnownAssignedXids) { - /* must hold lock to compress */ - if (!exclusive_lock) - LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE); - - KnownAssignedXidsCompress(true); + KnownAssignedXidsCompress(KAX_NO_SPACE, exclusive_lock); head = pArray->headKnownAssignedXids; /* note: we no longer care about the tail pointer */ - if (!exclusive_lock) - LWLockRelease(ProcArrayLock); - /* * If it still won't fit then we're out of memory */ @@ -5996,7 +6069,7 @@ KnownAssignedXidsRemoveTree(TransactionId xid, int nsubxids, KnownAssignedXidsRemove(subxids[i]); /* Opportunistically compress the array */ - KnownAssignedXidsCompress(false); + KnownAssignedXidsCompress(KAX_TRANSACTION_END, true); } /* @@ -6071,7 +6144,7 @@ KnownAssignedXidsRemovePreceding(TransactionId removeXid) } /* Opportunistically compress the array */ - KnownAssignedXidsCompress(false); + KnownAssignedXidsCompress(KAX_PRUNE, true); } /* diff --git a/src/include/storage/procarray.h b/src/include/storage/procarray.h index 607834e46c4..dba7a6cf8a4 100644 --- a/src/include/storage/procarray.h +++ b/src/include/storage/procarray.h @@ -43,6 +43,7 @@ extern void ExpireTreeKnownAssignedTransactionIds(TransactionId xid, TransactionId max_xid); extern void ExpireAllKnownAssignedTransactionIds(void); extern void ExpireOldKnownAssignedTransactionIds(TransactionId xid); +extern void KnownAssignedTransactionIdsIdleMaintenance(void); extern int GetMaxSnapshotXidCount(void); extern int GetMaxSnapshotSubxidCount(void); From 59b2d607fdba8c39031facff493a6ebb20874e31 Mon Sep 17 00:00:00 2001 From: Michael Paquier Date: Wed, 30 Nov 2022 08:38:30 +0900 Subject: [PATCH 26/70] Fix comment in fe-auth-scram.c The frontend-side routine in charge of building a SCRAM verifier mentioned that the restrictions applying to SASLprep on the password with the encoding are described at the top of fe-auth-scram.c, but this information is in auth-scram.c. This is wrong since 8f8b9be, so backpatch all the way down as this is an important documentation bit. Spotted while reviewing a different patch. Backpatch-through: 11 --- src/interfaces/libpq/fe-auth-scram.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/interfaces/libpq/fe-auth-scram.c b/src/interfaces/libpq/fe-auth-scram.c index 345f046f807..0c1c4cd7e53 100644 --- a/src/interfaces/libpq/fe-auth-scram.c +++ b/src/interfaces/libpq/fe-auth-scram.c @@ -886,7 +886,8 @@ pg_fe_scram_build_secret(const char *password) /* * Normalize the password with SASLprep. If that doesn't work, because * the password isn't valid UTF-8 or contains prohibited characters, just - * proceed with the original password. (See comments at top of file.) + * proceed with the original password. (See comments at the top of + * auth-scram.c.) */ rc = pg_saslprep(password, &prep_password); if (rc == SASLPREP_OOM) From f4800c6ead923f429d7db229819a23ba6cd55720 Mon Sep 17 00:00:00 2001 From: Bruce Momjian Date: Tue, 29 Nov 2022 20:49:52 -0500 Subject: [PATCH 27/70] doc: add transaction processing chapter with internals info This also adds references to this new chapter at relevant sections of our documentation. Previously much of these internal details were exposed to users, but not explained. This also updates RELEASE SAVEPOINT. Discussion: https://postgr.es/m/CANbhV-E_iy9fmrErxrCh8TZTyenpfo72Hf_XD2HLDppva4dUNA@mail.gmail.com Author: Simon Riggs, Laurenz Albe Reviewed-by: Bruce Momjian Backpatch-through: 11 --- doc/src/sgml/catalogs.sgml | 7 +- doc/src/sgml/config.sgml | 6 +- doc/src/sgml/datatype.sgml | 3 +- doc/src/sgml/filelist.sgml | 1 + doc/src/sgml/func.sgml | 20 ++- doc/src/sgml/glossary.sgml | 3 +- doc/src/sgml/monitoring.sgml | 6 +- doc/src/sgml/pgrowlocks.sgml | 3 +- doc/src/sgml/postgres.sgml | 1 + doc/src/sgml/ref/release_savepoint.sgml | 62 ++++--- doc/src/sgml/ref/rollback.sgml | 8 +- doc/src/sgml/ref/rollback_to.sgml | 5 +- doc/src/sgml/wal.sgml | 6 +- doc/src/sgml/xact.sgml | 205 ++++++++++++++++++++++++ 14 files changed, 292 insertions(+), 44 deletions(-) create mode 100644 doc/src/sgml/xact.sgml diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml index b8b82208a40..8031aa83e37 100644 --- a/doc/src/sgml/catalogs.sgml +++ b/doc/src/sgml/catalogs.sgml @@ -10612,7 +10612,8 @@ SCRAM-SHA-256$<iteration count>:&l Virtual ID of the transaction targeted by the lock, - or null if the target is not a virtual transaction ID + or null if the target is not a virtual transaction ID; see + @@ -10621,8 +10622,8 @@ SCRAM-SHA-256$<iteration count>:&l transactionid xid - ID of the transaction targeted by the lock, - or null if the target is not a transaction ID + ID of the transaction targeted by the lock, or null if the target + is not a transaction ID; diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 514cac29e60..f9337ebaf5a 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -7033,12 +7033,14 @@ local0.* /var/log/postgresql %v - Virtual transaction ID (backendID/localXID) + Virtual transaction ID (backendID/localXID); see + no %x - Transaction ID (0 if none is assigned) + Transaction ID (0 if none is assigned); see + no diff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml index 0e89b768c5d..ad35ac7ec98 100644 --- a/doc/src/sgml/datatype.sgml +++ b/doc/src/sgml/datatype.sgml @@ -4936,7 +4936,8 @@ WHERE ... xmin and xmax. Transaction identifiers are 32-bit quantities. In some contexts, a 64-bit variant xid8 is used. Unlike xid values, xid8 values increase strictly - monotonically and cannot be reused in the lifetime of a database cluster. + monotonically and cannot be reused in the lifetime of a database + cluster. See for more details. diff --git a/doc/src/sgml/filelist.sgml b/doc/src/sgml/filelist.sgml index 04d8b235752..c4bda90fe67 100644 --- a/doc/src/sgml/filelist.sgml +++ b/doc/src/sgml/filelist.sgml @@ -103,6 +103,7 @@ + diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 7960cc2a5aa..3bb3e6e0fee 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -24100,7 +24100,10 @@ SELECT collation for ('foo' COLLATE "de_DE"); Returns the current transaction's ID. It will assign a new one if the current transaction does not have one already (because it has not - performed any database updates). + performed any database updates); see for details. If executed in a + subtransaction, this will return the top-level transaction ID; + see for details. @@ -24117,6 +24120,8 @@ SELECT collation for ('foo' COLLATE "de_DE"); ID is assigned yet. (It's best to use this variant if the transaction might otherwise be read-only, to avoid unnecessary consumption of an XID.) + If executed in a subtransaction, this will return the top-level + transaction ID. @@ -24160,6 +24165,9 @@ SELECT collation for ('foo' COLLATE "de_DE"); Returns a current snapshot, a data structure showing which transaction IDs are now in-progress. + Only top-level transaction IDs are included in the snapshot; + subtransaction IDs are not shown; see + for details. @@ -24214,7 +24222,8 @@ SELECT collation for ('foo' COLLATE "de_DE"); Is the given transaction ID visible according to this snapshot (that is, was it completed before the snapshot was taken)? Note that this function will not give the correct answer for - a subtransaction ID. + a subtransaction ID (subxid); see for + details. @@ -24226,8 +24235,9 @@ SELECT collation for ('foo' COLLATE "de_DE"); wraps around every 4 billion transactions. However, the functions shown in use a 64-bit type xid8 that does not wrap around during the life - of an installation, and can be converted to xid by casting if - required. The data type pg_snapshot stores information about + of an installation and can be converted to xid by casting if + required; see for details. + The data type pg_snapshot stores information about transaction ID visibility at a particular moment in time. Its components are described in . pg_snapshot's textual representation is @@ -24273,7 +24283,7 @@ SELECT collation for ('foo' COLLATE "de_DE"); xmax and not in this list was already completed at the time of the snapshot, and thus is either visible or dead according to its commit status. This list does not include the transaction IDs of - subtransactions. + subtransactions (subxids). diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml index c8d0440e80f..de0f7852a1b 100644 --- a/doc/src/sgml/glossary.sgml +++ b/doc/src/sgml/glossary.sgml @@ -1644,7 +1644,8 @@ 3 (values under that are reserved) and the epoch value is incremented by one. In some contexts, the epoch and xid values are - considered together as a single 64-bit value. + considered together as a single 64-bit value; see for more details. For more information, see diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml index ce6c34d6ea1..470f9fe080e 100644 --- a/doc/src/sgml/monitoring.sgml +++ b/doc/src/sgml/monitoring.sgml @@ -897,7 +897,8 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser backend_xid xid - Top-level transaction identifier of this backend, if any. + Top-level transaction identifier of this backend, if any; see + . @@ -1873,7 +1874,8 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser virtualxid - Waiting to acquire a virtual transaction ID lock. + Waiting to acquire a virtual transaction ID lock; see + . diff --git a/doc/src/sgml/pgrowlocks.sgml b/doc/src/sgml/pgrowlocks.sgml index 392d5f1f9a7..02f31601b06 100644 --- a/doc/src/sgml/pgrowlocks.sgml +++ b/doc/src/sgml/pgrowlocks.sgml @@ -57,7 +57,8 @@ pgrowlocks(text) returns setof record locker xid - Transaction ID of locker, or multixact ID if multitransaction + Transaction ID of locker, or multixact ID if + multitransaction; see multi diff --git a/doc/src/sgml/postgres.sgml b/doc/src/sgml/postgres.sgml index e7260e573c1..c7373d12c67 100644 --- a/doc/src/sgml/postgres.sgml +++ b/doc/src/sgml/postgres.sgml @@ -270,6 +270,7 @@ break is not needed in a wider output rendering. &brin; &hash; &storage; + &transaction; &bki; &planstats; &backup-manifest; diff --git a/doc/src/sgml/ref/release_savepoint.sgml b/doc/src/sgml/ref/release_savepoint.sgml index daf8eb9a436..e9fc6e5d1c8 100644 --- a/doc/src/sgml/ref/release_savepoint.sgml +++ b/doc/src/sgml/ref/release_savepoint.sgml @@ -21,7 +21,7 @@ PostgreSQL documentation RELEASE SAVEPOINT - destroy a previously defined savepoint + release a previously defined savepoint @@ -34,23 +34,13 @@ RELEASE [ SAVEPOINT ] savepoint_name Description - RELEASE SAVEPOINT destroys a savepoint previously defined - in the current transaction. - - - - Destroying a savepoint makes it unavailable as a rollback point, - but it has no other user visible behavior. It does not undo the - effects of commands executed after the savepoint was established. - (To do that, see .) - Destroying a savepoint when - it is no longer needed allows the system to reclaim some resources - earlier than transaction end. - - - - RELEASE SAVEPOINT also destroys all savepoints that were - established after the named savepoint was established. + RELEASE SAVEPOINT releases the named savepoint and + all active savepoints that were created after the named savepoint, + and frees their resources. All changes made since the creation of + the savepoint that didn't already get rolled back are merged into + the transaction or savepoint that was active when the named savepoint + was created. Changes made after RELEASE SAVEPOINT + will also be part of this active transaction or savepoint. @@ -62,7 +52,7 @@ RELEASE [ SAVEPOINT ] savepoint_name savepoint_name - The name of the savepoint to destroy. + The name of the savepoint to release. @@ -78,7 +68,7 @@ RELEASE [ SAVEPOINT ] savepoint_name It is not possible to release a savepoint when the transaction is in - an aborted state. + an aborted state; to do that, use . @@ -93,7 +83,7 @@ RELEASE [ SAVEPOINT ] savepoint_name Examples - To establish and later destroy a savepoint: + To establish and later release a savepoint: BEGIN; INSERT INTO table1 VALUES (3); @@ -104,6 +94,36 @@ COMMIT; The above transaction will insert both 3 and 4. + + + A more complex example with multiple nested subtransactions: + +BEGIN; + INSERT INTO table1 VALUES (1); + SAVEPOINT sp1; + INSERT INTO table1 VALUES (2); + SAVEPOINT sp2; + INSERT INTO table1 VALUES (3); + RELEASE SAVEPOINT sp2; + INSERT INTO table1 VALUES (4))); -- generates an error + + In this example, the application requests the release of the savepoint + sp2, which inserted 3. This changes the insert's + transaction context to sp1. When the statement + attempting to insert value 4 generates an error, the insertion of 2 and + 4 are lost because they are in the same, now-rolled back savepoint, + and value 3 is in the same transaction context. The application can + now only choose one of these two commands, since all other commands + will be ignored: + + ROLLBACK; + ROLLBACK TO SAVEPOINT sp1; + + Choosing ROLLBACK will abort everything, including + value 1, whereas ROLLBACK TO SAVEPOINT sp1 will retain + value 1 and allow the transaction to continue. + + diff --git a/doc/src/sgml/ref/rollback.sgml b/doc/src/sgml/ref/rollback.sgml index 142f71e7742..7700547669f 100644 --- a/doc/src/sgml/ref/rollback.sgml +++ b/doc/src/sgml/ref/rollback.sgml @@ -56,10 +56,10 @@ ROLLBACK [ WORK | TRANSACTION ] [ AND [ NO ] CHAIN ] AND CHAIN - If AND CHAIN is specified, a new transaction is - immediately started with the same transaction characteristics (see ) as the just finished one. Otherwise, - no new transaction is started. + If AND CHAIN is specified, a new (not aborted) + transaction is immediately started with the same transaction + characteristics (see ) as the + just finished one. Otherwise, no new transaction is started. diff --git a/doc/src/sgml/ref/rollback_to.sgml b/doc/src/sgml/ref/rollback_to.sgml index 3d5a241e1aa..37e5ca8f12b 100644 --- a/doc/src/sgml/ref/rollback_to.sgml +++ b/doc/src/sgml/ref/rollback_to.sgml @@ -35,8 +35,9 @@ ROLLBACK [ WORK | TRANSACTION ] TO [ SAVEPOINT ] savepoint_name Roll back all commands that were executed after the savepoint was - established. The savepoint remains valid and can be rolled back to - again later, if needed. + established and then start a new subtransaction at the same transaction level. + The savepoint remains valid and can be rolled back to again later, + if needed. diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml index 24e1c89503c..6330a8d5865 100644 --- a/doc/src/sgml/wal.sgml +++ b/doc/src/sgml/wal.sgml @@ -4,8 +4,9 @@ Reliability and the Write-Ahead Log - This chapter explains how the Write-Ahead Log is used to obtain - efficient, reliable operation. + This chapter explains how to control the reliability of + PostgreSQL, including details about the + Write-Ahead Log. @@ -895,4 +896,5 @@ seem to be a problem in practice. + diff --git a/doc/src/sgml/xact.sgml b/doc/src/sgml/xact.sgml new file mode 100644 index 00000000000..c0f4993a0df --- /dev/null +++ b/doc/src/sgml/xact.sgml @@ -0,0 +1,205 @@ + + + + + Transaction Processing + + + This chapter provides an overview of the internals of + PostgreSQL's transaction management system. + The word transaction is often abbreviated as xact. + + + + + Transactions and Identifiers + + + Transactions can be created explicitly using BEGIN + or START TRANSACTION and ended using + COMMIT or ROLLBACK. SQL + statements outside of explicit transactions automatically use + single-statement transactions. + + + + Every transaction is identified by a unique + VirtualTransactionId (also called + virtualXID or vxid), which + is comprised of a backend ID (or backendID) + and a sequentially-assigned number local to each backend, known as + localXID. For example, the virtual transaction + ID 4/12532 has a backendID + of 4 and a localXID of + 12532. + + + + Non-virtual TransactionIds (or xid), + e.g., 278394, are assigned sequentially to + transactions from a global counter used by all databases within + the PostgreSQL cluster. This assignment + happens when a transaction first writes to the database. This means + lower-numbered xids started writing before higher-numbered xids. + Note that the order in which transactions perform their first database + write might be different from the order in which the transactions + started, particularly if the transaction started with statements that + only performed database reads. + + + + The internal transaction ID type xid is 32 bits wide + and wraps around every + 4 billion transactions. A 32-bit epoch is incremented during each + wraparound. There is also a 64-bit type xid8 which + includes this epoch and therefore does not wrap around during the + life of an installation; it can be converted to xid by casting. + The functions in + return xid8 values. Xids are used as the + basis for PostgreSQL's MVCC concurrency mechanism and streaming + replication. + + + + When a top-level transaction with a (non-virtual) xid commits, + it is marked as committed in the pg_xact + directory. Additional information is recorded in the + pg_commit_ts directory if is enabled. + + + + In addition to vxid and xid, + prepared transactions are also assigned Global Transaction + Identifiers (GID). GIDs are string literals up + to 200 bytes long, which must be unique amongst other currently + prepared transactions. The mapping of GID to xid is shown in pg_prepared_xacts. + + + + + + Transactions and Locking + + + The transaction IDs of currently executing transactions are shown in + pg_locks + in columns virtualxid and + transactionid. Read-only transactions + will have virtualxids but NULL + transactionids, while both columns will be + set in read-write transactions. + + + + Some lock types wait on virtualxid, + while other types wait on transactionid. + Row-level read and write locks are recorded directly in the locked + rows and can be inspected using the + extension. Row-level read locks might also require the assignment + of multixact IDs (mxid; see ). + + + + + + Subtransactions + + + Subtransactions are started inside transactions, allowing large + transactions to be broken into smaller units. Subtransactions can + commit or abort without affecting their parent transactions, allowing + parent transactions to continue. This allows errors to be handled + more easily, which is a common application development pattern. + The word subtransaction is often abbreviated as + subxact. + + + + Subtransactions can be started explicitly using the + SAVEPOINT command, but can also be started in + other ways, such as PL/pgSQL's EXCEPTION clause. + PL/Python and PL/TCL also support explicit subtransactions. + Subtransactions can also be started from other subtransactions. + The top-level transaction and its child subtransactions form a + hierarchy or tree, which is why we refer to the main transaction as + the top-level transaction. + + + + If a subtransaction is assigned a non-virtual transaction ID, + its transaction ID is referred to as a subxid. + Read-only subtransactions are not assigned subxids, but once they + attempt to write, they will be assigned one. This also causes all of + a subxid's parents, up to and including the top-level transaction, + to be assigned non-virtual transaction ids. We ensure that a parent + xid is always lower than any of its child subxids. + + + + The immediate parent xid of each subxid is recorded in the + pg_subtrans directory. No entry is made for + top-level xids since they do not have a parent, nor is an entry made + for read-only subtransactions. + + + + When a subtransaction commits, all of its committed child + subtransactions with subxids will also be considered subcommitted + in that transaction. When a subtransaction aborts, all of its child + subtransactions will also be considered aborted. + + + + When a top-level transaction with an xid commits, all of its + subcommitted child subtransactions are also persistently recorded + as committed in the pg_xact directory. If the + top-level transaction aborts, all its subtransactions are also aborted, + even if they were subcommitted. + + + + The more subtransactions each transaction keeps open (not + rolled back or released), the greater the transaction management + overhead. Up to 64 open subxids are cached in shared memory for + each backend; after that point, the storage I/O overhead increases + significantly due to additional lookups of subxid entries in + pg_subtrans. + + + + + + Two-Phase Transactions + + + PostgreSQL supports a two-phase commit (2PC) + protocol that allows multiple distributed systems to work together + in a transactional manner. The commands are PREPARE + TRANSACTION, COMMIT PREPARED and + ROLLBACK PREPARED. Two-phase transactions + are intended for use by external transaction management systems. + PostgreSQL follows the features and model + proposed by the X/Open XA standard, but does not implement some less + often used aspects. + + + + When the user executes PREPARE TRANSACTION, the + only possible next commands are COMMIT PREPARED + or ROLLBACK PREPARED. In general, this prepared + state is intended to be of very short duration, but external + availability issues might mean transactions stay in this state + for an extended interval. Short-lived prepared + transactions are stored only in shared memory and WAL. + Transactions that span checkpoints are recorded in the + pg_twophase directory. Transactions + that are currently prepared can be inspected using pg_prepared_xacts. + + + + From 19fb3c00b2ad55ea09afac04a537603fc477dd2f Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Wed, 30 Nov 2022 13:01:41 -0500 Subject: [PATCH 28/70] Reject missing database name in pg_regress and cohorts. Writing "pg_regress --dbname= ..." led to a crash, because we weren't expecting there to be no database name supplied. It doesn't seem like a great idea to run regression tests in whatever is the user's default database; so rather than supporting this case let's explicitly reject it. Per report from Xing Guo. Back-patch to all supported branches. Discussion: https://postgr.es/m/CACpMh+A8cRvtvtOWVAZsCM1DU81GK4DL26R83y6ugZ1osV=ifA@mail.gmail.com --- src/test/regress/pg_regress.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/src/test/regress/pg_regress.c b/src/test/regress/pg_regress.c index 9320cf0aeec..c479142222e 100644 --- a/src/test/regress/pg_regress.c +++ b/src/test/regress/pg_regress.c @@ -3087,6 +3087,17 @@ regression_main(int argc, char *argv[], optind++; } + /* + * We must have a database to run the tests in; either a default name, or + * one supplied by the --dbname switch. + */ + if (!(dblist && dblist->str && dblist->str[0])) + { + fprintf(stderr, _("%s: no database name was specified\n"), + progname); + exit(2); + } + if (config_auth_datadir) { #ifdef ENABLE_SSPI From accad053a8783ef52a290248961063628545ed96 Mon Sep 17 00:00:00 2001 From: Bruce Momjian Date: Thu, 1 Dec 2022 10:45:08 -0500 Subject: [PATCH 29/70] revert: add transaction processing chapter with internals info This doc patch (master hash 66bc9d2d3e) was decided to be too significant for backpatching, so reverted in all but master. Also fix SGML file header comment in master. Reported-by: Peter Eisentraut Discussion: https://postgr.es/m/c6304b19-6ff7-f3af-0148-cf7aa7e2fbfd@enterprisedb.com Backpatch-through: 11 --- doc/src/sgml/catalogs.sgml | 7 +- doc/src/sgml/config.sgml | 6 +- doc/src/sgml/datatype.sgml | 3 +- doc/src/sgml/filelist.sgml | 1 - doc/src/sgml/func.sgml | 20 +-- doc/src/sgml/glossary.sgml | 3 +- doc/src/sgml/monitoring.sgml | 6 +- doc/src/sgml/pgrowlocks.sgml | 3 +- doc/src/sgml/postgres.sgml | 1 - doc/src/sgml/ref/release_savepoint.sgml | 62 +++---- doc/src/sgml/ref/rollback.sgml | 8 +- doc/src/sgml/ref/rollback_to.sgml | 5 +- doc/src/sgml/wal.sgml | 6 +- doc/src/sgml/xact.sgml | 205 ------------------------ 14 files changed, 44 insertions(+), 292 deletions(-) delete mode 100644 doc/src/sgml/xact.sgml diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml index 8031aa83e37..b8b82208a40 100644 --- a/doc/src/sgml/catalogs.sgml +++ b/doc/src/sgml/catalogs.sgml @@ -10612,8 +10612,7 @@ SCRAM-SHA-256$<iteration count>:&l Virtual ID of the transaction targeted by the lock, - or null if the target is not a virtual transaction ID; see - + or null if the target is not a virtual transaction ID @@ -10622,8 +10621,8 @@ SCRAM-SHA-256$<iteration count>:&l transactionid xid - ID of the transaction targeted by the lock, or null if the target - is not a transaction ID; + ID of the transaction targeted by the lock, + or null if the target is not a transaction ID diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index f9337ebaf5a..514cac29e60 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -7033,14 +7033,12 @@ local0.* /var/log/postgresql %v - Virtual transaction ID (backendID/localXID); see - + Virtual transaction ID (backendID/localXID) no %x - Transaction ID (0 if none is assigned); see - + Transaction ID (0 if none is assigned) no diff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml index ad35ac7ec98..0e89b768c5d 100644 --- a/doc/src/sgml/datatype.sgml +++ b/doc/src/sgml/datatype.sgml @@ -4936,8 +4936,7 @@ WHERE ... xmin and xmax. Transaction identifiers are 32-bit quantities. In some contexts, a 64-bit variant xid8 is used. Unlike xid values, xid8 values increase strictly - monotonically and cannot be reused in the lifetime of a database - cluster. See for more details. + monotonically and cannot be reused in the lifetime of a database cluster. diff --git a/doc/src/sgml/filelist.sgml b/doc/src/sgml/filelist.sgml index c4bda90fe67..04d8b235752 100644 --- a/doc/src/sgml/filelist.sgml +++ b/doc/src/sgml/filelist.sgml @@ -103,7 +103,6 @@ - diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 3bb3e6e0fee..7960cc2a5aa 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -24100,10 +24100,7 @@ SELECT collation for ('foo' COLLATE "de_DE"); Returns the current transaction's ID. It will assign a new one if the current transaction does not have one already (because it has not - performed any database updates); see for details. If executed in a - subtransaction, this will return the top-level transaction ID; - see for details. + performed any database updates). @@ -24120,8 +24117,6 @@ SELECT collation for ('foo' COLLATE "de_DE"); ID is assigned yet. (It's best to use this variant if the transaction might otherwise be read-only, to avoid unnecessary consumption of an XID.) - If executed in a subtransaction, this will return the top-level - transaction ID. @@ -24165,9 +24160,6 @@ SELECT collation for ('foo' COLLATE "de_DE"); Returns a current snapshot, a data structure showing which transaction IDs are now in-progress. - Only top-level transaction IDs are included in the snapshot; - subtransaction IDs are not shown; see - for details. @@ -24222,8 +24214,7 @@ SELECT collation for ('foo' COLLATE "de_DE"); Is the given transaction ID visible according to this snapshot (that is, was it completed before the snapshot was taken)? Note that this function will not give the correct answer for - a subtransaction ID (subxid); see for - details. + a subtransaction ID. @@ -24235,9 +24226,8 @@ SELECT collation for ('foo' COLLATE "de_DE"); wraps around every 4 billion transactions. However, the functions shown in use a 64-bit type xid8 that does not wrap around during the life - of an installation and can be converted to xid by casting if - required; see for details. - The data type pg_snapshot stores information about + of an installation, and can be converted to xid by casting if + required. The data type pg_snapshot stores information about transaction ID visibility at a particular moment in time. Its components are described in . pg_snapshot's textual representation is @@ -24283,7 +24273,7 @@ SELECT collation for ('foo' COLLATE "de_DE"); xmax and not in this list was already completed at the time of the snapshot, and thus is either visible or dead according to its commit status. This list does not include the transaction IDs of - subtransactions (subxids). + subtransactions. diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml index de0f7852a1b..c8d0440e80f 100644 --- a/doc/src/sgml/glossary.sgml +++ b/doc/src/sgml/glossary.sgml @@ -1644,8 +1644,7 @@ 3 (values under that are reserved) and the epoch value is incremented by one. In some contexts, the epoch and xid values are - considered together as a single 64-bit value; see for more details. + considered together as a single 64-bit value. For more information, see diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml index 470f9fe080e..ce6c34d6ea1 100644 --- a/doc/src/sgml/monitoring.sgml +++ b/doc/src/sgml/monitoring.sgml @@ -897,8 +897,7 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser backend_xid xid - Top-level transaction identifier of this backend, if any; see - . + Top-level transaction identifier of this backend, if any. @@ -1874,8 +1873,7 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser virtualxid - Waiting to acquire a virtual transaction ID lock; see - . + Waiting to acquire a virtual transaction ID lock. diff --git a/doc/src/sgml/pgrowlocks.sgml b/doc/src/sgml/pgrowlocks.sgml index 02f31601b06..392d5f1f9a7 100644 --- a/doc/src/sgml/pgrowlocks.sgml +++ b/doc/src/sgml/pgrowlocks.sgml @@ -57,8 +57,7 @@ pgrowlocks(text) returns setof record locker xid - Transaction ID of locker, or multixact ID if - multitransaction; see + Transaction ID of locker, or multixact ID if multitransaction multi diff --git a/doc/src/sgml/postgres.sgml b/doc/src/sgml/postgres.sgml index c7373d12c67..e7260e573c1 100644 --- a/doc/src/sgml/postgres.sgml +++ b/doc/src/sgml/postgres.sgml @@ -270,7 +270,6 @@ break is not needed in a wider output rendering. &brin; &hash; &storage; - &transaction; &bki; &planstats; &backup-manifest; diff --git a/doc/src/sgml/ref/release_savepoint.sgml b/doc/src/sgml/ref/release_savepoint.sgml index e9fc6e5d1c8..daf8eb9a436 100644 --- a/doc/src/sgml/ref/release_savepoint.sgml +++ b/doc/src/sgml/ref/release_savepoint.sgml @@ -21,7 +21,7 @@ PostgreSQL documentation RELEASE SAVEPOINT - release a previously defined savepoint + destroy a previously defined savepoint @@ -34,13 +34,23 @@ RELEASE [ SAVEPOINT ] savepoint_name Description - RELEASE SAVEPOINT releases the named savepoint and - all active savepoints that were created after the named savepoint, - and frees their resources. All changes made since the creation of - the savepoint that didn't already get rolled back are merged into - the transaction or savepoint that was active when the named savepoint - was created. Changes made after RELEASE SAVEPOINT - will also be part of this active transaction or savepoint. + RELEASE SAVEPOINT destroys a savepoint previously defined + in the current transaction. + + + + Destroying a savepoint makes it unavailable as a rollback point, + but it has no other user visible behavior. It does not undo the + effects of commands executed after the savepoint was established. + (To do that, see .) + Destroying a savepoint when + it is no longer needed allows the system to reclaim some resources + earlier than transaction end. + + + + RELEASE SAVEPOINT also destroys all savepoints that were + established after the named savepoint was established. @@ -52,7 +62,7 @@ RELEASE [ SAVEPOINT ] savepoint_name savepoint_name - The name of the savepoint to release. + The name of the savepoint to destroy. @@ -68,7 +78,7 @@ RELEASE [ SAVEPOINT ] savepoint_name It is not possible to release a savepoint when the transaction is in - an aborted state; to do that, use . + an aborted state. @@ -83,7 +93,7 @@ RELEASE [ SAVEPOINT ] savepoint_name Examples - To establish and later release a savepoint: + To establish and later destroy a savepoint: BEGIN; INSERT INTO table1 VALUES (3); @@ -94,36 +104,6 @@ COMMIT; The above transaction will insert both 3 and 4. - - - A more complex example with multiple nested subtransactions: - -BEGIN; - INSERT INTO table1 VALUES (1); - SAVEPOINT sp1; - INSERT INTO table1 VALUES (2); - SAVEPOINT sp2; - INSERT INTO table1 VALUES (3); - RELEASE SAVEPOINT sp2; - INSERT INTO table1 VALUES (4))); -- generates an error - - In this example, the application requests the release of the savepoint - sp2, which inserted 3. This changes the insert's - transaction context to sp1. When the statement - attempting to insert value 4 generates an error, the insertion of 2 and - 4 are lost because they are in the same, now-rolled back savepoint, - and value 3 is in the same transaction context. The application can - now only choose one of these two commands, since all other commands - will be ignored: - - ROLLBACK; - ROLLBACK TO SAVEPOINT sp1; - - Choosing ROLLBACK will abort everything, including - value 1, whereas ROLLBACK TO SAVEPOINT sp1 will retain - value 1 and allow the transaction to continue. - - diff --git a/doc/src/sgml/ref/rollback.sgml b/doc/src/sgml/ref/rollback.sgml index 7700547669f..142f71e7742 100644 --- a/doc/src/sgml/ref/rollback.sgml +++ b/doc/src/sgml/ref/rollback.sgml @@ -56,10 +56,10 @@ ROLLBACK [ WORK | TRANSACTION ] [ AND [ NO ] CHAIN ] AND CHAIN - If AND CHAIN is specified, a new (not aborted) - transaction is immediately started with the same transaction - characteristics (see ) as the - just finished one. Otherwise, no new transaction is started. + If AND CHAIN is specified, a new transaction is + immediately started with the same transaction characteristics (see ) as the just finished one. Otherwise, + no new transaction is started. diff --git a/doc/src/sgml/ref/rollback_to.sgml b/doc/src/sgml/ref/rollback_to.sgml index 37e5ca8f12b..3d5a241e1aa 100644 --- a/doc/src/sgml/ref/rollback_to.sgml +++ b/doc/src/sgml/ref/rollback_to.sgml @@ -35,9 +35,8 @@ ROLLBACK [ WORK | TRANSACTION ] TO [ SAVEPOINT ] savepoint_name Roll back all commands that were executed after the savepoint was - established and then start a new subtransaction at the same transaction level. - The savepoint remains valid and can be rolled back to again later, - if needed. + established. The savepoint remains valid and can be rolled back to + again later, if needed. diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml index 6330a8d5865..24e1c89503c 100644 --- a/doc/src/sgml/wal.sgml +++ b/doc/src/sgml/wal.sgml @@ -4,9 +4,8 @@ Reliability and the Write-Ahead Log - This chapter explains how to control the reliability of - PostgreSQL, including details about the - Write-Ahead Log. + This chapter explains how the Write-Ahead Log is used to obtain + efficient, reliable operation. @@ -896,5 +895,4 @@ seem to be a problem in practice. - diff --git a/doc/src/sgml/xact.sgml b/doc/src/sgml/xact.sgml deleted file mode 100644 index c0f4993a0df..00000000000 --- a/doc/src/sgml/xact.sgml +++ /dev/null @@ -1,205 +0,0 @@ - - - - - Transaction Processing - - - This chapter provides an overview of the internals of - PostgreSQL's transaction management system. - The word transaction is often abbreviated as xact. - - - - - Transactions and Identifiers - - - Transactions can be created explicitly using BEGIN - or START TRANSACTION and ended using - COMMIT or ROLLBACK. SQL - statements outside of explicit transactions automatically use - single-statement transactions. - - - - Every transaction is identified by a unique - VirtualTransactionId (also called - virtualXID or vxid), which - is comprised of a backend ID (or backendID) - and a sequentially-assigned number local to each backend, known as - localXID. For example, the virtual transaction - ID 4/12532 has a backendID - of 4 and a localXID of - 12532. - - - - Non-virtual TransactionIds (or xid), - e.g., 278394, are assigned sequentially to - transactions from a global counter used by all databases within - the PostgreSQL cluster. This assignment - happens when a transaction first writes to the database. This means - lower-numbered xids started writing before higher-numbered xids. - Note that the order in which transactions perform their first database - write might be different from the order in which the transactions - started, particularly if the transaction started with statements that - only performed database reads. - - - - The internal transaction ID type xid is 32 bits wide - and wraps around every - 4 billion transactions. A 32-bit epoch is incremented during each - wraparound. There is also a 64-bit type xid8 which - includes this epoch and therefore does not wrap around during the - life of an installation; it can be converted to xid by casting. - The functions in - return xid8 values. Xids are used as the - basis for PostgreSQL's MVCC concurrency mechanism and streaming - replication. - - - - When a top-level transaction with a (non-virtual) xid commits, - it is marked as committed in the pg_xact - directory. Additional information is recorded in the - pg_commit_ts directory if is enabled. - - - - In addition to vxid and xid, - prepared transactions are also assigned Global Transaction - Identifiers (GID). GIDs are string literals up - to 200 bytes long, which must be unique amongst other currently - prepared transactions. The mapping of GID to xid is shown in pg_prepared_xacts. - - - - - - Transactions and Locking - - - The transaction IDs of currently executing transactions are shown in - pg_locks - in columns virtualxid and - transactionid. Read-only transactions - will have virtualxids but NULL - transactionids, while both columns will be - set in read-write transactions. - - - - Some lock types wait on virtualxid, - while other types wait on transactionid. - Row-level read and write locks are recorded directly in the locked - rows and can be inspected using the - extension. Row-level read locks might also require the assignment - of multixact IDs (mxid; see ). - - - - - - Subtransactions - - - Subtransactions are started inside transactions, allowing large - transactions to be broken into smaller units. Subtransactions can - commit or abort without affecting their parent transactions, allowing - parent transactions to continue. This allows errors to be handled - more easily, which is a common application development pattern. - The word subtransaction is often abbreviated as - subxact. - - - - Subtransactions can be started explicitly using the - SAVEPOINT command, but can also be started in - other ways, such as PL/pgSQL's EXCEPTION clause. - PL/Python and PL/TCL also support explicit subtransactions. - Subtransactions can also be started from other subtransactions. - The top-level transaction and its child subtransactions form a - hierarchy or tree, which is why we refer to the main transaction as - the top-level transaction. - - - - If a subtransaction is assigned a non-virtual transaction ID, - its transaction ID is referred to as a subxid. - Read-only subtransactions are not assigned subxids, but once they - attempt to write, they will be assigned one. This also causes all of - a subxid's parents, up to and including the top-level transaction, - to be assigned non-virtual transaction ids. We ensure that a parent - xid is always lower than any of its child subxids. - - - - The immediate parent xid of each subxid is recorded in the - pg_subtrans directory. No entry is made for - top-level xids since they do not have a parent, nor is an entry made - for read-only subtransactions. - - - - When a subtransaction commits, all of its committed child - subtransactions with subxids will also be considered subcommitted - in that transaction. When a subtransaction aborts, all of its child - subtransactions will also be considered aborted. - - - - When a top-level transaction with an xid commits, all of its - subcommitted child subtransactions are also persistently recorded - as committed in the pg_xact directory. If the - top-level transaction aborts, all its subtransactions are also aborted, - even if they were subcommitted. - - - - The more subtransactions each transaction keeps open (not - rolled back or released), the greater the transaction management - overhead. Up to 64 open subxids are cached in shared memory for - each backend; after that point, the storage I/O overhead increases - significantly due to additional lookups of subxid entries in - pg_subtrans. - - - - - - Two-Phase Transactions - - - PostgreSQL supports a two-phase commit (2PC) - protocol that allows multiple distributed systems to work together - in a transactional manner. The commands are PREPARE - TRANSACTION, COMMIT PREPARED and - ROLLBACK PREPARED. Two-phase transactions - are intended for use by external transaction management systems. - PostgreSQL follows the features and model - proposed by the X/Open XA standard, but does not implement some less - often used aspects. - - - - When the user executes PREPARE TRANSACTION, the - only possible next commands are COMMIT PREPARED - or ROLLBACK PREPARED. In general, this prepared - state is intended to be of very short duration, but external - availability issues might mean transactions stay in this state - for an extended interval. Short-lived prepared - transactions are stored only in shared memory and WAL. - Transactions that span checkpoints are recorded in the - pg_twophase directory. Transactions - that are currently prepared can be inspected using pg_prepared_xacts. - - - - From 1088e9554e5fc8a7d0ded455192a6496c44d908f Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Thu, 1 Dec 2022 11:38:06 -0500 Subject: [PATCH 30/70] Fix under-parenthesized display of AT TIME ZONE constructs. In commit 40c24bfef, I forgot to use get_rule_expr_paren() for the arguments of AT TIME ZONE, resulting in possibly not printing parens for expressions that need it. But get_rule_expr_paren() wouldn't have gotten it right anyway, because isSimpleNode() hadn't been taught that COERCE_SQL_SYNTAX parent nodes don't guarantee sufficient parentheses. Improve all that. Also use this methodology for F_IS_NORMALIZED, so that we don't print useless parens for that. In passing, remove a comment that was obsoleted later. Per report from Duncan Sands. Back-patch to v14 where this code came in. (Before that, we didn't try to print AT TIME ZONE that way, so there was no bug just ugliness.) Discussion: https://postgr.es/m/f41566aa-a057-6628-4b7c-b48770ecb84a@deepbluecap.com --- src/backend/utils/adt/ruleutils.c | 28 +++++++++++------------ src/test/regress/expected/create_view.out | 8 ++++--- src/test/regress/sql/create_view.sql | 1 + 3 files changed, 20 insertions(+), 17 deletions(-) diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c index ea8156bebad..e1dca78d40d 100644 --- a/src/backend/utils/adt/ruleutils.c +++ b/src/backend/utils/adt/ruleutils.c @@ -8304,11 +8304,12 @@ isSimpleNode(Node *node, Node *parentNode, int prettyFlags) { case T_FuncExpr: { - /* special handling for casts */ + /* special handling for casts and COERCE_SQL_SYNTAX */ CoercionForm type = ((FuncExpr *) parentNode)->funcformat; if (type == COERCE_EXPLICIT_CAST || - type == COERCE_IMPLICIT_CAST) + type == COERCE_IMPLICIT_CAST || + type == COERCE_SQL_SYNTAX) return false; return true; /* own parentheses */ } @@ -8356,11 +8357,12 @@ isSimpleNode(Node *node, Node *parentNode, int prettyFlags) return false; case T_FuncExpr: { - /* special handling for casts */ + /* special handling for casts and COERCE_SQL_SYNTAX */ CoercionForm type = ((FuncExpr *) parentNode)->funcformat; if (type == COERCE_EXPLICIT_CAST || - type == COERCE_IMPLICIT_CAST) + type == COERCE_IMPLICIT_CAST || + type == COERCE_SQL_SYNTAX) return false; return true; /* own parentheses */ } @@ -10432,9 +10434,11 @@ get_func_sql_syntax(FuncExpr *expr, deparse_context *context) case F_TIMEZONE_TEXT_TIMETZ: /* AT TIME ZONE ... note reversed argument order */ appendStringInfoChar(buf, '('); - get_rule_expr((Node *) lsecond(expr->args), context, false); + get_rule_expr_paren((Node *) lsecond(expr->args), context, false, + (Node *) expr); appendStringInfoString(buf, " AT TIME ZONE "); - get_rule_expr((Node *) linitial(expr->args), context, false); + get_rule_expr_paren((Node *) linitial(expr->args), context, false, + (Node *) expr); appendStringInfoChar(buf, ')'); return true; @@ -10486,9 +10490,10 @@ get_func_sql_syntax(FuncExpr *expr, deparse_context *context) case F_IS_NORMALIZED: /* IS xxx NORMALIZED */ - appendStringInfoString(buf, "(("); - get_rule_expr((Node *) linitial(expr->args), context, false); - appendStringInfoString(buf, ") IS"); + appendStringInfoString(buf, "("); + get_rule_expr_paren((Node *) linitial(expr->args), context, false, + (Node *) expr); + appendStringInfoString(buf, " IS"); if (list_length(expr->args) == 2) { Const *con = (Const *) lsecond(expr->args); @@ -10509,11 +10514,6 @@ get_func_sql_syntax(FuncExpr *expr, deparse_context *context) appendStringInfoChar(buf, ')'); return true; - /* - * XXX EXTRACT, a/k/a date_part(), is intentionally not covered - * yet. Add it after we change the return type to numeric. - */ - case F_NORMALIZE: /* NORMALIZE() */ appendStringInfoString(buf, "NORMALIZE("); diff --git a/src/test/regress/expected/create_view.out b/src/test/regress/expected/create_view.out index fdb0657bb72..ac88c92f398 100644 --- a/src/test/regress/expected/create_view.out +++ b/src/test/regress/expected/create_view.out @@ -1797,6 +1797,7 @@ select pg_get_viewdef('tt20v', true); -- reverse-listing of various special function syntaxes required by SQL create view tt201v as select + ('2022-12-01'::date + '1 day'::interval) at time zone 'UTC' as atz, extract(day from now()) as extr, (now(), '1 day'::interval) overlaps (current_timestamp(2), '1 day'::interval) as o, @@ -1819,10 +1820,11 @@ select select pg_get_viewdef('tt201v', true); pg_get_viewdef ----------------------------------------------------------------------------------------------- - SELECT EXTRACT(day FROM now()) AS extr, + + SELECT (('12-01-2022'::date + '@ 1 day'::interval) AT TIME ZONE 'UTC'::text) AS atz, + + EXTRACT(day FROM now()) AS extr, + ((now(), '@ 1 day'::interval) OVERLAPS (CURRENT_TIMESTAMP(2), '@ 1 day'::interval)) AS o,+ - (('foo'::text) IS NORMALIZED) AS isn, + - (('foo'::text) IS NFKC NORMALIZED) AS isnn, + + ('foo'::text IS NORMALIZED) AS isn, + + ('foo'::text IS NFKC NORMALIZED) AS isnn, + NORMALIZE('foo'::text) AS n, + NORMALIZE('foo'::text, NFKD) AS nfkd, + OVERLAY('foo'::text PLACING 'bar'::text FROM 2) AS ovl, + diff --git a/src/test/regress/sql/create_view.sql b/src/test/regress/sql/create_view.sql index 2e7452ac9ea..f35364b8a23 100644 --- a/src/test/regress/sql/create_view.sql +++ b/src/test/regress/sql/create_view.sql @@ -615,6 +615,7 @@ select pg_get_viewdef('tt20v', true); create view tt201v as select + ('2022-12-01'::date + '1 day'::interval) at time zone 'UTC' as atz, extract(day from now()) as extr, (now(), '1 day'::interval) overlaps (current_timestamp(2), '1 day'::interval) as o, From cf47030c680bcbcb5c40927142414e509300b165 Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Thu, 1 Dec 2022 12:26:12 -0500 Subject: [PATCH 31/70] Doc: add example of round(v, s) with negative s. This has always worked, but you'd be unlikely to guess it from the documentation. Add an example showing it. Lack of docs noted by David Johnston. Back-patch to v13; the documentation layout we used before that was not very amenable to squeezing in multiple examples. Discussion: https://postgr.es/m/CAKFQuwZ4Vy1Xty0G5Ok+ot=NDrU3C6hzF1JwUk-FEkwe3V9_RA@mail.gmail.com --- doc/src/sgml/func.sgml | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 7960cc2a5aa..ba8bccc135d 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -1598,6 +1598,10 @@ repeat('Pg', 4) PgPgPgPg round(42.4382, 2) 42.44 + + + round(1234.56, -1) + 1230 From 29ac3a5e4a33def0cd3f28a23f296e30d1447f70 Mon Sep 17 00:00:00 2001 From: Jeff Davis Date: Thu, 1 Dec 2022 11:26:32 -0800 Subject: [PATCH 32/70] Fix memory leak for hashing with nondeterministic collations. Backpatch through 12, where nondeterministic collations were introduced (5e1963fb76). Backpatch-through: 12 --- src/backend/access/hash/hashfunc.c | 2 ++ src/backend/utils/adt/varchar.c | 2 ++ 2 files changed, 4 insertions(+) diff --git a/src/backend/access/hash/hashfunc.c b/src/backend/access/hash/hashfunc.c index 242333920e7..2c00de50418 100644 --- a/src/backend/access/hash/hashfunc.c +++ b/src/backend/access/hash/hashfunc.c @@ -303,6 +303,7 @@ hashtext(PG_FUNCTION_ARGS) buf = palloc(bsize); ucol_getSortKey(mylocale->info.icu.ucol, uchar, ulen, buf, bsize); + pfree(uchar); result = hash_any(buf, bsize); @@ -360,6 +361,7 @@ hashtextextended(PG_FUNCTION_ARGS) buf = palloc(bsize); ucol_getSortKey(mylocale->info.icu.ucol, uchar, ulen, buf, bsize); + pfree(uchar); result = hash_any_extended(buf, bsize, PG_GETARG_INT64(1)); diff --git a/src/backend/utils/adt/varchar.c b/src/backend/utils/adt/varchar.c index f96c6508eb3..a5c651d629c 100644 --- a/src/backend/utils/adt/varchar.c +++ b/src/backend/utils/adt/varchar.c @@ -1020,6 +1020,7 @@ hashbpchar(PG_FUNCTION_ARGS) buf = palloc(bsize); ucol_getSortKey(mylocale->info.icu.ucol, uchar, ulen, buf, bsize); + pfree(uchar); result = hash_any(buf, bsize); @@ -1081,6 +1082,7 @@ hashbpcharextended(PG_FUNCTION_ARGS) buf = palloc(bsize); ucol_getSortKey(mylocale->info.icu.ucol, uchar, ulen, buf, bsize); + pfree(uchar); result = hash_any_extended(buf, bsize, PG_GETARG_INT64(1)); From 1fa66f8273753801658fc2e7d971f4b6e1703596 Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Fri, 2 Dec 2022 14:24:44 -0500 Subject: [PATCH 33/70] Fix psql's \sf and \ef for new-style SQL functions. Some options of these commands need to be able to identify the start of the function body within the output of pg_get_functiondef(). It used to be that that always began with "AS", but since the introduction of new-style SQL functions, it might also start with "BEGIN" or "RETURN". Fix that on the psql side, and add some regression tests. Noted by me awhile ago, but I didn't do anything about it. Thanks to David Johnston for a nag. Discussion: https://postgr.es/m/AM9PR01MB8268D5CDABDF044EE9F42173FE8C9@AM9PR01MB8268.eurprd01.prod.exchangelabs.com --- src/backend/utils/adt/ruleutils.c | 4 +- src/bin/psql/command.c | 46 +++++++++++----------- src/test/regress/expected/psql.out | 62 ++++++++++++++++++++++++++++++ src/test/regress/sql/psql.sql | 8 ++++ 4 files changed, 94 insertions(+), 26 deletions(-) diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c index e1dca78d40d..95ae6c76018 100644 --- a/src/backend/utils/adt/ruleutils.c +++ b/src/backend/utils/adt/ruleutils.c @@ -2796,8 +2796,8 @@ pg_get_serial_sequence(PG_FUNCTION_ARGS) * * Note: if you change the output format of this function, be careful not * to break psql's rules (in \ef and \sf) for identifying the start of the - * function body. To wit: the function body starts on a line that begins - * with "AS ", and no preceding line will look like that. + * function body. To wit: the function body starts on a line that begins with + * "AS ", "BEGIN ", or "RETURN ", and no preceding line will look like that. */ Datum pg_get_functiondef(PG_FUNCTION_ARGS) diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c index 4215504dc11..269925342be 100644 --- a/src/bin/psql/command.c +++ b/src/bin/psql/command.c @@ -164,8 +164,7 @@ static bool get_create_object_cmd(EditableObjectType obj_type, Oid oid, PQExpBuffer buf); static int strip_lineno_from_objdesc(char *obj); static int count_lines_in_buf(PQExpBuffer buf); -static void print_with_linenumbers(FILE *output, char *lines, - const char *header_keyword); +static void print_with_linenumbers(FILE *output, char *lines, bool is_func); static void minimal_error_message(PGresult *res); static void printSSLInfo(void); @@ -1166,17 +1165,19 @@ exec_command_ef_ev(PsqlScanState scan_state, bool active_branch, /* * lineno "1" should correspond to the first line of the * function body. We expect that pg_get_functiondef() will - * emit that on a line beginning with "AS ", and that there - * can be no such line before the real start of the function - * body. Increment lineno by the number of lines before that - * line, so that it becomes relative to the first line of the - * function definition. + * emit that on a line beginning with "AS ", "BEGIN ", or + * "RETURN ", and that there can be no such line before the + * real start of the function body. Increment lineno by the + * number of lines before that line, so that it becomes + * relative to the first line of the function definition. */ const char *lines = query_buf->data; while (*lines != '\0') { - if (strncmp(lines, "AS ", 3) == 0) + if (strncmp(lines, "AS ", 3) == 0 || + strncmp(lines, "BEGIN ", 6) == 0 || + strncmp(lines, "RETURN ", 7) == 0) break; lineno++; /* find start of next line */ @@ -2454,15 +2455,8 @@ exec_command_sf_sv(PsqlScanState scan_state, bool active_branch, if (show_linenumbers) { - /* - * For functions, lineno "1" should correspond to the first - * line of the function body. We expect that - * pg_get_functiondef() will emit that on a line beginning - * with "AS ", and that there can be no such line before the - * real start of the function body. - */ - print_with_linenumbers(output, buf->data, - is_func ? "AS " : NULL); + /* add line numbers */ + print_with_linenumbers(output, buf->data, is_func); } else { @@ -5357,24 +5351,28 @@ count_lines_in_buf(PQExpBuffer buf) /* * Write text at *lines to output with line numbers. * - * If header_keyword isn't NULL, then line 1 should be the first line beginning - * with header_keyword; lines before that are unnumbered. + * For functions, lineno "1" should correspond to the first line of the + * function body; lines before that are unnumbered. We expect that + * pg_get_functiondef() will emit that on a line beginning with "AS ", + * "BEGIN ", or "RETURN ", and that there can be no such line before + * the real start of the function body. * * Caution: this scribbles on *lines. */ static void -print_with_linenumbers(FILE *output, char *lines, - const char *header_keyword) +print_with_linenumbers(FILE *output, char *lines, bool is_func) { - bool in_header = (header_keyword != NULL); - size_t header_sz = in_header ? strlen(header_keyword) : 0; + bool in_header = is_func; int lineno = 0; while (*lines != '\0') { char *eol; - if (in_header && strncmp(lines, header_keyword, header_sz) == 0) + if (in_header && + (strncmp(lines, "AS ", 3) == 0 || + strncmp(lines, "BEGIN ", 6) == 0 || + strncmp(lines, "RETURN ", 7) == 0)) in_header = false; /* increment lineno only for body's lines */ diff --git a/src/test/regress/expected/psql.out b/src/test/regress/expected/psql.out index a5951c0c6d1..2f06a903b57 100644 --- a/src/test/regress/expected/psql.out +++ b/src/test/regress/expected/psql.out @@ -5328,6 +5328,13 @@ List of access methods pg_catalog | bit_xor | smallint | smallint | agg (3 rows) +\df *._pg_expandarray + List of functions + Schema | Name | Result data type | Argument data types | Type +--------------------+-----------------+------------------+-------------------------------------------+------ + information_schema | _pg_expandarray | SETOF record | anyarray, OUT x anyelement, OUT n integer | func +(1 row) + \do - pg_catalog.int4 List of operators Schema | Name | Left arg type | Right arg type | Result type | Description @@ -5342,6 +5349,61 @@ List of access methods pg_catalog | && | anyarray | anyarray | boolean | overlaps (1 row) +-- check \sf +\sf information_schema._pg_expandarray +CREATE OR REPLACE FUNCTION information_schema._pg_expandarray(anyarray, OUT x anyelement, OUT n integer) + RETURNS SETOF record + LANGUAGE sql + IMMUTABLE PARALLEL SAFE STRICT +AS $function$select $1[s], + s operator(pg_catalog.-) pg_catalog.array_lower($1,1) operator(pg_catalog.+) 1 + from pg_catalog.generate_series(pg_catalog.array_lower($1,1), + pg_catalog.array_upper($1,1), + 1) as g(s)$function$ +\sf+ information_schema._pg_expandarray + CREATE OR REPLACE FUNCTION information_schema._pg_expandarray(anyarray, OUT x anyelement, OUT n integer) + RETURNS SETOF record + LANGUAGE sql + IMMUTABLE PARALLEL SAFE STRICT +1 AS $function$select $1[s], +2 s operator(pg_catalog.-) pg_catalog.array_lower($1,1) operator(pg_catalog.+) 1 +3 from pg_catalog.generate_series(pg_catalog.array_lower($1,1), +4 pg_catalog.array_upper($1,1), +5 1) as g(s)$function$ +\sf+ interval_pl_time + CREATE OR REPLACE FUNCTION pg_catalog.interval_pl_time(interval, time without time zone) + RETURNS time without time zone + LANGUAGE sql + IMMUTABLE PARALLEL SAFE STRICT COST 1 +1 RETURN ($2 + $1) +\sf ts_debug(text) +CREATE OR REPLACE FUNCTION pg_catalog.ts_debug(document text, OUT alias text, OUT description text, OUT token text, OUT dictionaries regdictionary[], OUT dictionary regdictionary, OUT lexemes text[]) + RETURNS SETOF record + LANGUAGE sql + STABLE PARALLEL SAFE STRICT +BEGIN ATOMIC + SELECT ts_debug.alias, + ts_debug.description, + ts_debug.token, + ts_debug.dictionaries, + ts_debug.dictionary, + ts_debug.lexemes + FROM ts_debug(get_current_ts_config(), ts_debug.document) ts_debug(alias, description, token, dictionaries, dictionary, lexemes); +END +\sf+ ts_debug(text) + CREATE OR REPLACE FUNCTION pg_catalog.ts_debug(document text, OUT alias text, OUT description text, OUT token text, OUT dictionaries regdictionary[], OUT dictionary regdictionary, OUT lexemes text[]) + RETURNS SETOF record + LANGUAGE sql + STABLE PARALLEL SAFE STRICT +1 BEGIN ATOMIC +2 SELECT ts_debug.alias, +3 ts_debug.description, +4 ts_debug.token, +5 ts_debug.dictionaries, +6 ts_debug.dictionary, +7 ts_debug.lexemes +8 FROM ts_debug(get_current_ts_config(), ts_debug.document) ts_debug(alias, description, token, dictionaries, dictionary, lexemes); +9 END -- check describing invalid multipart names \dA regression.heap improper qualified name (too many dotted names): regression.heap diff --git a/src/test/regress/sql/psql.sql b/src/test/regress/sql/psql.sql index 501579f22a8..467d0b42143 100644 --- a/src/test/regress/sql/psql.sql +++ b/src/test/regress/sql/psql.sql @@ -1252,9 +1252,17 @@ drop role regress_partitioning_role; \df has_database_privilege oid text \df has_database_privilege oid text - \dfa bit* small* +\df *._pg_expandarray \do - pg_catalog.int4 \do && anyarray * +-- check \sf +\sf information_schema._pg_expandarray +\sf+ information_schema._pg_expandarray +\sf+ interval_pl_time +\sf ts_debug(text) +\sf+ ts_debug(text) + -- check describing invalid multipart names \dA regression.heap \dA nonesuch.heap From f964fb751eb99936de7f75c980649a60ba1f127c Mon Sep 17 00:00:00 2001 From: Dean Rasheed Date: Sat, 3 Dec 2022 12:16:07 +0000 Subject: [PATCH 34/70] Fix DEFAULT handling for multi-row INSERT rules. When updating a relation with a rule whose action performed an INSERT from a multi-row VALUES list, the rewriter might skip processing the VALUES list, and therefore fail to replace any DEFAULTs in it. This would lead to an "unrecognized node type" error. The reason was that RewriteQuery() assumed that a query doing an INSERT from a multi-row VALUES list would necessarily only have one item in its fromlist, pointing to the VALUES RTE to read from. That assumption is correct for the original query, but not for product queries produced for rule actions. In such cases, there may be multiple items in the fromlist, possibly including multiple VALUES RTEs. What is required instead is for RewriteQuery() to skip any RTEs from the product query's originating query, which might include one or more already-processed VALUES RTEs. What's left should then include at most one VALUES RTE (from the rule action) to be processed. Patch by me. Thanks to Tom Lane for reviewing. Back-patch to all supported branches. Discussion: https://postgr.es/m/CAEZATCV39OOW7LAR_Xq4i%2BLc1Byux%3DeK3Q%3DHD_pF1o9LBt%3DphA%40mail.gmail.com --- src/backend/rewrite/rewriteHandler.c | 47 +++++++++++--- src/test/regress/expected/rules.out | 91 ++++++++++++++++------------ src/test/regress/sql/rules.sql | 15 +++-- 3 files changed, 99 insertions(+), 54 deletions(-) diff --git a/src/backend/rewrite/rewriteHandler.c b/src/backend/rewrite/rewriteHandler.c index 48d1caace62..2938edf1c6a 100644 --- a/src/backend/rewrite/rewriteHandler.c +++ b/src/backend/rewrite/rewriteHandler.c @@ -468,6 +468,10 @@ rewriteRuleAction(Query *parsetree, * NOTE: because planner will destructively alter rtable, we must ensure * that rule action's rtable is separate and shares no substructure with * the main rtable. Hence do a deep copy here. + * + * Note also that RewriteQuery() relies on the fact that RT entries from + * the original query appear at the start of the expanded rtable, so + * beware of changing this. */ sub_action->rtable = list_concat(copyObject(parsetree->rtable), sub_action->rtable); @@ -3661,9 +3665,13 @@ rewriteTargetView(Query *parsetree, Relation view) * * rewrite_events is a list of open query-rewrite actions, so we can detect * infinite recursion. + * + * orig_rt_length is the length of the originating query's rtable, for product + * queries created by fireRules(), and 0 otherwise. This is used to skip any + * already-processed VALUES RTEs from the original query. */ static List * -RewriteQuery(Query *parsetree, List *rewrite_events) +RewriteQuery(Query *parsetree, List *rewrite_events, int orig_rt_length) { CmdType event = parsetree->commandType; bool instead = false; @@ -3687,7 +3695,7 @@ RewriteQuery(Query *parsetree, List *rewrite_events) if (ctequery->commandType == CMD_SELECT) continue; - newstuff = RewriteQuery(ctequery, rewrite_events); + newstuff = RewriteQuery(ctequery, rewrite_events, 0); /* * Currently we can only handle unconditional, single-statement DO @@ -3761,6 +3769,7 @@ RewriteQuery(Query *parsetree, List *rewrite_events) RangeTblEntry *rt_entry; Relation rt_entry_relation; List *locks; + int product_orig_rt_length; List *product_queries; bool hasUpdate = false; int values_rte_index = 0; @@ -3782,23 +3791,30 @@ RewriteQuery(Query *parsetree, List *rewrite_events) */ if (event == CMD_INSERT) { + ListCell *lc2; RangeTblEntry *values_rte = NULL; /* - * If it's an INSERT ... VALUES (...), (...), ... there will be a - * single RTE for the VALUES targetlists. + * Test if it's a multi-row INSERT ... VALUES (...), (...), ... by + * looking for a VALUES RTE in the fromlist. For product queries, + * we must ignore any already-processed VALUES RTEs from the + * original query. These appear at the start of the rangetable. */ - if (list_length(parsetree->jointree->fromlist) == 1) + foreach(lc2, parsetree->jointree->fromlist) { - RangeTblRef *rtr = (RangeTblRef *) linitial(parsetree->jointree->fromlist); + RangeTblRef *rtr = (RangeTblRef *) lfirst(lc2); - if (IsA(rtr, RangeTblRef)) + if (IsA(rtr, RangeTblRef) && rtr->rtindex > orig_rt_length) { RangeTblEntry *rte = rt_fetch(rtr->rtindex, parsetree->rtable); if (rte->rtekind == RTE_VALUES) { + /* should not find more than one VALUES RTE */ + if (values_rte != NULL) + elog(ERROR, "more than one VALUES RTE found"); + values_rte = rte; values_rte_index = rtr->rtindex; } @@ -3870,6 +3886,7 @@ RewriteQuery(Query *parsetree, List *rewrite_events) locks = matchLocks(event, rt_entry_relation->rd_rules, result_relation, parsetree, &hasUpdate); + product_orig_rt_length = list_length(parsetree->rtable); product_queries = fireRules(parsetree, result_relation, event, @@ -4053,7 +4070,19 @@ RewriteQuery(Query *parsetree, List *rewrite_events) Query *pt = (Query *) lfirst(n); List *newstuff; - newstuff = RewriteQuery(pt, rewrite_events); + /* + * For an updatable view, pt might be the rewritten version of + * the original query, in which case we pass on orig_rt_length + * to finish processing any VALUES RTE it contained. + * + * Otherwise, we have a product query created by fireRules(). + * Any VALUES RTEs from the original query have been fully + * processed, and must be skipped when we recurse. + */ + newstuff = RewriteQuery(pt, rewrite_events, + pt == parsetree ? + orig_rt_length : + product_orig_rt_length); rewritten = list_concat(rewritten, newstuff); } @@ -4205,7 +4234,7 @@ QueryRewrite(Query *parsetree) * * Apply all non-SELECT rules possibly getting 0 or many queries */ - querylist = RewriteQuery(parsetree, NIL); + querylist = RewriteQuery(parsetree, NIL, 0); /* * Step 2 diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out index 16a471245a9..a2922a0a9ec 100644 --- a/src/test/regress/expected/rules.out +++ b/src/test/regress/expected/rules.out @@ -3099,11 +3099,11 @@ select pg_get_viewdef('shoe'::regclass,0) as prettier; -- -- check multi-row VALUES in rules -- -create table rules_src(f1 int, f2 int); -create table rules_log(f1 int, f2 int, tag text); +create table rules_src(f1 int, f2 int default 0); +create table rules_log(f1 int, f2 int, tag text, id serial); insert into rules_src values(1,2), (11,12); create rule r1 as on update to rules_src do also - insert into rules_log values(old.*, 'old'), (new.*, 'new'); + insert into rules_log values(old.*, 'old', default), (new.*, 'new', default); update rules_src set f2 = f2 + 1; update rules_src set f2 = f2 * 10; select * from rules_src; @@ -3114,16 +3114,16 @@ select * from rules_src; (2 rows) select * from rules_log; - f1 | f2 | tag -----+-----+----- - 1 | 2 | old - 1 | 3 | new - 11 | 12 | old - 11 | 13 | new - 1 | 3 | old - 1 | 30 | new - 11 | 13 | old - 11 | 130 | new + f1 | f2 | tag | id +----+-----+-----+---- + 1 | 2 | old | 1 + 1 | 3 | new | 2 + 11 | 12 | old | 3 + 11 | 13 | new | 4 + 1 | 3 | old | 5 + 1 | 30 | new | 6 + 11 | 13 | old | 7 + 11 | 130 | new | 8 (8 rows) create rule r2 as on update to rules_src do also @@ -3137,71 +3137,84 @@ update rules_src set f2 = f2 / 10; 11 | 13 | new (4 rows) +create rule r3 as on insert to rules_src do also + insert into rules_log values(null, null, '-', default), (new.*, 'new', default); +insert into rules_src values(22,23), (33,default); select * from rules_src; f1 | f2 ----+---- 1 | 3 11 | 13 -(2 rows) + 22 | 23 + 33 | 0 +(4 rows) select * from rules_log; - f1 | f2 | tag -----+-----+----- - 1 | 2 | old - 1 | 3 | new - 11 | 12 | old - 11 | 13 | new - 1 | 3 | old - 1 | 30 | new - 11 | 13 | old - 11 | 130 | new - 1 | 30 | old - 1 | 3 | new - 11 | 130 | old - 11 | 13 | new -(12 rows) - -create rule r3 as on delete to rules_src do notify rules_src_deletion; + f1 | f2 | tag | id +----+-----+-----+---- + 1 | 2 | old | 1 + 1 | 3 | new | 2 + 11 | 12 | old | 3 + 11 | 13 | new | 4 + 1 | 3 | old | 5 + 1 | 30 | new | 6 + 11 | 13 | old | 7 + 11 | 130 | new | 8 + 1 | 30 | old | 9 + 1 | 3 | new | 10 + 11 | 130 | old | 11 + 11 | 13 | new | 12 + | | - | 13 + 22 | 23 | new | 14 + | | - | 15 + 33 | 0 | new | 16 +(16 rows) + +create rule r4 as on delete to rules_src do notify rules_src_deletion; \d+ rules_src Table "public.rules_src" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description --------+---------+-----------+----------+---------+---------+--------------+------------- f1 | integer | | | | plain | | - f2 | integer | | | | plain | | + f2 | integer | | | 0 | plain | | Rules: r1 AS - ON UPDATE TO rules_src DO INSERT INTO rules_log (f1, f2, tag) VALUES (old.f1,old.f2,'old'::text), (new.f1,new.f2,'new'::text) + ON UPDATE TO rules_src DO INSERT INTO rules_log (f1, f2, tag, id) VALUES (old.f1,old.f2,'old'::text,DEFAULT), (new.f1,new.f2,'new'::text,DEFAULT) r2 AS ON UPDATE TO rules_src DO VALUES (old.f1,old.f2,'old'::text), (new.f1,new.f2,'new'::text) r3 AS + ON INSERT TO rules_src DO INSERT INTO rules_log (f1, f2, tag, id) VALUES (NULL::integer,NULL::integer,'-'::text,DEFAULT), (new.f1,new.f2,'new'::text,DEFAULT) + r4 AS ON DELETE TO rules_src DO NOTIFY rules_src_deletion -- -- Ensure an aliased target relation for insert is correctly deparsed. -- -create rule r4 as on insert to rules_src do instead insert into rules_log AS trgt SELECT NEW.* RETURNING trgt.f1, trgt.f2; -create rule r5 as on update to rules_src do instead UPDATE rules_log AS trgt SET tag = 'updated' WHERE trgt.f1 = new.f1; +create rule r5 as on insert to rules_src do instead insert into rules_log AS trgt SELECT NEW.* RETURNING trgt.f1, trgt.f2; +create rule r6 as on update to rules_src do instead UPDATE rules_log AS trgt SET tag = 'updated' WHERE trgt.f1 = new.f1; \d+ rules_src Table "public.rules_src" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description --------+---------+-----------+----------+---------+---------+--------------+------------- f1 | integer | | | | plain | | - f2 | integer | | | | plain | | + f2 | integer | | | 0 | plain | | Rules: r1 AS - ON UPDATE TO rules_src DO INSERT INTO rules_log (f1, f2, tag) VALUES (old.f1,old.f2,'old'::text), (new.f1,new.f2,'new'::text) + ON UPDATE TO rules_src DO INSERT INTO rules_log (f1, f2, tag, id) VALUES (old.f1,old.f2,'old'::text,DEFAULT), (new.f1,new.f2,'new'::text,DEFAULT) r2 AS ON UPDATE TO rules_src DO VALUES (old.f1,old.f2,'old'::text), (new.f1,new.f2,'new'::text) r3 AS + ON INSERT TO rules_src DO INSERT INTO rules_log (f1, f2, tag, id) VALUES (NULL::integer,NULL::integer,'-'::text,DEFAULT), (new.f1,new.f2,'new'::text,DEFAULT) + r4 AS ON DELETE TO rules_src DO NOTIFY rules_src_deletion - r4 AS + r5 AS ON INSERT TO rules_src DO INSTEAD INSERT INTO rules_log AS trgt (f1, f2) SELECT new.f1, new.f2 RETURNING trgt.f1, trgt.f2 - r5 AS + r6 AS ON UPDATE TO rules_src DO INSTEAD UPDATE rules_log trgt SET tag = 'updated'::text WHERE trgt.f1 = new.f1 diff --git a/src/test/regress/sql/rules.sql b/src/test/regress/sql/rules.sql index 078eabcd4cb..7b0cd28720c 100644 --- a/src/test/regress/sql/rules.sql +++ b/src/test/regress/sql/rules.sql @@ -1016,11 +1016,11 @@ select pg_get_viewdef('shoe'::regclass,0) as prettier; -- check multi-row VALUES in rules -- -create table rules_src(f1 int, f2 int); -create table rules_log(f1 int, f2 int, tag text); +create table rules_src(f1 int, f2 int default 0); +create table rules_log(f1 int, f2 int, tag text, id serial); insert into rules_src values(1,2), (11,12); create rule r1 as on update to rules_src do also - insert into rules_log values(old.*, 'old'), (new.*, 'new'); + insert into rules_log values(old.*, 'old', default), (new.*, 'new', default); update rules_src set f2 = f2 + 1; update rules_src set f2 = f2 * 10; select * from rules_src; @@ -1028,16 +1028,19 @@ select * from rules_log; create rule r2 as on update to rules_src do also values(old.*, 'old'), (new.*, 'new'); update rules_src set f2 = f2 / 10; +create rule r3 as on insert to rules_src do also + insert into rules_log values(null, null, '-', default), (new.*, 'new', default); +insert into rules_src values(22,23), (33,default); select * from rules_src; select * from rules_log; -create rule r3 as on delete to rules_src do notify rules_src_deletion; +create rule r4 as on delete to rules_src do notify rules_src_deletion; \d+ rules_src -- -- Ensure an aliased target relation for insert is correctly deparsed. -- -create rule r4 as on insert to rules_src do instead insert into rules_log AS trgt SELECT NEW.* RETURNING trgt.f1, trgt.f2; -create rule r5 as on update to rules_src do instead UPDATE rules_log AS trgt SET tag = 'updated' WHERE trgt.f1 = new.f1; +create rule r5 as on insert to rules_src do instead insert into rules_log AS trgt SELECT NEW.* RETURNING trgt.f1, trgt.f2; +create rule r6 as on update to rules_src do instead UPDATE rules_log AS trgt SET tag = 'updated' WHERE trgt.f1 = new.f1; \d+ rules_src -- From fffb3bc2719d2153fea713de70d7d9a12b6dd4a3 Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Sun, 4 Dec 2022 13:17:18 -0500 Subject: [PATCH 35/70] Fix generate_partitionwise_join_paths() to tolerate failure. We might fail to generate a partitionwise join, because reparameterize_path_by_child() does not support all path types. This should not be a hard failure condition: we should just fall back to a non-partitioned join. However, generate_partitionwise_join_paths did not consider this possibility and would emit the (misleading) error "could not devise a query plan for the given query" if we'd failed to make any paths for a child join. Fix it to give up on partitionwise joining if so. (The accepted technique for giving up appears to be to set rel->nparts = 0, which I find pretty bizarre, but there you have it.) I have not added a test case because there'd be little point: any omissions of this sort that we identify would soon get fixed by extending reparameterize_path_by_child(), so the test would stop proving anything. However, right now there is a known test case based on failure to cover MaterialPath, and with that I've found that this is broken in all supported versions. Hence, patch all the way back. Original report and patch by me; thanks to Richard Guo for identifying a test case that works against committed versions. Discussion: https://postgr.es/m/1854233.1669949723@sss.pgh.pa.us --- src/backend/optimizer/path/allpaths.c | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c index c02fcd4ea73..10e41de8cd9 100644 --- a/src/backend/optimizer/path/allpaths.c +++ b/src/backend/optimizer/path/allpaths.c @@ -4855,12 +4855,24 @@ generate_partitionwise_join_paths(PlannerInfo *root, RelOptInfo *rel) if (child_rel == NULL) continue; - /* Add partitionwise join paths for partitioned child-joins. */ + /* Make partitionwise join paths for this partitioned child-join. */ generate_partitionwise_join_paths(root, child_rel); + /* If we failed to make any path for this child, we must give up. */ + if (child_rel->pathlist == NIL) + { + /* + * Mark the parent joinrel as unpartitioned so that later + * functions treat it correctly. + */ + rel->nparts = 0; + return; + } + + /* Else, identify the cheapest path for it. */ set_cheapest(child_rel); - /* Dummy children will not be scanned, so ignore those. */ + /* Dummy children need not be scanned, so ignore those. */ if (IS_DUMMY_REL(child_rel)) continue; From dd5484112702d0daf5ab66a136871564f86e0f89 Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Sun, 4 Dec 2022 13:48:12 -0500 Subject: [PATCH 36/70] Fix broken MemoizePath support in reparameterize_path(). It neglected to recurse to the subpath, meaning you'd get back a path identical to the input. This could produce wrong query results if the omission meant that the subpath fails to enforce some join clause it should be enforcing. We don't have a test case for this at the moment, but the code is obviously broken and the fix is equally obvious. Back-patch to v14 where Memoize was introduced. Richard Guo Discussion: https://postgr.es/m/CAMbWs4_R=ORpz=Lkn2q3ebPC5EuWyfZF+tmfCPVLBVK5W39mHA@mail.gmail.com --- src/backend/optimizer/util/pathnode.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c index 24b795dde17..b989d46d6de 100644 --- a/src/backend/optimizer/util/pathnode.c +++ b/src/backend/optimizer/util/pathnode.c @@ -6293,9 +6293,15 @@ reparameterize_path(PlannerInfo *root, Path *path, case T_Memoize: { MemoizePath *mpath = (MemoizePath *) path; + Path *spath = mpath->subpath; + spath = reparameterize_path(root, spath, + required_outer, + loop_count); + if (spath == NULL) + return NULL; return (Path *) create_memoize_path(root, rel, - mpath->subpath, + spath, mpath->param_exprs, mpath->hash_operators, mpath->singlerow, From 7d21f8b2808e97facf0193486f98fd04d02b4e3d Mon Sep 17 00:00:00 2001 From: Michael Paquier Date: Mon, 5 Dec 2022 11:23:30 +0900 Subject: [PATCH 37/70] doc: Add missing markups for developer GUCs Missing such markups makes it impossible to create links back to these GUCs, and all the other parameters have one already. Author: Ian Lawrence Barwick Discussion: https://postgr.es/m/CAB8KJ=jx=6dFB_EN3j0UkuvG3cPu5OmQiM-ZKRAz+fKvS+u8Ng@mail.gmail.com Backpatch-through: 11 --- doc/src/sgml/config.sgml | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 514cac29e60..5532e63740d 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -10734,7 +10734,7 @@ dynamic_library_path = 'C:\tools\postgresql;H:\my_project\lib;$libdir' - + trace_locks (boolean) trace_locks configuration parameter @@ -10775,7 +10775,7 @@ LOG: CleanUpLock: deleting: lock(0xb7acd844) id(24688,24696,0,0,0,1) - + trace_lwlocks (boolean) trace_lwlocks configuration parameter @@ -10795,7 +10795,7 @@ LOG: CleanUpLock: deleting: lock(0xb7acd844) id(24688,24696,0,0,0,1) - + trace_userlocks (boolean) trace_userlocks configuration parameter @@ -10814,7 +10814,7 @@ LOG: CleanUpLock: deleting: lock(0xb7acd844) id(24688,24696,0,0,0,1) - + trace_lock_oidmin (integer) trace_lock_oidmin configuration parameter @@ -10833,7 +10833,7 @@ LOG: CleanUpLock: deleting: lock(0xb7acd844) id(24688,24696,0,0,0,1) - + trace_lock_table (integer) trace_lock_table configuration parameter @@ -10851,7 +10851,7 @@ LOG: CleanUpLock: deleting: lock(0xb7acd844) id(24688,24696,0,0,0,1) - + debug_deadlocks (boolean) debug_deadlocks configuration parameter @@ -10870,7 +10870,7 @@ LOG: CleanUpLock: deleting: lock(0xb7acd844) id(24688,24696,0,0,0,1) - + log_btree_build_stats (boolean) log_btree_build_stats configuration parameter From 600b6b9ea498cad9ea7e9b9b628873cef2853f17 Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Mon, 5 Dec 2022 12:36:41 -0500 Subject: [PATCH 38/70] Fix Memoize to work with partitionwise joining. A couple of places weren't up to speed for this. By sheer good luck, we didn't fail but just selected a non-memoized join plan, at least in the test case we have. Nonetheless, it's a bug, and I'm not quite sure that it couldn't have worse consequences in other examples. So back-patch to v14 where Memoize came in. Richard Guo Discussion: https://postgr.es/m/CAMbWs48GkNom272sfp0-WeD6_0HSR19BJ4H1c9ZKSfbVnJsvRg@mail.gmail.com --- src/backend/optimizer/path/joinpath.c | 14 +++++++++- src/backend/optimizer/util/pathnode.c | 1 + src/include/nodes/pathnodes.h | 4 +-- src/test/regress/expected/memoize.out | 39 +++++++++++++++++++++++++++ src/test/regress/sql/memoize.sql | 19 +++++++++++++ 5 files changed, 74 insertions(+), 3 deletions(-) diff --git a/src/backend/optimizer/path/joinpath.c b/src/backend/optimizer/path/joinpath.c index ec50c66104a..d4c2b793bb5 100644 --- a/src/backend/optimizer/path/joinpath.c +++ b/src/backend/optimizer/path/joinpath.c @@ -600,6 +600,7 @@ get_memoize_path(PlannerInfo *root, RelOptInfo *innerrel, Path *outer_path, JoinType jointype, JoinPathExtraData *extra) { + RelOptInfo *top_outerrel; List *param_exprs; List *hash_operators; ListCell *lc; @@ -689,10 +690,21 @@ get_memoize_path(PlannerInfo *root, RelOptInfo *innerrel, return NULL; } + /* + * When considering a partitionwise join, we have clauses that reference + * the outerrel's top parent not outerrel itself. + */ + if (outerrel->reloptkind == RELOPT_OTHER_MEMBER_REL) + top_outerrel = find_base_rel(root, bms_singleton_member(outerrel->top_parent_relids)); + else if (outerrel->reloptkind == RELOPT_OTHER_JOINREL) + top_outerrel = find_join_rel(root, outerrel->top_parent_relids); + else + top_outerrel = outerrel; + /* Check if we have hash ops for each parameter to the path */ if (paraminfo_get_equal_hashops(root, inner_path->param_info, - outerrel, + top_outerrel, innerrel, ¶m_exprs, &hash_operators, diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c index b989d46d6de..4648a2f2719 100644 --- a/src/backend/optimizer/util/pathnode.c +++ b/src/backend/optimizer/util/pathnode.c @@ -6532,6 +6532,7 @@ do { \ FLAT_COPY_PATH(mpath, path, MemoizePath); REPARAMETERIZE_CHILD_PATH(mpath->subpath); + ADJUST_CHILD_ATTRS(mpath->param_exprs); new_path = (Path *) mpath; } break; diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h index b02a2ccf0c1..41220887050 100644 --- a/src/include/nodes/pathnodes.h +++ b/src/include/nodes/pathnodes.h @@ -1933,8 +1933,8 @@ typedef struct MemoizePath { Path path; Path *subpath; /* outerpath to cache tuples from */ - List *hash_operators; /* hash operators for each key */ - List *param_exprs; /* cache keys */ + List *hash_operators; /* OIDs of hash equality ops for cache keys */ + List *param_exprs; /* expressions that are cache keys */ bool singlerow; /* true if the cache entry is to be marked as * complete after caching the first record. */ bool binary_mode; /* true when cache key should be compared bit diff --git a/src/test/regress/expected/memoize.out b/src/test/regress/expected/memoize.out index 3dfc46118da..09b8b5fa719 100644 --- a/src/test/regress/expected/memoize.out +++ b/src/test/regress/expected/memoize.out @@ -209,6 +209,45 @@ SELECT * FROM strtest s1 INNER JOIN strtest s2 ON s1.t >= s2.t;', false); (7 rows) DROP TABLE strtest; +-- Ensure memoize works with partitionwise join +SET enable_partitionwise_join TO on; +CREATE TABLE prt (a int) PARTITION BY RANGE(a); +CREATE TABLE prt_p1 PARTITION OF prt FOR VALUES FROM (0) TO (10); +CREATE TABLE prt_p2 PARTITION OF prt FOR VALUES FROM (10) TO (20); +INSERT INTO prt VALUES (0), (0), (0), (0); +INSERT INTO prt VALUES (10), (10), (10), (10); +CREATE INDEX iprt_p1_a ON prt_p1 (a); +CREATE INDEX iprt_p2_a ON prt_p2 (a); +ANALYZE prt; +SELECT explain_memoize(' +SELECT * FROM prt t1 INNER JOIN prt t2 ON t1.a = t2.a;', false); + explain_memoize +------------------------------------------------------------------------------------------ + Append (actual rows=32 loops=N) + -> Nested Loop (actual rows=16 loops=N) + -> Index Only Scan using iprt_p1_a on prt_p1 t1_1 (actual rows=4 loops=N) + Heap Fetches: N + -> Memoize (actual rows=4 loops=N) + Cache Key: t1_1.a + Cache Mode: logical + Hits: 3 Misses: 1 Evictions: Zero Overflows: 0 Memory Usage: NkB + -> Index Only Scan using iprt_p1_a on prt_p1 t2_1 (actual rows=4 loops=N) + Index Cond: (a = t1_1.a) + Heap Fetches: N + -> Nested Loop (actual rows=16 loops=N) + -> Index Only Scan using iprt_p2_a on prt_p2 t1_2 (actual rows=4 loops=N) + Heap Fetches: N + -> Memoize (actual rows=4 loops=N) + Cache Key: t1_2.a + Cache Mode: logical + Hits: 3 Misses: 1 Evictions: Zero Overflows: 0 Memory Usage: NkB + -> Index Only Scan using iprt_p2_a on prt_p2 t2_2 (actual rows=4 loops=N) + Index Cond: (a = t1_2.a) + Heap Fetches: N +(21 rows) + +DROP TABLE prt; +RESET enable_partitionwise_join; -- Exercise Memoize code that flushes the cache when a parameter changes which -- is not part of the cache key. -- Ensure we get a Memoize plan diff --git a/src/test/regress/sql/memoize.sql b/src/test/regress/sql/memoize.sql index 68ce217df1b..78ae7bb782c 100644 --- a/src/test/regress/sql/memoize.sql +++ b/src/test/regress/sql/memoize.sql @@ -110,6 +110,25 @@ SELECT * FROM strtest s1 INNER JOIN strtest s2 ON s1.t >= s2.t;', false); DROP TABLE strtest; +-- Ensure memoize works with partitionwise join +SET enable_partitionwise_join TO on; + +CREATE TABLE prt (a int) PARTITION BY RANGE(a); +CREATE TABLE prt_p1 PARTITION OF prt FOR VALUES FROM (0) TO (10); +CREATE TABLE prt_p2 PARTITION OF prt FOR VALUES FROM (10) TO (20); +INSERT INTO prt VALUES (0), (0), (0), (0); +INSERT INTO prt VALUES (10), (10), (10), (10); +CREATE INDEX iprt_p1_a ON prt_p1 (a); +CREATE INDEX iprt_p2_a ON prt_p2 (a); +ANALYZE prt; + +SELECT explain_memoize(' +SELECT * FROM prt t1 INNER JOIN prt t2 ON t1.a = t2.a;', false); + +DROP TABLE prt; + +RESET enable_partitionwise_join; + -- Exercise Memoize code that flushes the cache when a parameter changes which -- is not part of the cache key. From feaabfcb78efd9eb607fc1a10610b31ab39b8ee1 Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Mon, 12 Dec 2022 16:17:49 -0500 Subject: [PATCH 39/70] Fix jsonb subscripting to cope with toasted subscript values. jsonb_get_element() was incautious enough to use VARDATA() and VARSIZE() directly on an arbitrary text Datum. That of course fails if the Datum is short-header, compressed, or out-of-line. The typical result would be failing to match any element of a jsonb object, though matching the wrong one seems possible as well. setPathObject() was slightly brighter, in that it used VARDATA_ANY and VARSIZE_ANY_EXHDR, but that only kept it out of trouble for short-header Datums. push_path() had the same issue. This could result in faulty subscripted insertions, though keys long enough to cause a problem are likely rare in the wild. Having seen these, I looked around for unsafe usages in the rest of the adt/json* files. There are a couple of places where it's not immediately obvious that the Datum can't be compressed or out-of-line, so I added pg_detoast_datum_packed() to cope if it is. Also, remove some other usages of VARDATA/VARSIZE on Datums we just extracted from a text array. Those aren't actively broken, but they will become so if we ever start allowing short-header array elements, which does not seem like a terribly unreasonable thing to do. In any case they are not great coding examples, and they could also do with comments pointing out that we're assuming we don't need pg_detoast_datum_packed. Per report from exe-dealer@yandex.ru. Patch by me, but thanks to David Johnston for initial investigation. Back-patch to v14 where jsonb subscripting was introduced. Discussion: https://postgr.es/m/205321670615953@mail.yandex.ru --- src/backend/utils/adt/jsonb_gin.c | 5 ++-- src/backend/utils/adt/jsonb_op.c | 10 ++++--- src/backend/utils/adt/jsonfuncs.c | 41 ++++++++++++++++++++--------- src/test/regress/expected/jsonb.out | 34 ++++++++++++++++++++++++ src/test/regress/sql/jsonb.sql | 18 +++++++++++++ 5 files changed, 90 insertions(+), 18 deletions(-) diff --git a/src/backend/utils/adt/jsonb_gin.c b/src/backend/utils/adt/jsonb_gin.c index 37499bc5622..a4d7d24509a 100644 --- a/src/backend/utils/adt/jsonb_gin.c +++ b/src/backend/utils/adt/jsonb_gin.c @@ -896,9 +896,10 @@ gin_extract_jsonb_query(PG_FUNCTION_ARGS) /* Nulls in the array are ignored */ if (key_nulls[i]) continue; + /* We rely on the array elements not being toasted */ entries[j++] = make_text_key(JGINFLAG_KEY, - VARDATA(key_datums[i]), - VARSIZE(key_datums[i]) - VARHDRSZ); + VARDATA_ANY(key_datums[i]), + VARSIZE_ANY_EXHDR(key_datums[i])); } *nentries = j; diff --git a/src/backend/utils/adt/jsonb_op.c b/src/backend/utils/adt/jsonb_op.c index 6e85e5c36b3..758fd7beaae 100644 --- a/src/backend/utils/adt/jsonb_op.c +++ b/src/backend/utils/adt/jsonb_op.c @@ -64,8 +64,9 @@ jsonb_exists_any(PG_FUNCTION_ARGS) continue; strVal.type = jbvString; - strVal.val.string.val = VARDATA(key_datums[i]); - strVal.val.string.len = VARSIZE(key_datums[i]) - VARHDRSZ; + /* We rely on the array elements not being toasted */ + strVal.val.string.val = VARDATA_ANY(key_datums[i]); + strVal.val.string.len = VARSIZE_ANY_EXHDR(key_datums[i]); if (findJsonbValueFromContainer(&jb->root, JB_FOBJECT | JB_FARRAY, @@ -97,8 +98,9 @@ jsonb_exists_all(PG_FUNCTION_ARGS) continue; strVal.type = jbvString; - strVal.val.string.val = VARDATA(key_datums[i]); - strVal.val.string.len = VARSIZE(key_datums[i]) - VARHDRSZ; + /* We rely on the array elements not being toasted */ + strVal.val.string.val = VARDATA_ANY(key_datums[i]); + strVal.val.string.len = VARSIZE_ANY_EXHDR(key_datums[i]); if (findJsonbValueFromContainer(&jb->root, JB_FOBJECT | JB_FARRAY, diff --git a/src/backend/utils/adt/jsonfuncs.c b/src/backend/utils/adt/jsonfuncs.c index 0364765dc0e..b342c81f27b 100644 --- a/src/backend/utils/adt/jsonfuncs.c +++ b/src/backend/utils/adt/jsonfuncs.c @@ -516,6 +516,12 @@ pg_parse_json_or_ereport(JsonLexContext *lex, JsonSemAction *sem) JsonLexContext * makeJsonLexContext(text *json, bool need_escapes) { + /* + * Most callers pass a detoasted datum, but it's not clear that they all + * do. pg_detoast_datum_packed() is cheap insurance. + */ + json = pg_detoast_datum_packed(json); + return makeJsonLexContextCstringLen(VARDATA_ANY(json), VARSIZE_ANY_EXHDR(json), GetDatabaseEncoding(), @@ -1518,9 +1524,11 @@ jsonb_get_element(Jsonb *jb, Datum *path, int npath, bool *isnull, bool as_text) { if (have_object) { + text *subscr = DatumGetTextPP(path[i]); + jbvp = getKeyJsonValueFromContainer(container, - VARDATA(path[i]), - VARSIZE(path[i]) - VARHDRSZ, + VARDATA_ANY(subscr), + VARSIZE_ANY_EXHDR(subscr), NULL); } else if (have_array) @@ -1693,8 +1701,8 @@ push_path(JsonbParseState **st, int level, Datum *path_elems, { /* text, an object is expected */ newkey.type = jbvString; - newkey.val.string.len = VARSIZE_ANY_EXHDR(path_elems[i]); - newkey.val.string.val = VARDATA_ANY(path_elems[i]); + newkey.val.string.val = c; + newkey.val.string.len = strlen(c); (void) pushJsonbValue(st, WJB_BEGIN_OBJECT, NULL); (void) pushJsonbValue(st, WJB_KEY, &newkey); @@ -4461,6 +4469,7 @@ jsonb_delete_array(PG_FUNCTION_ARGS) if (keys_nulls[i]) continue; + /* We rely on the array elements not being toasted */ keyptr = VARDATA_ANY(keys_elems[i]); keylen = VARSIZE_ANY_EXHDR(keys_elems[i]); if (keylen == v.val.string.len && @@ -4985,6 +4994,7 @@ setPathObject(JsonbIterator **it, Datum *path_elems, bool *path_nulls, int path_len, JsonbParseState **st, int level, JsonbValue *newval, uint32 npairs, int op_type) { + text *pathelem = NULL; int i; JsonbValue k, v; @@ -4992,6 +5002,11 @@ setPathObject(JsonbIterator **it, Datum *path_elems, bool *path_nulls, if (level >= path_len || path_nulls[level]) done = true; + else + { + /* The path Datum could be toasted, in which case we must detoast it */ + pathelem = DatumGetTextPP(path_elems[level]); + } /* empty object is a special case for create */ if ((npairs == 0) && (op_type & JB_PATH_CREATE_OR_INSERT) && @@ -5000,8 +5015,8 @@ setPathObject(JsonbIterator **it, Datum *path_elems, bool *path_nulls, JsonbValue newkey; newkey.type = jbvString; - newkey.val.string.len = VARSIZE_ANY_EXHDR(path_elems[level]); - newkey.val.string.val = VARDATA_ANY(path_elems[level]); + newkey.val.string.val = VARDATA_ANY(pathelem); + newkey.val.string.len = VARSIZE_ANY_EXHDR(pathelem); (void) pushJsonbValue(st, WJB_KEY, &newkey); (void) pushJsonbValue(st, WJB_VALUE, newval); @@ -5014,8 +5029,8 @@ setPathObject(JsonbIterator **it, Datum *path_elems, bool *path_nulls, Assert(r == WJB_KEY); if (!done && - k.val.string.len == VARSIZE_ANY_EXHDR(path_elems[level]) && - memcmp(k.val.string.val, VARDATA_ANY(path_elems[level]), + k.val.string.len == VARSIZE_ANY_EXHDR(pathelem) && + memcmp(k.val.string.val, VARDATA_ANY(pathelem), k.val.string.len) == 0) { done = true; @@ -5055,8 +5070,8 @@ setPathObject(JsonbIterator **it, Datum *path_elems, bool *path_nulls, JsonbValue newkey; newkey.type = jbvString; - newkey.val.string.len = VARSIZE_ANY_EXHDR(path_elems[level]); - newkey.val.string.val = VARDATA_ANY(path_elems[level]); + newkey.val.string.val = VARDATA_ANY(pathelem); + newkey.val.string.len = VARSIZE_ANY_EXHDR(pathelem); (void) pushJsonbValue(st, WJB_KEY, &newkey); (void) pushJsonbValue(st, WJB_VALUE, newval); @@ -5099,8 +5114,8 @@ setPathObject(JsonbIterator **it, Datum *path_elems, bool *path_nulls, JsonbValue newkey; newkey.type = jbvString; - newkey.val.string.len = VARSIZE_ANY_EXHDR(path_elems[level]); - newkey.val.string.val = VARDATA_ANY(path_elems[level]); + newkey.val.string.val = VARDATA_ANY(pathelem); + newkey.val.string.len = VARSIZE_ANY_EXHDR(pathelem); (void) pushJsonbValue(st, WJB_KEY, &newkey); (void) push_path(st, level, path_elems, path_nulls, @@ -5509,6 +5524,8 @@ transform_jsonb_string_values(Jsonb *jsonb, void *action_state, if ((type == WJB_VALUE || type == WJB_ELEM) && v.type == jbvString) { out = transform_action(action_state, v.val.string.val, v.val.string.len); + /* out is probably not toasted, but let's be sure */ + out = pg_detoast_datum_packed(out); v.val.string.val = VARDATA_ANY(out); v.val.string.len = VARSIZE_ANY_EXHDR(out); res = pushJsonbValue(&st, type, type < WJB_BEGIN_ARRAY ? &v : NULL); diff --git a/src/test/regress/expected/jsonb.out b/src/test/regress/expected/jsonb.out index 635016d4aad..aeb92caa06d 100644 --- a/src/test/regress/expected/jsonb.out +++ b/src/test/regress/expected/jsonb.out @@ -5207,6 +5207,40 @@ DETAIL: The path assumes key is a composite object, but it is a scalar value. update test_jsonb_subscript set test_json[0][0] = '1'; ERROR: cannot replace existing key DETAIL: The path assumes key is a composite object, but it is a scalar value. +-- try some things with short-header and toasted subscript values +drop table test_jsonb_subscript; +create temp table test_jsonb_subscript ( + id text, + test_json jsonb +); +insert into test_jsonb_subscript values('foo', '{"foo": "bar"}'); +insert into test_jsonb_subscript + select s, ('{"' || s || '": "bar"}')::jsonb from repeat('xyzzy', 500) s; +select length(id), test_json[id] from test_jsonb_subscript; + length | test_json +--------+----------- + 3 | "bar" + 2500 | "bar" +(2 rows) + +update test_jsonb_subscript set test_json[id] = '"baz"'; +select length(id), test_json[id] from test_jsonb_subscript; + length | test_json +--------+----------- + 3 | "baz" + 2500 | "baz" +(2 rows) + +\x +table test_jsonb_subscript; +-[ RECORD 1 ]-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- +id | foo +test_json | {"foo": "baz"} +-[ RECORD 2 ]-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- +id | xyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzy +test_json | {"xyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzyxyzzy": "baz"} + +\x -- jsonb to tsvector select to_tsvector('{"a": "aaa bbb ddd ccc", "b": ["eee fff ggg"], "c": {"d": "hhh iii"}}'::jsonb); to_tsvector diff --git a/src/test/regress/sql/jsonb.sql b/src/test/regress/sql/jsonb.sql index b1ce452a822..c48f9944070 100644 --- a/src/test/regress/sql/jsonb.sql +++ b/src/test/regress/sql/jsonb.sql @@ -1410,6 +1410,24 @@ insert into test_jsonb_subscript values (1, 'null'); update test_jsonb_subscript set test_json[0] = '1'; update test_jsonb_subscript set test_json[0][0] = '1'; +-- try some things with short-header and toasted subscript values + +drop table test_jsonb_subscript; +create temp table test_jsonb_subscript ( + id text, + test_json jsonb +); + +insert into test_jsonb_subscript values('foo', '{"foo": "bar"}'); +insert into test_jsonb_subscript + select s, ('{"' || s || '": "bar"}')::jsonb from repeat('xyzzy', 500) s; +select length(id), test_json[id] from test_jsonb_subscript; +update test_jsonb_subscript set test_json[id] = '"baz"'; +select length(id), test_json[id] from test_jsonb_subscript; +\x +table test_jsonb_subscript; +\x + -- jsonb to tsvector select to_tsvector('{"a": "aaa bbb ddd ccc", "b": ["eee fff ggg"], "c": {"d": "hhh iii"}}'::jsonb); From a48c1f6d62140584c29db46ce97ad8cd8db2d2ec Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Tue, 13 Dec 2022 14:23:59 -0500 Subject: [PATCH 40/70] Rethink handling of [Prevent|Is]InTransactionBlock in pipeline mode. Commits f92944137 et al. made IsInTransactionBlock() set the XACT_FLAGS_NEEDIMMEDIATECOMMIT flag before returning "false", on the grounds that that kept its API promises equivalent to those of PreventInTransactionBlock(). This turns out to be a bad idea though, because it allows an ANALYZE in a pipelined series of commands to cause an immediate commit, which is unexpected. Furthermore, if we return "false" then we have another issue, which is that ANALYZE will decide it's allowed to do internal commit-and-start-transaction sequences, thus possibly unexpectedly committing the effects of previous commands in the pipeline. To fix the latter situation, invent another transaction state flag XACT_FLAGS_PIPELINING, which explicitly records the fact that we have executed some extended-protocol command and not yet seen a commit for it. Then, require that flag to not be set before allowing InTransactionBlock() to return "false". Having done that, we can remove its setting of NEEDIMMEDIATECOMMIT without fear of causing problems. This means that the API guarantees of IsInTransactionBlock now diverge from PreventInTransactionBlock, which is mildly annoying, but it seems OK given the very limited usage of IsInTransactionBlock. (In any case, a caller preferring the old behavior could always set NEEDIMMEDIATECOMMIT for itself.) For consistency also require XACT_FLAGS_PIPELINING to not be set in PreventInTransactionBlock. This too is meant to prevent commands such as CREATE DATABASE from silently committing previous commands in a pipeline. Per report from Peter Eisentraut. As before, back-patch to all supported branches (which sadly no longer includes v10). Discussion: https://postgr.es/m/65a899dd-aebc-f667-1d0a-abb89ff3abf8@enterprisedb.com --- doc/src/sgml/libpq.sgml | 5 +++-- doc/src/sgml/protocol.sgml | 9 +++++---- src/backend/access/transam/xact.c | 28 ++++++++++++++++++---------- src/backend/tcop/postgres.c | 12 ++++++++++++ src/include/access/xact.h | 7 +++++++ 5 files changed, 45 insertions(+), 16 deletions(-) diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml index fd230d9fd0e..efd039381d6 100644 --- a/doc/src/sgml/libpq.sgml +++ b/doc/src/sgml/libpq.sgml @@ -5024,10 +5024,11 @@ int PQflush(PGconn *conn); - While the pipeline API was introduced in + While libpq's pipeline API was introduced in PostgreSQL 14, it is a client-side feature which doesn't require special server support and works on any server - that supports the v3 extended query protocol. + that supports the v3 extended query protocol. For more information see + . diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml index cf1fadcda4b..7141f6c277a 100644 --- a/doc/src/sgml/protocol.sgml +++ b/doc/src/sgml/protocol.sgml @@ -1093,9 +1093,10 @@ SELCT 1/0; implicit ROLLBACK if they failed. However, there are a few DDL commands (such as CREATE DATABASE) that cannot be executed inside a transaction block. If one of - these is executed in a pipeline, it will, upon success, force an - immediate commit to preserve database consistency. - A Sync immediately following one of these has no effect except to + these is executed in a pipeline, it will fail unless it is the first + command in the pipeline. Furthermore, upon success it will force an + immediate commit to preserve database consistency. Thus a Sync + immediately following one of these commands has no effect except to respond with ReadyForQuery. @@ -1103,7 +1104,7 @@ SELCT 1/0; When using this method, completion of the pipeline must be determined by counting ReadyForQuery messages and waiting for that to reach the number of Syncs sent. Counting command completion responses is - unreliable, since some of the commands may not be executed and thus not + unreliable, since some of the commands may be skipped and thus not produce a completion message. diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c index 7ed0a6ce42a..d6b44fc8412 100644 --- a/src/backend/access/transam/xact.c +++ b/src/backend/access/transam/xact.c @@ -4450,6 +4450,16 @@ PreventInTransactionBlock(bool isTopLevel, const char *stmtType) errmsg("%s cannot run inside a subtransaction", stmtType))); + /* + * inside a pipeline that has started an implicit transaction? + */ + if (MyXactFlags & XACT_FLAGS_PIPELINING) + ereport(ERROR, + (errcode(ERRCODE_ACTIVE_SQL_TRANSACTION), + /* translator: %s represents an SQL statement name */ + errmsg("%s cannot be executed within a pipeline", + stmtType))); + /* * inside a function call? */ @@ -4539,9 +4549,11 @@ CheckTransactionBlock(bool isTopLevel, bool throwError, const char *stmtType) * a transaction block than when running as single commands. ANALYZE is * currently the only example. * - * If this routine returns "false", then the calling statement is - * guaranteed that if it completes without error, its results will be - * committed immediately. + * If this routine returns "false", then the calling statement is allowed + * to perform internal transaction-commit-and-start cycles; there is not a + * risk of messing up any transaction already in progress. (Note that this + * is not the identical guarantee provided by PreventInTransactionBlock, + * since we will not force a post-statement commit.) * * isTopLevel: passed down from ProcessUtility to determine whether we are * inside a function. @@ -4559,6 +4571,9 @@ IsInTransactionBlock(bool isTopLevel) if (IsSubTransaction()) return true; + if (MyXactFlags & XACT_FLAGS_PIPELINING) + return true; + if (!isTopLevel) return true; @@ -4566,13 +4581,6 @@ IsInTransactionBlock(bool isTopLevel) CurrentTransactionState->blockState != TBLOCK_STARTED) return true; - /* - * If we tell the caller we're not in a transaction block, then inform - * postgres.c that it had better commit when the statement is done. - * Otherwise our report could be a lie. - */ - MyXactFlags |= XACT_FLAGS_NEEDIMMEDIATECOMMIT; - return false; } diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c index 62ded58aafb..37cdcfba46a 100644 --- a/src/backend/tcop/postgres.c +++ b/src/backend/tcop/postgres.c @@ -3092,6 +3092,12 @@ exec_execute_message(const char *portal_name, int64 max_rows) */ CommandCounterIncrement(); + /* + * Set XACT_FLAGS_PIPELINING whenever we complete an Execute + * message without immediately committing the transaction. + */ + MyXactFlags |= XACT_FLAGS_PIPELINING; + /* * Disable statement timeout whenever we complete an Execute * message. The next protocol message will start a fresh timeout. @@ -3107,6 +3113,12 @@ exec_execute_message(const char *portal_name, int64 max_rows) /* Portal run not complete, so send PortalSuspended */ if (whereToSendOutput == DestRemote) pq_putemptymessage('s'); + + /* + * Set XACT_FLAGS_PIPELINING whenever we suspend an Execute message, + * too. + */ + MyXactFlags |= XACT_FLAGS_PIPELINING; } /* diff --git a/src/include/access/xact.h b/src/include/access/xact.h index 25f224b2c2b..dd7687a1246 100644 --- a/src/include/access/xact.h +++ b/src/include/access/xact.h @@ -121,6 +121,13 @@ extern int MyXactFlags; */ #define XACT_FLAGS_NEEDIMMEDIATECOMMIT (1U << 2) +/* + * XACT_FLAGS_PIPELINING - set when we complete an extended-query-protocol + * Execute message. This is useful for detecting that an implicit transaction + * block has been created via pipelining. + */ +#define XACT_FLAGS_PIPELINING (1U << 3) + /* * start- and end-of-transaction callbacks for dynamically loaded modules */ From d287d97d66c33641226b2a35d8dc8d272fe5a69a Mon Sep 17 00:00:00 2001 From: David Rowley Date: Fri, 16 Dec 2022 11:40:55 +1300 Subject: [PATCH 41/70] Re-adjust drop-index-concurrently-1 isolation test It seems that drop-index-concurrently-1 has started to forget what it was originally meant to be testing. d2d8a229b, which added incremental sorts changed the expected plan to be an Index Scan plan instead of a Seq Scan plan. This occurred as the primary key index of the table in question provided presorted input and, because that index happened to be the cheapest input path due to enable_seqscan being disabled, the incremental sort changes just added a Sort on top of that. It seems based on the name of the PREPAREd statement that the intention here is that the query produces a seqscan plan. The reason this test has become broken seems to be due to how the test was originally coded. The test was trying to force a seqscan plan by performing some casting to make it so the test_dc index couldn't be used to perform the required filtering. Trying to coax the planner into using a plan which has costed in a disable_cost seems like it's always going to be flakey as small changes in costs are drowned out by the large disable_cost combined with add_path's STD_FUZZ_FACTOR. Here we get rid of the casts that we're using to try to trick the planner into a seqscan and instead toggle enable_seqscan as and when required to get the desired plan. Additionally, rename a few things in the test and add some additional wording to the comments to try and make it more clear in the future what we expect this test to be doing. Discussion: https://postgr.es/m/CAApHDvrbDhObhLV+=U_K_-t+2Av2av1aL9d+2j_3AO-XndaviA@mail.gmail.com Backpatch-through: 13, where d2d8a229b changed the expected test output --- .../expected/drop-index-concurrently-1.out | 31 ++++++++++--------- .../expected/drop-index-concurrently-1_2.out | 31 ++++++++++--------- .../specs/drop-index-concurrently-1.spec | 22 +++++++------ 3 files changed, 44 insertions(+), 40 deletions(-) diff --git a/src/test/isolation/expected/drop-index-concurrently-1.out b/src/test/isolation/expected/drop-index-concurrently-1.out index e7b9272f455..c7f24252e34 100644 --- a/src/test/isolation/expected/drop-index-concurrently-1.out +++ b/src/test/isolation/expected/drop-index-concurrently-1.out @@ -1,17 +1,17 @@ Parsed test spec with 3 sessions -starting permutation: noseq chkiso prepi preps begin explaini explains select2 drop insert2 end2 selecti selects end -step noseq: SET enable_seqscan = false; +starting permutation: chkiso prepi preps begin disableseq explaini enableseq explains select2 drop insert2 end2 selecti selects end step chkiso: SELECT (setting in ('read committed','read uncommitted')) AS is_read_committed FROM pg_settings WHERE name = 'default_transaction_isolation'; is_read_committed ----------------- t (1 row) -step prepi: PREPARE getrow_idx AS SELECT * FROM test_dc WHERE data=34 ORDER BY id,data; -step preps: PREPARE getrow_seq AS SELECT * FROM test_dc WHERE data::text=34::text ORDER BY id,data; +step prepi: PREPARE getrow_idxscan AS SELECT * FROM test_dc WHERE data = 34 ORDER BY id,data; +step preps: PREPARE getrow_seqscan AS SELECT * FROM test_dc WHERE data = 34 ORDER BY id,data; step begin: BEGIN; -step explaini: EXPLAIN (COSTS OFF) EXECUTE getrow_idx; +step disableseq: SET enable_seqscan = false; +step explaini: EXPLAIN (COSTS OFF) EXECUTE getrow_idxscan; QUERY PLAN ---------------------------------------------- Sort @@ -21,17 +21,18 @@ Sort Optimizer: Postgres query optimizer (5 rows) -step explains: EXPLAIN (COSTS OFF) EXECUTE getrow_seq; -QUERY PLAN ----------------------------------------------- -Sort - Sort Key: id, data - -> Index Scan using test_dc_pkey on test_dc - Filter: ((data)::text = '34'::text) +step enableseq: SET enable_seqscan = true; +step explains: EXPLAIN (COSTS OFF) EXECUTE getrow_seqscan; +QUERY PLAN +--------------------------- +Sort + Sort Key: id + -> Seq Scan on test_dc + Filter: (data = 34) Optimizer: Postgres query optimizer (5 rows) -step select2: SELECT * FROM test_dc WHERE data=34 ORDER BY id,data; +step select2: SELECT * FROM test_dc WHERE data = 34 ORDER BY id,data; id|data --+---- 34| 34 @@ -40,14 +41,14 @@ id|data step drop: DROP INDEX CONCURRENTLY test_dc_data; step insert2: INSERT INTO test_dc(data) SELECT * FROM generate_series(1, 100); step end2: COMMIT; -step selecti: EXECUTE getrow_idx; +step selecti: EXECUTE getrow_idxscan; id|data ---+---- 34| 34 134| 34 (2 rows) -step selects: EXECUTE getrow_seq; +step selects: EXECUTE getrow_seqscan; id|data ---+---- 34| 34 diff --git a/src/test/isolation/expected/drop-index-concurrently-1_2.out b/src/test/isolation/expected/drop-index-concurrently-1_2.out index 04612d3cacc..266b0e4adac 100644 --- a/src/test/isolation/expected/drop-index-concurrently-1_2.out +++ b/src/test/isolation/expected/drop-index-concurrently-1_2.out @@ -1,17 +1,17 @@ Parsed test spec with 3 sessions -starting permutation: noseq chkiso prepi preps begin explaini explains select2 drop insert2 end2 selecti selects end -step noseq: SET enable_seqscan = false; +starting permutation: chkiso prepi preps begin disableseq explaini enableseq explains select2 drop insert2 end2 selecti selects end step chkiso: SELECT (setting in ('read committed','read uncommitted')) AS is_read_committed FROM pg_settings WHERE name = 'default_transaction_isolation'; is_read_committed ----------------- f (1 row) -step prepi: PREPARE getrow_idx AS SELECT * FROM test_dc WHERE data=34 ORDER BY id,data; -step preps: PREPARE getrow_seq AS SELECT * FROM test_dc WHERE data::text=34::text ORDER BY id,data; +step prepi: PREPARE getrow_idxscan AS SELECT * FROM test_dc WHERE data = 34 ORDER BY id,data; +step preps: PREPARE getrow_seqscan AS SELECT * FROM test_dc WHERE data = 34 ORDER BY id,data; step begin: BEGIN; -step explaini: EXPLAIN (COSTS OFF) EXECUTE getrow_idx; +step disableseq: SET enable_seqscan = false; +step explaini: EXPLAIN (COSTS OFF) EXECUTE getrow_idxscan; QUERY PLAN ---------------------------------------------- Sort @@ -20,16 +20,17 @@ Sort Index Cond: (data = 34) (4 rows) -step explains: EXPLAIN (COSTS OFF) EXECUTE getrow_seq; -QUERY PLAN ----------------------------------------------- -Sort - Sort Key: id, data - -> Index Scan using test_dc_pkey on test_dc - Filter: ((data)::text = '34'::text) +step enableseq: SET enable_seqscan = true; +step explains: EXPLAIN (COSTS OFF) EXECUTE getrow_seqscan; +QUERY PLAN +--------------------------- +Sort + Sort Key: id + -> Seq Scan on test_dc + Filter: (data = 34) (4 rows) -step select2: SELECT * FROM test_dc WHERE data=34 ORDER BY id,data; +step select2: SELECT * FROM test_dc WHERE data = 34 ORDER BY id,data; id|data --+---- 34| 34 @@ -38,13 +39,13 @@ id|data step drop: DROP INDEX CONCURRENTLY test_dc_data; step insert2: INSERT INTO test_dc(data) SELECT * FROM generate_series(1, 100); step end2: COMMIT; -step selecti: EXECUTE getrow_idx; +step selecti: EXECUTE getrow_idxscan; id|data --+---- 34| 34 (1 row) -step selects: EXECUTE getrow_seq; +step selects: EXECUTE getrow_seqscan; id|data --+---- 34| 34 diff --git a/src/test/isolation/specs/drop-index-concurrently-1.spec b/src/test/isolation/specs/drop-index-concurrently-1.spec index 812b5de2261..a57a02469d8 100644 --- a/src/test/isolation/specs/drop-index-concurrently-1.spec +++ b/src/test/isolation/specs/drop-index-concurrently-1.spec @@ -3,7 +3,8 @@ # This test shows that the concurrent write behaviour works correctly # with the expected output being 2 rows at the READ COMMITTED and READ # UNCOMMITTED transaction isolation levels, and 1 row at the other -# transaction isolation levels. +# transaction isolation levels. We ensure this is the case by checking +# the returned rows in an index scan plan and a seq scan plan. # setup { @@ -18,24 +19,25 @@ teardown } session s1 -step noseq { SET enable_seqscan = false; } step chkiso { SELECT (setting in ('read committed','read uncommitted')) AS is_read_committed FROM pg_settings WHERE name = 'default_transaction_isolation'; } -step prepi { PREPARE getrow_idx AS SELECT * FROM test_dc WHERE data=34 ORDER BY id,data; } -step preps { PREPARE getrow_seq AS SELECT * FROM test_dc WHERE data::text=34::text ORDER BY id,data; } +step prepi { PREPARE getrow_idxscan AS SELECT * FROM test_dc WHERE data = 34 ORDER BY id,data; } +step preps { PREPARE getrow_seqscan AS SELECT * FROM test_dc WHERE data = 34 ORDER BY id,data; } step begin { BEGIN; } -step explaini { EXPLAIN (COSTS OFF) EXECUTE getrow_idx; } -step explains { EXPLAIN (COSTS OFF) EXECUTE getrow_seq; } -step selecti { EXECUTE getrow_idx; } -step selects { EXECUTE getrow_seq; } +step disableseq { SET enable_seqscan = false; } +step explaini { EXPLAIN (COSTS OFF) EXECUTE getrow_idxscan; } +step enableseq { SET enable_seqscan = true; } +step explains { EXPLAIN (COSTS OFF) EXECUTE getrow_seqscan; } +step selecti { EXECUTE getrow_idxscan; } +step selects { EXECUTE getrow_seqscan; } step end { COMMIT; } session s2 setup { BEGIN; } -step select2 { SELECT * FROM test_dc WHERE data=34 ORDER BY id,data; } +step select2 { SELECT * FROM test_dc WHERE data = 34 ORDER BY id,data; } step insert2 { INSERT INTO test_dc(data) SELECT * FROM generate_series(1, 100); } step end2 { COMMIT; } session s3 step drop { DROP INDEX CONCURRENTLY test_dc_data; } -permutation noseq chkiso prepi preps begin explaini explains select2 drop insert2 end2 selecti selects end +permutation chkiso prepi preps begin disableseq explaini enableseq explains select2 drop insert2 end2 selecti selects end From 852ef67871cbd8a999090f4c5ced0cdec8257b75 Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Wed, 21 Dec 2022 17:51:50 -0500 Subject: [PATCH 42/70] Fix contrib/seg to be more wary of long input numbers. seg stores the number of significant digits in an input number in a "char" field. If char is signed, and the input is more than 127 digits long, the count can read out as negative causing seg_out() to print garbage (or, if you're really unlucky, even crash). To fix, clamp the digit count to be not more than FLT_DIG. (In theory this loses some information about what the original input was, but it doesn't seem like useful information; it would not survive dump/restore in any case.) Also, in case there are stored values of the seg type containing bad data, add a clamp in seg_out's restore() subroutine. Per bug #17725 from Robins Tharakan. It's been like this forever, so back-patch to all supported branches. Discussion: https://postgr.es/m/17725-0a09313b67fbe86e@postgresql.org --- contrib/seg/expected/seg.out | 7 +++++++ contrib/seg/seg.c | 8 ++++++-- contrib/seg/segparse.y | 22 +++++++++++++++++----- contrib/seg/sql/seg.sql | 3 +++ 4 files changed, 33 insertions(+), 7 deletions(-) diff --git a/contrib/seg/expected/seg.out b/contrib/seg/expected/seg.out index d20b7d60e15..b67e93f9c18 100644 --- a/contrib/seg/expected/seg.out +++ b/contrib/seg/expected/seg.out @@ -256,6 +256,13 @@ SELECT '12.34567890123456'::seg AS seg; 12.3457 (1 row) +-- Same, with a very long input +SELECT '12.3456789012345600000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000'::seg AS seg; + seg +--------- + 12.3457 +(1 row) + -- Numbers with certainty indicators SELECT '~6.5'::seg AS seg; seg diff --git a/contrib/seg/seg.c b/contrib/seg/seg.c index 4a8e2be3290..91b8a796004 100644 --- a/contrib/seg/seg.c +++ b/contrib/seg/seg.c @@ -927,9 +927,13 @@ restore(char *result, float val, int n) /* * Put a cap on the number of significant digits to avoid garbage in the - * output and ensure we don't overrun the result buffer. + * output and ensure we don't overrun the result buffer. (n should not be + * negative, but check to protect ourselves against corrupted data.) */ - n = Min(n, FLT_DIG); + if (n <= 0) + n = FLT_DIG; + else + n = Min(n, FLT_DIG); /* remember the sign */ sign = (val < 0 ? 1 : 0); diff --git a/contrib/seg/segparse.y b/contrib/seg/segparse.y index 040cab39041..3115b12ebd4 100644 --- a/contrib/seg/segparse.y +++ b/contrib/seg/segparse.y @@ -3,6 +3,7 @@ #include "postgres.h" +#include #include #include "fmgr.h" @@ -23,6 +24,8 @@ static float seg_atof(const char *value); +static int sig_digits(const char *value); + static char strbuf[25] = { '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', @@ -63,9 +66,9 @@ range: boundary PLUMIN deviation result->lower = $1.val - $3.val; result->upper = $1.val + $3.val; sprintf(strbuf, "%g", result->lower); - result->l_sigd = Max(Min(6, significant_digits(strbuf)), Max($1.sigd, $3.sigd)); + result->l_sigd = Max(sig_digits(strbuf), Max($1.sigd, $3.sigd)); sprintf(strbuf, "%g", result->upper); - result->u_sigd = Max(Min(6, significant_digits(strbuf)), Max($1.sigd, $3.sigd)); + result->u_sigd = Max(sig_digits(strbuf), Max($1.sigd, $3.sigd)); result->l_ext = '\0'; result->u_ext = '\0'; } @@ -122,7 +125,7 @@ boundary: SEGFLOAT float val = seg_atof($1); $$.ext = '\0'; - $$.sigd = significant_digits($1); + $$.sigd = sig_digits($1); $$.val = val; } | EXTENSION SEGFLOAT @@ -131,7 +134,7 @@ boundary: SEGFLOAT float val = seg_atof($2); $$.ext = $1[0]; - $$.sigd = significant_digits($2); + $$.sigd = sig_digits($2); $$.val = val; } ; @@ -142,7 +145,7 @@ deviation: SEGFLOAT float val = seg_atof($1); $$.ext = '\0'; - $$.sigd = significant_digits($1); + $$.sigd = sig_digits($1); $$.val = val; } ; @@ -159,5 +162,14 @@ seg_atof(const char *value) return DatumGetFloat4(datum); } +static int +sig_digits(const char *value) +{ + int n = significant_digits(value); + + /* Clamp, to ensure value will fit in sigd fields */ + return Min(n, FLT_DIG); +} + #include "segscan.c" diff --git a/contrib/seg/sql/seg.sql b/contrib/seg/sql/seg.sql index eb7c7138f82..8decf0937bf 100644 --- a/contrib/seg/sql/seg.sql +++ b/contrib/seg/sql/seg.sql @@ -60,6 +60,9 @@ SELECT '3.400e5'::seg AS seg; -- Digits truncated SELECT '12.34567890123456'::seg AS seg; +-- Same, with a very long input +SELECT '12.3456789012345600000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000'::seg AS seg; + -- Numbers with certainty indicators SELECT '~6.5'::seg AS seg; SELECT '<6.5'::seg AS seg; From 66a9673555904a5caf6f52865ee5d22873f9e737 Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Thu, 22 Dec 2022 10:35:02 -0500 Subject: [PATCH 43/70] Add some recursion and looping defenses in prepjointree.c. Andrey Lepikhov demonstrated a case where we spend an unreasonable amount of time in pull_up_subqueries(). Not only is that recursing with no explicit check for stack overrun, but the code seems not interruptable by control-C. Let's stick a CHECK_FOR_INTERRUPTS there, along with sprinkling some stack depth checks. An actual fix for the excessive time consumption seems a bit risky to back-patch; but this isn't, so let's do so. Discussion: https://postgr.es/m/703c09a2-08f3-d2ec-b33d-dbecd62428b8@postgrespro.ru --- src/backend/optimizer/prep/prepjointree.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/src/backend/optimizer/prep/prepjointree.c b/src/backend/optimizer/prep/prepjointree.c index eb4450a3632..9d245c31e44 100644 --- a/src/backend/optimizer/prep/prepjointree.c +++ b/src/backend/optimizer/prep/prepjointree.c @@ -29,6 +29,7 @@ #include "catalog/pg_type.h" #include "funcapi.h" +#include "miscadmin.h" #include "nodes/makefuncs.h" #include "nodes/nodeFuncs.h" #include "optimizer/clauses.h" @@ -243,6 +244,9 @@ static Node * pull_up_sublinks_jointree_recurse(PlannerInfo *root, Node *jtnode, Relids *relids) { + /* Since this function recurses, it could be driven to stack overflow. */ + check_stack_depth(); + if (jtnode == NULL) { *relids = NULL; @@ -900,6 +904,11 @@ pull_up_subqueries_recurse(PlannerInfo *root, Node *jtnode, JoinExpr *lowest_nulling_outer_join, AppendRelInfo *containing_appendrel) { + /* Since this function recurses, it could be driven to stack overflow. */ + check_stack_depth(); + /* Also, since it's a bit expensive, let's check for query cancel. */ + CHECK_FOR_INTERRUPTS(); + Assert(jtnode != NULL); if (IsA(jtnode, RangeTblRef)) { @@ -2057,6 +2066,9 @@ is_simple_union_all(Query *subquery) static bool is_simple_union_all_recurse(Node *setOp, Query *setOpQuery, List *colTypes) { + /* Since this function recurses, it could be driven to stack overflow. */ + check_stack_depth(); + if (IsA(setOp, RangeTblRef)) { RangeTblRef *rtr = (RangeTblRef *) setOp; From 0af45634c62e1a630e136d8898885ff8fe9fc087 Mon Sep 17 00:00:00 2001 From: Michael Paquier Date: Fri, 23 Dec 2022 10:04:33 +0900 Subject: [PATCH 44/70] Fix come incorrect elog() messages in aclchk.c Three error strings used with cache lookup failures were referring to incorrect object types for ACL checks: - Schemas - Types - Foreign Servers There errors should never be triggered, but if they do incorrect information would be reported. Author: Justin Pryzby Discussion: https://postgr.es/m/20221222153041.GN1153@telsasoft.com Backpatch-through: 11 --- src/backend/catalog/aclchk.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/backend/catalog/aclchk.c b/src/backend/catalog/aclchk.c index 219f64f5298..44011306df2 100644 --- a/src/backend/catalog/aclchk.c +++ b/src/backend/catalog/aclchk.c @@ -6402,7 +6402,7 @@ recordExtObjInitPriv(Oid objoid, Oid classoid) tuple = SearchSysCache1(FOREIGNSERVEROID, ObjectIdGetDatum(objoid)); if (!HeapTupleIsValid(tuple)) - elog(ERROR, "cache lookup failed for foreign data wrapper %u", + elog(ERROR, "cache lookup failed for foreign server %u", objoid); aclDatum = SysCacheGetAttr(FOREIGNSERVEROID, tuple, @@ -6488,7 +6488,7 @@ recordExtObjInitPriv(Oid objoid, Oid classoid) tuple = SearchSysCache1(NAMESPACEOID, ObjectIdGetDatum(objoid)); if (!HeapTupleIsValid(tuple)) - elog(ERROR, "cache lookup failed for function %u", objoid); + elog(ERROR, "cache lookup failed for schema %u", objoid); aclDatum = SysCacheGetAttr(NAMESPACEOID, tuple, Anum_pg_namespace_nspacl, &isNull); @@ -6530,7 +6530,7 @@ recordExtObjInitPriv(Oid objoid, Oid classoid) tuple = SearchSysCache1(TYPEOID, ObjectIdGetDatum(objoid)); if (!HeapTupleIsValid(tuple)) - elog(ERROR, "cache lookup failed for function %u", objoid); + elog(ERROR, "cache lookup failed for type %u", objoid); aclDatum = SysCacheGetAttr(TYPEOID, tuple, Anum_pg_type_typacl, &isNull); From 46fff788bb1cbc120c81031edddfde50c4d28ec9 Mon Sep 17 00:00:00 2001 From: Alvaro Herrera Date: Fri, 23 Dec 2022 13:21:41 +0100 Subject: [PATCH 45/70] Fix event trigger example Commit 2f9661311b changed command tags from strings to numbers, but forgot to adjust the code in the event trigger example, which consequently failed to compile. While fixing that, improve the indentation to adhere to pgindent style. Backpatch to v13, where the change was introduced. Author: Laurenz Albe Discussion: https://postgr.es/m/81e36ac17dc80489e74dc5b6914afa6ccdb1a99d.camel@cybertec.at --- doc/src/sgml/event-trigger.sgml | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/doc/src/sgml/event-trigger.sgml b/doc/src/sgml/event-trigger.sgml index 9c66f97b0f6..a76e8ac09be 100644 --- a/doc/src/sgml/event-trigger.sgml +++ b/doc/src/sgml/event-trigger.sgml @@ -1194,8 +1194,9 @@ noddl(PG_FUNCTION_ARGS) trigdata = (EventTriggerData *) fcinfo->context; ereport(ERROR, - (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), - errmsg("command \"%s\" denied", trigdata->tag))); + (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), + errmsg("command \"%s\" denied", + GetCommandTagName(trigdata->tag)))); PG_RETURN_NULL(); } From a18396e5635cc4919381de7187f2777e0c8e4c00 Mon Sep 17 00:00:00 2001 From: Tomas Vondra Date: Fri, 30 Dec 2022 19:44:48 +0100 Subject: [PATCH 46/70] Fix assert in BRIN build_distances When brin_minmax_multi_union merges summaries, we may end up with just a single range after merge_overlapping_ranges. The summaries may contain just one range each, and they may overlap (or be exactly the same). With a single range there's no distance to calculate, but we happen to call build_distances anyway - which is fine, we don't calculate the distance in this case, except that with asserts this failed due to a check there are at least two ranges. The assert is unnecessarily strict, so relax it a bit and bail out if there's just a single range. The relaxed assert would be enough, but this way we don't allocate unnecessary memory for distance. Backpatch to 14, where minmax-multi opclasses were introduced. Reported-by: Jaime Casanova Backpatch-through: 14 Discussion: https://postgr.es/m/YzVA55qS0hgz8P3r@ahch-to --- src/backend/access/brin/brin_minmax_multi.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/src/backend/access/brin/brin_minmax_multi.c b/src/backend/access/brin/brin_minmax_multi.c index 1e4afafe622..b343226a191 100644 --- a/src/backend/access/brin/brin_minmax_multi.c +++ b/src/backend/access/brin/brin_minmax_multi.c @@ -1335,7 +1335,11 @@ build_distances(FmgrInfo *distanceFn, Oid colloid, int ndistances; DistanceValue *distances; - Assert(neranges >= 2); + Assert(neranges > 0); + + /* If there's only a single range, there's no distance to calculate. */ + if (neranges == 1) + return NULL; ndistances = (neranges - 1); distances = (DistanceValue *) palloc0(sizeof(DistanceValue) * ndistances); From 6bb4abe3c00953cb5c58e6612162d464e061e8f6 Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Mon, 2 Jan 2023 16:17:00 -0500 Subject: [PATCH 47/70] Avoid reference to nonexistent array element in ExecInitAgg(). When considering an empty grouping set, we fetched phasedata->eqfunctions[-1]. Because the eqfunctions array is palloc'd, that would always be an aset pointer in released versions, and thus the code accidentally failed to malfunction (since it would do nothing unless it found a null pointer). Nonetheless this seems like trouble waiting to happen, so add a check for length == 0. It's depressing that our valgrind testing did not catch this. Maybe we should reconsider the choice to not mark that word NOACCESS? Richard Guo Discussion: https://postgr.es/m/CAMbWs4-vZuuPOZsKOYnSAaPYGKhmacxhki+vpOKk0O7rymccXQ@mail.gmail.com --- src/backend/executor/nodeAgg.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c index 7513d11102f..09c8991178a 100644 --- a/src/backend/executor/nodeAgg.c +++ b/src/backend/executor/nodeAgg.c @@ -3867,6 +3867,11 @@ ExecInitAgg(Agg *node, EState *estate, int eflags) { int length = phasedata->gset_lengths[i]; + /* nothing to do for empty grouping set */ + if (length == 0) + continue; + + /* if we already had one of this length, it'll do */ if (phasedata->eqfunctions[length - 1] != NULL) continue; From 834d9cbf52bcda9cf32b5dbf2b329b7079d0ee37 Mon Sep 17 00:00:00 2001 From: Michael Paquier Date: Tue, 3 Jan 2023 16:26:30 +0900 Subject: [PATCH 48/70] Fix typos in comments, code and documentation While on it, newlines are removed from the end of two elog() strings. The others are simple grammar mistakes. One comment in pg_upgrade referred incorrectly to sequences since a7e5457. Author: Justin Pryzby Discussion: https://postgr.es/m/20221230231257.GI1153@telsasoft.com Backpatch-through: 11 --- doc/src/sgml/parallel.sgml | 2 +- doc/src/sgml/ref/alter_table.sgml | 2 +- doc/src/sgml/sources.sgml | 4 ++-- src/backend/access/common/bufmask.c | 2 +- src/backend/access/spgist/spgutils.c | 2 +- src/backend/jit/llvm/llvmjit.c | 2 +- src/backend/optimizer/util/tlist.c | 2 +- src/backend/utils/adt/ruleutils.c | 2 +- src/bin/pg_upgrade/info.c | 9 ++++----- src/test/regress/expected/expressions.out | 2 +- src/test/regress/sql/expressions.sql | 2 +- 11 files changed, 15 insertions(+), 16 deletions(-) diff --git a/doc/src/sgml/parallel.sgml b/doc/src/sgml/parallel.sgml index 3e8326d46c8..2b2c8cbffd6 100644 --- a/doc/src/sgml/parallel.sgml +++ b/doc/src/sgml/parallel.sgml @@ -128,7 +128,7 @@ EXPLAIN SELECT * FROM pgbench_accounts WHERE filler LIKE '%x%'; In addition, the system must not be running in single-user mode. Since - the entire database system is running in single process in this situation, + the entire database system is running as a single process in this situation, no background workers will be available. diff --git a/doc/src/sgml/ref/alter_table.sgml b/doc/src/sgml/ref/alter_table.sgml index cdcfbdb5b6b..ba7a077ae10 100644 --- a/doc/src/sgml/ref/alter_table.sgml +++ b/doc/src/sgml/ref/alter_table.sgml @@ -1050,7 +1050,7 @@ Where column_reference_storage_directive is: constraint. This does not work, however, if any of the partition keys is an expression and the partition does not accept NULL values. If attaching a list partition that will - not accept NULL values, also add + not accept NULL values, also add a NOT NULL constraint to the partition key column, unless it's an expression. diff --git a/doc/src/sgml/sources.sgml b/doc/src/sgml/sources.sgml index e6ae02f2af7..4ee99853b23 100644 --- a/doc/src/sgml/sources.sgml +++ b/doc/src/sgml/sources.sgml @@ -929,8 +929,8 @@ BETTER: unrecognized node type: 42 Function-Like Macros and Inline Functions - Both, macros with arguments and static inline - functions, may be used. The latter are preferable if there are + Both macros with arguments and static inline + functions may be used. The latter are preferable if there are multiple-evaluation hazards when written as a macro, as e.g., the case with diff --git a/src/backend/access/common/bufmask.c b/src/backend/access/common/bufmask.c index 003a0befb25..409acecf42a 100644 --- a/src/backend/access/common/bufmask.c +++ b/src/backend/access/common/bufmask.c @@ -78,7 +78,7 @@ mask_unused_space(Page page) if (pd_lower > pd_upper || pd_special < pd_upper || pd_lower < SizeOfPageHeaderData || pd_special > BLCKSZ) { - elog(ERROR, "invalid page pd_lower %u pd_upper %u pd_special %u\n", + elog(ERROR, "invalid page pd_lower %u pd_upper %u pd_special %u", pd_lower, pd_upper, pd_special); } diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c index e3e663a006e..44e7f7a1ba2 100644 --- a/src/backend/access/spgist/spgutils.c +++ b/src/backend/access/spgist/spgutils.c @@ -1301,7 +1301,7 @@ spgproperty(Oid index_oid, int attno, /* * Currently, SP-GiST distance-ordered scans require that there be a * distance operator in the opclass with the default types. So we assume - * that if such a operator exists, then there's a reason for it. + * that if such an operator exists, then there's a reason for it. */ /* First we need to know the column's opclass. */ diff --git a/src/backend/jit/llvm/llvmjit.c b/src/backend/jit/llvm/llvmjit.c index 199fff4f773..e8eb83ebd38 100644 --- a/src/backend/jit/llvm/llvmjit.c +++ b/src/backend/jit/llvm/llvmjit.c @@ -816,7 +816,7 @@ llvm_session_initialize(void) if (LLVMGetTargetFromTriple(llvm_triple, &llvm_targetref, &error) != 0) { - elog(FATAL, "failed to query triple %s\n", error); + elog(FATAL, "failed to query triple %s", error); } /* diff --git a/src/backend/optimizer/util/tlist.c b/src/backend/optimizer/util/tlist.c index bb16a8d2c38..cfe9d3ea39d 100644 --- a/src/backend/optimizer/util/tlist.c +++ b/src/backend/optimizer/util/tlist.c @@ -1108,7 +1108,7 @@ apply_pathtarget_labeling_to_tlist(List *tlist, PathTarget *target) * * The outputs of this function are two parallel lists, one a list of * PathTargets and the other an integer list of bool flags indicating - * whether the corresponding PathTarget contains any evaluatable SRFs. + * whether the corresponding PathTarget contains any evaluable SRFs. * The lists are given in the order they'd need to be evaluated in, with * the "lowest" PathTarget first. So the last list entry is always the * originally given PathTarget, and any entries before it indicate evaluation diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c index 95ae6c76018..bf663afb3c5 100644 --- a/src/backend/utils/adt/ruleutils.c +++ b/src/backend/utils/adt/ruleutils.c @@ -11787,7 +11787,7 @@ get_opclass_name_for_distribution_key(Oid opclass, Oid actual_datatype, /* * generate_opclass_name - * Compute the name to display for a opclass specified by OID + * Compute the name to display for an opclass specified by OID * * The result includes all necessary quoting and schema-prefixing. */ diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c index da2f3cb9e38..a1e96686793 100644 --- a/src/bin/pg_upgrade/info.c +++ b/src/bin/pg_upgrade/info.c @@ -462,11 +462,10 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo) query[0] = '\0'; /* initialize query string to empty */ /* - * Create a CTE that collects OIDs of regular user tables, including - * matviews and sequences, but excluding toast tables and indexes. We - * assume that relations with OIDs >= FirstNormalObjectId belong to the - * user. (That's probably redundant with the namespace-name exclusions, - * but let's be safe.) + * Create a CTE that collects OIDs of regular user tables and matviews, + * but excluding toast tables and indexes. We assume that relations with + * OIDs >= FirstNormalObjectId belong to the user. (That's probably + * redundant with the namespace-name exclusions, but let's be safe.) * * pg_largeobject contains user data that does not appear in pg_dump * output, so we have to copy that system table. It's easiest to do that diff --git a/src/test/regress/expected/expressions.out b/src/test/regress/expected/expressions.out index 4168cd55af6..56b8c9c6189 100644 --- a/src/test/regress/expected/expressions.out +++ b/src/test/regress/expected/expressions.out @@ -57,7 +57,7 @@ SELECT now()::timestamp::text = localtimestamp::text; t (1 row) --- current_role/user/user is tested in rolnames.sql +-- current_role/user/user is tested in rolenames.sql -- current database / catalog SELECT current_catalog = current_database(); ?column? diff --git a/src/test/regress/sql/expressions.sql b/src/test/regress/sql/expressions.sql index f9f9f97efa4..f87751f4ef5 100644 --- a/src/test/regress/sql/expressions.sql +++ b/src/test/regress/sql/expressions.sql @@ -24,7 +24,7 @@ SELECT length(current_timestamp::text) >= length(current_timestamp(0)::text); -- localtimestamp SELECT now()::timestamp::text = localtimestamp::text; --- current_role/user/user is tested in rolnames.sql +-- current_role/user/user is tested in rolenames.sql -- current database / catalog SELECT current_catalog = current_database(); From e1ba739674034fa36b9d7497521a3ed6a1710e87 Mon Sep 17 00:00:00 2001 From: Robert Haas Date: Tue, 3 Jan 2023 14:50:40 -0500 Subject: [PATCH 49/70] Improve documentation of the CREATEROLE attibute. MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit In user-manag.sgml, document precisely what privileges are conveyed by CREATEROLE. Make particular note of the fact that it allows changing passwords and granting access to high-privilege roles. Also remove the suggestion of using a user with CREATEROLE and CREATEDB instead of a superuser, as there is no real security advantage to this approach. Elsewhere in the documentation, adjust text that suggests that CREATEROLE only allows for role creation, and refer to the documentation in user-manag.sgml as appropriate. Patch by me, reviewed by Álvaro Herrera Discussion: http://postgr.es/m/CA+TgmoZBsPL8nPhvYecx7iGo5qpDRqa9k_AcaW1SbOjugAY1Ag@mail.gmail.com --- doc/src/sgml/ref/alter_role.sgml | 2 +- doc/src/sgml/ref/create_role.sgml | 10 +++---- doc/src/sgml/ref/createuser.sgml | 18 ++++++++---- doc/src/sgml/user-manag.sgml | 47 ++++++++++++++++++++++--------- 4 files changed, 52 insertions(+), 25 deletions(-) diff --git a/doc/src/sgml/ref/alter_role.sgml b/doc/src/sgml/ref/alter_role.sgml index c9047374eff..af2c1bec08e 100644 --- a/doc/src/sgml/ref/alter_role.sgml +++ b/doc/src/sgml/ref/alter_role.sgml @@ -341,7 +341,7 @@ ALTER ROLE fred VALID UNTIL 'infinity'; - Give a role the ability to create other roles and new databases: + Give a role the ability to manage other roles and create new databases: ALTER ROLE miriam CREATEROLE CREATEDB; diff --git a/doc/src/sgml/ref/create_role.sgml b/doc/src/sgml/ref/create_role.sgml index d6e06939a3a..8cd66949832 100644 --- a/doc/src/sgml/ref/create_role.sgml +++ b/doc/src/sgml/ref/create_role.sgml @@ -126,11 +126,11 @@ in sync when changing the above synopsis! These clauses determine whether a role will be permitted to - create new roles (that is, execute CREATE ROLE). - A role with CREATEROLE privilege can also alter - and drop other roles. - If not specified, - NOCREATEROLE is the default. + create, alter, drop, comment on, change the security label for, + and grant or revoke membership in other roles. + See for more details about what + capabilities are conferred by this privilege. + If not specified, NOCREATEROLE is the default. diff --git a/doc/src/sgml/ref/createuser.sgml b/doc/src/sgml/ref/createuser.sgml index 17579e50afb..0e1a39a3fe6 100644 --- a/doc/src/sgml/ref/createuser.sgml +++ b/doc/src/sgml/ref/createuser.sgml @@ -41,10 +41,14 @@ PostgreSQL documentation - If you wish to create a new superuser, you must connect as a - superuser, not merely with CREATEROLE privilege. + If you wish to create a role with the SUPERUSER, + REPLICATION, or BYPASSRLS privilege, + you must connect as a superuser, not merely with + CREATEROLE privilege. Being a superuser implies the ability to bypass all access permission - checks within the database, so superuser access should not be granted lightly. + checks within the database, so superuser access should not be granted + lightly. CREATEROLE also conveys + very extensive privileges. @@ -221,8 +225,12 @@ PostgreSQL documentation - The new user will be allowed to create new roles (that is, - this user will have CREATEROLE privilege). + The new user will be allowed to create, alter, drop, comment on, + change the security label for, and grant or revoke membership in + other roles; that is, + this user will have CREATEROLE privilege. + See for more details about what + capabilities are conferred by this privilege. diff --git a/doc/src/sgml/user-manag.sgml b/doc/src/sgml/user-manag.sgml index 5420567896d..41b014205f7 100644 --- a/doc/src/sgml/user-manag.sgml +++ b/doc/src/sgml/user-manag.sgml @@ -191,7 +191,7 @@ CREATE USER name; - role creationroleprivilege to create + role creationroleprivilege to create A role must be explicitly given permission to create more roles @@ -200,9 +200,38 @@ CREATE USER name; name CREATEROLE. A role with CREATEROLE privilege can alter and drop other roles, too, as well as grant or revoke membership in them. - However, to create, alter, drop, or change membership of a - superuser role, superuser status is required; - CREATEROLE is insufficient for that. + Altering a role includes most changes that can be made using + ALTER ROLE, including, for example, changing + passwords. It also includes modifications to a role that can + be made using the COMMENT and + SECURITY LABEL commands. + + + However, CREATEROLE does not convey the ability to + create SUPERUSER roles, nor does it convey any + power over SUPERUSER roles that already exist. + Furthermore, CREATEROLE does not convey the power + to create REPLICATION users, nor the ability to + grant or revoke the REPLICATION privilege, nor the + ability to modify the role properties of such users. However, it does + allow ALTER ROLE ... SET and + ALTER ROLE ... RENAME to be used on + REPLICATION roles, as well as the use of + COMMENT ON ROLE, + SECURITY LABEL ON ROLE, + and DROP ROLE. + Finally, CREATEROLE does not + confer the ability to grant or revoke the BYPASSRLS + privilege. + + + Because the CREATEROLE privilege allows a user + to grant or revoke membership even in roles to which it does not (yet) + have any access, a CREATEROLE user can obtain access + to the capabilities of every predefined role in the system, including + highly privileged roles such as + pg_execute_server_program and + pg_write_server_files. @@ -277,16 +306,6 @@ CREATE USER name; and commands for details. - - - It is good practice to create a role that has the CREATEDB - and CREATEROLE privileges, but is not a superuser, and then - use this role for all routine management of databases and roles. This - approach avoids the dangers of operating as a superuser for tasks that - do not really require it. - - - A role can also have role-specific defaults for many of the run-time configuration settings described in Date: Fri, 6 Jan 2023 16:38:46 +1300 Subject: [PATCH 50/70] Fix pg_truncate() on Windows. Commit 57faaf376 added pg_truncate(const char *path, off_t length), but "length" was ignored under WIN32 and the file was unconditionally truncated to 0. There was no live bug, since the only caller passes 0. Fix, and back-patch to 14 where the function arrived. Author: Justin Pryzby Discussion: https://postgr.es/m/20230106031652.GR3109%40telsasoft.com --- src/backend/storage/file/fd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/backend/storage/file/fd.c b/src/backend/storage/file/fd.c index e2953686b8e..f7f5de69ffb 100644 --- a/src/backend/storage/file/fd.c +++ b/src/backend/storage/file/fd.c @@ -668,7 +668,7 @@ pg_truncate(const char *path, off_t length) fd = OpenTransientFile(path, O_RDWR | PG_BINARY); if (fd >= 0) { - ret = ftruncate(fd, 0); + ret = ftruncate(fd, length); save_errno = errno; CloseTransientFile(fd); errno = save_errno; From 18709f3f88b7c2fe571aef6c5313be9f61164a80 Mon Sep 17 00:00:00 2001 From: Dean Rasheed Date: Fri, 6 Jan 2023 11:15:22 +0000 Subject: [PATCH 51/70] Fix tab completion of ALTER FUNCTION/PROCEDURE/ROUTINE ... SET SCHEMA. The ALTER DATABASE|FUNCTION|PROCEDURE|ROLE|ROUTINE|USER ... SET case in psql tab completion failed to exclude = "SCHEMA", which caused ALTER FUNCTION|PROCEDURE|ROUTINE ... SET SCHEMA to complete with "FROM CURRENT" and "TO", which won't work. Fix that, so that those cases now complete with the list of schemas, like other ALTER ... SET SCHEMA commands. Noticed while testing the recent patch to improve tab completion for ALTER FUNCTION/PROCEDURE/ROUTINE, but this is not directly related to that patch. Rather, this is a long-standing bug, so back-patch to all supported branches. Discussion: https://postgr.es/m/CALDaNm0s7GQmkLP_mx5Cvk=UzYMnjhPmXBxU8DsHEunFbC5sTg@mail.gmail.com --- src/bin/psql/tab-complete.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c index 3593b78577d..1bdb618da93 100644 --- a/src/bin/psql/tab-complete.c +++ b/src/bin/psql/tab-complete.c @@ -4023,11 +4023,12 @@ psql_completion(const char *text, int start, int end) COMPLETE_WITH("TO"); /* - * Complete ALTER DATABASE|FUNCTION||PROCEDURE|ROLE|ROUTINE|USER ... SET + * Complete ALTER DATABASE|FUNCTION|PROCEDURE|ROLE|ROUTINE|USER ... SET * */ else if (HeadMatches("ALTER", "DATABASE|FUNCTION|PROCEDURE|ROLE|ROUTINE|USER") && - TailMatches("SET", MatchAny)) + TailMatches("SET", MatchAny) && + !TailMatches("SCHEMA")) COMPLETE_WITH("FROM CURRENT", "TO"); /* From 6bc8292ff7e0c0aa5b13db7a77d1d7a4997cc799 Mon Sep 17 00:00:00 2001 From: Amit Kapila Date: Sat, 7 Jan 2023 11:52:41 +0530 Subject: [PATCH 52/70] Remove the streaming files for incomplete xacts after restart. After restart, we try to stream the changes for large transactions that were not sent before server crash and restart. However, we forget to send the abort message for such transactions. This leads to spurious streaming files on the subscriber which won't be cleaned till the apply worker or the subscriber server restarts. Reported-by: Dilip Kumar Author: Hou Zhijie Reviewed-by: Dilip Kumar and Amit Kapila Backpatch-through: 14 Discussion: https://postgr.es/m/OS0PR01MB5716A773F46768A1B75BE24394FB9@OS0PR01MB5716.jpnprd01.prod.outlook.com --- src/backend/replication/logical/reorderbuffer.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c index 1d1649a22a8..709365fc8c6 100644 --- a/src/backend/replication/logical/reorderbuffer.c +++ b/src/backend/replication/logical/reorderbuffer.c @@ -2898,6 +2898,10 @@ ReorderBufferAbortOld(ReorderBuffer *rb, TransactionId oldestRunningXid) { elog(DEBUG2, "aborting old transaction %u", txn->xid); + /* Notify the remote node about the crash/immediate restart. */ + if (rbtxn_is_streamed(txn)) + rb->stream_abort(rb, txn, InvalidXLogRecPtr); + /* remove potential on-disk data, and deallocate this tx */ ReorderBufferCleanupTXN(rb, txn); } From a8cddbd803c3dece862cdb2ec060131cf7574b81 Mon Sep 17 00:00:00 2001 From: Alexander Korotkov Date: Thu, 12 Jan 2023 18:16:34 +0300 Subject: [PATCH 53/70] Fix jsonpath existense checking of missing variables The current jsonpath code assumes that the referenced variable always exists. It could only throw an error at the value valuation time. At the same time existence checking assumes variable is present without valuation, and error suppression doesn't work for missing variables. This commit makes existense checking trigger an error for missing variables. This makes the overall behavior consistent. Backpatch to 12 where jsonpath was introduced. Reported-by: David G. Johnston Discussion: https://postgr.es/m/CAKFQuwbeytffJkVnEqDyLZ%3DrQsznoTh1OgDoOF3VmOMkxcTMjA%40mail.gmail.com Author: Alexander Korotkov, David G. Johnston Backpatch-through: 12 --- src/backend/utils/adt/jsonpath_exec.c | 8 +++-- src/test/regress/expected/jsonb_jsonpath.out | 32 ++++++++++++++++++++ src/test/regress/sql/jsonb_jsonpath.sql | 8 +++++ 3 files changed, 46 insertions(+), 2 deletions(-) diff --git a/src/backend/utils/adt/jsonpath_exec.c b/src/backend/utils/adt/jsonpath_exec.c index 078aaef5392..cd2ac04d89e 100644 --- a/src/backend/utils/adt/jsonpath_exec.c +++ b/src/backend/utils/adt/jsonpath_exec.c @@ -958,9 +958,13 @@ executeItemOptUnwrapTarget(JsonPathExecContext *cxt, JsonPathItem *jsp, JsonbValue *v; bool hasNext = jspGetNext(jsp, &elem); - if (!hasNext && !found) + if (!hasNext && !found && jsp->type != jpiVariable) { - res = jperOk; /* skip evaluation */ + /* + * Skip evaluation, but not for variables. We must + * trigger an error for the missing variable. + */ + res = jperOk; break; } diff --git a/src/test/regress/expected/jsonb_jsonpath.out b/src/test/regress/expected/jsonb_jsonpath.out index 508ddd797ed..328a6b39199 100644 --- a/src/test/regress/expected/jsonb_jsonpath.out +++ b/src/test/regress/expected/jsonb_jsonpath.out @@ -2212,6 +2212,14 @@ SELECT jsonb_path_query('[{"a": 1}, {"a": 2}]', '$[*] ? (@.a > 10)'); ------------------ (0 rows) +SELECT jsonb_path_query('[{"a": 1}]', '$undefined_var'); +ERROR: could not find jsonpath variable "undefined_var" +SELECT jsonb_path_query('[{"a": 1}]', 'false'); + jsonb_path_query +------------------ + false +(1 row) + SELECT jsonb_path_query_array('[{"a": 1}, {"a": 2}, {}]', 'strict $[*].a'); ERROR: JSON object does not contain key "a" SELECT jsonb_path_query_array('[{"a": 1}, {"a": 2}]', '$[*].a'); @@ -2282,6 +2290,14 @@ SELECT jsonb_path_query_first('[{"a": 1}, {"a": 2}, {"a": 3}, {"a": 5}]', '$[*]. (1 row) +SELECT jsonb_path_query_first('[{"a": 1}]', '$undefined_var'); +ERROR: could not find jsonpath variable "undefined_var" +SELECT jsonb_path_query_first('[{"a": 1}]', 'false'); + jsonb_path_query_first +------------------------ + false +(1 row) + SELECT jsonb '[{"a": 1}, {"a": 2}]' @? '$[*].a ? (@ > 1)'; ?column? ---------- @@ -2312,6 +2328,14 @@ SELECT jsonb_path_exists('[{"a": 1}, {"a": 2}, {"a": 3}, {"a": 5}]', '$[*] ? (@. f (1 row) +SELECT jsonb_path_exists('[{"a": 1}]', '$undefined_var'); +ERROR: could not find jsonpath variable "undefined_var" +SELECT jsonb_path_exists('[{"a": 1}]', 'false'); + jsonb_path_exists +------------------- + t +(1 row) + SELECT jsonb_path_match('true', '$', silent => false); jsonb_path_match ------------------ @@ -2374,6 +2398,14 @@ SELECT jsonb_path_match('[{"a": 1}, {"a": 2}]', '$[*].a > 1'); t (1 row) +SELECT jsonb_path_match('[{"a": 1}]', '$undefined_var'); +ERROR: could not find jsonpath variable "undefined_var" +SELECT jsonb_path_match('[{"a": 1}]', 'false'); + jsonb_path_match +------------------ + f +(1 row) + -- test string comparison (Unicode codepoint collation) WITH str(j, num) AS ( diff --git a/src/test/regress/sql/jsonb_jsonpath.sql b/src/test/regress/sql/jsonb_jsonpath.sql index 60f73cb0590..bd025077d52 100644 --- a/src/test/regress/sql/jsonb_jsonpath.sql +++ b/src/test/regress/sql/jsonb_jsonpath.sql @@ -532,6 +532,8 @@ set time zone default; SELECT jsonb_path_query('[{"a": 1}, {"a": 2}]', '$[*]'); SELECT jsonb_path_query('[{"a": 1}, {"a": 2}]', '$[*] ? (@.a > 10)'); +SELECT jsonb_path_query('[{"a": 1}]', '$undefined_var'); +SELECT jsonb_path_query('[{"a": 1}]', 'false'); SELECT jsonb_path_query_array('[{"a": 1}, {"a": 2}, {}]', 'strict $[*].a'); SELECT jsonb_path_query_array('[{"a": 1}, {"a": 2}]', '$[*].a'); @@ -547,12 +549,16 @@ SELECT jsonb_path_query_first('[{"a": 1}, {"a": 2}]', '$[*].a ? (@ == 1)'); SELECT jsonb_path_query_first('[{"a": 1}, {"a": 2}]', '$[*].a ? (@ > 10)'); SELECT jsonb_path_query_first('[{"a": 1}, {"a": 2}, {"a": 3}, {"a": 5}]', '$[*].a ? (@ > $min && @ < $max)', vars => '{"min": 1, "max": 4}'); SELECT jsonb_path_query_first('[{"a": 1}, {"a": 2}, {"a": 3}, {"a": 5}]', '$[*].a ? (@ > $min && @ < $max)', vars => '{"min": 3, "max": 4}'); +SELECT jsonb_path_query_first('[{"a": 1}]', '$undefined_var'); +SELECT jsonb_path_query_first('[{"a": 1}]', 'false'); SELECT jsonb '[{"a": 1}, {"a": 2}]' @? '$[*].a ? (@ > 1)'; SELECT jsonb '[{"a": 1}, {"a": 2}]' @? '$[*] ? (@.a > 2)'; SELECT jsonb_path_exists('[{"a": 1}, {"a": 2}]', '$[*].a ? (@ > 1)'); SELECT jsonb_path_exists('[{"a": 1}, {"a": 2}, {"a": 3}, {"a": 5}]', '$[*] ? (@.a > $min && @.a < $max)', vars => '{"min": 1, "max": 4}'); SELECT jsonb_path_exists('[{"a": 1}, {"a": 2}, {"a": 3}, {"a": 5}]', '$[*] ? (@.a > $min && @.a < $max)', vars => '{"min": 3, "max": 4}'); +SELECT jsonb_path_exists('[{"a": 1}]', '$undefined_var'); +SELECT jsonb_path_exists('[{"a": 1}]', 'false'); SELECT jsonb_path_match('true', '$', silent => false); SELECT jsonb_path_match('false', '$', silent => false); @@ -569,6 +575,8 @@ SELECT jsonb_path_match('[true, true]', '$[*]', silent => false); SELECT jsonb '[{"a": 1}, {"a": 2}]' @@ '$[*].a > 1'; SELECT jsonb '[{"a": 1}, {"a": 2}]' @@ '$[*].a > 2'; SELECT jsonb_path_match('[{"a": 1}, {"a": 2}]', '$[*].a > 1'); +SELECT jsonb_path_match('[{"a": 1}]', '$undefined_var'); +SELECT jsonb_path_match('[{"a": 1}]', 'false'); -- test string comparison (Unicode codepoint collation) WITH str(j, num) AS From 60836d4407110f9a5e5f5be78c85ba18d8256367 Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Fri, 13 Jan 2023 10:40:52 +1300 Subject: [PATCH 54/70] Fix WaitEventSetWait() buffer overrun. The WAIT_USE_EPOLL and WAIT_USE_KQUEUE implementations of WaitEventSetWaitBlock() confused the size of their internal buffer with the size of the caller's output buffer, and could ask the kernel for too many events. In fact the set of events retrieved from the kernel needs to be able to fit in both buffers, so take the smaller of the two. The WAIT_USE_POLL and WAIT_USE WIN32 implementations didn't have this confusion. This probably didn't come up before because we always used the same number in both places, but commit 7389aad6 calculates a dynamic size at construction time, while using MAXLISTEN for its output event buffer on the stack. That seems like a reasonable thing to want to do, so consider this to be a pre-existing bug worth fixing. As discovered by valgrind on skink. Back-patch to all supported releases for epoll, and to release 13 for the kqueue part, which copied the incorrect epoll code. Reviewed-by: Andres Freund Discussion: https://postgr.es/m/901504.1673504836%40sss.pgh.pa.us --- src/backend/storage/ipc/latch.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/src/backend/storage/ipc/latch.c b/src/backend/storage/ipc/latch.c index 7ecd3afe1b9..208dfeea7bd 100644 --- a/src/backend/storage/ipc/latch.c +++ b/src/backend/storage/ipc/latch.c @@ -1548,7 +1548,7 @@ WaitEventSetWaitBlock(WaitEventSet *set, int cur_timeout, /* Sleep */ rc = epoll_wait(set->epoll_fd, set->epoll_ret_events, - nevents, cur_timeout); + Min(nevents, set->nevents_space), cur_timeout); /* Check return code */ if (rc < 0) @@ -1699,7 +1699,8 @@ WaitEventSetWaitBlock(WaitEventSet *set, int cur_timeout, /* Sleep */ rc = kevent(set->kqueue_fd, NULL, 0, - set->kqueue_ret_events, nevents, + set->kqueue_ret_events, + Min(nevents, set->nevents_space), timeout_p); /* Check return code */ From cdf736956d77ba749533a535f8a6308802782273 Mon Sep 17 00:00:00 2001 From: Peter Eisentraut Date: Mon, 16 Jan 2023 09:20:44 +0100 Subject: [PATCH 55/70] Fix some BufFileRead() error reporting Remove "%m" from error messages where errno would be bogus. Add short read byte counts where appropriate. This is equivalent to what was done in 7897e3bb902c557412645b82120f4d95f7474906, but some code was apparently developed concurrently to that and not updated accordingly. Reviewed-by: Amit Kapila Discussion: https://www.postgresql.org/message-id/flat/f3501945-c591-8cc3-5ef0-b72a2e0eaa9c@enterprisedb.com --- src/backend/replication/backup_manifest.c | 3 ++- src/backend/replication/logical/worker.c | 33 +++++++++++++---------- 2 files changed, 21 insertions(+), 15 deletions(-) diff --git a/src/backend/replication/backup_manifest.c b/src/backend/replication/backup_manifest.c index 5add6672b2a..28646bdd334 100644 --- a/src/backend/replication/backup_manifest.c +++ b/src/backend/replication/backup_manifest.c @@ -377,7 +377,8 @@ SendBackupManifest(backup_manifest_info *manifest) if (rc != bytes_to_read) ereport(ERROR, (errcode_for_file_access(), - errmsg("could not read from temporary file: %m"))); + errmsg("could not read from temporary file: read only %zu of %zu bytes", + rc, bytes_to_read))); pq_putmessage('d', manifestbuf, bytes_to_read); manifest_bytes_done += bytes_to_read; } diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c index cb8f76d902a..67dd1a97a6f 100644 --- a/src/backend/replication/logical/worker.c +++ b/src/backend/replication/logical/worker.c @@ -1071,7 +1071,7 @@ apply_handle_stream_commit(StringInfo s) nchanges = 0; while (true) { - int nbytes; + size_t nbytes; int len; CHECK_FOR_INTERRUPTS(); @@ -1087,8 +1087,8 @@ apply_handle_stream_commit(StringInfo s) if (nbytes != sizeof(len)) ereport(ERROR, (errcode_for_file_access(), - errmsg("could not read from streaming transaction's changes file \"%s\": %m", - path))); + errmsg("could not read from streaming transaction's changes file \"%s\": read only %zu of %zu bytes", + path, nbytes, sizeof(len)))); if (len <= 0) elog(ERROR, "incorrect length %d in streaming transaction's changes file \"%s\"", @@ -1098,11 +1098,12 @@ apply_handle_stream_commit(StringInfo s) buffer = repalloc(buffer, len); /* and finally read the data into the buffer */ - if (BufFileRead(fd, buffer, len) != len) + nbytes = BufFileRead(fd, buffer, len); + if (nbytes != len) ereport(ERROR, (errcode_for_file_access(), - errmsg("could not read from streaming transaction's changes file \"%s\": %m", - path))); + errmsg("could not read from streaming transaction's changes file \"%s\": read only %zu of %zu bytes", + path, nbytes, (size_t) len))); /* copy the buffer to the stringinfo and call apply_dispatch */ resetStringInfo(&s2); @@ -2717,6 +2718,7 @@ static void subxact_info_read(Oid subid, TransactionId xid) { char path[MAXPGPATH]; + size_t nread; Size len; BufFile *fd; StreamXidHash *ent; @@ -2749,13 +2751,12 @@ subxact_info_read(Oid subid, TransactionId xid) fd = BufFileOpenShared(ent->subxact_fileset, path, O_RDONLY); /* read number of subxact items */ - if (BufFileRead(fd, &subxact_data.nsubxacts, - sizeof(subxact_data.nsubxacts)) != - sizeof(subxact_data.nsubxacts)) + nread = BufFileRead(fd, &subxact_data.nsubxacts, sizeof(subxact_data.nsubxacts)); + if (nread != sizeof(subxact_data.nsubxacts)) ereport(ERROR, (errcode_for_file_access(), - errmsg("could not read from streaming transaction's subxact file \"%s\": %m", - path))); + errmsg("could not read from streaming transaction's subxact file \"%s\": read only %zu of %zu bytes", + path, nread, sizeof(subxact_data.nsubxacts)))); len = sizeof(SubXactInfo) * subxact_data.nsubxacts; @@ -2773,11 +2774,15 @@ subxact_info_read(Oid subid, TransactionId xid) sizeof(SubXactInfo)); MemoryContextSwitchTo(oldctx); - if ((len > 0) && ((BufFileRead(fd, subxact_data.subxacts, len)) != len)) + if (len > 0) + { + nread = BufFileRead(fd, subxact_data.subxacts, len); + if (nread != len) ereport(ERROR, (errcode_for_file_access(), - errmsg("could not read from streaming transaction's subxact file \"%s\": %m", - path))); + errmsg("could not read from streaming transaction's subxact file \"%s\": read only %zu of %zu bytes", + path, nread, len))); + } BufFileClose(fd); } From 795a4cd16db36177f46092cf35601773e422f58d Mon Sep 17 00:00:00 2001 From: Michael Paquier Date: Thu, 19 Jan 2023 10:02:10 +0900 Subject: [PATCH 56/70] Fix failure with perlcritic in psql's create_help.pl No buildfarm members have reported that yet, but a recently-refreshed Debian host did. Reviewed-by: Andrew Dunstan Discussion: https://postgr.es/m/Y8ey5z4Nav62g4/K@paquier.xyz Backpatch-through: 11 --- src/bin/psql/create_help.pl | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/bin/psql/create_help.pl b/src/bin/psql/create_help.pl index 83324239740..339884102ee 100644 --- a/src/bin/psql/create_help.pl +++ b/src/bin/psql/create_help.pl @@ -41,7 +41,7 @@ $define =~ tr/a-z/A-Z/; $define =~ s/\W/_/g; -opendir(DIR, $docdir) +opendir(my $dh, $docdir) or die "$0: could not open documentation source dir '$docdir': $!\n"; open(my $hfile_handle, '>', $hfile) or die "$0: could not open output file '$hfile': $!\n"; @@ -93,7 +93,7 @@ my %entries; -foreach my $file (sort readdir DIR) +foreach my $file (sort readdir $dh) { my ($cmdid, @cmdnames, $cmddesc, $cmdsynopsis); $file =~ /\.sgml$/ or next; @@ -216,4 +216,4 @@ close $cfile_handle; close $hfile_handle; -closedir DIR; +closedir $dh; From 591b0ddc01f4374942ee81073ef26fe286b7492d Mon Sep 17 00:00:00 2001 From: Michael Paquier Date: Thu, 19 Jan 2023 13:13:28 +0900 Subject: [PATCH 57/70] Add missing assign hook for GUC checkpoint_completion_target This is wrong since 88e9823, that has switched the WAL sizing configuration from checkpoint_segments to min_wal_size and max_wal_size. This missed the recalculation of the internal value of the internal "CheckPointSegments", that works as a mapping of the old GUC checkpoint_segments, on reload, for example, and it controls the timing of checkpoints depending on the volume of WAL generated. Most users tend to leave checkpoint_completion_target at 0.9 to smooth the I/O workload, which is why I guess this has gone unnoticed for so long, still it can be useful to tweak and reload the value dynamically in some cases to control the timing of checkpoints. Author: Bharath Rupireddy Discussion: https://postgr.es/m/CALj2ACXgPPAm28mruojSBno+F_=9cTOOxHAywu_dfZPeBdybQw@mail.gmail.com Backpatch-through: 11 --- src/backend/utils/misc/guc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c index cb3b1a1cbdd..168f6113ce2 100644 --- a/src/backend/utils/misc/guc.c +++ b/src/backend/utils/misc/guc.c @@ -3935,7 +3935,7 @@ static struct config_real ConfigureNamesReal[] = }, &CheckPointCompletionTarget, 0.9, 0.0, 1.0, - NULL, NULL, NULL + NULL, assign_checkpoint_completion_target, NULL }, { From bcbaf3559f300ba0dc090cf509dfe2b219196642 Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Thu, 19 Jan 2023 12:23:20 -0500 Subject: [PATCH 58/70] Log the correct ending timestamp in recovery_target_xid mode. MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit When ending recovery based on recovery_target_xid matching with recovery_target_inclusive = off, we printed an incorrect timestamp (always 2000-01-01) in the "recovery stopping before ... transaction" log message. This is a consequence of sloppy refactoring in c945af80c: the code to fetch recordXtime out of the commit/abort record used to be executed unconditionally, but it was changed to get called only in the RECOVERY_TARGET_TIME case. We need only flip the order of operations to restore the intended behavior. Per report from Torsten Förtsch. Back-patch to all supported branches. Discussion: https://postgr.es/m/CAKkG4_kUevPqbmyOfLajx7opAQk6Cvwkvx0HRcFjSPfRPTXanA@mail.gmail.com --- src/backend/access/transam/xlog.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index ce464494778..17ebce7303e 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -5985,8 +5985,13 @@ recoveryStopsBefore(XLogReaderState *record) stopsHere = (recordXid == recoveryTargetXid); } - if (recoveryTarget == RECOVERY_TARGET_TIME && - getRecordTimestamp(record, &recordXtime)) + /* + * Note: we must fetch recordXtime regardless of recoveryTarget setting. + * We don't expect getRecordTimestamp ever to fail, since we already know + * this is a commit or abort record; but test its result anyway. + */ + if (getRecordTimestamp(record, &recordXtime) && + recoveryTarget == RECOVERY_TARGET_TIME) { /* * There can be many transactions that share the same commit time, so From b549299c3e90493e1957fc07525634772c752cac Mon Sep 17 00:00:00 2001 From: Noah Misch Date: Sat, 21 Jan 2023 06:08:00 -0800 Subject: [PATCH 59/70] Reject CancelRequestPacket having unexpected length. When the length was too short, the server read outside the allocation. That yielded the same log noise as sending the correct length with (backendPID,cancelAuthCode) matching nothing. Change to a message about the unexpected length. Given the attacker's lack of control over the memory layout and the general lack of diversity in memory layouts at the code in question, we doubt a would-be attacker could cause a segfault. Hence, while the report arrived via security@postgresql.org, this is not a vulnerability. Back-patch to v11 (all supported versions). Andrey Borodin, reviewed by Tom Lane. Reported by Andrey Borodin. ==== in CBDB, backported & resolved issues by Backported-by: reshke --- src/backend/postmaster/postmaster.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c index 4500c711ff5..8f21a90f391 100644 --- a/src/backend/postmaster/postmaster.c +++ b/src/backend/postmaster/postmaster.c @@ -2368,6 +2368,13 @@ ProcessStartupPacket(Port *port, bool ssl_done, bool gss_done) if (proto == CANCEL_REQUEST_CODE || proto == FINISH_REQUEST_CODE) { + if (len != sizeof(CancelRequestPacket)) + { + ereport(COMMERROR, + (errcode(ERRCODE_PROTOCOL_VIOLATION), + errmsg("invalid length of startup packet"))); + return STATUS_ERROR; + } processCancelRequest(port, buf, proto); /* Not really an error, but we don't want to proceed further */ return STATUS_ERROR; From 553178aca2822d215cf1a6e2ec796f8098706798 Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Sat, 21 Jan 2023 13:10:29 -0500 Subject: [PATCH 60/70] Allow REPLICA IDENTITY to be set on an index that's not (yet) valid. The motivation for this change is that when pg_dump dumps a partitioned index that's marked REPLICA IDENTITY, it generates a command sequence that applies REPLICA IDENTITY before the partitioned index has been marked valid, causing restore to fail. We could perhaps change pg_dump to not do it like that, but that would be difficult and would not fix existing dump files with the problem. There seems to be very little reason for the backend to disallow this anyway --- the code ignores indisreplident when the index isn't valid --- so instead let's fix it by allowing the case. Commit 9511fb37a previously expressed a concern that allowing indisreplident to be set on invalid indexes might allow us to wind up in a situation where a table could have indisreplident set on multiple indexes. I'm not sure I follow that concern exactly, but in any case the only way that could happen is because relation_mark_replica_identity is too trusting about the existing set of markings being valid. Let's just rip out its early-exit code path (which sure looks like premature optimization anyway; what are we doing expending code to make redundant ALTER TABLE ... REPLICA IDENTITY commands marginally faster and not-redundant ones marginally slower?) and fix it to positively guarantee that no more than one index is marked indisreplident. The pg_dump failure can be demonstrated in all supported branches, so back-patch all the way. I chose to back-patch 9511fb37a as well, just to keep indisreplident handling the same in all branches. Per bug #17756 from Sergey Belyashov. Discussion: https://postgr.es/m/17756-dd50e8e0c8dd4a40@postgresql.org --- src/backend/catalog/index.c | 5 +- src/backend/commands/tablecmds.c | 64 +++++++------------ .../regress/expected/replica_identity.out | 37 +++++++++++ src/test/regress/sql/replica_identity.sql | 20 ++++++ 4 files changed, 81 insertions(+), 45 deletions(-) diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c index a775f32e2b0..24a82a29dca 100644 --- a/src/backend/catalog/index.c +++ b/src/backend/catalog/index.c @@ -3661,9 +3661,8 @@ index_set_state_flags(Oid indexId, IndexStateFlagsAction action) * CONCURRENTLY that failed partway through.) * * Note: the CLUSTER logic assumes that indisclustered cannot be - * set on any invalid index, so clear that flag too. Similarly, - * ALTER TABLE assumes that indisreplident cannot be set for - * invalid indexes. + * set on any invalid index, so clear that flag too. For + * cleanliness, also clear indisreplident. */ indexForm->indisvalid = false; indexForm->indisclustered = false; diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index 08ef3902a07..29a7a57e681 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -19622,7 +19622,10 @@ ATExecDropOf(Relation rel, LOCKMODE lockmode) * relation_mark_replica_identity: Update a table's replica identity * * Iff ri_type = REPLICA_IDENTITY_INDEX, indexOid must be the Oid of a suitable - * index. Otherwise, it should be InvalidOid. + * index. Otherwise, it must be InvalidOid. + * + * Caller had better hold an exclusive lock on the relation, as the results + * of running two of these concurrently wouldn't be pretty. */ static void relation_mark_replica_identity(Relation rel, char ri_type, Oid indexOid, @@ -19634,7 +19637,6 @@ relation_mark_replica_identity(Relation rel, char ri_type, Oid indexOid, HeapTuple pg_index_tuple; Form_pg_class pg_class_form; Form_pg_index pg_index_form; - ListCell *index; /* @@ -19656,29 +19658,7 @@ relation_mark_replica_identity(Relation rel, char ri_type, Oid indexOid, heap_freetuple(pg_class_tuple); /* - * Check whether the correct index is marked indisreplident; if so, we're - * done. - */ - if (OidIsValid(indexOid)) - { - Assert(ri_type == REPLICA_IDENTITY_INDEX); - - pg_index_tuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(indexOid)); - if (!HeapTupleIsValid(pg_index_tuple)) - elog(ERROR, "cache lookup failed for index %u", indexOid); - pg_index_form = (Form_pg_index) GETSTRUCT(pg_index_tuple); - - if (pg_index_form->indisreplident) - { - ReleaseSysCache(pg_index_tuple); - return; - } - ReleaseSysCache(pg_index_tuple); - } - - /* - * Clear the indisreplident flag from any index that had it previously, - * and set it for any index that should have it now. + * Update the per-index indisreplident flags correctly. */ pg_index = table_open(IndexRelationId, RowExclusiveLock); foreach(index, RelationGetIndexList(rel)) @@ -19692,19 +19672,23 @@ relation_mark_replica_identity(Relation rel, char ri_type, Oid indexOid, elog(ERROR, "cache lookup failed for index %u", thisIndexOid); pg_index_form = (Form_pg_index) GETSTRUCT(pg_index_tuple); - /* - * Unset the bit if set. We know it's wrong because we checked this - * earlier. - */ - if (pg_index_form->indisreplident) + if (thisIndexOid == indexOid) { - dirty = true; - pg_index_form->indisreplident = false; + /* Set the bit if not already set. */ + if (!pg_index_form->indisreplident) + { + dirty = true; + pg_index_form->indisreplident = true; + } } - else if (thisIndexOid == indexOid) + else { - dirty = true; - pg_index_form->indisreplident = true; + /* Unset the bit if set. */ + if (pg_index_form->indisreplident) + { + dirty = true; + pg_index_form->indisreplident = false; + } } if (dirty) @@ -19715,7 +19699,9 @@ relation_mark_replica_identity(Relation rel, char ri_type, Oid indexOid, /* * Invalidate the relcache for the table, so that after we commit * all sessions will refresh the table's replica identity index - * before attempting any UPDATE or DELETE on the table. + * before attempting any UPDATE or DELETE on the table. (If we + * changed the table's pg_class row above, then a relcache inval + * is already queued due to that; but we might not have.) */ CacheInvalidateRelcache(rel); } @@ -19800,12 +19786,6 @@ ATExecReplicaIdentity(Relation rel, ReplicaIdentityStmt *stmt, LOCKMODE lockmode (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), errmsg("cannot use partial index \"%s\" as replica identity", RelationGetRelationName(indexRel)))); - /* And neither are invalid indexes. */ - if (!indexRel->rd_index->indisvalid) - ereport(ERROR, - (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), - errmsg("cannot use invalid index \"%s\" as replica identity", - RelationGetRelationName(indexRel)))); /* Check index for nullable columns. */ for (key = 0; key < IndexRelationGetNumberOfKeyAttributes(indexRel); key++) diff --git a/src/test/regress/expected/replica_identity.out b/src/test/regress/expected/replica_identity.out index c2fe3e476e3..58b4fca9926 100644 --- a/src/test/regress/expected/replica_identity.out +++ b/src/test/regress/expected/replica_identity.out @@ -239,7 +239,44 @@ Indexes: -- used as replica identity. ALTER TABLE test_replica_identity3 ALTER COLUMN id DROP NOT NULL; ERROR: column "id" is in index used as replica identity +-- +-- Test that replica identity can be set on an index that's not yet valid. +-- (This matches the way pg_dump will try to dump a partitioned table.) +-- +CREATE TABLE test_replica_identity4(id integer NOT NULL) PARTITION BY LIST (id); +CREATE TABLE test_replica_identity4_1(id integer NOT NULL); +ALTER TABLE ONLY test_replica_identity4 + ATTACH PARTITION test_replica_identity4_1 FOR VALUES IN (1); +ALTER TABLE ONLY test_replica_identity4 + ADD CONSTRAINT test_replica_identity4_pkey PRIMARY KEY (id); +ALTER TABLE ONLY test_replica_identity4 + REPLICA IDENTITY USING INDEX test_replica_identity4_pkey; +ALTER TABLE ONLY test_replica_identity4_1 + ADD CONSTRAINT test_replica_identity4_1_pkey PRIMARY KEY (id); +\d+ test_replica_identity4 + Partitioned table "public.test_replica_identity4" + Column | Type | Collation | Nullable | Default | Storage | Stats target | Description +--------+---------+-----------+----------+---------+---------+--------------+------------- + id | integer | | not null | | plain | | +Partition key: LIST (id) +Indexes: + "test_replica_identity4_pkey" PRIMARY KEY, btree (id) INVALID REPLICA IDENTITY +Partitions: test_replica_identity4_1 FOR VALUES IN (1) + +ALTER INDEX test_replica_identity4_pkey + ATTACH PARTITION test_replica_identity4_1_pkey; +\d+ test_replica_identity4 + Partitioned table "public.test_replica_identity4" + Column | Type | Collation | Nullable | Default | Storage | Stats target | Description +--------+---------+-----------+----------+---------+---------+--------------+------------- + id | integer | | not null | | plain | | +Partition key: LIST (id) +Indexes: + "test_replica_identity4_pkey" PRIMARY KEY, btree (id) REPLICA IDENTITY +Partitions: test_replica_identity4_1 FOR VALUES IN (1) + DROP TABLE test_replica_identity; DROP TABLE test_replica_identity2; DROP TABLE test_replica_identity3; +DROP TABLE test_replica_identity4; DROP TABLE test_replica_identity_othertable; diff --git a/src/test/regress/sql/replica_identity.sql b/src/test/regress/sql/replica_identity.sql index 11974cf25af..26ee7423614 100644 --- a/src/test/regress/sql/replica_identity.sql +++ b/src/test/regress/sql/replica_identity.sql @@ -106,7 +106,27 @@ ALTER TABLE test_replica_identity3 ALTER COLUMN id TYPE bigint; -- used as replica identity. ALTER TABLE test_replica_identity3 ALTER COLUMN id DROP NOT NULL; +-- +-- Test that replica identity can be set on an index that's not yet valid. +-- (This matches the way pg_dump will try to dump a partitioned table.) +-- +CREATE TABLE test_replica_identity4(id integer NOT NULL) PARTITION BY LIST (id); +CREATE TABLE test_replica_identity4_1(id integer NOT NULL); +ALTER TABLE ONLY test_replica_identity4 + ATTACH PARTITION test_replica_identity4_1 FOR VALUES IN (1); +ALTER TABLE ONLY test_replica_identity4 + ADD CONSTRAINT test_replica_identity4_pkey PRIMARY KEY (id); +ALTER TABLE ONLY test_replica_identity4 + REPLICA IDENTITY USING INDEX test_replica_identity4_pkey; +ALTER TABLE ONLY test_replica_identity4_1 + ADD CONSTRAINT test_replica_identity4_1_pkey PRIMARY KEY (id); +\d+ test_replica_identity4 +ALTER INDEX test_replica_identity4_pkey + ATTACH PARTITION test_replica_identity4_1_pkey; +\d+ test_replica_identity4 + DROP TABLE test_replica_identity; DROP TABLE test_replica_identity2; DROP TABLE test_replica_identity3; +DROP TABLE test_replica_identity4; DROP TABLE test_replica_identity_othertable; From 6f2b50434f4154b35a6b2e7b9a6b4ece42f63dff Mon Sep 17 00:00:00 2001 From: Andres Freund Date: Mon, 23 Jan 2023 18:04:02 -0800 Subject: [PATCH 61/70] Fix error handling in libpqrcv_connect() When libpqrcv_connect (also known as walrcv_connect()) failed, it leaked the libpq connection. In most paths that's fairly harmless, as the calling process will exit soon after. But e.g. CREATE SUBSCRIPTION could lead to a somewhat longer lived leak. Fix by releasing resources, including the libpq connection, on error. Add a test exercising the error code path. To make it reliable and safe, the test tries to connect to port=-1, which happens to fail during connection establishment, rather than during connection string parsing. Reviewed-by: Noah Misch Discussion: https://postgr.es/m/20230121011237.q52apbvlarfv6jm6@awork3.anarazel.de Backpatch: 11- --- .../libpqwalreceiver/libpqwalreceiver.c | 26 +++++++++++-------- src/test/regress/expected/subscription.out | 10 ++++++- src/test/regress/sql/subscription.sql | 9 ++++++- 3 files changed, 32 insertions(+), 13 deletions(-) diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c index 4bcf0b322dc..4f1f7abd006 100644 --- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c +++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c @@ -170,10 +170,7 @@ libpqrcv_connect(const char *conninfo, bool logical, const char *appname, conn->streamConn = PQconnectStartParams(keys, vals, /* expand_dbname = */ true); if (PQstatus(conn->streamConn) == CONNECTION_BAD) - { - *err = pchomp(PQerrorMessage(conn->streamConn)); - return NULL; - } + goto bad_connection_errmsg; /* * Poll connection until we have OK or FAILED status. @@ -215,10 +212,7 @@ libpqrcv_connect(const char *conninfo, bool logical, const char *appname, } while (status != PGRES_POLLING_OK && status != PGRES_POLLING_FAILED); if (PQstatus(conn->streamConn) != CONNECTION_OK) - { - *err = pchomp(PQerrorMessage(conn->streamConn)); - return NULL; - } + goto bad_connection_errmsg; if (logical) { @@ -229,9 +223,9 @@ libpqrcv_connect(const char *conninfo, bool logical, const char *appname, if (PQresultStatus(res) != PGRES_TUPLES_OK) { PQclear(res); - ereport(ERROR, - (errmsg("could not clear search path: %s", - pchomp(PQerrorMessage(conn->streamConn))))); + *err = psprintf(_("could not clear search path: %s"), + pchomp(PQerrorMessage(conn->streamConn))); + goto bad_connection; } PQclear(res); } @@ -239,6 +233,16 @@ libpqrcv_connect(const char *conninfo, bool logical, const char *appname, conn->logical = logical; return conn; + + /* error path, using libpq's error message */ +bad_connection_errmsg: + *err = pchomp(PQerrorMessage(conn->streamConn)); + + /* error path, error already set */ +bad_connection: + PQfinish(conn->streamConn); + pfree(conn); + return NULL; } /* diff --git a/src/test/regress/expected/subscription.out b/src/test/regress/expected/subscription.out index e1bac50dfe4..7075b45a4f7 100644 --- a/src/test/regress/expected/subscription.out +++ b/src/test/regress/expected/subscription.out @@ -70,7 +70,15 @@ ERROR: cannot enable subscription that does not have a slot name ALTER SUBSCRIPTION regress_testsub3 REFRESH PUBLICATION; ERROR: ALTER SUBSCRIPTION ... REFRESH is not allowed for disabled subscriptions DROP SUBSCRIPTION regress_testsub3; --- fail - invalid connection string +-- fail, connection string does not parse +CREATE SUBSCRIPTION regress_testsub5 CONNECTION 'i_dont_exist=param' PUBLICATION testpub; +ERROR: invalid connection string syntax: invalid connection option "i_dont_exist" + +-- fail, connection string parses, but doesn't work (and does so without +-- connecting, so this is reliable and safe) +CREATE SUBSCRIPTION regress_testsub5 CONNECTION 'port=-1' PUBLICATION testpub; +ERROR: could not connect to the publisher: invalid port number: "-1" +-- fail - invalid connection string during ALTER ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar'; ERROR: invalid connection string syntax: missing "=" after "foobar" in connection info string \dRs+ diff --git a/src/test/regress/sql/subscription.sql b/src/test/regress/sql/subscription.sql index 855a341a3d9..c8a2d08e721 100644 --- a/src/test/regress/sql/subscription.sql +++ b/src/test/regress/sql/subscription.sql @@ -56,7 +56,14 @@ ALTER SUBSCRIPTION regress_testsub3 REFRESH PUBLICATION; DROP SUBSCRIPTION regress_testsub3; --- fail - invalid connection string +-- fail, connection string does not parse +CREATE SUBSCRIPTION regress_testsub5 CONNECTION 'i_dont_exist=param' PUBLICATION testpub; + +-- fail, connection string parses, but doesn't work (and does so without +-- connecting, so this is reliable and safe) +CREATE SUBSCRIPTION regress_testsub5 CONNECTION 'port=-1' PUBLICATION testpub; + +-- fail - invalid connection string during ALTER ALTER SUBSCRIPTION regress_testsub CONNECTION 'foobar'; \dRs+ From c9614344e64179d11b89f07c7292991857cfd5d0 Mon Sep 17 00:00:00 2001 From: Amit Kapila Date: Tue, 24 Jan 2023 08:55:55 +0530 Subject: [PATCH 62/70] Fix the Drop Database hang. The drop database command waits for the logical replication sync worker to accept ProcSignalBarrier and the worker's slot creation waits for the drop database to finish which leads to a deadlock. This happens because the tablesync worker holds interrupts while creating a slot. We prevent cancel/die interrupts while creating a slot in the table sync worker because it is possible that before the server finishes this command, a concurrent drop subscription happens which would complete without removing this slot and that leads to the slot existing until the end of walsender. However, the slot will eventually get dropped at the walsender exit time, so there is no danger of the dangling slot. This patch reallows cancel/die interrupts while creating a slot and modifies the test to wait for slots to become zero to prevent finding an ephemeral slot. The reported hang doesn't happen in PG14 as the drop database starts to wait for ProcSignalBarrier with PG15 (commits 4eb2176318 and e2f65f4255) but it is good to backpatch this till PG14 as it is not a good idea to prevent interrupts during a network call that could block indefinitely. Reported-by: Lakshmi Narayanan Sreethar Diagnosed-by: Andres Freund Author: Hou Zhijie Reviewed-by: Vignesh C, Amit Kapila Backpatch-through: 14, where it was introduced in commit 6b67d72b60 Discussion: https://postgr.es/m/CA+kvmZELXQ4ZD3U=XCXuG3KvFgkuPoN1QrEj8c-rMRodrLOnsg@mail.gmail.com --- src/backend/replication/logical/tablesync.c | 7 ------- src/test/subscription/t/004_sync.pl | 10 +++++++--- 2 files changed, 7 insertions(+), 10 deletions(-) diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c index d104ccf86e1..75df2dc2cc6 100644 --- a/src/backend/replication/logical/tablesync.c +++ b/src/backend/replication/logical/tablesync.c @@ -1076,16 +1076,9 @@ LogicalRepSyncTableStart(XLogRecPtr *origin_startpos) * Create a new permanent logical decoding slot. This slot will be used * for the catchup phase after COPY is done, so tell it to use the * snapshot to make the final data consistent. - * - * Prevent cancel/die interrupts while creating slot here because it is - * possible that before the server finishes this command, a concurrent - * drop subscription happens which would complete without removing this - * slot leading to a dangling slot on the server. */ - HOLD_INTERRUPTS(); walrcv_create_slot(LogRepWorkerWalRcvConn, slotname, false /* permanent */ , CRS_USE_SNAPSHOT, origin_startpos); - RESUME_INTERRUPTS(); /* * Setup replication origin tracking. The purpose of doing this before the diff --git a/src/test/subscription/t/004_sync.pl b/src/test/subscription/t/004_sync.pl index 959e47fad5e..545599b8f8d 100644 --- a/src/test/subscription/t/004_sync.pl +++ b/src/test/subscription/t/004_sync.pl @@ -163,9 +163,13 @@ # subscriber is stuck on data copy for constraint violation. $node_subscriber->safe_psql('postgres', "DROP SUBSCRIPTION tap_sub"); -$result = $node_publisher->safe_psql('postgres', - "SELECT count(*) FROM pg_replication_slots"); -is($result, qq(0), +# When DROP SUBSCRIPTION tries to drop the tablesync slot, the slot may not +# have been created, which causes the slot to be created after the DROP +# SUSCRIPTION finishes. Such slots eventually get dropped at walsender exit +# time. So, to prevent being affected by such ephemeral tablesync slots, we +# wait until all the slots have been cleaned. +ok( $node_publisher->poll_query_until( + 'postgres', 'SELECT count(*) = 0 FROM pg_replication_slots'), 'DROP SUBSCRIPTION during error can clean up the slots on the publisher'); $node_subscriber->stop('fast'); From ae559a3de63cdaab9400bf796852b9e243840470 Mon Sep 17 00:00:00 2001 From: Michael Paquier Date: Wed, 25 Jan 2023 20:00:46 +0900 Subject: [PATCH 63/70] doc: Fix network_ops -> inet_ops in SpGiST operator class list network_ops is an opclass family of SpGiST, and the opclass able to work on the inet type is named inet_ops. Oversight in 7a1cd52, that reworked the design of the table listing all the operators available. Reported-by: Laurence Parry Reviewed-by: Tom Lane, David G. Johnston Discussion: https://postgr.es/m/167458110639.2667300.14741268666497110766@wrigleys.postgresql.org Backpatch-through: 14 --- doc/src/sgml/spgist.sgml | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/doc/src/sgml/spgist.sgml b/doc/src/sgml/spgist.sgml index 00432512de9..102f8627bd0 100644 --- a/doc/src/sgml/spgist.sgml +++ b/doc/src/sgml/spgist.sgml @@ -91,18 +91,7 @@ |>> (box,box) - kd_point_ops - |>> (point,point) - <-> (point,point) - - << (point,point) - >> (point,point) - <<| (point,point) - ~= (point,point) - <@ (point,box) - - - network_ops + inet_ops << (inet,inet) @@ -117,6 +106,17 @@ >= (inet,inet) && (inet,inet) + + kd_point_ops + |>> (point,point) + <-> (point,point) + + << (point,point) + >> (point,point) + <<| (point,point) + ~= (point,point) + <@ (point,box) + poly_ops << (polygon,polygon) From 2acdb02d469422988e499ddc562ce701cbaa5c36 Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Thu, 26 Jan 2023 14:50:07 +1300 Subject: [PATCH 64/70] Fix rare sharedtuplestore.c corruption. If the final chunk of an oversized tuple being written out to disk was exactly 32760 bytes, it would be corrupted due to a fencepost bug. Bug #17619. Back-patch to 11 where the code arrived. While testing that (see test module in archives), I (tmunro) noticed that the per-participant page counter was not initialized to zero as it should have been; that wasn't a live bug when it was written since DSM memory was originally always zeroed, but since 14 min_dynamic_shared_memory might be configured and it supplies non-zeroed memory, so that is also fixed here. Author: Dmitry Astapov Discussion: https://postgr.es/m/17619-0de62ceda812b8b5%40postgresql.org --- src/backend/utils/sort/sharedtuplestore.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/src/backend/utils/sort/sharedtuplestore.c b/src/backend/utils/sort/sharedtuplestore.c index 65e18eff8f2..f519ef838fe 100644 --- a/src/backend/utils/sort/sharedtuplestore.c +++ b/src/backend/utils/sort/sharedtuplestore.c @@ -158,6 +158,7 @@ sts_initialize(SharedTuplestore *sts, int participants, LWLockInitialize(&sts->participants[i].lock, LWTRANCHE_SHARED_TUPLESTORE); sts->participants[i].read_page = 0; + sts->participants[i].npages = 0; sts->participants[i].writing = false; } @@ -321,7 +322,7 @@ sts_puttuple(SharedTuplestoreAccessor *accessor, void *meta_data, /* Do we have space? */ size = accessor->sts->meta_data_size + tuple->t_len; - if (accessor->write_pointer + size >= accessor->write_end) + if (accessor->write_pointer + size > accessor->write_end) { if (accessor->write_chunk == NULL) { @@ -341,7 +342,7 @@ sts_puttuple(SharedTuplestoreAccessor *accessor, void *meta_data, } /* It may still not be enough in the case of a gigantic tuple. */ - if (accessor->write_pointer + size >= accessor->write_end) + if (accessor->write_pointer + size > accessor->write_end) { size_t written; From ca4c1729e7e70189508d99cf9bfe6f1dead101e8 Mon Sep 17 00:00:00 2001 From: Michael Paquier Date: Tue, 31 Jan 2023 12:47:12 +0900 Subject: [PATCH 65/70] Remove recovery test 011_crash_recovery.pl This test has been added as of 857ee8e that has introduced the SQL function txid_status(), with the purpose of checking that a transaction ID still in-progress during a crash is correctly marked as aborted after recovery finishes. This test is unstable, and some configuration scenarios may that easier to reproduce (wal_level=minimal, wal_compression=on) because the WAL holding the information about the in-progress transaction ID may not have made it to disk yet, hence a post-crash recovery may cause the same XID to be reused, triggering a test failure. We have discussed a few approaches, like making this function force a WAL flush to make it reliable across crashes, but we don't want to pay a performance penalty in some scenarios, as well. The test could have been tweaked to enforce a checkpoint but that actually breaks the promise of the test to rely on a stable result of txid_status() after a crash. This issue has been reported a few times across the past years, with an original report from Kyotaro Horiguchi. The buildfarm machines tanager, hachi and gokiburi enable wal_compression, and fail on this test periodically. Discussion: https://postgr.es/m/3163112.1674762209@sss.pgh.pa.us Discussion: https://postgr.es/m/20210305.115011.558061052471425531.horikyota.ntt@gmail.com Backpatch-through: 11 --- src/test/recovery/t/011_crash_recovery.pl | 64 ----------------------- 1 file changed, 64 deletions(-) delete mode 100644 src/test/recovery/t/011_crash_recovery.pl diff --git a/src/test/recovery/t/011_crash_recovery.pl b/src/test/recovery/t/011_crash_recovery.pl deleted file mode 100644 index a26e99500b2..00000000000 --- a/src/test/recovery/t/011_crash_recovery.pl +++ /dev/null @@ -1,64 +0,0 @@ - -# Copyright (c) 2021, PostgreSQL Global Development Group - -# -# Tests relating to PostgreSQL crash recovery and redo -# -use strict; -use warnings; -use PostgresNode; -use TestLib; -use Test::More; -use Config; - -plan tests => 3; - -my $node = get_new_node('primary'); -$node->init(allows_streaming => 1); -$node->start; - -my ($stdin, $stdout, $stderr) = ('', '', ''); - -# Ensure that pg_xact_status reports 'aborted' for xacts -# that were in-progress during crash. To do that, we need -# an xact to be in-progress when we crash and we need to know -# its xid. -my $tx = IPC::Run::start( - [ - 'psql', '-X', '-qAt', '-v', 'ON_ERROR_STOP=1', '-f', '-', '-d', - $node->connstr('postgres') - ], - '<', - \$stdin, - '>', - \$stdout, - '2>', - \$stderr); -$stdin .= q[ -BEGIN; -CREATE TABLE mine(x integer); -SELECT pg_current_xact_id(); -]; -$tx->pump until $stdout =~ /[[:digit:]]+[\r\n]$/; - -# Status should be in-progress -my $xid = $stdout; -chomp($xid); - -is($node->safe_psql('postgres', qq[SELECT pg_xact_status('$xid');]), - 'in progress', 'own xid is in-progress'); - -# Crash and restart the postmaster -$node->stop('immediate'); -$node->start; - -# Make sure we really got a new xid -cmp_ok($node->safe_psql('postgres', 'SELECT pg_current_xact_id()'), - '>', $xid, 'new xid after restart is greater'); - -# and make sure we show the in-progress xact as aborted -is($node->safe_psql('postgres', qq[SELECT pg_xact_status('$xid');]), - 'aborted', 'xid is aborted after crash'); - -$stdin .= "\\q\n"; -$tx->finish; # wait for psql to quit gracefully From 1b141f9fc50c578f58e1776d32d9e844ff78fa19 Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Tue, 31 Jan 2023 14:32:24 -0500 Subject: [PATCH 66/70] Doc: clarify use of NULL to drop comments and security labels. MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This was only mentioned in the description of the text/label, which are marked as being in quotes in the synopsis, which can cause confusion (as witnessed on IRC). Also separate the literal and NULL cases in the parameter list, per suggestion from Tom Lane. Also add an example of dropping a security label. Dagfinn Ilmari Mannsåker, with some tweaks by me Discussion: https://postgr.es/m/87sffqk4zp.fsf@wibble.ilmari.org --- doc/src/sgml/ref/comment.sgml | 16 +++++++++++---- doc/src/sgml/ref/security_label.sgml | 29 +++++++++++++++++++++------- 2 files changed, 34 insertions(+), 11 deletions(-) diff --git a/doc/src/sgml/ref/comment.sgml b/doc/src/sgml/ref/comment.sgml index 850e5ba4afc..17f6b816365 100644 --- a/doc/src/sgml/ref/comment.sgml +++ b/doc/src/sgml/ref/comment.sgml @@ -67,7 +67,7 @@ COMMENT ON TRIGGER trigger_name ON table_name | TYPE object_name | VIEW object_name -} IS 'text' +} IS { string_literal | NULL } where aggregate_signature is: @@ -263,11 +263,19 @@ COMMENT ON - text + string_literal - The new comment, written as a string literal; or NULL - to drop the comment. + The new comment contents, written as a string literal. + + + + + + NULL + + + Write NULL to drop the comment. diff --git a/doc/src/sgml/ref/security_label.sgml b/doc/src/sgml/ref/security_label.sgml index 20a839ff0c3..5f96b7e1ded 100644 --- a/doc/src/sgml/ref/security_label.sgml +++ b/doc/src/sgml/ref/security_label.sgml @@ -44,7 +44,7 @@ SECURITY LABEL [ FOR provider ] ON TABLESPACE object_name | TYPE object_name | VIEW object_name -} IS 'label' +} IS { string_literal | NULL } where aggregate_signature is: @@ -178,11 +178,19 @@ SECURITY LABEL [ FOR provider ] ON - label + string_literal - The new security label, written as a string literal; or NULL - to drop the security label. + The new setting of the security label, written as a string literal. + + + + + + NULL + + + Write NULL to drop the security label. @@ -193,12 +201,19 @@ SECURITY LABEL [ FOR provider ] ON Examples - The following example shows how the security label of a table might - be changed. + The following example shows how the security label of a table could + be set or changed: SECURITY LABEL FOR selinux ON TABLE mytable IS 'system_u:object_r:sepgsql_table_t:s0'; - + + + To remove the label: + + +SECURITY LABEL FOR selinux ON TABLE mytable IS NULL; + + From 3940366e5bed4d59227567ef10de4fb76ac30427 Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Tue, 31 Jan 2023 17:36:55 -0500 Subject: [PATCH 67/70] Update time zone data files to tzdata release 2022g. DST law changes in Greenland and Mexico. Notably, a new timezone America/Ciudad_Juarez has been split off from America/Ojinaga. Historical corrections for northern Canada, Colombia, and Singapore. --- src/timezone/data/tzdata.zi | 47 +++++++++++++++++++++++-------------- 1 file changed, 29 insertions(+), 18 deletions(-) diff --git a/src/timezone/data/tzdata.zi b/src/timezone/data/tzdata.zi index a36449f46bc..3db1460e1be 100644 --- a/src/timezone/data/tzdata.zi +++ b/src/timezone/data/tzdata.zi @@ -1,4 +1,4 @@ -# version 2022f +# version 2022g # This zic input file is in the public domain. R d 1916 o - Jun 14 23s 1 S R d 1916 1919 - O Su>=1 23s 0 - @@ -1040,7 +1040,7 @@ Z Asia/Singapore 6:55:25 - LMT 1901 7:20 - +0720 1941 S 7:30 - +0730 1942 F 16 9 - +09 1945 S 12 -7:30 - +0730 1982 +7:30 - +0730 1981 D 31 16u 8 - +08 Z Asia/Colombo 5:19:24 - LMT 1880 5:19:32 - MMT 1906 @@ -1754,7 +1754,8 @@ Z America/Scoresbysund -1:27:52 - LMT 1916 Jul 28 -1 E -01/+00 Z America/Nuuk -3:26:56 - LMT 1916 Jul 28 -3 - -03 1980 Ap 6 2 --3 E -03/-02 +-3 E -03/-02 2023 Mar 25 22 +-2 - -02 Z America/Thule -4:35:8 - LMT 1916 Jul 28 -4 Th A%sT Z Europe/Tallinn 1:39 - LMT 1880 @@ -3044,16 +3045,11 @@ R Y 1919 o - N 1 0 0 S R Y 1942 o - F 9 2 1 W R Y 1945 o - Au 14 23u 1 P R Y 1945 o - S 30 2 0 S -R Y 1965 o - Ap lastSu 0 2 DD -R Y 1965 o - O lastSu 2 0 S -R Y 1980 1986 - Ap lastSu 2 1 D -R Y 1980 2006 - O lastSu 2 0 S +R Y 1972 1986 - Ap lastSu 2 1 D +R Y 1972 2006 - O lastSu 2 0 S R Y 1987 2006 - Ap Su>=1 2 1 D -Z America/Pangnirtung 0 - -00 1921 --4 Y A%sT 1995 Ap Su>=1 2 --5 C E%sT 1999 O 31 2 --6 C C%sT 2000 O 29 2 --5 C E%sT +R Yu 1965 o - Ap lastSu 0 2 DD +R Yu 1965 o - O lastSu 2 0 S Z America/Iqaluit 0 - -00 1942 Au -5 Y E%sT 1999 O 31 2 -6 C C%sT 2000 O 29 2 @@ -3082,13 +3078,15 @@ Z America/Inuvik 0 - -00 1953 -7 Y M%sT 1980 -7 C M%sT Z America/Whitehorse -9:0:12 - LMT 1900 Au 20 --9 Y Y%sT 1967 May 28 --8 Y P%sT 1980 +-9 Y Y%sT 1965 +-9 Yu Y%sT 1966 F 27 +-8 - PST 1980 -8 C P%sT 2020 N -7 - MST Z America/Dawson -9:17:40 - LMT 1900 Au 20 --9 Y Y%sT 1973 O 28 --8 Y P%sT 1980 +-9 Y Y%sT 1965 +-9 Yu Y%sT 1973 O 28 +-8 - PST 1980 -8 C P%sT 2020 N -7 - MST R m 1931 o - May 1 23 1 D @@ -3132,6 +3130,17 @@ Z America/Mexico_City -6:36:36 - LMT 1922 Ja 1 7u -6 m C%sT 2001 S 30 2 -6 - CST 2002 F 20 -6 m C%sT +Z America/Ciudad_Juarez -7:5:56 - LMT 1922 Ja 1 7u +-7 - MST 1927 Jun 10 23 +-6 - CST 1930 N 15 +-7 m M%sT 1932 Ap +-6 - CST 1996 +-6 m C%sT 1998 +-6 - CST 1998 Ap Su>=1 3 +-7 m M%sT 2010 +-7 u M%sT 2022 O 30 2 +-6 - CST 2022 N 30 +-7 u M%sT Z America/Ojinaga -6:57:40 - LMT 1922 Ja 1 7u -7 - MST 1927 Jun 10 23 -6 - CST 1930 N 15 @@ -3141,7 +3150,8 @@ Z America/Ojinaga -6:57:40 - LMT 1922 Ja 1 7u -6 - CST 1998 Ap Su>=1 3 -7 m M%sT 2010 -7 u M%sT 2022 O 30 2 --6 - CST +-6 - CST 2022 N 30 +-6 u C%sT Z America/Chihuahua -7:4:20 - LMT 1922 Ja 1 7u -7 - MST 1927 Jun 10 23 -6 - CST 1930 N 15 @@ -3771,7 +3781,7 @@ Z Antarctica/Palmer 0 - -00 1965 -4 x -04/-03 2016 D 4 -3 - -03 R CO 1992 o - May 3 0 1 - -R CO 1993 o - Ap 4 0 0 - +R CO 1993 o - F 6 24 0 - Z America/Bogota -4:56:16 - LMT 1884 Mar 13 -4:56:16 - BMT 1914 N 23 -5 CO -05/-04 @@ -4154,6 +4164,7 @@ L America/Tijuana America/Ensenada L America/Indiana/Indianapolis America/Fort_Wayne L America/Toronto America/Montreal L America/Toronto America/Nipigon +L America/Iqaluit America/Pangnirtung L America/Rio_Branco America/Porto_Acre L America/Winnipeg America/Rainy_River L America/Argentina/Cordoba America/Rosario From b8431d5a93315f38e02e1a28d4ab647f2c1c3455 Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Thu, 2 Feb 2023 18:13:44 +1300 Subject: [PATCH 68/70] Doc: Abstract AF_UNIX sockets don't work on Windows. An early release of AF_UNIX in Windows apparently supported Linux-style "abstract" Unix sockets, but they do not seem to work in current Windows versions and there is no mention of any of this in the Winsock documentation. Remove the mention of Windows from the documentation. Back-patch to 14, where commit c9f0624b landed. Discussion: https://postgr.es/m/CA%2BhUKGKrYbSZhrk4NGfoQGT_3LQS5pC5KNE1g0tvE_pPBZ7uew%40mail.gmail.com --- doc/src/sgml/config.sgml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 5532e63740d..23f60cad528 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -756,7 +756,7 @@ include_dir 'conf.d' A value that starts with @ specifies that a Unix-domain socket in the abstract namespace should be created - (currently supported on Linux and Windows). In that case, this value + (currently supported on Linux only). In that case, this value does not specify a directory but a prefix from which the actual socket name is computed in the same manner as for the file-system namespace. While the abstract socket name prefix can be From e8ca834c616cf36c1bcd0781bf8b81b9114eb57f Mon Sep 17 00:00:00 2001 From: Peter Eisentraut Date: Fri, 3 Feb 2023 09:04:35 +0100 Subject: [PATCH 69/70] doc: Fix XML formatting that psql cannot handle Breaking over two lines is not handled by psql's create_help.pl. (It creates faulty \help output.) Undo the formatting change introduced by 9bdad1b5153e5d6b77a8f9c6e32286d6bafcd76d to fix this for now. --- doc/src/sgml/ref/fetch.sgml | 3 +-- doc/src/sgml/ref/move.sgml | 3 +-- 2 files changed, 2 insertions(+), 4 deletions(-) diff --git a/doc/src/sgml/ref/fetch.sgml b/doc/src/sgml/ref/fetch.sgml index 83d58e54b9d..f0f3ac2a028 100644 --- a/doc/src/sgml/ref/fetch.sgml +++ b/doc/src/sgml/ref/fetch.sgml @@ -29,8 +29,7 @@ PostgreSQL documentation FETCH [ direction ] [ FROM | IN ] cursor_name -where direction can -be one of: +where direction can be one of: NEXT PRIOR diff --git a/doc/src/sgml/ref/move.sgml b/doc/src/sgml/ref/move.sgml index 8378439debb..89b5a241013 100644 --- a/doc/src/sgml/ref/move.sgml +++ b/doc/src/sgml/ref/move.sgml @@ -29,8 +29,7 @@ PostgreSQL documentation MOVE [ direction ] [ FROM | IN ] cursor_name -where direction can -be one of: +where direction can be one of: NEXT PRIOR From 50a00dc32ec808497a459c051c0487636857a38e Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Sun, 5 Feb 2023 16:22:32 -0500 Subject: [PATCH 70/70] Release notes for 15.2, 14.7, 13.10, 12.14, 11.19. --- doc/src/sgml/release-14.sgml | 1008 ++++++++++++++++++++++++++++++++++ 1 file changed, 1008 insertions(+) diff --git a/doc/src/sgml/release-14.sgml b/doc/src/sgml/release-14.sgml index 1b8092af963..f4d6d11ca4d 100644 --- a/doc/src/sgml/release-14.sgml +++ b/doc/src/sgml/release-14.sgml @@ -1,6 +1,1014 @@ + + Release 14.7 + + + Release date: + 2023-02-09 + + + + This release contains a variety of fixes from 14.6. + For information about new features in major release 14, see + . + + + + Migration to Version 14.7 + + + A dump/restore is not required for those running 14.X. + + + + However, if you are upgrading from a version earlier than 14.4, + see . + + + + + Changes + + + + + + + Fix calculation of which GENERATED columns need + to be updated in child tables during an UPDATE on + a partitioned table or inheritance tree (Amit Langote, Tom Lane) + + + + This fixes failure to update GENERATED columns + that do not exist in the parent table, or that have different + dependencies than are in the parent column's generation expression. + + + + + + + Allow a WITH RECURSIVE ... CYCLE CTE + to access its output column (Tom Lane) + + + + A reference to the SET column from within the CTE + would fail with cache lookup failed for type 0. + + + + + + + Fix handling of pending inserts when doing a bulk insertion to a + foreign table (Etsuro Fujita) + + + + In some cases pending insertions were not flushed to the FDW soon + enough, leading to logical inconsistencies, for + example BEFORE ROW triggers not seeing rows they + should be able to see. + + + + + + + Allow REPLICA IDENTITY + to be set on an index that's not (yet) valid (Tom Lane) + + + + When pg_dump dumps a partitioned index + that's marked REPLICA IDENTITY, it generates a + command sequence that applies REPLICA IDENTITY + before the partitioned index has been marked valid, causing restore + to fail. There seems no very good reason to prohibit doing it in + that order, so allow it. The marking will have no effect anyway + until the index becomes valid. + + + + + + + Fix handling of DEFAULT markers in rules that + perform an INSERT from a + multi-row VALUES list (Dean Rasheed) + + + + In some cases a DEFAULT marker would not get + replaced with the proper default-value expression, leading to + an unrecognized node type error. + + + + + + + Reject uses of undefined variables in jsonpath + existence checks (Alexander Korotkov, David G. Johnston) + + + + While jsonpath match operators threw an error for an + undefined variable in the path pattern, the existence operators + silently treated it as a match. + + + + + + + Fix jsonb subscripting to cope with toasted subscript + values (Tom Lane, David G. Johnston) + + + + Using a text value fetched directly from a table as + a jsonb subscript was likely to fail. + Fetches would usually not find any matching element. + Assignments could store the value with a garbage key, + although keys long enough to cause that problem are probably rare in + the field. + + + + + + + Fix edge-case data corruption in parallel hash joins (Dmitry Astapov) + + + + If the final chunk of a large tuple being written out to a temporary + file was exactly 32760 bytes, it would be corrupted due to a + fencepost bug. The query would typically fail later with + corrupted-data symptoms. + + + + + + + Honor non-default settings + of checkpoint_completion_target + (Bharath Rupireddy) + + + + Internal state was not updated after a change + in checkpoint_completion_target, possibly + resulting in performing checkpoint I/O faster or slower than + desired, especially if that setting was changed on-the-fly. + + + + + + + Log the correct ending timestamp + in recovery_target_xid mode (Tom Lane) + + + + When ending recovery based on the recovery_target_xid + setting with recovery_target_inclusive + = off, we printed an incorrect timestamp (always + 2000-01-01) in the recovery stopping before + ... transaction log message. + + + + + + + Improve error reporting for some buffered file read failures + (Peter Eisentraut) + + + + Correctly report a short read, giving the numbers of bytes desired + and actually read, instead of reporting an irrelevant error code. + Most places got this right already, but some recently-written + replication logic did not. + + + + + + + In extended query protocol, avoid an immediate commit + after ANALYZE if we're running a pipeline + (Tom Lane) + + + + If there's not been an explicit BEGIN + TRANSACTION, ANALYZE would take it on + itself to commit, which should not happen within a pipelined series + of commands. + + + + + + + Reject cancel request packets having the wrong length + (Andrey Borodin) + + + + The server would process a cancel request even if its length word + was too small. This led to reading beyond the end of the allocated + buffer. In theory that could cause a segfault, but it seems quite + unlikely to happen in practice, since the buffer would have to be + very close to the end of memory. The more likely outcome was a bogus + log message about wrong backend PID or cancel code. Complain about + the wrong length, instead. + + + + + + + Add recursion and looping defenses in subquery pullup (Tom Lane) + + + + A contrived query can result in deep recursion and unreasonable + amounts of time spent trying to flatten subqueries. A proper fix + for that seems unduly invasive for a back-patch, but we can at least + add stack depth checks and an interrupt check to allow the query to + be cancelled. + + + + + + + Fix planner issues when combining Memoize nodes with partitionwise + joins or parameterized nestloops (Richard Guo) + + + + These errors could lead to not using Memoize in contexts where it + would be useful, or possibly to wrong query plans. + + + + + + + Fix partitionwise-join code to tolerate failure to produce a plan for + each partition (Tom Lane) + + + + This could result in could not devise a query plan for the + given query errors. + + + + + + + Limit the amount of cleanup work done + by get_actual_variable_range (Simon Riggs) + + + + Planner runs occurring just after deletion of a large number of + tuples appearing at the end of an index could expend significant + amounts of work setting the killed bits for those + index entries. Limit the amount of work done in any one query by + giving up on this process after examining 100 heap pages. All the + cleanup will still happen eventually, but without so large a + performance hiccup. + + + + + + + Fix under-parenthesized display of AT TIME ZONE + constructs (Tom Lane) + + + + This could result in dump/restore failures for rules or views in + which an argument of AT TIME ZONE is itself an + expression. + + + + + + + Prevent clobbering of cached parsetrees for utility statements in + SQL functions (Tom Lane, Daniel Gustafsson) + + + + If a SQL-language function executes the same utility command more + than once within a single calling query, it could crash or report + strange errors such as unrecognized node type. + + + + + + + Ensure that execution of full-text-search queries can be cancelled + while they are performing phrase matches (Tom Lane) + + + + + + + Fix memory leak in hashing strings with nondeterministic collations + (Jeff Davis) + + + + + + + Fix deadlock between DROP DATABASE and logical + replication worker process (Hou Zhijie) + + + + This was caused by an ill-advised choice to block interrupts while + creating a logical replication slot in the worker. In version 15 + that could lead to an undetected deadlock. In version 14, no + deadlock has been observed, but it's still a bad idea to block + interrupts while waiting for network I/O. + + + + + + + Clean up the libpq connection object + after a failed replication connection attempt (Andres Freund) + + + + The previous coding leaked the connection object. In background + code paths that's pretty harmless because the calling process will + give up and exit. But in commands such as CREATE + SUBSCRIPTION, such a failure resulted in a small + session-lifespan memory leak. + + + + + + + In hot-standby servers, reduce processing effort for tracking XIDs + known to be active on the primary (Simon Riggs, Michail Nikolaev) + + + + Insufficiently-aggressive cleanup of the KnownAssignedXids array + could lead to poor performance, particularly + when max_connections is set to a large value on + the standby. + + + + + + + Ignore invalidated logical-replication slots while determining + oldest catalog xmin (Sirisha Chamarthi) + + + + A replication slot could prevent cleanup of dead tuples in the + system catalogs even after it becomes invalidated due to + exceeding max_slot_wal_keep_size. Thus, failure + of a replication consumer could lead to indefinitely-large catalog + bloat. + + + + + + + In logical decoding, notify the remote node when a transaction is + detected to have crashed (Hou Zhijie) + + + + After a server restart, we'll re-stream the changes for transactions + occurring shortly before the restart. Some of these transactions + probably never completed; when we realize that one didn't we throw + away the relevant decoding state locally, but we neglected to tell + the subscriber about it. That led to the subscriber keeping useless + streaming files until it's next restarted. + + + + + + + Fix uninitialized-memory usage in logical decoding (Masahiko Sawada) + + + + In certain cases, resumption of logical decoding could try to re-use + XID data that had already been freed, leading to unpredictable + behavior. + + + + + + + Avoid rare failed to acquire cleanup lock panic + during WAL replay of hash-index page split operations (Robert Haas) + + + + + + + Advance a heap page's LSN when setting its all-visible bit during + WAL replay (Jeff Davis) + + + + Failure to do this left the page possibly different on standby + servers than the primary, and violated some other expectations about + when the LSN changes. This seems only a theoretical hazard so + far as PostgreSQL itself is concerned, + but it could upset third-party tools. + + + + + + + Prevent unsafe usage of a relation cache + entry's rd_smgr pointer (Amul Sul) + + + + Remove various assumptions that rd_smgr + would stay valid over a series of operations, by wrapping all uses + of it in a function that will recompute it if needed. This prevents + bugs occurring when an unexpected cache flush occurs partway through + such a series. + + + + + + + Fix int64_div_fast_to_numeric() to work for a + wider range of inputs (Dean Rasheed) + + + + This function misbehaved with some values of its second argument. + No such usages exist in core PostgreSQL, + but it's clearly a hazard for external modules, so repair. + + + + + + + Fix latent buffer-overrun problem in WaitEventSet + logic (Thomas Munro) + + + + The epoll-based + and kqueue-based implementations could ask the + kernel for too many events if the size of their internal buffer was + different from the size of the caller's output buffer. That case is + not known to occur in released PostgreSQL + versions, but this error is a hazard for external modules and future + bug fixes. + + + + + + + Avoid nominally-undefined behavior when accessing shared memory in + 32-bit builds (Andres Freund) + + + + clang's undefined-behavior sanitizer complained about use of a + pointer that was less aligned than it should be. It's very unlikely + that this would cause a problem in non-debug builds, but it's worth + fixing for testing purposes. + + + + + + + Fix assertion failure in BRIN minmax-multi opclasses (Tomas Vondra) + + + + The assertion was overly strict, so this mistake was harmless in + non-assert builds. + + + + + + + Remove faulty assertion in useless-RESULT-RTE optimization logic + (Tom Lane) + + + + + + + Fix copy-and-paste errors in cache-lookup-failure messages for ACL + checks (Justin Pryzby) + + + + In principle these errors should never be reached. But if they are, + some of them reported the wrong type of object. + + + + + + + In pg_dump, + avoid calling unsafe server functions before we have locks on the + tables to be examined (Tom Lane, Gilles Darold) + + + + pg_dump uses certain server functions + that can fail if examining a table that gets dropped concurrently. + Avoid this type of failure by ensuring that we obtain access share + lock before inquiring too deeply into a table's properties, and that + we don't apply such functions to tables we don't intend to dump at + all. + + + + + + + Fix psql's \sf + and \ef commands to handle SQL-language functions + that have SQL-standard function bodies (Tom Lane) + + + + These commands misidentified the start of the function body when it + used new-style syntax. + + + + + + + Fix tab completion of ALTER + FUNCTION/PROCEDURE/ROUTINE ... SET + SCHEMA (Dean Rasheed) + + + + + + + Fix contrib/seg to not crash or print garbage + if an input number has more than 127 digits (Tom Lane) + + + + + + + Fix build on Microsoft Visual Studio 2013 (Tom Lane) + + + + A previous patch supposed that all platforms of interest + have snprintf(), but MSVC 2013 isn't quite + there yet. Revert to using sprintf() on that + platform. + + + + + + + Fix compile failure in building PL/Perl with MSVC when using + Strawberry Perl (Andrew Dunstan) + + + + + + + Fix mismatch of PL/Perl built with MSVC versus a Perl library built + with gcc (Andrew Dunstan) + + + + Such combinations could previously fail with loadable library + and perl binaries are mismatched errors. + + + + + + + Suppress compiler warnings from Perl's header files (Andres Freund) + + + + Our preferred compiler options provoke warnings about constructs + appearing in recent versions of Perl's header files. When using + gcc, we can suppress these warnings with + a pragma. + + + + + + + Fix pg_waldump to build on compilers that + don't discard unused static-inline functions (Tom Lane) + + + + + + + Update time zone data files to tzdata + release 2022g for DST law changes in Greenland and Mexico, + plus historical corrections for northern Canada, Colombia, and + Singapore. + + + + Notably, a new timezone America/Ciudad_Juarez has been split off + from America/Ojinaga. + + + + + + + + Release 14.6