Who needs comments?My friend and colleague Enrico told me about one of those hidden features of
The option allows you to dump the database (or the part of it) without dumping any user defined comment, that is no comment on tables, data types, and nothing you placed with an explicit
This made me lough at firts: why should I don’t want comments on my dump? Are we still back in the ninenties where people thought that hiding information was a good strategy to ensure their job?
However, there are some cases I can think about where you don’t want comments. For example, some extensions use comments on objects to perform some magic, and
pgauditcomes to mind. But it is not always true that you need to replicate the same configuration on another database, hence you should strip off the comments.
Having a quick look at
pg_dump avoids comments
pg_dumpsource code, the function
dumpTableCommentrepresent a good introduction to how the comments are dumped or not. In particular, in the very beginning of the function, you can find something like:
/* do nothing, if --no-comments is supplied */ if (dopt->no_comments) return; /* Comments are SCHEMA not data */ if (dopt->dataOnly) return; /* Search for comments associated with relation, using table */ ncomments = findComments(fout, tbinfo->dobj.catId.tableoid, tbinfo->dobj.catId.oid, &comments);
--no-commentscommand line option is set (i.e.,
dopt->no_commentsis true), the function returns immediatly since there is nothing to do.
Interestingly, if the user wants to dump only the data for the database, and not its schema, the comments are not dumped too. That’s quite obvious if you think about.
findCommentsfunction is in charge of going to the storage to retrieve the comments, and it does in a strange way. It invokes, in turn,
collectComments, that executes a query like the following one:
appendPQExpBufferStr(query, "SELECT description, classoid, objoid, objsubid " "FROM pg_catalog.pg_description " "ORDER BY classoid, objoid, objsubid");
Do you see something strange there? There is no
WHEREclause in the query! It does mean that the function is going to get all the comments from all the objects in the database, as reported also by the function comments preamble:
/* * collectComments -- * * Construct a table of all comments available for database objects. * We used to do per-object queries for the comments, but it's much faster * to pull them all over at once, and on most databases the memory cost * isn't high. * * The table is sorted by classoid/objid/objsubid for speed in lookup. */
The idea is that all the comments are retrieved on a single pass, and then
findCommentsperforms a kind of binary search to find out the exact range of comments that match the object it is dumping at that moment (i.e., the table).