Creating many (really many) users in PostgreSQL

In his post Hans-Jürgen Schönig showed how to easily and quickly create a million users in PostgreSQL; taking inspiration from such post, I decided to stress one of my virtual machines with a little more complex user creation use-case.

Roles in Roles

One feature of PostgreSQL roles is that they can contain other roles, creating a hierarchy of roles. Therefore, I decided to write a simple plpgsql function to loop creating a chain of roles at each iteration. The function f_users accepts two integers:
  • deep is the number of roles within the single role inheritance chain;
  • how_many is the number of iterations.
As a result, the procedure will create ( 1 + deep ) x how_many roles. Each role name is made by a random string and the iteration number, therefore preventing as much as possible collisions. The function code is as follows:
CREATE OR REPLACE FUNCTION f_users( deep int, how_many int )
RETURNS VOID
AS
$BODY$
DECLARE
        main_role_name     text;
        current_role_name  text;
        current_level      int;
        iteration          int;
        query              text;
BEGIN

<<LP_MAIN>>
 FOR iteration IN 1..how_many LOOP
     -- main role
     main_role_name := 'role_test_' || md5( random()::text ) || '_' || iteration;
     RAISE DEBUG 'Main role is %', main_role_name;
     query := 'CREATE ROLE ' || main_role_name || ' WITH NOLOGIN CONNECTION LIMIT 0;';
     RAISE DEBUG '%', query;
     EXECUTE query;

     <<LP_DEEP>>
       FOR current_level IN 1..deep BY 1 LOOP
        current_role_name := main_role_name || '_' || current_level;
        RAISE DEBUG 'Level % -> role %', current_level, current_role_name;
        query := 'CREATE ROLE ' || current_role_name || ' WITH IN ROLE ' || main_role_name ||  ' NOLOGIN CONNECTION LIMIT 0;';
        RAISE DEBUG '%', query;
        EXECUTE query;
      END LOOP LP_DEEP;
 END LOOP LP_MAIN;

END;
$BODY$
LANGUAGE plpgsql;
Please note the above code can be optimized reducing the number of RAISE (that implies string concatenation). The connection limit 0 is for safety reasons: it is not desiderable to have such automatically created roles to be of any practical use.

Results

The first attempt was short and sweet: 5000 roles within 1000 groups.
# SELECT f_users( 5, 1000 );
 f_users
---------

(1 row)

Time: 965.479 ms
As readers can see, this took less than a second to perform. What about a 10x factor?
# SELECT f_users( 5, 10000 );
 f_users
---------

(1 row)

Time: 9118.100 ms
It seems time is growing linearly. Increase by a 5x factor:
# SELECT f_users( 5, 50000 );
 f_users
---------

(1 row)

Time: 104680.382 ms
To recap, the following is the timing of role creations:
Groups Levels ROLES TIME
1000 5 6000 1 sec
  2 3000 0.3 sec
10000 5 60000 10 sec
  2 30000 2.7 sec
50000 5 300000 105 sec
  2 150000 36 sec
for a total of 549000 roles in 155 secs. So time is not really increasing linearly, but as readers can see PostgreSQL can easily handle a half million roles in less than three minutes. What about the virtual machine? Well, it is a poor FreeBSD 11.1-RELEASE running PostgreSQL 9.6 with 512 MB of RAM without WAL archiving or any other replication active. I cannot hit one million roles in a single shoot in such machine because it starts swapping until the swap daemon freezes. In order to confirm such, let’s consider how many roles there are in my system:
# SELECT count(*) FROM pg_roles;
 count
--------
 552018
the final result is greater than what is expected because I had already a discrete amount of roles. Not so bad for a database!

The article Creating many (really many) users in PostgreSQL has been posted by Luca Ferrari on January 4, 2018