Using Replication

The --replication option of PgOSM Flex enables osm2pgsql-replication to provide an easy and quick way to keep your OpenStreetMap data refreshed.

PgOSM Flex's --replication mode wraps around the osm2pgsql-replication package included with osm2pgsql. The first time running an import with --replication mode runs osm2pgsql normally, with --slim mode and without --drop. After osm2pgsql completes, osm2pgsql-replication init ... is ran to setup the DB for updates. This mode of operation results in larger database as the intermediate osm2pgsql tables (--slim) must be left in the database (no --drop).

Important: The original --append option is now under --replication. The --append option was removed in PgOSM Flex 0.7.0. See #275 for context.

Use tagged version

When using replication you should pin your process to a specific PgOSM Flex version in the docker run command. When upgrading to new versions, be sure to check the release notes for manual upgrade steps for --replication.
The release notes for PgOSM Flex 0.6.1 are one example. The notes discussed in the release notes have reference SQL scripts under db/data-migration folder.

WARNING - Due to the ability to configure custom layersets these data-migration scripts need manual review, and possibly manual adjustments for your specific database and process.

Not tested by `make`

The function exposed by --replication is not tested via PgOSM's Makefile.

Max connections

The other important change when using replication is to increase Postgres' max_connections. See this discussion on osm2pgsql for why this is necessary.

If using the Docker-internal Postgres instance this is done with -c max_connections=300 in the docker run command. External database connections must update this in the appropriate postgresql.conf file.

export POSTGRES_USER=postgres
export POSTGRES_PASSWORD=mysecretpassword

docker run --name pgosm -d --rm \
    -v ~/pgosm-data:/app/output \
    -v /etc/localtime:/etc/localtime:ro \
    -e POSTGRES_PASSWORD=$POSTGRES_PASSWORD \
    -p 5433:5432 \
    -d rustprooflabs/pgosm-flex:0.10.0 \
        -c max_connections=300

Using `--replication`

Run the docker exec step with --replication.

docker exec -it \
    pgosm python3 docker/pgosm_flex.py \
    --ram=8 \
    --region=north-america/us \
    --subregion=district-of-columbia \
    --replication

Running the above command a second time will detect that the target database has osm2pgsql-replication setup and load data via the defined replication service.

One replication source

Replication with PgOSM Flex is limited to one data source per database. While it is possible to load multiple regions, each into their own schema using --schema-name, replication via osm2pgsql-replication only supports a single source. See this issue for details. Possibly this ability will be supported in the future.

Resetting Replication

⚠️ WARNING! ⚠️ This section is only suitable for DEVELOPMENT databases. Do NOT USE on production databases!

Replication with PgOSM Flex --replication is simply a wrapper around the osm2pgsql-replication tool. If you need to reload a development database after using --replication you must remove the data from the public.osm2pgsql_properties table. If you do not remove this data, PgOSM Flex will detect the replication setup and attempt to update data, not load fresh.

DELETE FROM public.osm2pgsql_properties;

WARNING: This process works as an okay hack when you are using the same layerset in the new import as was previously used. If you use a layerset with fewer tables, the original tables from the original layerset will persist and can cause confusion about what was loaded.

PgOSM Flex User Guide