Psql: FATAL: Failed to get transaction id from GTM. Maybe GTM failed or lost connection

Question

Psql: FATAL: Failed to get transaction id from GTM. Maybe GTM failed or lost connection

I want to create a postgres-xl cluster. The cluster includes 5 nodes, 1 GTM, 2 coordinators and 2 Datanodes. Listed below are the parts details.

 GTM: hostname=localhost nodename=gtm IP=127.0.0.1 port=20001 Coordinator1： hostname=localhost nodename=coord1 IP=127.0.0.1 pooler_port=30011，port=30001 Coordinator2： hostname=host2 nodename=coord2 IP=10.4.6.36 pooler_port=30012，port=30002 Datanode1： hostname=localhost nodename=dn1 IP=127.0.0.1 pooler_port=40011, port=40001 Datanode2： hostname=host2 nodename=dn2 IP=10.4.6.36 pooler_port=40012, port=40002

I set pgxc_ctl and added / usr / local / pgsql / bin to PATH for postgres. I configured SSH authentication to not enter the password for pgxc_ctl. I edited postgresql.conf and pg_hba.conf on both nodes.

Then I built the cluster as follows:

 $ pgxc_ctl PGXC$ add gtm master gtm localhost 20001 $dataDirRoot/gtm PGXC$ add coordinator master coord1 localhost 30001 30011 $dataDirRoot/coord_master.1 none none PGXC$ add coordinator master coord2 10.4.6.36 30002 30012 $dataDirRoot/coord_master.2 none none

after adding coord2, I got the following

psql: FATAL: Failed to get transaction id from GTM. GTM may have lost or lost connection

 PGXC$ add datanode master dn1 localhost 40001 40011 $dataDirRoot/dn_master.1 none none none PGXC$ add datanode master dn2 10.4.6.36 40002 40012 $dataDirRoot/dn_master.2 none none none

after adding dn2, I got the following error:

ERROR: Failed to get join connections TIP. This can happen because one or more nodes are currently unavailable due to node or network failure. It is also possible that the target node might fall into connection restriction or the pool is configured with low connections. Verify that all nodes are functioning properly, and also review the max_connections and max_pool_size configuration parameters.

But when I control all the nodes, it shows

 PGXC$ monitor all Running: gtm master Running: coordinator master coord1 Running: coordinator master coord2 Running: datanode master dn1 Running: datanode master dn2

I could not connect to coord2 by running

  psql -h 10.4.6.36 -p 30002 -U user -d postgres

He shows

psql: FATAL: Failed to get transaction id from GTM. GTM may have lost or lost connection

But I could connect to coord1 by running

 psql -p 30001 -U user -d postgres

I can ping host2 from my localhost without a password. I need to resolve the above errors. Any help? Adding configuration:

 pgxcInstallDir=$HOME/pgxc pgxcOwner=$USER pgxcUser=$pgxcOwner tmpDir=/tmp localTmpDir=$tmpDir configBackup=n configBackupHost=pgxc-linker configBackupDir=$HOME/pgxc configBackupFile=pgxc_ctl.bak dataDirRoot=$HOME/DATA/pgxl/nodes #---- Coordinators ---------------------------------------------------------------------------------------------------- coordMasterDir=$dataDirRoot/coord_master coordSlaveDir=$HOME/coord_slave coordArchLogDir=$HOME/coord_archlog coordExtraConfig=coordExtraConfig cat > $coordExtraConfig <<EOF #================================================ # Added to all the coordinator postgresql.conf # Original: $coordExtraConfig log_destination = 'stderr' logging_collector = on log_directory = 'pg_log' listen_addresses = '*' max_pool_size=300 max_connections=200 hot_standby = off EOF #---- Datanodes ------------------------------------------------------------------------------------------------------- datanodeMasterDir=$dataDirRoot/dn_master datanodeSlaveDir=$dataDirRoot/dn_slave datanodeArchLogDir=$dataDirRoot/datanode_archlog datanodeExtraConfig=datanodeExtraConfig cat > $datanodeExtraConfig <<EOF #================================================ # Added to all the datanode postgresql.conf # Original: $datanodeExtraConfig log_destination = 'stderr' logging_collector = on log_directory = 'pg_log' listen_addresses = '*' max_pool_size=300 max_connections=200 hot_standby = off EOF #---- GTM ------------------------------------------------------------------------------------ gtmName=gtm gtmMasterServer=localhost gtmMasterPort=20001 gtmMasterDir=$dataDirRoot/gtm coordNames=( coord1 coord2 ) coordMasterServers=( localhost 10.4.6.36 ) coordPorts=( 30001 30002 ) poolerPorts=( 30011 30012 ) coordMasterDirs=( $dataDirRoot/coord_master.1 $dataDirRoot/coord_master.2 ) coordMaxWALSenders=( 5 5 ) coordSlave=n coordSlaveServers=( none none ) coordSlavePorts=( none none ) coordSlavePoolerPorts=( none none ) coordSlaveDirs=( none none ) coordArchLogDirs=( none none ) coordSpecificExtraConfig=( coordExtraConfig coordExtraConfig ) coordSpecificExtraPgHba=( none none ) datanodeNames=( dn1 dn2 ) datanodeMasterServers=( localhost 10.4.6.36 ) datanodePorts=( 40001 40002 ) datanodePoolerPorts=( 40011 40012 ) datanodeMasterDirs=( $dataDirRoot/dn_master.1 $dataDirRoot/dn_master.2 ) datanodeMasterWALDirs=( none none ) datanodeMaxWALSenders=( 5 5 ) datanodeSpecificExtraConfig=( datanodeExtraConfig datanodeExtraConfig ) datanodeSpecificExtraPgHba=( none none )

+7

postgresql postgresql-9.5 postgres-xl

Pratheesh m Feb 13 '18 at 10:12

source share

1 answer

tukan · Answer 1 · 2018-02-22T08:53:40+0000

Could you show us your configuration?

What are your max_connections and max_pool_size ? What initdb show for your kernel? I assume that when you add datanode2 (dn2), you do not have enough connections.

You have:

the cluster includes 5 nodes, 1 GTM, 2 coordinators and 2 Datanodes. The following are parts details.

Postgres-xl: max_pool_size=300 max_coordinators=2 max_datanodes=2

In the case of the Coordinator (minimum settings): max_connections=100 # the number of connections received from the application (s) max_prepared_transactions = 100 # is the same as the number of connections

In case of Datanode (minimum settings): Coordinators max_connections=200 # 2 max_prepared_transactions=2 # Indicate at least the total number of coordinators in the cluster.

Excerpt from Postgres documentation (-xl)

max_connections (integer)

Determines the maximum number of concurrent connections to the database server. By default, usually 100 connections are established, but may be less if your kernel settings do not support it (as defined during initdb). This parameter can only be set at server startup.
When starting the backup server, you must set this parameter to the same or higher value than on the main server. Otherwise, requests will not be allowed on the standby server.
In the case of the Coordinator, this parameter determines how many connections each Coordinator can accept.
In the case of Datanode, the number of connections to each Datanode can reach max_connections times the number of Coordinators.

max_pool_size (integer)

Specify the maximum Coordinator connection pool for Datanodes. Since each transaction can be invoked by all Datanodes , this parameter should at least be max_connections times the number of Datanodes.

Edit - to configure the update question

Try the following:

coordinator
```
 max_connections=100 max_pool_size=300 
```
Datanode (you have defined 2 datanodes)
```
 max_connections=200 max_pool_size=500 
```

Psql: FATAL: Failed to get transaction id from GTM. Maybe GTM failed or lost connection

max_connections (integer)

max_pool_size (integer)

More articles: