Share the knowledge: vCenter Server Appliance: Troubleshooting full database partition

http://www.vcloudnine.de/vcenter-server-appliance-troubleshooting-full-database-partition/

A customer of mine had within 6 months twice a full database partition on a VMware vCenter Server Appliance. After the first outage, the customer increased the size of the partition which is mounted to /storage/db. Some months later, some days ago, the vCSA became unresponsive again. Again because of a filled up database partition. The customer increased the size of the database partition again (~ 200 GB!!) and today I had time to take a look at this nasty vCSA.

The situation

Within 2 days, the storage usage of the databse increased from 75% to 77%. First, I checked the size of the database:

vcsa:/opt/vmware/vpostgres/current/bin # /opt/vmware/vpostgres/current/bin/psql -h localhost -U vc VCDB
psql.bin (9.0.17)
Type "help" for help.

VCDB=> SELECT pg_database.datname, pg_size_pretty(pg_database_size(pg_database.datname)) AS size FROM pg_database;
  datname  |  size
-----------+---------
 template1 | 5353 kB
 template0 | 5345 kB
 postgres  | 5449 kB
 VCDB      | 2007 MB
(4 rows)

VCDB=>

vcsa:/opt/vmware/vpostgres/current/bin # /opt/vmware/vpostgres/current/bin/psql -h localhost -U vc VCDB

psql.bin (9.0.17)

Type "help" for help.

VCDB=> SELECT pg_database.datname, pg_size_pretty(pg_database_size(pg_database.datname)) AS size FROM pg_database;

datname | size

-----------+---------

template1 | 5353 kB

template0 | 5345 kB

postgres | 5449 kB

VCDB | 2007 MB

(4 rows)

VCDB=>

As you can see, the database had only 2 GB. The pg_log directory was more interesting:

vcsa:/storage/db/vpostgres # du -shc /storage/db/vpostgres/*
4.0K    /storage/db/vpostgres/PG_VERSION
2.0G    /storage/db/vpostgres/base
704K    /storage/db/vpostgres/global
47M     /storage/db/vpostgres/pg_clog
4.0K    /storage/db/vpostgres/pg_hba.conf
4.0K    /storage/db/vpostgres/pg_ident.conf
<strong>141G    /storage/db/vpostgres/pg_log</strong>
252K    /storage/db/vpostgres/pg_multixact
12K     /storage/db/vpostgres/pg_notify
324K    /storage/db/vpostgres/pg_stat_tmp
20K     /storage/db/vpostgres/pg_subtrans
4.0K    /storage/db/vpostgres/pg_tblspc
4.0K    /storage/db/vpostgres/pg_twophase
81M     /storage/db/vpostgres/pg_xlog
20K     /storage/db/vpostgres/postgresql.conf
4.0K    /storage/db/vpostgres/postmaster.opts
4.0K    /storage/db/vpostgres/postmaster.pid
0       /storage/db/vpostgres/serverlog
143G    total

vcsa:/storage/db/vpostgres # du -shc /storage/db/vpostgres/*

4.0K /storage/db/vpostgres/PG_VERSION

2.0G /storage/db/vpostgres/base

704K /storage/db/vpostgres/global

47M /storage/db/vpostgres/pg_clog

4.0K /storage/db/vpostgres/pg_hba.conf

4.0K /storage/db/vpostgres/pg_ident.conf

<strong>141G /storage/db/vpostgres/pg_log</strong>

252K /storage/db/vpostgres/pg_multixact

12K /storage/db/vpostgres/pg_notify

324K /storage/db/vpostgres/pg_stat_tmp

20K /storage/db/vpostgres/pg_subtrans

4.0K /storage/db/vpostgres/pg_tblspc

4.0K /storage/db/vpostgres/pg_twophase

81M /storage/db/vpostgres/pg_xlog

20K /storage/db/vpostgres/postgresql.conf

4.0K /storage/db/vpostgres/postmaster.opts

4.0K /storage/db/vpostgres/postmaster.pid

0 /storage/db/vpostgres/serverlog

143G total

The directory was full with log files. The log files containted only one message:

vcsa:/storage/db/vpostgres/pg_log # more postgresql-2015-03-04_090525.log
 123462 tm:2015-03-04 09:05:25.488 UTC db:VCDB pid:1527 WARNING:  there is already a transaction in progress

1 2	vcsa:/storage/db/vpostgres/pg_log # more postgresql-2015-03-04_090525.log 123462 tm:2015-03-04 09:05:25.488 UTC db:VCDB pid:1527 WARNING: there is already a transaction in progress

The solution

This led me to VMware KB2092127 (After upgrading to vCenter Server Appliance 5.5 Update 2, pg_log file reports this error: WARNING: there is already a transaction in progress). And yes, this appliance was upgraded to U2 with high probability. The solution is described in KB2092127, and is really easy to implement. Please note that this is only a workaround. There’s currently no solution, as mentioned in the article.

Share the knowledge

Thursday, March 12, 2015

vCenter Server Appliance: Troubleshooting full database partition

No comments:

Post a Comment