A customer of mine had within 6 months
twice a full database partition on a VMware vCenter Server
Appliance. After the first outage, the customer increased the size of
the partition which is mounted to /storage/db. Some months later, some
days ago, the vCSA became unresponsive again. Again because of a filled
up database partition. The customer increased the size of the database
partition again (~ 200 GB!!) and today I had time to take a look at
this nasty vCSA.
The situation
Within 2 days, the storage usage of the databse increased from 75% to 77%. First, I checked the size of the database:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
|
vcsa:/opt/vmware/vpostgres/current/bin # /opt/vmware/vpostgres/current/bin/psql -h localhost -U vc VCDB
psql.bin (9.0.17)
Type "help" for help.
VCDB=> SELECT pg_database.datname, pg_size_pretty(pg_database_size(pg_database.datname)) AS size FROM pg_database;
datname | size
-----------+---------
template1 | 5353 kB
template0 | 5345 kB
postgres | 5449 kB
VCDB | 2007 MB
(4 rows)
VCDB=>
|
As you can see, the database had only 2 GB. The pg_log directory was more interesting:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
|
vcsa:/storage/db/vpostgres # du -shc /storage/db/vpostgres/*
4.0K /storage/db/vpostgres/PG_VERSION
2.0G /storage/db/vpostgres/base
704K /storage/db/vpostgres/global
47M /storage/db/vpostgres/pg_clog
4.0K /storage/db/vpostgres/pg_hba.conf
4.0K /storage/db/vpostgres/pg_ident.conf
<strong>141G /storage/db/vpostgres/pg_log</strong>
252K /storage/db/vpostgres/pg_multixact
12K /storage/db/vpostgres/pg_notify
324K /storage/db/vpostgres/pg_stat_tmp
20K /storage/db/vpostgres/pg_subtrans
4.0K /storage/db/vpostgres/pg_tblspc
4.0K /storage/db/vpostgres/pg_twophase
81M /storage/db/vpostgres/pg_xlog
20K /storage/db/vpostgres/postgresql.conf
4.0K /storage/db/vpostgres/postmaster.opts
4.0K /storage/db/vpostgres/postmaster.pid
0 /storage/db/vpostgres/serverlog
143G total
|
The directory was full with log files. The log files containted only one message:
1
2
|
vcsa:/storage/db/vpostgres/pg_log # more postgresql-2015-03-04_090525.log
123462 tm:2015-03-04 09:05:25.488 UTC db:VCDB pid:1527 WARNING: there is already a transaction in progress
|
The solution
This led me to VMware KB2092127
(After upgrading to vCenter Server Appliance 5.5 Update 2, pg_log file
reports this error: WARNING: there is already a transaction in
progress). And yes, this appliance was upgraded to U2 with high
probability. The solution is described in KB2092127, and is really easy
to implement. Please note that this is only a workaround. There’s
currently no solution, as mentioned in the article.
No comments:
Post a Comment