勉強日記

チラ裏

PostgreSQL 高度技術者育成テキスト ch3 (運用管理) (2/2)

LPI-Japan OSS-DB Gold 認定教材 PostgreSQL 高度技術者育成テキスト

LPI-Japan OSS-DB Gold 認定教材 PostgreSQL 高度技術者育成テキスト

  • 作者:河原 翔
  • 発売日: 2014/10/27
  • メディア: オンデマンド (ペーパーバック)


メンテナンス

VACUUM

  • FULLとそうでないのは別用途
    • FULL
      • 更新/削除された行を物理的に切り詰める
        • ページサイズ小さくなる
    • FULLでないの
  • 作りたてのテーブルに1000000行挿入
CREATE TABLE tbl (col int);
INSERT INTO tbl VALUES (generate_series(1,1000000));
SELECT relname,reltuples,relpages FROM pg_class WHERE relname='tbl';
-[ RECORD 1 ]----
relname   | tbl
reltuples | 1e+06
relpages  | 4425
  • pgstatuple
SELECT * FROM pgstattuple('tbl');
-[ RECORD 1 ]------+---------
table_len          | 36249600
tuple_count        | 1000000
tuple_len          | 28000000
tuple_percent      | 77.24
dead_tuple_count   | 0
dead_tuple_len     | 0
dead_tuple_percent | 0
free_space         | 125700
free_percent       | 0.35
  • 半分削除して再度挿入
DELETE FROM tbl WHERE col%2=0;
INSERT INTO tbl VALUES(generate_series(2,1000000,2));

ANALYZE tbl;
SELECT relname,reltuples,relpages FROM pg_class WHERE relname='tbl';
-[ RECORD 1 ]----
relname   | tbl
reltuples | 1e+06
relpages  | 6638
  • pgstattuple
SELECT * FROM pgstattuple('tbl');
-[ RECORD 1 ]------+---------
table_len          | 54378496
tuple_count        | 1000000
tuple_len          | 28000000
tuple_percent      | 51.49
dead_tuple_count   | 500000
dead_tuple_len     | 14000000
dead_tuple_percent | 25.75
free_space         | 192632
free_percent       | 0.35
  • vacuumしてみる
VACUUM VERBOSE tbl;
INFO:  vacuuming "public.tbl"
INFO:  "tbl": found 0 removable, 38 nonremovable row versions in 1 out of 6638 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 523
There were 0 unused item identifiers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
VACUUM
ANALYZE tbl;
SELECT relname,reltuples,relpages FROM pg_class WHERE relname='tbl';
  • FULLでないので物理的なページサイズは変わらず
-[ RECORD 1 ]----
relname   | tbl
reltuples | 1e+06
relpages  | 6638
  • dead tupleは回収された
SELECT * FROM pgstattuple('tbl');
-[ RECORD 1 ]------+---------
table_len          | 54378496
tuple_count        | 1000000
tuple_len          | 28000000
tuple_percent      | 51.49
dead_tuple_count   | 0
dead_tuple_len     | 0
dead_tuple_percent | 0
free_space         | 16192632
free_percent       | 29.78
  • VACUUM FULL
VACUUM FULL VERBOSE tbl;
INFO:  vacuuming "public.tbl"
INFO:  "tbl": found 0 removable, 1000000 nonremovable row versions in 6638 pages
DETAIL:  0 dead row versions cannot be removed yet.
CPU: user: 0.17 s, system: 0.04 s, elapsed: 0.37 s.
VACUUM
SELECT relname,reltuples,relpages FROM pg_class WHERE relname='tbl';
  • 物理的なページ数も減少
-[ RECORD 1 ]----
relname   | tbl
reltuples | 1e+06
relpages  | 4425
SELECT * FROM pgstattuple('tbl');
-[ RECORD 1 ]------+---------
table_len          | 36249600
tuple_count        | 1000000
tuple_len          | 28000000
tuple_percent      | 77.24
dead_tuple_count   | 0
dead_tuple_len     | 0
dead_tuple_percent | 0
free_space         | 125700
free_percent       | 0.35

トランザクションID周回問題

XIDs are compared using modulo-2^32 arithmetic.

  • トランザクションIDは32ビット整数なので、20億が新、20億が旧となる
  • ので、少なくとも20億トランザクションごとにfreezing処理を行わないと、タプルの可視性がおかしくなる
  • freezing処理はVACUUMで行われる
    • 手動
    • autovacuum
  • freezingの昔話
    • ~9.4まではFrozenTransactionId(2)を「どのトランザクションIDよりも古いトランザクションID」扱いとし、この値で行のxminを上書きしていた
    • 新しいバージョンでは、単にフラグを立てる
      • 元のxmin値は調査用(forensic use)に保存される
  • 各データベースの、最古のXIDから現在のXIDまでのトランザクション
SELECT datname,age(datfrozenxid) FROM pg_database;
  datname  | age 
-----------+-----
 postgres  | 132
 template1 | 132
 template0 | 132
(3 rows)
SELECT relname,age(relfrozenxid) FROM pg_class where relkind='r';
         relname         | age 
-------------------------+-----
 pg_statistic            | 132
 pg_type                 | 132
 tbl                     |  88
...
(71 rows)
  • 公式/pg_class
  • relkind = 'r'は「通常のテーブル」
  • VERBOSE FREEZEでfreezing処理を行う
VACUUM FREEZE VERBOSE;
...
INFO:  aggressively vacuuming "public.tbl"
INFO:  "tbl": found 0 removable, 176 nonremovable row versions in 1 out of 4425 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 612
There were 0 unused item identifiers.
Skipped 0 pages due to buffer pins, 4424 frozen pages.
0 pages are entirely empty.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
...
  • ageが0になる
SELECT datname,age(datfrozenxid) FROM pg_database;
  datname  | age 
-----------+-----
 postgres  |   0
 template1 | 132
 template0 | 132
(3 rows)
SELECT relname,age(relfrozenxid) FROM pg_class WHERE relkind='r';
         relname         | age 
-------------------------+-----
 pg_statistic            |   0
 pg_type                 |   0
 tbl                     |   0
...

遅延vacuum

postgresql.conf

# - Cost-Based Vacuum Delay -

#vacuum_cost_delay = 0          # 0-100 milliseconds (0 disables)
#vacuum_cost_page_hit = 1       # 0-10000 credits
#vacuum_cost_page_miss = 10     # 0-10000 credits
#vacuum_cost_page_dirty = 20        # 0-10000 credits
#vacuum_cost_limit = 200        # 1-10000 credits
  • 他のデータベース活動のI/Oとvacuumとが競合するのを避ける機能
  • vacuumのI/O操作をコストとして積算し、閾値を超えたらvacuumを一時中断(nap)

autovacuum

  • 前回VACUUM後のUPDATE/DELETE回数が閾値を超えたらVACUUMが自動実行される
    • cf. AUTOANALYZEはINSERTも

VACUUM/autovacuumの動作確認

DELETE FROM tbl WHERE col%2=0;
INSERT INTO tbl VALUES (generate_series(2,1000000,2));

TABLE pg_stat_user_tables;
-[ RECORD 1 ]-------+------------------------------
relid               | 16458
schemaname          | public
relname             | tbl
seq_scan            | 6
seq_tup_read        | 6500000
idx_scan            | 
idx_tup_fetch       | 
n_tup_ins           | 1500000
n_tup_upd           | 0
n_tup_del           | 500000
n_tup_hot_upd       | 0
n_live_tup          | 1000000
n_dead_tup          | 0
n_mod_since_analyze | 0
last_vacuum         | 2020-02-28 10:43:07.031521+00
last_autovacuum     | 2020-02-28 10:10:47.467538+00
last_analyze        | 2020-02-28 10:43:19.134225+00
last_autoanalyze    | 2020-02-28 10:10:47.596796+00
vacuum_count        | 6
autovacuum_count    | 1
analyze_count       | 5
autoanalyze_count   | 2
VACUUM VERBOSE tbl;
INFO:  vacuuming "public.tbl"
INFO:  "tbl": found 0 removable, 38 nonremovable row versions in 1 out of 6638 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 658
There were 0 unused item identifiers.
Skipped 0 pages due to buffer pins, 4424 frozen pages.
0 pages are entirely empty.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
VACUUM
TABLE pg_stat_user_tables;
-[ RECORD 1 ]-------+------------------------------
relid               | 16458
schemaname          | public
relname             | tbl
seq_scan            | 7
seq_tup_read        | 7500000
idx_scan            | 
idx_tup_fetch       | 
n_tup_ins           | 2000000
n_tup_upd           | 0
n_tup_del           | 1000000
n_tup_hot_upd       | 0
n_live_tup          | 999887
n_dead_tup          | 0
n_mod_since_analyze | 0
last_vacuum         | 2020-02-28 10:56:19.6923+00
last_autovacuum     | 2020-02-28 10:55:19.445485+00
last_analyze        | 2020-02-28 10:43:19.134225+00
last_autoanalyze    | 2020-02-28 10:55:19.784912+00
vacuum_count        | 7
autovacuum_count    | 2
analyze_count       | 5
autoanalyze_count   | 3
  • n_dead_tupが0に
  • last_vacuumが更新された

ANALYZE

autoanalyze

  • 前回ANALYZE後のINSERT/UPDATE/DELETE回数が閾値を超えたらANALYZEが自動実行される

analyzeの動作確認

  • VACUUM同様、pg_stat_user_tablesで確認できる
    • last_analyze
    • last_autoanalyze
  • 統計情報の内容はpg_statsビューで確認できる
SELECT * FROM pg_stats WHERE tablename='tbl';
-[ RECORD 1 ]----------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
schemaname             | public
tablename              | tbl
attname                | col
inherited              | f
null_frac              | 0
avg_width              | 4
n_distinct             | -1
most_common_vals       | 
most_common_freqs      | 
histogram_bounds       | {19,10102,20969,30202,40396,50128,59774,68357,77885,86540,96720,105907,117040,127045,137472,148207,157800,167979,177740,187833,197898,207900,216911,226639,236399,246664,256203,266022,275753,285657,295739,305197,315949,326191,337116,347317,357285,367682,377767,387847,397098,407678,417632,426761,436724,446741,456940,466834,476455,486075,496231,506432,516392,525813,535829,545462,555056,565962,575183,585422,594819,605429,616488,625835,635298,645977,655527,665398,675739,685726,695043,704075,714334,723537,733644,743675,754400,764724,774897,785568,796077,806288,816085,827060,837054,847974,858128,867579,878894,888263,898976,909578,920909,931088,940888,950540,960383,969776,979597,989158,999997}
correlation            | 0.50307155
most_common_elems      | 
most_common_elem_freqs | 
elem_count_histogram   | 
  • 半分重複させてみる
DELETE FROM tbl where col%2=1;
INSERT INTO tbl VALUES(generate_series(2,1000000,2));
ANALYZE VERBOSE tbl;
SELECT * FROM pg_stats WHERE tablename='tbl';
  • n_distinct-0.483089くらいになった
    • すべて同じ値だと1になる
-[ RECORD 1 ]----------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
schemaname             | public
tablename              | tbl
attname                | col
inherited              | f
null_frac              | 0
avg_width              | 4
n_distinct             | -0.483089
most_common_vals       | 
most_common_freqs      | 
histogram_bounds       | {26,9890,19898,29156,40100,49796,60944,71946,81258,91518,102058,111782,121298,131442,141590,151562,161292,170440,180748,190446,201294,212782,223116,232946,242624,253124,262706,271792,282286,293080,303706,314286,324250,334808,345646,355838,367076,377294,387430,397838,407538,417268,427288,437266,447684,458166,467658,478280,487960,497870,507468,516416,527334,537138,546636,556092,566000,575662,584946,596124,605754,616234,625962,635204,644802,653914,664466,674872,685876,696172,706478,715658,725776,735222,744456,754098,763182,773792,783348,792702,802528,811422,821902,831464,841780,850256,860542,870374,879638,889338,899712,909398,919792,928878,938944,949642,960718,971236,980278,990056,999966}
correlation            | 0.50362015
most_common_elems      | 
most_common_elem_freqs | 
elem_count_histogram   | 
  • さらに半分nullにしてみる
UPDATE tbl SET col=NULL WHERE col%4 = 0;
UPDATE 500000
ANALYZE VERBOSE tbl;
SELECT * FROM  pg_stats WHERE tablename='tbl';
  • null_fracが0.5くらいになった
-[ RECORD 1 ]----------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
schemaname             | public
tablename              | tbl
attname                | col
inherited              | f
null_frac              | 0.49976668
avg_width              | 4
n_distinct             | -0.250741
most_common_vals       | 
most_common_freqs      | 
histogram_bounds       | {54,11650,21694,32594,42594,51594,60842,71170,80226,90234,99242,108918,118618,127514,138226,148014,157494,167914,179554,187762,197670,207254,216886,228398,239698,249158,257918,268682,278310,289482,299546,309826,320706,330278,339650,349810,359334,368354,377998,387910,398234,407914,418358,429418,439782,450646,459646,470690,480318,489470,499570,509142,519122,529766,540122,551214,560758,571286,581142,591654,602454,611542,620406,630498,640286,649750,658370,668138,679082,689726,699530,709022,719130,729038,739838,749910,758998,770234,780942,790942,800190,810590,821378,830766,842154,852194,861906,870846,879310,889578,900118,909250,920126,931898,941630,950682,960702,970050,979846,990058,999898}
correlation            | 0.50730336
most_common_elems      | 
most_common_elem_freqs | 
elem_count_histogram   | 
  • 可変長カラムの場合
CREATE TABLE tbl2 (name text);
-- 126文字
INSERT INTO tbl2 VALUES ('zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz');
ANALYZE tbl2;
SELECT * FROM pg_stats WHERE tablename='tbl2';
  • widthは127
    • 126文字以下なのでヘッダ長が1バイト
-[ RECORD 1 ]----------+-------
schemaname             | public
tablename              | tbl2
attname                | name
inherited              | f
null_frac              | 0
avg_width              | 127
n_distinct             | -1
most_common_vals       | 
most_common_freqs      | 
histogram_bounds       | 
correlation            | 
most_common_elems      | 
most_common_elem_freqs | 
elem_count_histogram   | 
-- 127文字
UPDATE tbl2 SET name='zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz';
ANALYZE tbl2;
SELECT * FROM pg_stats WHERE tablename='tbl2';
  • widthは131
    • 127文字になるとヘッダ長が4バイトになるので
-[ RECORD 1 ]----------+-------
schemaname             | public
tablename              | tbl2
attname                | name
inherited              | f
null_frac              | 0
avg_width              | 131
n_distinct             | -1
most_common_vals       | 
most_common_freqs      | 
histogram_bounds       | 
correlation            | 
most_common_elems      | 
most_common_elem_freqs | 
elem_count_histogram   | 

REINDEX

-- DROP TABLE tbl;

CREATE TABLE tbl(id int primary key);
INSERT INTO tbl (id) VALUES(generate_series(1,1000000));
SELECT relname,reltuples,relpages FROM pg_class WHERE relname='tbl_pkey';
-[ RECORD 1 ]-------
relname   | tbl_pkey
reltuples | 1e+06
relpages  | 2745
  • UPDATEによりインデックスは断片化する
UPDATE tbl SET id=id WHERE id%2=0;
ANALYZE tbl;

SELECT relname,reltuples,relpages FROM pg_class WHERE relname='tbl_pkey';
-[ RECORD 1 ]-------
relname   | tbl_pkey
reltuples | 1e+06
relpages  | 5486
  • 有効データの密度が下がっている
SELECT * FROM pgstatindex('tbl_pkey');
-[ RECORD 1 ]------+---------
version            | 4
tree_level         | 2
index_size         | 44941312
root_block_no      | 412
internal_pages     | 20
leaf_pages         | 5465
empty_pages        | 0
deleted_pages      | 0
avg_leaf_density   | 45.19
leaf_fragmentation | 49.99
  • インデックスはvacuumできない
VACUUM tbl_pkey;
WARNING:  skipping "tbl_pkey" --- cannot vacuum non-tables or special system tables
VACUUM
  • 断片化したインデックスはREINDEXで再構成する
REINDEX (VERBOSE) INDEX tbl_pkey;
INFO:  index "tbl_pkey" was reindexed
DETAIL:  CPU: user: 0.18 s, system: 0.06 s, elapsed: 0.33 s
REINDEX
SELECT * FROM pgstatindex('tbl_pkey');
-[ RECORD 1 ]------+---------
version            | 4
tree_level         | 2
index_size         | 22487040
root_block_no      | 290
internal_pages     | 11
leaf_pages         | 2733
empty_pages        | 0
deleted_pages      | 0
avg_leaf_density   | 90.06
leaf_fragmentation | 0

サーバログ

  • どこに・いつ・何をログ出力する
  • どのようにログを保持する

どこに

# - Where to Log -

#log_destination = 'stderr'     # Valid values are combinations of
                    # stderr, csvlog, syslog, and eventlog,
                    # depending on platform.  csvlog
                    # requires logging_collector to be on.

# This is used when logging to stderr:
#logging_collector = off        # Enable capturing of stderr and csvlog
                    # into log files. Required to be on for
                    # csvlogs.
                    # (change requires restart)

# These are only used if logging_collector is on:
#log_directory = 'log'          # directory where log files are written,
                    # can be absolute or relative to PGDATA
#log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log'    # log file name pattern,
                    # can include strftime() escapes
#log_file_mode = 0600           # creation mode for log files,
                    # begin with 0 to use octal notation

...

# These are relevant when logging to syslog:
#syslog_facility = 'LOCAL0'
#syslog_ident = 'postgres'
#syslog_sequence_numbers = on
#syslog_split_messages = on

# This is only relevant when logging to eventlog (win32):
# (change requires restart)
#event_source = 'PostgreSQL'

いつ

# - When to Log -

#log_min_messages = warning     # values in order of decreasing detail:
                    #   debug5
                    #   debug4
                    #   debug3
                    #   debug2
                    #   debug1
                    #   info
                    #   notice
                    #   warning
                    #   error
                    #   log
                    #   fatal
                    #   panic

#log_min_error_statement = error    # values in order of decreasing detail:
                    #   debug5
                    #   debug4
                    #   debug3
                    #   debug2
                    #   debug1
                    #   info
                    #   notice
                    #   warning
                    #   error
                    #   log
                    #   fatal
                    #   panic (effectively off)

#log_min_duration_statement = -1    # -1 is disabled, 0 logs all statements
                    # and their durations, > 0 logs only
                    # statements running at least this number
                    # of milliseconds

#log_transaction_sample_rate = 0.0  # Fraction of transactions whose statements
                    # are logged regardless of their duration. 1.0 logs all
                    # statements from all transactions, 0.0 never logs.
  • クライアントにエラーを送信する設定
#------------------------------------------------------------------------------
# CLIENT CONNECTION DEFAULTS
#------------------------------------------------------------------------------

# - Statement Behavior -

#client_min_messages = notice       # values in order of decreasing detail:
                    #   debug5
                    #   debug4
                    #   debug3
                    #   debug2
                    #   debug1
                    #   log
                    #   notice
                    #   warning
                    #   error
  • client側はlogのレベルが低い
    • cf. server側はlogのレベルがfatalに次いで高い
      • 設定値のほとんどでlogは出力するということ

何を

# - What to Log -

#debug_print_parse = off
#debug_print_rewritten = off
#debug_print_plan = off
#debug_pretty_print = on
#log_checkpoints = off
#log_connections = off
#log_disconnections = off
#log_duration = off
#log_error_verbosity = default      # terse, default, or verbose messages
#log_hostname = off
#log_line_prefix = '%m [%p] '       # special values:
                    #   %a = application name
                    #   %u = user name
                    #   %d = database name
                    #   %r = remote host and port
                    #   %h = remote host
                    #   %p = process ID
                    #   %t = timestamp without milliseconds
                    #   %m = timestamp with milliseconds
                    #   %n = timestamp with milliseconds (as a Unix epoch)
                    #   %i = command tag
                    #   %e = SQL state
                    #   %c = session ID
                    #   %l = session line number
                    #   %s = session start timestamp
                    #   %v = virtual transaction ID
                    #   %x = transaction ID (0 if none)
                    #   %q = stop here in non-session
                    #        processes
                    #   %% = '%'
                    # e.g. '<%u%%%d> '
#log_lock_waits = off           # log lock waits >= deadlock_timeout
#log_statement = 'none'         # none, ddl, mod, all
#log_replication_commands = off
#log_temp_files = -1            # log temporary files equal or larger
                    # than the specified size in kilobytes;
                    # -1 disables, 0 logs all temp files
log_timezone = 'Etc/UTC'

どのように保持する

  • ローテーション、追記or上書きの設定など
#log_truncate_on_rotation = off      # If on, an existing log file with the
                    # same name as the new log file will be
                    # truncated rather than appended to.
                    # But such truncation only occurs on
                    # time-driven rotation, not on restarts
                    # or size-driven rotation.  Default is
                    # off, meaning append to existing files
                    # in all cases.
#log_rotation_age = 1d          # Automatic rotation of logfiles will
                    # happen after that time.  0 disables.
#log_rotation_size = 10MB       # Automatic rotation of logfiles will
                    # happen after that much log output.
                    # 0 disables.

レプリケーションとホットスタンバイ

状態確認

  • 【master】WAL位置
SELECT * FROM pg_current_wal_lsn();
 pg_current_wal_lsn 
--------------------
 0/3013230
(1 row)
  • 【standby】WAL適用位置
SELECT * FROM pg_last_wal_receive_lsn();
 pg_last_wal_receive_lsn 
-------------------------
 0/3013230
(1 row)
  • 【standby】WAL適用日時
SELECT * FROM pg_last_xact_replay_timestamp()
 pg_last_xact_replay_timestamp 
-------------------------------
 2020-02-28 13:56:58.162382+00
(1 row)

コンフリクト軽減

postgresql.conf

#max_standby_archive_delay = 30s # max delay before canceling queries
                    # when reading WAL from archive;
                    # -1 allows indefinite delay
#max_standby_streaming_delay = 30s  # max delay before canceling queries
                    # when reading streaming WAL;
                    # -1 allows indefinite delay
#vacuum_defer_cleanup_age = 0   # number of xacts by which cleanup is delayed
  • スタンバイで、プライマリからWALを受け取ってディスクに書き込んだが、コンフリクトして適用は不可能な場合、クエリをキャンセルするまでの待ち時間
    • from archive
    • from streaming
  • vacuumせずに残しておくトランザクションの世代の数
    • デフォルトの0だと最新以外消す