PostgreSQL 高度技術者育成テキスト ch2 (基本整理)
LPI-Japan OSS-DB Gold 認定教材 PostgreSQL 高度技術者育成テキスト
- 作者:河原 翔
- 発売日: 2014/10/27
- メディア: オンデマンド (ペーパーバック)
プロセス構造
docker exec 2a459 ps auxfww | grep postgres
postgres 62 0.2 0.0 2472 1688 pts/0 Ss+ 13:00 0:00 /bin/sh postgres 1 0.0 0.0 213900 26180 ? Ss 13:00 0:00 postgres postgres 56 0.0 0.0 213900 4016 ? Ss 13:00 0:00 postgres: checkpointer postgres 57 0.0 0.0 213900 5776 ? Ss 13:00 0:00 postgres: background writer postgres 58 0.0 0.0 213900 10040 ? Ss 13:00 0:00 postgres: walwriter postgres 59 0.0 0.0 214440 6756 ? Ss 13:00 0:00 postgres: autovacuum launcher postgres 60 0.0 0.0 68280 4580 ? Ss 13:00 0:00 postgres: stats collector postgres 61 0.0 0.0 214324 6676 ? Ss 13:00 0:00 postgres: logical replication launcher
- loggerプロセスは
logging_collector
がon
のときのみ- ログ出力を行うプロセス
ALTER SYSTEM SET logging_collector='on'; -- then, restart
... postgres 26 0.0 0.0 68196 4712 ? Ss 13:07 0:00 postgres: logger ...
- archiverプロセスは
archive_mode
がon
のときのみ- 不要になったWALを別の領域に保存するプロセス
ALTER SYSTEM SET archive_mode='on'; -- then, restart
... postgres 32 0.0 0.0 68280 4764 ? Ss 13:11 0:00 postgres: archiver ...
- walsender/walreceiverはレプリケーション時のみ
docker exec streaming_replication_master_1 ps auxfww | grep postgres
... postgres 34 0.0 0.0 214712 9548 ? Ss 13:13 0:00 postgres: walsender replication_user 172.30.0.3(47138) streaming 0/3013180 ...
docker exec streaming_replication_standby1_1 ps auxfww | grep postgres
postgres 20 0.0 0.0 214580 10640 ? Ss 13:13 0:00 \_ postgres: walreceiver streaming 0/3013180
- logical replication launcher/workerはロジカルレプリケーション時のみ
docker exec logical_replication_sub_1 ps auxfww | grep postgres
postgres 110 0.0 0.0 214100 6756 ? Ss 13:15 0:00 postgres: logical replication launcher postgres 111 0.0 0.0 215604 14760 ? Ss 13:15 0:00 postgres: logical replication worker for subscription 16390
- バックエンドプロセス
- クライアント(フロントエンド)ごとに立ち上がる
postgres 5240 0.0 0.0 215348 11988 ? Ss 13:05 0:00 postgres: postgres postgres [local] idle
利用する主なメモリ
- 共有バッファ
- ダーティページをメモリ上に持っておくやつ
- WALバッファ
- WALを書き込むときのやつ
- ワークメモリ
- ソートやハッシュ操作などでバックエンドプロセスが使用
- 増やすとGroupAggregateがHashAggregateになったりする
- メンテナンスワークメモリ
- VACUUMやCREATE INDEXなどで使用
データベースクラスタの構造
/docker:postgres@2a45:/var/lib/postgresql/data $ ls -lA
total 104 -rw------- 1 postgres postgres 3 2020-02-27 22:00 PG_VERSION drwx------ 5 postgres postgres 4096 2020-02-27 22:00 base -rw------- 1 postgres postgres 44 2020-02-27 22:11 current_logfiles drwx------ 2 postgres postgres 4096 2020-02-27 22:11 global drwx------ 2 postgres postgres 4096 2020-02-27 22:11 log drwx------ 2 postgres postgres 4096 2020-02-27 22:00 pg_commit_ts drwx------ 2 postgres postgres 4096 2020-02-27 22:00 pg_dynshmem -rw------- 1 postgres postgres 4537 2020-02-27 22:00 pg_hba.conf -rw------- 1 postgres postgres 1636 2020-02-27 22:00 pg_ident.conf drwx------ 4 postgres postgres 4096 2020-02-27 22:11 pg_logical drwx------ 4 postgres postgres 4096 2020-02-27 22:00 pg_multixact drwx------ 2 postgres postgres 4096 2020-02-27 22:11 pg_notify drwx------ 2 postgres postgres 4096 2020-02-27 22:00 pg_replslot drwx------ 2 postgres postgres 4096 2020-02-27 22:00 pg_serial drwx------ 2 postgres postgres 4096 2020-02-27 22:00 pg_snapshots drwx------ 2 postgres postgres 4096 2020-02-27 22:07 pg_stat drwx------ 2 postgres postgres 4096 2020-02-27 22:22 pg_stat_tmp drwx------ 2 postgres postgres 4096 2020-02-27 22:00 pg_subtrans drwx------ 2 postgres postgres 4096 2020-02-27 22:00 pg_tblspc drwx------ 2 postgres postgres 4096 2020-02-27 22:00 pg_twophase drwx------ 3 postgres postgres 4096 2020-02-27 22:00 pg_wal drwx------ 2 postgres postgres 4096 2020-02-27 22:00 pg_xact -rw------- 1 postgres postgres 133 2020-02-27 22:10 postgresql.auto.conf -rw------- 1 postgres postgres 26588 2020-02-27 22:00 postgresql.conf -rw------- 1 postgres postgres 36 2020-02-27 22:11 postmaster.opts -rw------- 1 postgres postgres 94 2020-02-27 22:11 postmaster.pid
データの格納方法
base/
- テーブルファイル
/docker:postgres@2a45:/var/lib/postgresql/data $ ls -lA base/1
total 7801 -rw------- 1 postgres postgres 8192 2020-02-27 22:00 112 -rw------- 1 postgres postgres 8192 2020-02-27 22:00 113 -rw------- 1 postgres postgres 81920 2020-02-27 22:00 1247 -rw------- 1 postgres postgres 24576 2020-02-27 22:00 1247_fsm -rw------- 1 postgres postgres 8192 2020-02-27 22:00 1247_vm -rw------- 1 postgres postgres 442368 2020-02-27 22:00 1249 -rw------- 1 postgres postgres 24576 2020-02-27 22:00 1249_fsm -rw------- 1 postgres postgres 8192 2020-02-27 22:00 1249_vm ...
- ページサイズ単位
- デフォルト設定でPostgreSQLをビルドすると8192B
- fsm: free space map
- 再利用できる領域を覚えておくやつ
- vm: visibility map
- vacuumをサボるためのやつ
- インデックスオンリースキャンでも使う
- インデックスもページ単位
pg_wal/
- WALファイル領域
- パフォーマンスを確保しつつACIDのdurabilityを実現するためのやつ
pg_tblspc/
- テーブル空間へのsymlinkを格納する
- テーブル空間
base/
のほか、任意の領域をデータ保存先にできる- ユースケース
- 高価で高速な領域
- 安価で低速な領域
- バックアップ時に取得を忘れないよう注意
contribモジュール
- インストール済のEXTENSIONはpg_extension`ビューで確認可能
SELECT extname,extversion FROM pg_extension;
extname | extversion ---------+------------ plpgsql | 1.0 (1 row)
pgbench
- TPC-Bベース
- TPC: Transaction Processing Committee で策定されたベンチマーク
- 測定結果にその旨が書いてある
transaction type: <builtin: TPC-B (sort of)>
- つかってみる
createdb benchdb -- ベンチ用DB作る pgbench -i benchdb -- 初期化
dropping old tables... NOTICE: table "pgbench_accounts" does not exist, skipping NOTICE: table "pgbench_branches" does not exist, skipping NOTICE: table "pgbench_history" does not exist, skipping NOTICE: table "pgbench_tellers" does not exist, skipping creating tables... generating data... 100000 of 100000 tuples (100%) done (elapsed 0.06 s, remaining 0.00 s) vacuuming... creating primary keys... done.
pgbench benchdb -- 実施
starting vacuum...end. transaction type: <builtin: TPC-B (sort of)> scaling factor: 1 query mode: simple number of clients: 1 number of threads: 1 number of transactions per client: 10 number of transactions actually processed: 10/10 latency average = 2.321 ms tps = 430.859910 (including connections establishing) tps = 463.329766 (excluding connections establishing)
ベンチマーク用DBの内容
\d+
List of relations Schema | Name | Type | Owner --------+------------------+-------+---------- public | pgbench_accounts | table | postgres public | pgbench_branches | table | postgres public | pgbench_history | table | postgres public | pgbench_tellers | table | postgres (4 rows)
- 件数・fillfactor
SELECT relname,reltuples,reloptions FROM pg_class WHERE relname LIKE 'pgbench_%';
relname | reltuples | reloptions -----------------------+-----------+------------------ pgbench_accounts | 100000 | {fillfactor=100} pgbench_accounts_pkey | 100000 | pgbench_branches | 1 | {fillfactor=100} pgbench_branches_pkey | 1 | pgbench_history | 0 | pgbench_tellers | 10 | {fillfactor=100} pgbench_tellers_pkey | 10 | (7 rows)
- 初期化時のオプションで調整できる
pgbench -i -s 10 -F 70 benchdb
SELECT relname,reltuples,reloptions FROM pg_class WHERE relname LIKE 'pgbench_%';
relname | reltuples | reloptions -----------------------+-----------+----------------- pgbench_accounts | 1e+06 | {fillfactor=70} pgbench_accounts_pkey | 1e+06 | pgbench_branches | 10 | {fillfactor=70} pgbench_branches_pkey | 10 | pgbench_history | 0 | pgbench_tellers | 100 | {fillfactor=70} pgbench_tellers_pkey | 100 | (7 rows)
ベンチマーク処理の中身
ALTER SYSTEM SET log_statement='all';
pgbench benchdb
- ログ
2020-02-27 14:06:52.333 UTC [370] LOG: statement: BEGIN; 2020-02-27 14:06:52.333 UTC [370] LOG: statement: UPDATE pgbench_accounts SET abalance = abalance + -3533 WHERE aid = 40109; 2020-02-27 14:06:52.334 UTC [370] LOG: statement: SELECT abalance FROM pgbench_accounts WHERE aid = 40109; 2020-02-27 14:06:52.334 UTC [370] LOG: statement: UPDATE pgbench_tellers SET tbalance = tbalance + -3533 WHERE tid = 5; 2020-02-27 14:06:52.334 UTC [370] LOG: statement: UPDATE pgbench_branches SET bbalance = bbalance + -3533 WHERE bid = 1; 2020-02-27 14:06:52.334 UTC [370] LOG: statement: INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (5, 1, 40109, -3533, CURRENT_TIMESTAMP); 2020-02-27 14:06:52.335 UTC [370] LOG: statement: END;
- トランザクションで下記の処理を実施
-
pgbench_accounts
の1行更新 -
pgbench_accounts
の1行参照 -
pgbench_tellers
の1行更新 -
pgbench_branches
の1行更新 -
pgbench_history
に1行追加
-
- これをvacuum後に10回実施
2020-02-27 14:06:52.318 UTC [369] LOG: statement: vacuum pgbench_branches 2020-02-27 14:06:52.329 UTC [369] LOG: statement: vacuum pgbench_tellers
starting vacuum...end. transaction type: <builtin: TPC-B (sort of)> scaling factor: 1 query mode: simple number of clients: 1 number of threads: 1 number of transactions per client: 10 number of transactions actually processed: 10/10 latency average = 2.727 ms tps = 366.747718 (including connections establishing) tps = 393.761247 (excluding connections establishing)
負荷の調整
- 一部オプション抜粋
Options to select what to run: -b, --builtin=NAME[@W] add builtin script NAME weighted at W (default: 1) (use "-b list" to list available scripts) ... -S, --select-only perform SELECT-only transactions (same as "-b select-only") Benchmarking options: -c, --client=NUM number of concurrent database clients (default: 1) -C, --connect establish new connection for each transaction ... -j, --jobs=NUM number of threads (default: 1) -l, --log write transaction times to log file ... -t, --transactions=NUM number of transactions each client runs (default: 10) -T, --time=NUM duration of benchmark test in seconds
測定結果の見方
starting vacuum...end. transaction type: <builtin: TPC-B (sort of)> scaling factor: 1 query mode: simple number of clients: 1 number of threads: 1 number of transactions per client: 10 number of transactions actually processed: 10/10 latency average = 2.727 ms tps = 366.747718 (including connections establishing) tps = 393.761247 (excluding connections establishing)
- 設定パラメータ
transaction type: <builtin: TPC-B (sort of)> scaling factor: 1 query mode: simple number of clients: 1 number of threads: 1 number of transactions per client: 10
- トランザクション数
- 完了/予定
number of transactions actually processed: 10/10
- Transaction per Second
tps = 366.747718 (including connections establishing) tps = 393.761247 (excluding connections establishing)
-C, --connect
を指定すると、これらの差分から接続のオーバヘッドを確認できる
pgbench --connect
starting vacuum...end. transaction type: <builtin: TPC-B (sort of)> scaling factor: 10 query mode: simple number of clients: 1 number of threads: 1 number of transactions per client: 10 number of transactions actually processed: 10/10 latency average = 5.937 ms tps = 168.432116 (including connections establishing) tps = 256.256503 (excluding connections establishing)
oid2name
oidと名前一覧を出力
- 引数なしで呼び出し
oid2name
All databases: Oid Database Name Tablespace ---------------------------------- 16384 benchdb pg_default 13408 postgres pg_default 13407 template0 pg_default 1 template1 pg_default
テーブル空間一覧
mkdir -m 700 /var/lib/postgresql/tblspc/
CREATE TABLESPACE my_table_space LOCATION '/var/lib/postgresql/tblspc/';
oid2name -s # --tablespaces でも可
All tablespaces: Oid Tablespace Name ------------------------ 1663 pg_default 1664 pg_global 16482 my_table_space
指定のデータベースのデータベースオブジェクト一覧
postgres
データベースのテーブルのみ
oid2name -d postgres
From database "postgres": Filenode Table Name ---------------------------- 16420 pgbench_accounts 16421 pgbench_branches 16481 pgbench_history 16423 pgbench_tellers 16483 tbl
- インデックスも
oid2name -d postgres -i
From database "postgres": Filenode Table Name --------------------------------- 16420 pgbench_accounts 16428 pgbench_accounts_pkey 16421 pgbench_branches 16424 pgbench_branches_pkey 16481 pgbench_history 16423 pgbench_tellers 16426 pgbench_tellers_pkey 16483 tbl
- ファイルノード指定
oid2name -f 16483
From database "postgres": Filenode Table Name ---------------------- 16483 tbl
注意点
oid2name --help
oid2name helps examining the file structure used by PostgreSQL. ... Connection options: -d, --dbname=DBNAME database to connect to -h, --host=HOSTNAME database server host or socket directory -H same as -h, deprecated option -p, --port=PORT database server port number -U, --username=USERNAME connect as specified database user
-H
(deprecated)- 他のツール群では
-h
でホスト指定するところを、oid2name
は例外的に-H
だったらしい
- 他のツール群では
pgstattuple/pgstatindex
- 公式/pgstattuple
- ソースコードは
contrib/pgbench/
配下 CREATE EXTENSION
で利用
CREATE EXTENSION pgstattuple; SELECT extname,extversion FROM pg_extension;
extname | extversion -------------+------------ plpgsql | 1.0 pgstattuple | 1.5 (2 rows)
pgstattuple
SELECT * FROM pgstattuple('tbl');
-[ RECORD 1 ]------+------ table_len | 8192 tuple_count | 100 tuple_len | 2800 tuple_percent | 34.18 dead_tuple_count | 0 dead_tuple_len | 0 dead_tuple_percent | 0 free_space | 4564 free_percent | 55.71
- 1レコードあたりのサイズ等算出できる
- DELETE/INSERTしてdead tupleを増やしてみる
DELETE FROM tbl WHERE col % 2 = 0; INSERT INTO tbl VALUES (generate_series(2,100,2)); SELECT * FROM pgstattuple('tbl');
- dead tupleが50行増加
- free_space減少
-[ RECORD 1 ]------+------ table_len | 8192 tuple_count | 100 tuple_len | 2800 tuple_percent | 34.18 dead_tuple_count | 50 dead_tuple_len | 1400 dead_tuple_percent | 17.09 free_space | 2764 free_percent | 33.74
- もっとDELETE\INSERTしてみる
DELETE FROM tbl WHERE col % 2 = 0; INSERT INTO tbl VALUES (generate_series(2,100,2)); DELETE FROM tbl WHERE col % 2 = 0; INSERT INTO tbl VALUES (generate_series(2,100,2)); DELETE FROM tbl WHERE col % 2 = 0; INSERT INTO tbl VALUES (generate_series(2,100,2)); SELECT * FROM pgstattuple('tbl');
- 2ページ(2 x 8192B)になった
-[ RECORD 1 ]------+------ table_len | 16384 tuple_count | 100 tuple_len | 2800 tuple_percent | 17.09 dead_tuple_count | 50 dead_tuple_len | 1400 dead_tuple_percent | 8.54 free_space | 10328 free_percent | 63.04
- VACUUMして再度見てみる
VACUUM VERBOSE tbl;
INFO: vacuuming "public.tbl" INFO: "tbl": found 0 removable, 100 nonremovable row versions in 2 out of 2 pages DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 652 There were 200 unused item identifiers. Skipped 0 pages due to buffer pins, 0 frozen pages. 0 pages are entirely empty. CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s. VACUUM
SELECT * FROM pgstattuple('tbl');
- dead tupleはなくなった
- VACUUM FULLではないので物理的な領域は回収されない (2ページのまま)
-[ RECORD 1 ]------+------ table_len | 16384 tuple_count | 100 tuple_len | 2800 tuple_percent | 17.09 dead_tuple_count | 0 dead_tuple_len | 0 dead_tuple_percent | 0 free_space | 11928 free_percent | 72.8
- VACUUM FULLして再度見てみる
VACUUM FULL VERBOSE tbl;
INFO: vacuuming "public.tbl" INFO: "tbl": found 0 removable, 100 nonremovable row versions in 2 pages DETAIL: 0 dead row versions cannot be removed yet. CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s. VACUUM
SELECT * FROM pgstattuple('tbl');
- 物理的な領域も回収された
-[ RECORD 1 ]------+------ table_len | 8192 tuple_count | 100 tuple_len | 2800 tuple_percent | 34.18 dead_tuple_count | 0 dead_tuple_len | 0 dead_tuple_percent | 0 free_space | 4564 free_percent | 55.71
pgstatindex
CREATE INDEX ON tbl using BTREE (col); SELECT * FROM pgstatindex('tbl_col_idx');
-[ RECORD 1 ]------+------ version | 4 tree_level | 0 index_size | 16384 root_block_no | 1 internal_pages | 0 leaf_pages | 1 empty_pages | 0 deleted_pages | 0 avg_leaf_density | 24.58 leaf_fragmentation | 0
- 2ページある(2 x 8192B)のはメタページとrootページが1つずつあるから
root_block_no
が1なのは「1番目のページがroot」の意
- 木の深さ(
tree_level
)は0- rootがleafでもある(
leaf_pages
が1) - 中間ノードなし(
internal_pages
が0)
- rootがleafでもある(
注意点
- pgstattupleモジュールは、情報収集のためにリレーション全体を走査する
- cf. ANALYZEは全レコード走査はせずサンプリングする
- 本番環境で利用すると性能に影響を与える可能性があることに注意する