PostgreSQL
 sql >> база данни >  >> RDS >> PostgreSQL

PostgreSQL заявката работи по-бързо с индексно сканиране, но двигателят избира хеш присъединяване

Предполагам, че използвате random_page_cost = 4 по подразбиране , което е твърде високо, което прави индексното сканиране твърде скъпо.

Опитвам се да реконструирам 2-те таблици с този скрипт:

CREATE TABLE replays_game (
    id integer NOT NULL,
    PRIMARY KEY (id)
);

CREATE TABLE replays_playeringame (
    player_id integer NOT NULL,
    game_id integer NOT NULL,
    PRIMARY KEY (player_id, game_id),
    CONSTRAINT replays_playeringame_game_fkey
        FOREIGN KEY (game_id) REFERENCES replays_game (id)
);

CREATE INDEX ix_replays_playeringame_game_id
    ON replays_playeringame (game_id);

-- 150k games
INSERT INTO replays_game
SELECT generate_series(1, 150000);

-- ~150k players, ~2 games each
INSERT INTO replays_playeringame
select trunc(random() * 149999 + 1), generate_series(1, 150000);

INSERT INTO replays_playeringame
SELECT *
FROM
    (
        SELECT
            trunc(random() * 149999 + 1) as player_id,
            generate_series(1, 150000) as game_id
    ) AS t
WHERE
    NOT EXISTS (
        SELECT 1
        FROM replays_playeringame
        WHERE
            t.player_id = replays_playeringame.player_id
            AND t.game_id = replays_playeringame.game_id
    )
;

-- the heavy player with 3000 games
INSERT INTO replays_playeringame
select 999999, generate_series(1, 3000);

Със стойността по подразбиране 4:

game=# set random_page_cost = 4;
SET
game=# explain analyse SELECT "replays_game".*
FROM "replays_game"
INNER JOIN "replays_playeringame" ON "replays_game"."id" = "replays_playeringame"."game_id"
WHERE "replays_playeringame"."player_id" = 999999;
                                                                     QUERY PLAN                                                                      
-----------------------------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=1483.54..4802.54 rows=3000 width=4) (actual time=3.640..110.212 rows=3000 loops=1)
   Hash Cond: (replays_game.id = replays_playeringame.game_id)
   ->  Seq Scan on replays_game  (cost=0.00..2164.00 rows=150000 width=4) (actual time=0.012..34.261 rows=150000 loops=1)
   ->  Hash  (cost=1446.04..1446.04 rows=3000 width=4) (actual time=3.598..3.598 rows=3000 loops=1)
         Buckets: 1024  Batches: 1  Memory Usage: 106kB
         ->  Bitmap Heap Scan on replays_playeringame  (cost=67.54..1446.04 rows=3000 width=4) (actual time=0.586..2.041 rows=3000 loops=1)
               Recheck Cond: (player_id = 999999)
               ->  Bitmap Index Scan on replays_playeringame_pkey  (cost=0.00..66.79 rows=3000 width=0) (actual time=0.560..0.560 rows=3000 loops=1)
                     Index Cond: (player_id = 999999)
 Total runtime: 110.621 ms

След като го намалите до 2:

game=# set random_page_cost = 2;
SET
game=# explain analyse SELECT "replays_game".*
FROM "replays_game"
INNER JOIN "replays_playeringame" ON "replays_game"."id" = "replays_playeringame"."game_id"
WHERE "replays_playeringame"."player_id" = 999999;
                                                                  QUERY PLAN                                                                   
-----------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=45.52..4444.86 rows=3000 width=4) (actual time=0.418..27.741 rows=3000 loops=1)
   ->  Bitmap Heap Scan on replays_playeringame  (cost=45.52..1424.02 rows=3000 width=4) (actual time=0.406..1.502 rows=3000 loops=1)
         Recheck Cond: (player_id = 999999)
         ->  Bitmap Index Scan on replays_playeringame_pkey  (cost=0.00..44.77 rows=3000 width=0) (actual time=0.388..0.388 rows=3000 loops=1)
               Index Cond: (player_id = 999999)
   ->  Index Scan using replays_game_pkey on replays_game  (cost=0.00..0.99 rows=1 width=4) (actual time=0.006..0.006 rows=1 loops=3000)
         Index Cond: (id = replays_playeringame.game_id)
 Total runtime: 28.542 ms
(8 rows)

Ако използвам SSD, бих го намалил допълнително до 1.1.

Що се отнася до последния ви въпрос, наистина мисля, че трябва да се придържате към postgresql. Имам опит с postgresql и mssql и трябва да положа тройни усилия в по-късния, за да работи наполовина по-добре от първия.



  1. Database
  2.   
  3. Mysql
  4.   
  5. Oracle
  6.   
  7. Sqlserver
  8.   
  9. PostgreSQL
  10.   
  11. Access
  12.   
  13. SQLite
  14.   
  15. MariaDB
  1. Превключването на Django проект от sqlite3 backend към postgresql се проваля при зареждане на datadump

  2. Как да имате пълна офлайн функционалност в уеб приложение с база данни PostgreSQL?

  3. Условен SQL брой

  4. Групиране по клаузи — GeneralBits на elein

  5. Postgresql DROP TABLE не работи