删除原来的表:
drop table part cascade;
drop table supplier cascade;
drop table customer cascade;
drop table dwdate cascade;
drop table lineorder cascade;
指定 sort keys
和distribution
, 重新创建表:
CREATE TABLE part (
p_partkey integer not null sortkey distkey,
p_name varchar(22) not null,
p_mfgr varchar(6) not null,
p_category varchar(7) not null,
p_brand1 varchar(9) not null,
p_color varchar(11) not null,
p_type varchar(25) not null,
p_size integer not null,
p_container varchar(10) not null
);
CREATE TABLE supplier (
s_suppkey integer not null sortkey,
s_name varchar(25) not null,
s_address varchar(25) not null,
s_city varchar(10) not null,
s_nation varchar(15) not null,
s_region varchar(12) not null,
s_phone varchar(15) not null)
diststyle all;
CREATE TABLE customer (
c_custkey integer not null sortkey,
c_name varchar(25) not null,
c_address varchar(25) not null,
c_city varchar(10) not null,
c_nation varchar(15) not null,
c_region varchar(12) not null,
c_phone varchar(15) not null,
c_mktsegment varchar(10) not null)
diststyle all;
CREATE TABLE dwdate (
d_datekey integer not null sortkey,
d_date varchar(19) not null,
d_dayofweek varchar(10) not null,
d_month varchar(10) not null,
d_year integer not null,
d_yearmonthnum integer not null,
d_yearmonth varchar(8) not null,
d_daynuminweek integer not null,
d_daynuminmonth integer not null,
d_daynuminyear integer not null,
d_monthnuminyear integer not null,
d_weeknuminyear integer not null,
d_sellingseason varchar(13) not null,
d_lastdayinweekfl varchar(1) not null,
d_lastdayinmonthfl varchar(1) not null,
d_holidayfl varchar(1) not null,
d_weekdayfl varchar(1) not null)
diststyle all;
CREATE TABLE lineorder (
lo_orderkey integer not null,
lo_linenumber integer not null,
lo_custkey integer not null,
lo_partkey integer not null distkey,
lo_suppkey integer not null,
lo_orderdate integer not null sortkey,
lo_orderpriority varchar(15) not null,
lo_shippriority varchar(1) not null,
lo_quantity integer not null,
lo_extendedprice integer not null,
lo_ordertotalprice integer not null,
lo_discount integer not null,
lo_revenue integer not null,
lo_supplycost integer not null,
lo_tax integer not null,
lo_commitdate integer not null,
lo_shipmode varchar(10) not null
);
在导入数据时,将原来的 compupdate off
删掉. 此时执行copy命令会启用自动压缩:
copy customer from 's3://awssampledbuswest2/ssbgz/customer'
iam_role 'arn:aws:iam::145197526627:role/redshift-access-s3'
gzip region 'us-west-2';
copy dwdate from 's3://awssampledbuswest2/ssbgz/dwdate'
iam_role 'arn:aws:iam::145197526627:role/redshift-access-s3'
gzip region 'us-west-2';
copy lineorder from 's3://awssampledbuswest2/ssbgz/lineorder'
iam_role 'arn:aws:iam::145197526627:role/redshift-access-s3'
gzip region 'us-west-2';
copy part from 's3://awssampledbuswest2/ssbgz/part'
iam_role 'arn:aws:iam::145197526627:role/redshift-access-s3'
gzip region 'us-west-2';
copy supplier from 's3://awssampledbuswest2/ssbgz/supplier'
iam_role 'arn:aws:iam::145197526627:role/redshift-access-s3'
gzip region 'us-west-2';
记录表的加载时间:
Benchmark | 优化前 | 优化后 |
---|---|---|
Load time (lineorder) | 15m 23s | 19m 27s |
Storage Use(MB) | ||
LINEORDER | 25136 | |
PART | 88 | |
CUSTOMER | 192 | |
DWDATE | 40 | |
SUPPLIER | 76 | |
Total storage | 25532 | |
Query execution time(ms) | ||
Query 1 | 1850 | |
Query 2 | 4642 | |
Query 3 | 2891 | |
Total execution time | 9283 |