DB 별 TPC-H 로드 방법

IT_Engineer/Data Base|2023. 10. 19. 10:54
728x90

# 참고 페이지

1. Mysql


2. Hive:


3. Oracle:

 

>  로드(데이터 적재)
- TPC-H 데이터베이스 성능 측정 : https://ycseo.tistory.com/21
- TPC-H #1 . 준비 : https://www.haedongg.net/?p=355

> 확인
- tablespace에 속한 table 조회(owner) :  
 
- 오라클 테이블 스페이스 사용법(조회, 생성, 삭제)등 총정리 :  

- TableSpace 용량 및 관련내용 조회  

4. Sybase:

## 쿼리 가이드

-- Query 21)
        select s_name, n_name,
        count(*) as numwait
        from supplier, lineitem l1, orders, nation
        where s_suppkey = l1.l_suppkey
        and o_orderkey = l1.l_orderkey
        and o_orderstatus = 'F'
        and l1.l_receiptdate > l1.l_commitdate
        and exists ( select *
        from lineitem l2
        where l2.l_orderkey = l1.l_orderkey
        and l2.l_suppkey <> l1.l_suppkey)
        and s_nationkey = n_nationkey
        and n_name = 'SAUDI ARABIA'
        group by s_name,n_name

     
-- Query 22)     
          select cntrycode,
          count(*) as numcust,
          sum(c_acctbal) as totacctbal
          from ( select substring (c_phone ,1,2) as cntrycode,
          c_acctbal
          from customer
          where substring (c_phone ,1,2) in('13', '31', '23', '29', '30', '18', '17')
          and c_acctbal > ( select avg(c_acctbal)
          from customer
          where c_acctbal > 0.00
          and substring (c_phone ,1,2) in('13','31', '23', '29', '30', '18', '17') )
          and not exists ( select *
          from orders
          where o_custkey = c_custkey) ) as custsale
          group by cntrycode
          order by cntrycode
 

- tbl 만들고 (dbgen -s 10) -> 10G DB 생성

[oracle@oracle-server tpch]$ ls -al
합계 10968908
drwxr-xr-x  2 root root        156  2월 16 15:49 .
drwxrwxr-x+ 6 root root         56  2월 16 15:47 ..
-rw-r--r--  1 root root  244847642  2월 16 15:39 customer.tbl
-rw-r--r--  1 root root 7775727688  2월 16 15:39 lineitem.tbl
-rw-r--r--  1 root root       2224  2월 16 15:39 nation.tbl
-rw-r--r--  1 root root 1749195031  2월 16 15:39 orders.tbl
-rw-r--r--  1 root root  243336157  2월 16 15:39 part.tbl
-rw-r--r--  1 root root 1204850769  2월 16 15:39 partsupp.tbl
-rw-r--r--  1 root root        389  2월 16 15:39 region.tbl
-rw-r--r--  1 root root   14176368  2월 16 15:39 supplier.tbl

[oracle@oracle-server tpch]$ sqlplus tpch/tpch
SQL*Plus: Release 11.2.0.1.0 Production on Wed Feb 16 15:52:23 2022
Copyright (c) 1982, 2009, Oracle.  All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
-- CREATE TABLE tpch.ext_part
-- (
--     p_partkey       NUMBER(10, 0),
--     p_name          VARCHAR2(55),
--     p_mfgr          CHAR(25),
--     p_brand         CHAR(10),
--     p_type          VARCHAR2(25),
--     p_size          INTEGER,
--     p_container     CHAR(10),
--     p_retailprice   NUMBER,
--     p_comment       VARCHAR2(23)
-- )
-- ORGANIZATION EXTERNAL
--     (TYPE oracle_loader
--           DEFAULT DIRECTORY tpch_dir
--               ACCESS PARAMETERS (
--                   FIELDS
--                       TERMINATED BY '|'
--                   MISSING FIELD VALUES ARE NULL
--               )
--           LOCATION('part.tbl'));

DROP TABLE tpch.ext_part CASCADE CONSTRAINTS;
DROP TABLE tpch.ext_supplier CASCADE CONSTRAINTS;
DROP TABLE tpch.ext_partsupp CASCADE CONSTRAINTS;
DROP TABLE tpch.ext_customer CASCADE CONSTRAINTS;
DROP TABLE tpch.ext_orders CASCADE CONSTRAINTS;
DROP TABLE tpch.ext_lineitem CASCADE CONSTRAINTS;
DROP TABLE tpch.ext_nation CASCADE CONSTRAINTS;
DROP TABLE tpch.ext_region CASCADE CONSTRAINTS;

DROP TABLE tpch.part CASCADE CONSTRAINTS;
DROP TABLE tpch.supplier CASCADE CONSTRAINTS;
DROP TABLE tpch.partsupp CASCADE CONSTRAINTS;
DROP TABLE tpch.customer CASCADE CONSTRAINTS;
DROP TABLE tpch.orders CASCADE CONSTRAINTS;
DROP TABLE tpch.lineitem CASCADE CONSTRAINTS;
DROP TABLE tpch.nation CASCADE CONSTRAINTS;
DROP TABLE tpch.region CASCADE CONSTRAINTS;


CREATE OR REPLACE DIRECTORY TPCH_DIR AS '/home/oracle/tpch/tpch/TPC-H_Tools_v3.0.0/dbgen/oracle20g';

CREATE TABLE tpch.ext_part
(
    p_partkey       NUMBER(10, 0),
    p_name          VARCHAR2(55),
    p_mfgr          CHAR(25),
    p_brand         CHAR(10),
    p_type          VARCHAR2(25),
    p_size          INTEGER,
    p_container     CHAR(10),
    p_retailprice   NUMBER,
    p_comment       VARCHAR2(23)
)
ORGANIZATION EXTERNAL
    (TYPE oracle_loader
          DEFAULT DIRECTORY tpch_dir
              ACCESS PARAMETERS (
                  FIELDS
                      TERMINATED BY '|'
                  MISSING FIELD VALUES ARE NULL
              )
          LOCATION('part.tbl'));


CREATE TABLE tpch.part
(
    p_partkey       NUMBER(10, 0) NOT NULL,
    p_name          VARCHAR2(55) NOT NULL,
    p_mfgr          CHAR(25) NOT NULL,
    p_brand         CHAR(10) NOT NULL,
    p_type          VARCHAR2(25) NOT NULL,
    p_size          INTEGER NOT NULL,
    p_container     CHAR(10) NOT NULL,
    p_retailprice   NUMBER NOT NULL,
    p_comment       VARCHAR2(23) NOT NULL
);


CREATE TABLE tpch.ext_supplier
(
    s_suppkey     NUMBER(10, 0),
    s_name        CHAR(25),
    s_address     VARCHAR2(40),
    s_nationkey   NUMBER(10, 0),
    s_phone       CHAR(15),
    s_acctbal     NUMBER,
    s_comment     VARCHAR2(101)
)
ORGANIZATION EXTERNAL
    (TYPE oracle_loader
          DEFAULT DIRECTORY tpch_dir
              ACCESS PARAMETERS (
                  FIELDS
                      TERMINATED BY '|'
                  MISSING FIELD VALUES ARE NULL
              )
          LOCATION('supplier.tbl'));


CREATE TABLE tpch.supplier
(
    s_suppkey     NUMBER(10, 0) NOT NULL,
    s_name        CHAR(25) NOT NULL,
    s_address     VARCHAR2(40) NOT NULL,
    s_nationkey   NUMBER(10, 0) NOT NULL,
    s_phone       CHAR(15) NOT NULL,
    s_acctbal     NUMBER NOT NULL,
    s_comment     VARCHAR2(101) NOT NULL
);

CREATE TABLE tpch.ext_partsupp
(
    ps_partkey      NUMBER(10, 0),
    ps_suppkey      NUMBER(10, 0),
    ps_availqty     INTEGER,
    ps_supplycost   NUMBER,
    ps_comment      VARCHAR2(199)
)
ORGANIZATION EXTERNAL
    (TYPE oracle_loader
          DEFAULT DIRECTORY tpch_dir
              ACCESS PARAMETERS (
                  FIELDS
                      TERMINATED BY '|'
                  MISSING FIELD VALUES ARE NULL
              )
          LOCATION('partsupp.tbl'));

CREATE TABLE tpch.partsupp
(
    ps_partkey      NUMBER(10, 0) NOT NULL,
    ps_suppkey      NUMBER(10, 0) NOT NULL,
    ps_availqty     INTEGER NOT NULL,
    ps_supplycost   NUMBER NOT NULL,
    ps_comment      VARCHAR2(199) NOT NULL
);


CREATE TABLE tpch.ext_customer
(
    c_custkey      NUMBER(10, 0),
    c_name         VARCHAR2(25),
    c_address      VARCHAR2(40),
    c_nationkey    NUMBER(10, 0),
    c_phone        CHAR(15),
    c_acctbal      NUMBER,
    c_mktsegment   CHAR(10),
    c_comment      VARCHAR2(117)
)
ORGANIZATION EXTERNAL
    (TYPE oracle_loader
          DEFAULT DIRECTORY tpch_dir
              ACCESS PARAMETERS (
                  FIELDS
                      TERMINATED BY '|'
                  MISSING FIELD VALUES ARE NULL
              )
          LOCATION('customer.tbl'));

CREATE TABLE tpch.customer
(
    c_custkey      NUMBER(10, 0) NOT NULL,
    c_name         VARCHAR2(25) NOT NULL,
    c_address      VARCHAR2(40) NOT NULL,
    c_nationkey    NUMBER(10, 0) NOT NULL,
    c_phone        CHAR(15) NOT NULL,
    c_acctbal      NUMBER NOT NULL,
    c_mktsegment   CHAR(10) NOT NULL,
    c_comment      VARCHAR2(117) NOT NULL
);

CREATE TABLE tpch.ext_orders
(
    o_orderkey        NUMBER(10, 0),
    o_custkey         NUMBER(10, 0),
    o_orderstatus     CHAR(1),
    o_totalprice      NUMBER,
    o_orderdate       CHAR(10),
    o_orderpriority   CHAR(15),
    o_clerk           CHAR(15),
    o_shippriority    INTEGER,
    o_comment         VARCHAR2(79)
)
ORGANIZATION EXTERNAL
    (TYPE oracle_loader
          DEFAULT DIRECTORY tpch_dir
              ACCESS PARAMETERS (
                  FIELDS
                      TERMINATED BY '|'
                  MISSING FIELD VALUES ARE NULL
              )
          LOCATION('orders.tbl'));

CREATE TABLE tpch.orders
(
    o_orderkey        NUMBER(10, 0) NOT NULL,
    o_custkey         NUMBER(10, 0) NOT NULL,
    o_orderstatus     CHAR(1) NOT NULL,
    o_totalprice      NUMBER NOT NULL,
    o_orderdate       DATE NOT NULL,
    o_orderpriority   CHAR(15) NOT NULL,
    o_clerk           CHAR(15) NOT NULL,
    o_shippriority    INTEGER NOT NULL,
    o_comment         VARCHAR2(79) NOT NULL
);

CREATE TABLE tpch.ext_lineitem
(
    l_orderkey        NUMBER(10, 0),
    l_partkey         NUMBER(10, 0),
    l_suppkey         NUMBER(10, 0),
    l_linenumber      INTEGER,
    l_quantity        NUMBER,
    l_extendedprice   NUMBER,
    l_discount        NUMBER,
    l_tax             NUMBER,
    l_returnflag      CHAR(1),
    l_linestatus      CHAR(1),
    l_shipdate        CHAR(10),
    l_commitdate      CHAR(10),
    l_receiptdate     CHAR(10),
    l_shipinstruct    CHAR(25),
    l_shipmode        CHAR(10),
    l_comment         VARCHAR2(44)
)
ORGANIZATION EXTERNAL
    (TYPE oracle_loader
          DEFAULT DIRECTORY tpch_dir
              ACCESS PARAMETERS (
                  FIELDS
                      TERMINATED BY '|'
                  MISSING FIELD VALUES ARE NULL
              )
          LOCATION('lineitem.tbl'));


CREATE TABLE tpch.lineitem
(
    l_orderkey        NUMBER(10, 0),
    l_partkey         NUMBER(10, 0),
    l_suppkey         NUMBER(10, 0),
    l_linenumber      INTEGER,
    l_quantity        NUMBER,
    l_extendedprice   NUMBER,
    l_discount        NUMBER,
    l_tax             NUMBER,
    l_returnflag      CHAR(1),
    l_linestatus      CHAR(1),
    l_shipdate        DATE,
    l_commitdate      DATE,
    l_receiptdate     DATE,
    l_shipinstruct    CHAR(25),
    l_shipmode        CHAR(10),
    l_comment         VARCHAR2(44)
);


CREATE TABLE tpch.ext_nation
(
    n_nationkey   NUMBER(10, 0),
    n_name        CHAR(25),
    n_regionkey   NUMBER(10, 0),
    n_comment     VARCHAR(152)
)
ORGANIZATION EXTERNAL
    (TYPE oracle_loader
          DEFAULT DIRECTORY tpch_dir
              ACCESS PARAMETERS (
                  FIELDS
                      TERMINATED BY '|'
                  MISSING FIELD VALUES ARE NULL
              )
          LOCATION('nation.tbl'));

CREATE TABLE tpch.nation
(
    n_nationkey   NUMBER(10, 0),
    n_name        CHAR(25),
    n_regionkey   NUMBER(10, 0),
    n_comment     VARCHAR(152)
);

CREATE TABLE tpch.ext_region
(
    r_regionkey   NUMBER(10, 0),
    r_name        CHAR(25),
    r_comment     VARCHAR(152)
)
ORGANIZATION EXTERNAL
    (TYPE oracle_loader
          DEFAULT DIRECTORY tpch_dir
              ACCESS PARAMETERS (
                  FIELDS
                      TERMINATED BY '|'
                  MISSING FIELD VALUES ARE NULL
              )
          LOCATION('region.tbl'));

CREATE TABLE tpch.region
(
    r_regionkey   NUMBER(10, 0),
    r_name        CHAR(25),
    r_comment     VARCHAR(152)
);


TRUNCATE TABLE tpch.part;
TRUNCATE TABLE tpch.supplier;
TRUNCATE TABLE tpch.partsupp;
TRUNCATE TABLE tpch.customer;
TRUNCATE TABLE tpch.orders;
TRUNCATE TABLE tpch.lineitem;
TRUNCATE TABLE tpch.nation;
TRUNCATE TABLE tpch.region;

ALTER SESSION SET nls_date_format='YYYY-MM-DD';

INSERT /*+ APPEND */ INTO  tpch.part     SELECT * FROM tpch.ext_part;
INSERT /*+ APPEND */ INTO  tpch.supplier SELECT * FROM tpch.ext_supplier;
INSERT /*+ APPEND */ INTO  tpch.partsupp SELECT * FROM tpch.ext_partsupp;
INSERT /*+ APPEND */ INTO  tpch.customer SELECT * FROM tpch.ext_customer;
INSERT /*+ APPEND */ INTO  tpch.orders   SELECT * FROM tpch.ext_orders;
INSERT /*+ APPEND */ INTO  tpch.lineitem SELECT * FROM tpch.ext_lineitem;
INSERT /*+ APPEND */ INTO  tpch.nation   SELECT * FROM tpch.ext_nation;
INSERT /*+ APPEND */ INTO  tpch.region   SELECT * FROM tpch.ext_region;


--- ORACLE 30G

CREATE TABLE G1.ext_part
(
    p_partkey       NUMBER(10, 0),
    p_name          VARCHAR2(55),
    p_mfgr          CHAR(25),
    p_brand         CHAR(10),
    p_type          VARCHAR2(25),
    p_size          INTEGER,
    p_container     CHAR(10),
    p_retailprice   NUMBER,
    p_comment       VARCHAR2(23)
)
ORGANIZATION EXTERNAL
    (TYPE oracle_loader
          DEFAULT DIRECTORY tpch30_dir
              ACCESS PARAMETERS (
                  FIELDS
                      TERMINATED BY '|'
                  MISSING FIELD VALUES ARE NULL
              )
          LOCATION('part.tbl'));

CREATE TABLE g1.part
(
    p_partkey       NUMBER(10, 0) NOT NULL,
    p_name          VARCHAR2(55) NOT NULL,
    p_mfgr          CHAR(25) NOT NULL,
    p_brand         CHAR(10) NOT NULL,
    p_type          VARCHAR2(25) NOT NULL,
    p_size          INTEGER NOT NULL,
    p_container     CHAR(10) NOT NULL,
    p_retailprice   NUMBER NOT NULL,
    p_comment       VARCHAR2(23) NOT NULL
);


CREATE TABLE g1.ext_supplier
(
    s_suppkey     NUMBER(10, 0),
    s_name        CHAR(25),
    s_address     VARCHAR2(40),
    s_nationkey   NUMBER(10, 0),
    s_phone       CHAR(15),
    s_acctbal     NUMBER,
    s_comment     VARCHAR2(101)
)
ORGANIZATION EXTERNAL
    (TYPE oracle_loader
          DEFAULT DIRECTORY tpch30_dir
              ACCESS PARAMETERS (
                  FIELDS
                      TERMINATED BY '|'
                  MISSING FIELD VALUES ARE NULL
              )
          LOCATION('supplier.tbl'));

CREATE TABLE g1.supplier
(
    s_suppkey     NUMBER(10, 0) NOT NULL,
    s_name        CHAR(25) NOT NULL,
    s_address     VARCHAR2(40) NOT NULL,
    s_nationkey   NUMBER(10, 0) NOT NULL,
    s_phone       CHAR(15) NOT NULL,
    s_acctbal     NUMBER NOT NULL,
    s_comment     VARCHAR2(101) NOT NULL
);

CREATE TABLE g1.ext_partsupp
(
    ps_partkey      NUMBER(10, 0),
    ps_suppkey      NUMBER(10, 0),
    ps_availqty     INTEGER,
    ps_supplycost   NUMBER,
    ps_comment      VARCHAR2(199)
)
ORGANIZATION EXTERNAL
    (TYPE oracle_loader
          DEFAULT DIRECTORY tpch30_dir
              ACCESS PARAMETERS (
                  FIELDS
                      TERMINATED BY '|'
                  MISSING FIELD VALUES ARE NULL
              )
          LOCATION('partsupp.tbl'));

CREATE TABLE g1.partsupp
(
    ps_partkey      NUMBER(10, 0) NOT NULL,
    ps_suppkey      NUMBER(10, 0) NOT NULL,
    ps_availqty     INTEGER NOT NULL,
    ps_supplycost   NUMBER NOT NULL,
    ps_comment      VARCHAR2(199) NOT NULL
);

CREATE TABLE g1.ext_customer
(
    c_custkey      NUMBER(10, 0),
    c_name         VARCHAR2(25),
    c_address      VARCHAR2(40),
    c_nationkey    NUMBER(10, 0),
    c_phone        CHAR(15),
    c_acctbal      NUMBER,
    c_mktsegment   CHAR(10),
    c_comment      VARCHAR2(117)
)
ORGANIZATION EXTERNAL
    (TYPE oracle_loader
          DEFAULT DIRECTORY tpch30_dir
              ACCESS PARAMETERS (
                  FIELDS
                      TERMINATED BY '|'
                  MISSING FIELD VALUES ARE NULL
              )
          LOCATION('customer.tbl'));

CREATE TABLE g1.customer
(
    c_custkey      NUMBER(10, 0) NOT NULL,
    c_name         VARCHAR2(25) NOT NULL,
    c_address      VARCHAR2(40) NOT NULL,
    c_nationkey    NUMBER(10, 0) NOT NULL,
    c_phone        CHAR(15) NOT NULL,
    c_acctbal      NUMBER NOT NULL,
    c_mktsegment   CHAR(10) NOT NULL,
    c_comment      VARCHAR2(117) NOT NULL
);

-- read date values as yyyy-mm-dd text

CREATE TABLE g1.ext_orders
(
    o_orderkey        NUMBER(10, 0),
    o_custkey         NUMBER(10, 0),
    o_orderstatus     CHAR(1),
    o_totalprice      NUMBER,
    o_orderdate       CHAR(10),
    o_orderpriority   CHAR(15),
    o_clerk           CHAR(15),
    o_shippriority    INTEGER,
    o_comment         VARCHAR2(79)
)
ORGANIZATION EXTERNAL
    (TYPE oracle_loader
          DEFAULT DIRECTORY tpch30_dir
              ACCESS PARAMETERS (
                  FIELDS
                      TERMINATED BY '|'
                  MISSING FIELD VALUES ARE NULL
              )
          LOCATION('orders.tbl'));

CREATE TABLE g1.orders
(
    o_orderkey        NUMBER(10, 0) NOT NULL,
    o_custkey         NUMBER(10, 0) NOT NULL,
    o_orderstatus     CHAR(1) NOT NULL,
    o_totalprice      NUMBER NOT NULL,
    o_orderdate       DATE NOT NULL,
    o_orderpriority   CHAR(15) NOT NULL,
    o_clerk           CHAR(15) NOT NULL,
    o_shippriority    INTEGER NOT NULL,
    o_comment         VARCHAR2(79) NOT NULL
);

-- read date values as yyyy-mm-dd text

CREATE TABLE g1.ext_lineitem
(
    l_orderkey        NUMBER(10, 0),
    l_partkey         NUMBER(10, 0),
    l_suppkey         NUMBER(10, 0),
    l_linenumber      INTEGER,
    l_quantity        NUMBER,
    l_extendedprice   NUMBER,
    l_discount        NUMBER,
    l_tax             NUMBER,
    l_returnflag      CHAR(1),
    l_linestatus      CHAR(1),
    l_shipdate        CHAR(10),
    l_commitdate      CHAR(10),
    l_receiptdate     CHAR(10),
    l_shipinstruct    CHAR(25),
    l_shipmode        CHAR(10),
    l_comment         VARCHAR2(44)
)
ORGANIZATION EXTERNAL
    (TYPE oracle_loader
          DEFAULT DIRECTORY tpch30_dir
              ACCESS PARAMETERS (
                  FIELDS
                      TERMINATED BY '|'
                  MISSING FIELD VALUES ARE NULL
              )
          LOCATION('lineitem.tbl'));

CREATE TABLE g1.lineitem
(
    l_orderkey        NUMBER(10, 0),
    l_partkey         NUMBER(10, 0),
    l_suppkey         NUMBER(10, 0),
    l_linenumber      INTEGER,
    l_quantity        NUMBER,
    l_extendedprice   NUMBER,
    l_discount        NUMBER,
    l_tax             NUMBER,
    l_returnflag      CHAR(1),
    l_linestatus      CHAR(1),
    l_shipdate        DATE,
    l_commitdate      DATE,
    l_receiptdate     DATE,
    l_shipinstruct    CHAR(25),
    l_shipmode        CHAR(10),
    l_comment         VARCHAR2(44)
);

CREATE TABLE g1.ext_nation
(
    n_nationkey   NUMBER(10, 0),
    n_name        CHAR(25),
    n_regionkey   NUMBER(10, 0),
    n_comment     VARCHAR(152)
)
ORGANIZATION EXTERNAL
    (TYPE oracle_loader
          DEFAULT DIRECTORY tpch30_dir
              ACCESS PARAMETERS (
                  FIELDS
                      TERMINATED BY '|'
                  MISSING FIELD VALUES ARE NULL
              )
          LOCATION('nation.tbl'));

CREATE TABLE g1.nation
(
    n_nationkey   NUMBER(10, 0),
    n_name        CHAR(25),
    n_regionkey   NUMBER(10, 0),
    n_comment     VARCHAR(152)
);

CREATE TABLE g1.ext_region
(
    r_regionkey   NUMBER(10, 0),
    r_name        CHAR(25),
    r_comment     VARCHAR(152)
)
ORGANIZATION EXTERNAL
    (TYPE oracle_loader
          DEFAULT DIRECTORY tpch30_dir
              ACCESS PARAMETERS (
                  FIELDS
                      TERMINATED BY '|'
                  MISSING FIELD VALUES ARE NULL
              )
          LOCATION('region.tbl'));

CREATE TABLE g1.region
(
    r_regionkey   NUMBER(10, 0),
    r_name        CHAR(25),
    r_comment     VARCHAR(152)
);


TRUNCATE TABLE g1.part;
TRUNCATE TABLE g1.supplier;
TRUNCATE TABLE g1.partsupp;
TRUNCATE TABLE g1.customer;
TRUNCATE TABLE g1.orders;
TRUNCATE TABLE g1.lineitem;
TRUNCATE TABLE g1.nation;
TRUNCATE TABLE g1.region;

ALTER SESSION SET nls_date_format='YYYY-MM-DD';

INSERT /*+ APPEND */ INTO  g1.part     SELECT * FROM g1.ext_part;
INSERT /*+ APPEND */ INTO  g1.supplier SELECT * FROM g1.ext_supplier;
INSERT /*+ APPEND */ INTO  g1.partsupp SELECT * FROM g1.ext_partsupp;
INSERT /*+ APPEND */ INTO  g1.customer SELECT * FROM g1.ext_customer;
INSERT /*+ APPEND */ INTO  g1.orders   SELECT * FROM g1.ext_orders;
INSERT /*+ APPEND */ INTO  g1.lineitem SELECT * FROM g1.ext_lineitem;
INSERT /*+ APPEND */ INTO  g1.nation   SELECT * FROM g1.ext_nation;
INSERT /*+ APPEND */ INTO  g1.region   SELECT * FROM g1.ext_region;


--- (만약의 상황) 데이터를 다시 넣기

ALTER TABLE tpch20.lineitem drop CONSTRAINT chk_lineitem_ship_rcpt ----check(l_shipdate <= l_receiptdate);
;
ALTER TABLE tpch20.lineitem
DROP CONSTRAINT chk_lineitem_discount -- --check(l_discount >= 0.00 AND l_discount <= 1.00);
;
ALTER TABLE tpch20.lineitem
    drop CONSTRAINT chk_lineitem_tax ----check(l_tax >= 0);
    ;
ALTER TABLE tpch20.lineitem
    drop CONSTRAINT chk_lineitem_extendedprice ----check(l_extendedprice >= 0);
;
ALTER TABLE tpch20.lineitem
    drop CONSTRAINT chk_lineitem_quantity --check(l_quantity >= 0);
;
ALTER TABLE tpch20.orders
    drop CONSTRAINT chk_orders_totalprice --check(o_totalprice >= 0);
;
ALTER TABLE tpch20.partsupp
    drop CONSTRAINT chk_partsupp_supplycost --check(ps_supplycost >= 0);
;
ALTER TABLE tpch20.partsupp
    drop CONSTRAINT chk_partsupp_availqty --check(ps_availqty >= 0);
;
ALTER TABLE tpch20.part
    drop CONSTRAINT chk_part_retailprice --check(p_retailprice >= 0);
;
    ALTER TABLE tpch20.part
    drop CONSTRAINT chk_part_size --check(p_size >= 0);
;

ALTER TABLE tpch20.nation
    drop CONSTRAINT chk_nation_nationkey --check(n_nationkey >= 0);
 ;   
ALTER TABLE tpch20.region
    drop CONSTRAINT chk_region_regionkey --check(r_regionkey >= 0);
;
ALTER TABLE tpch20.partsupp
    drop CONSTRAINT chk_partsupp_partkey --check(ps_partkey >= 0);
;
ALTER TABLE tpch20.customer
    drop CONSTRAINT chk_customer_custkey --check(c_custkey >= 0);
;
ALTER TABLE tpch20.supplier
    drop CONSTRAINT chk_supplier_suppkey --check(s_suppkey >= 0);
;
    ALTER TABLE tpch20.part
    drop CONSTRAINT chk_part_partkey --check(p_partkey >= 0);
;

 ALTER TABLE tpch20.region
    drop CONSTRAINT pk_region -- --primary key(r_regionkey);
;
 ALTER TABLE tpch20.nation
    drop CONSTRAINT pk_nation --primary key(n_nationkey);
;
ALTER TABLE tpch20.lineitem
    drop CONSTRAINT pk_lineitem --primary key(l_linenumber, l_orderkey);
;
ALTER TABLE tpch20.orders
    drop CONSTRAINT pk_orders --primary key(o_orderkey);
;
ALTER TABLE tpch20.customer
    drop CONSTRAINT pk_customer --primary key(c_custkey);
;
ALTER TABLE tpch20.partsupp
    drop CONSTRAINT pk_partsupp --primary key(ps_partkey, ps_suppkey);
;
ALTER TABLE tpch20.supplier
    drop CONSTRAINT pk_supplier --primary key(s_suppkey);
;
ALTER TABLE tpch20.part
    drop CONSTRAINT pk_part --primary key(p_partkey);
;


-- 1.4.2.3


ALTER TABLE tpch20.lineitem
    drop CONSTRAINT fk_lineitem_partsupp; -- FOREIGN KEY(l_partkey, l_suppkey);
--        REFERENCES tpch20.partsupp(ps_partkey, ps_suppkey);
 ;
ALTER TABLE tpch20.lineitem
    drop CONSTRAINT fk_lineitem_supplier ;--FOREIGN KEY(l_suppkey) REFERENCES tpch20.supplier(s_suppkey);
;
ALTER TABLE tpch20.lineitem
    drop CONSTRAINT fk_lineitem_part ; -- FOREIGN KEY(l_partkey) REFERENCES tpch20.part(p_partkey);
 ;
ALTER TABLE tpch20.lineitem
    drop CONSTRAINT fk_lineitem_order ; --FOREIGN KEY(l_orderkey) REFERENCES tpch20.orders(o_orderkey);
;
ALTER TABLE tpch20.orders
    drop CONSTRAINT fk_orders_customer -- FOREIGN KEY(o_custkey) REFERENCES tpch20.customer(c_custkey);
  ;  
ALTER TABLE tpch20.customer
    drop CONSTRAINT fk_customer_nation -- FOREIGN KEY(c_nationkey) REFERENCES tpch20.nation(n_nationkey);
 ;   
ALTER TABLE tpch20.partsupp
    drop CONSTRAINT fk_partsupp_supplier -- FOREIGN KEY(ps_suppkey) REFERENCES tpch20.supplier(s_suppkey);
;
         ALTER TABLE tpch20.partsupp
    drop CONSTRAINT fk_partsupp_part -- FOREIGN KEY(ps_partkey) REFERENCES tpch20.part(p_partkey);
; 
 
 -----

INSERT /*+ APPEND */ INTO  tpch20.region   SELECT * FROM tpch20.ext_region;
INSERT /*+ APPEND */ INTO  tpch20.customer SELECT * FROM tpch20.ext_customer;


ALTER TABLE tpch20.part
    ADD CONSTRAINT pk_part PRIMARY KEY(p_partkey);

ALTER TABLE tpch20.supplier
    ADD CONSTRAINT pk_supplier PRIMARY KEY(s_suppkey);

ALTER TABLE tpch20.partsupp
    ADD CONSTRAINT pk_partsupp PRIMARY KEY(ps_partkey, ps_suppkey);



ALTER TABLE tpch20.customer
    ADD CONSTRAINT pk_customer PRIMARY KEY(c_custkey);

ALTER TABLE tpch20.orders
    ADD CONSTRAINT pk_orders PRIMARY KEY(o_orderkey);

ALTER TABLE tpch20.lineitem
    ADD CONSTRAINT pk_lineitem PRIMARY KEY(l_linenumber, l_orderkey);

ALTER TABLE tpch20.nation
    ADD CONSTRAINT pk_nation PRIMARY KEY(n_nationkey);

ALTER TABLE tpch20.region
    ADD CONSTRAINT pk_region PRIMARY KEY(r_regionkey);

 
 ALTER TABLE tpch20.partsupp
    ADD CONSTRAINT fk_partsupp_part FOREIGN KEY(ps_partkey) REFERENCES tpch20.part(p_partkey);

ALTER TABLE tpch20.partsupp
    ADD CONSTRAINT fk_partsupp_supplier FOREIGN KEY(ps_suppkey) REFERENCES tpch20.supplier(s_suppkey);

ALTER TABLE tpch20.customer
    ADD CONSTRAINT fk_customer_nation FOREIGN KEY(c_nationkey) REFERENCES tpch20.nation(n_nationkey);

ALTER TABLE tpch20.orders
    ADD CONSTRAINT fk_orders_customer FOREIGN KEY(o_custkey) REFERENCES tpch20.customer(c_custkey);

ALTER TABLE tpch20.lineitem
    ADD CONSTRAINT fk_lineitem_order FOREIGN KEY(l_orderkey) REFERENCES tpch20.orders(o_orderkey);

ALTER TABLE tpch20.lineitem
    ADD CONSTRAINT fk_lineitem_part FOREIGN KEY(l_partkey) REFERENCES tpch20.part(p_partkey);

ALTER TABLE tpch20.lineitem
    ADD CONSTRAINT fk_lineitem_supplier FOREIGN KEY(l_suppkey) REFERENCES tpch20.supplier(s_suppkey);

ALTER TABLE tpch20.lineitem
    ADD CONSTRAINT fk_lineitem_partsupp FOREIGN KEY(l_partkey, l_suppkey)
        REFERENCES tpch20.partsupp(ps_partkey, ps_suppkey);

-- 1.4.2.4 - 1

ALTER TABLE tpch20.part
    ADD CONSTRAINT chk_part_partkey CHECK(p_partkey >= 0);

ALTER TABLE tpch20.supplier
    ADD CONSTRAINT chk_supplier_suppkey CHECK(s_suppkey >= 0);

ALTER TABLE tpch20.customer
    ADD CONSTRAINT chk_customer_custkey CHECK(c_custkey >= 0);

ALTER TABLE tpch20.partsupp
    ADD CONSTRAINT chk_partsupp_partkey CHECK(ps_partkey >= 0);

ALTER TABLE tpch20.region
    ADD CONSTRAINT chk_region_regionkey CHECK(r_regionkey >= 0);

ALTER TABLE tpch20.nation
    ADD CONSTRAINT chk_nation_nationkey CHECK(n_nationkey >= 0);

-- 1.4.2.4 - 2

ALTER TABLE tpch20.part
    ADD CONSTRAINT chk_part_size CHECK(p_size >= 0);

ALTER TABLE tpch20.part
    ADD CONSTRAINT chk_part_retailprice CHECK(p_retailprice >= 0);

ALTER TABLE tpch20.partsupp
    ADD CONSTRAINT chk_partsupp_availqty CHECK(ps_availqty >= 0);

ALTER TABLE tpch20.partsupp
    ADD CONSTRAINT chk_partsupp_supplycost CHECK(ps_supplycost >= 0);

ALTER TABLE tpch20.orders
    ADD CONSTRAINT chk_orders_totalprice CHECK(o_totalprice >= 0);

ALTER TABLE tpch20.lineitem
    ADD CONSTRAINT chk_lineitem_quantity CHECK(l_quantity >= 0);

ALTER TABLE tpch20.lineitem
    ADD CONSTRAINT chk_lineitem_extendedprice CHECK(l_extendedprice >= 0);

ALTER TABLE tpch20.lineitem
    ADD CONSTRAINT chk_lineitem_tax CHECK(l_tax >= 0);

-- 1.4.2.4 - 3

ALTER TABLE tpch20.lineitem
    ADD CONSTRAINT chk_lineitem_discount CHECK(l_discount >= 0.00 AND l_discount <= 1.00);

-- 1.4.2.4 - 4

ALTER TABLE tpch20.lineitem
    ADD CONSTRAINT chk_lineitem_ship_rcpt CHECK(l_shipdate <= l_receiptdate);

 

728x90

댓글()

[데이터처리] Spark + Airflow #1

IT_Engineer|2023. 2. 27. 09:43
728x90

[배경] AWS 의 경우 간단한 처리 구조

  • 보관 > 통합 > 처리 구조
    • AWS S3 > AWS Glue > Spark on AWS EMR
    • AWS S3 : 확장성, 데이터 가용성, 보안 및 성능을 제공하는 객체 스토리지 서비스
    • AWS Glue : 분석, 기계 학습(ML) 및 애플리케이션 개발을 위해 여러 소스에서 데이터를 쉽게 탐색, 준비, 이동 및 통합할 수 있도록 하는 확장 가능한 서버리스 데이터 통합 서비스
    • AWS EMR : Apache Spark, Apache Hive 및 Presto와 같은 오픈 소스 프레임워크를 사용하여 페타바이트급 데이터 처리, 대화식 분석 및 기계 학습을 위한 클라우드 빅 데이터 솔루션

Q. Local Spark와 Spark on Kubernetes는 어떻게 다른가요?

더보기

일반적으로 YARN을 통해서 client모드로 Spark job을 실행시켜보면, 
master node에서 driver가 동작하고 worker node에서 executor들이 동작하게 됩니다.
물론 cluster모드라면 driver 또한 worker node어딘가에서 동작할테지요.

Kubernetes에서는 driver가 뜨고 이 driver가 executor pod들을 실행시키게 되는데요. client모드로
Spark job을 실행하면 실행하고자 하는 pod이 driver가 되고 executor pod들이 새로 뜨게 됩니다. 
cluster 모드로 실행하면 실행한 pod과는 별개로 driver pod이 새로 뜨게되며,
새로 뜬 driver pod이 executor pod들을 띄우게 되는 구조입니다.

 

참고1 : https://blog.banksalad.com/tech/spark-on-kubernetes/

 

Spark on Kubernetes로 가자! | 뱅크샐러드

안녕하세요. 저희는 뱅크샐러드 Data Platform 팀 김민수, 김태일 입니다. 이번 글에서는 뱅크샐러드 데이터 분석환경 컴퓨팅을 EMR, YARN 기반 Spark에서 Self-hosted Kubernetes…

blog.banksalad.com

참고2 :  https://techblog.woowahan.com/10291/

 

Spark on Kubernetes로 이관하기 | 우아한형제들 기술블로그

{{item.name}} 안녕하세요, 우아한형제들 데이터플랫폼팀 박준영입니다. 이번 글에서는 Spark on Kubernetes 환경 도입 과정과 운영 경험에 대해 소개해 드리려 합니다. 데이터플랫폼팀은 다양한 서비스

techblog.woowahan.com

 

 

[구성] Spark on Kubernetes 구성

  1. Spark 2.3 이상 binary에는 k8s 배포 관련 util 이 포함되어 있다.
  2. spark app을 Image로 만들어준다.
  3. 만든 Image를 submin-commit 으로 k8s cluster를 지정해서 보낸다.
  4. 그럼 알아서 Driveer 생성 및 Executor 생성

[작업 Script]

1. 바이너리 다운로드 및 빌드하기

  • 도커가 설치된 서버에서 작업하기
# 바이너리 다운로드 (Ozone은 hadoop3 기준이기때문에 v3.3.x 쵸이스 )
$ wget https://dlcdn.apache.org/spark/spark-3.3.2/spark-3.3.2-bin-hadoop3.tgz

# 압축을 풀고
$ tar -zxvf spark-3.3.2-bin-hadoop3.tgz

# 빌드해본다.
$ ./spark-3.3.2-bin-hadoop2.7/bin/docker-image-tool.sh -r {Image주소} -t v1.0 build
Sending build context to Docker daemon  337.7MB
Step 1/18 : ARG java_image_tag=11-jre-slim
Step 2/18 : FROM openjdk:${java_image_tag}
11-jre-slim: Pulling from library/openjdk

## 생략 ##

Step 18/18 : USER ${spark_uid}
 ---> Running in 23bbcf422aa4
Removing intermediate container 23bbcf422aa4
 ---> 9fcf542b4d4a
Successfully built 9fcf542b4d4a
Successfully tagged {Image주소}/spark:v1.0

# build 확인
$ docker images |grep spark 

# push 해본다.
$ ./spark-3.3.2-bin-hadoop3/bin/docker-image-tool.sh -r {Image주소} -t v1.0 push
The push refers to repository [{Image주소}/spark]
b9ca6c484aa7: Pushed 
f775e56e0fcd: Pushed 
d00b1260ea67: Pushed 
acad6e80a726: Pushed 
481ab0385b18: Pushed 
15a249d53e5c: Pushed 
0fb02ac6bd62: Pushed 
8543b7b8afe5: Pushed 
e501578f6db2: Pushed 
f309a4cd9a46: Pushed 
d7802b8508af: Pushed 
e3abdc2e9252: Pushed 
eafe6e032dbd: Pushed 
92a4e8a3140f: Pushed 
v1.0: digest: sha256:f2bf057b9d3c98036bc0b94dc9221faecdb334d6ad16c64b563f01f31847593e size: 3459
{Image주소}/spark-py:v1.0 image not found. Skipping push for this image.
{Image주소}/spark-r:v1.0 image not found. Skipping push for this image.

# push 확인
$ curl {Image주소}/v2/_catalog?n=1000 |grep spark
$ curl {Image주소}/v2/spark/tags/list
{"name":"spark","tags":["v1.0","base","spark-crystal"]}

 

2. sumit shell 작성해보자

## 마스터에서 클러스터 정보 추출
$ kubectl cluster-info
Kubernetes master is running at https://{k8sIP}:6443
KubeDNS is running at https://{k8sIP}:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
Metrics-server is running at https://{k8sIP}:6443/api/v1/namespaces/kube-system/services/https:metrics-server:/proxy

$ vi ./k8s_submit.sh
#!/bin/bash
IMAGE_REPO={Image주소}
export SPARK_HOME=./spark-3.3.2-bin-hadoop3

$SPARK_HOME/bin/spark-submit \
    --master k8s://https://{k8sIP}:6443 \
    --deploy-mode cluster \
    --name spark-example \
    --class org.apache.spark.examples.SparkPi \
    --conf spark.executor.instances=3 \
    --conf spark.kubernetes.container.image=$IMAGE_REPO/spark:v1.0 \
    $SPARK_HOME/examples/jars/spark-examples_2.12-3.3.2.jar

## 1차 시도 -> 에러 https://m.blog.naver.com/firstpcb/221762669146
$ ./k8s_submit.sh 
23/02/26 15:01:40 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
### 생략 ###  >> Hadoop Binary가 필요해 보인다.
$ wget https://dlcdn.apache.org/hadoop/common/hadoop-3.3.4/hadoop-3.3.4.tar.gz
$ tar -zxvf hadoop-3.3.4.tar.gz
$ cp ./spark-3.3.2-bin-hadoop3/conf/spark-env.sh.template ./spark-3.3.2-bin-hadoop3/conf/spark-env.sh
$ vi ./spark-3.3.2-bin-hadoop3/conf/spark-env.sh
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/data/jhhan/spark/hadoop-3.3.4/lib/native # 추가

## 2차 시도 -> 또 에러난다....
$ ./k8s_submit.sh
23/02/26 15:26:43 INFO SparkKubernetesClientFactory: Auto-configuring K8S client using current context from users K8S config file
23/02/26 15:26:43 INFO KerberosConfDriverFeatureStep: You have not specified a krb5.conf file locally or via a ConfigMap. Make sure that you have the krb5.conf locally on the driver image.
Exception in thread "main" org.apache.spark.SparkException: Please specify spark.kubernetes.file.upload.path property.
    at org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:330)
    at org.apache.spark.deploy.k8s.KubernetesUtils$.$anonfun$uploadAndTransformFileUris$1(KubernetesUtils.scala:276)
    at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286)
    at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
    at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
    at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
    at scala.collection.TraversableLike.map(TraversableLike.scala:286)
    at scala.collection.TraversableLike.map$(TraversableLike.scala:279)
    at scala.collection.AbstractTraversable.map(Traversable.scala:108)
    at org.apache.spark.deploy.k8s.KubernetesUtils$.uploadAndTransformFileUris(KubernetesUtils.scala:275)
    at org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.$anonfun$getAdditionalPodSystemProperties$1(BasicDriverFeatureStep.scala:188)
    at scala.collection.immutable.List.foreach(List.scala:431)
    at org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.getAdditionalPodSystemProperties(BasicDriverFeatureStep.scala:178)
    at org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.$anonfun$buildFromFeatures$5(KubernetesDriverBuilder.scala:86)
    at scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126)
    at scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122)
    at scala.collection.immutable.List.foldLeft(List.scala:91)
    at org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.buildFromFeatures(KubernetesDriverBuilder.scala:84)
    at org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:104)
    at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5(KubernetesClientApplication.scala:248)
    at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5$adapted(KubernetesClientApplication.scala:242)
    at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2764)
    at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:242)
    at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:214)
    at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958)
    at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
    at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
    at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
    at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
23/02/26 15:26:43 INFO ShutdownHookManager: Shutdown hook called
23/02/26 15:26:43 INFO ShutdownHookManager: Deleting directory /tmp/spark-7aa1d18a-8834-42d4-9831-15d80d9779af

## 3차 시도
## $ ./k8s_submit.sh 에서 실행 jar 파일 Paht 앞에 local:// 설정

## 결과
#!/bin/bash
IMAGE_REPO={Image주소}
export SPARK_HOME=/data/jhhan/spark/spark-3.3.2-bin-hadoop3

$SPARK_HOME/bin/spark-submit \
    --master k8s://https://{k8sIP}:6443 \
    --deploy-mode cluster \
    --name spark-example \
    --class org.apache.spark.examples.SparkPi \
    --conf spark.executor.instances=3 \
    --conf spark.kubernetes.container.image=$IMAGE_REPO/spark:v1.0 \
    local://$SPARK_HOME/examples/jars/spark-examples_2.12-3.3.2.jar

## Drive가 배포는 되었으나 Error로 끝이난다.
$ ./k8s_submit.sh 
23/02/26 15:46:38 INFO SparkKubernetesClientFactory: Auto-configuring K8S client using current context from users K8S config file
23/02/26 15:46:38 INFO KerberosConfDriverFeatureStep: You have not specified a krb5.conf file locally or via a ConfigMap. Make sure that you have the krb5.conf locally on the driver image.
23/02/26 15:46:39 INFO KubernetesClientUtils: Spark configuration files loaded from Some(/data/jhhan/spark/spark-3.3.2-bin-hadoop3/conf) : spark-env.sh
23/02/26 15:46:39 INFO LoggingPodStatusWatcherImpl: State changed, new state: 
     pod name: spark-example-e4f249868c793199-driver
     namespace: default
     labels: spark-app-name -> spark-example, spark-app-selector -> spark-74dc1638e90149b893ef42bd776f3668, spark-role -> driver, spark-version -> 3.3.2
     pod uid: f82c4e16-3aea-4a48-8220-1bfe06e47d0b
     creation time: 2023-02-26T06:46:39Z
     service account name: default
     volumes: spark-local-dir-1, spark-conf-volume-driver, default-token-2dxpc
     node name: N/A
     start time: N/A
     phase: Pending
     container status: N/A
23/02/26 15:46:39 INFO LoggingPodStatusWatcherImpl: State changed, new state: 
     pod name: spark-example-e4f249868c793199-driver
     namespace: default
     labels: spark-app-name -> spark-example, spark-app-selector -> spark-74dc1638e90149b893ef42bd776f3668, spark-role -> driver, spark-version -> 3.3.2
     pod uid: f82c4e16-3aea-4a48-8220-1bfe06e47d0b
     creation time: 2023-02-26T06:46:39Z
     service account name: default
     volumes: spark-local-dir-1, spark-conf-volume-driver, default-token-2dxpc
     node name: N/A
     start time: N/A
     phase: Pending
     container status: N/A
23/02/26 15:46:39 INFO LoggingPodStatusWatcherImpl: Waiting for application spark-example with submission ID default:spark-example-e4f249868c793199-driver to finish...
23/02/26 15:46:39 INFO LoggingPodStatusWatcherImpl: State changed, new state: 
     pod name: spark-example-e4f249868c793199-driver
     namespace: default
     labels: spark-app-name -> spark-example, spark-app-selector -> spark-74dc1638e90149b893ef42bd776f3668, spark-role -> driver, spark-version -> 3.3.2
     pod uid: f82c4e16-3aea-4a48-8220-1bfe06e47d0b
     creation time: 2023-02-26T06:46:39Z
     service account name: default
     volumes: spark-local-dir-1, spark-conf-volume-driver, default-token-2dxpc
     node name: k8s-19-worker09
     start time: N/A
     phase: Pending
     container status: N/A
23/02/26 15:46:39 INFO LoggingPodStatusWatcherImpl: State changed, new state: 
     pod name: spark-example-e4f249868c793199-driver
     namespace: default
     labels: spark-app-name -> spark-example, spark-app-selector -> spark-74dc1638e90149b893ef42bd776f3668, spark-role -> driver, spark-version -> 3.3.2
     pod uid: f82c4e16-3aea-4a48-8220-1bfe06e47d0b
     creation time: 2023-02-26T06:46:39Z
     service account name: default
     volumes: spark-local-dir-1, spark-conf-volume-driver, default-token-2dxpc
     node name: k8s-19-worker09
     start time: 2023-02-26T06:46:40Z
     phase: Pending
     container status: 
         container name: spark-kubernetes-driver
         container image: {Image주소}/spark:v1.0
         container state: waiting
         pending reason: ContainerCreating
23/02/26 15:46:40 INFO LoggingPodStatusWatcherImpl: Application status for spark-74dc1638e90149b893ef42bd776f3668 (phase: Pending)
23/02/26 15:46:41 INFO LoggingPodStatusWatcherImpl: Application status for spark-74dc1638e90149b893ef42bd776f3668 (phase: Pending)
23/02/26 15:46:41 INFO LoggingPodStatusWatcherImpl: State changed, new state: 
     pod name: spark-example-e4f249868c793199-driver
     namespace: default
     labels: spark-app-name -> spark-example, spark-app-selector -> spark-74dc1638e90149b893ef42bd776f3668, spark-role -> driver, spark-version -> 3.3.2
     pod uid: f82c4e16-3aea-4a48-8220-1bfe06e47d0b
     creation time: 2023-02-26T06:46:39Z
     service account name: default
     volumes: spark-local-dir-1, spark-conf-volume-driver, default-token-2dxpc
     node name: k8s-19-worker09
     start time: 2023-02-26T06:46:40Z
     phase: Pending
     container status: 
         container name: spark-kubernetes-driver
         container image: {Image주소}/spark:v1.0
         container state: waiting
         pending reason: ContainerCreating
23/02/26 15:46:42 INFO LoggingPodStatusWatcherImpl: Application status for spark-74dc1638e90149b893ef42bd776f3668 (phase: Pending)
23/02/26 15:46:43 INFO LoggingPodStatusWatcherImpl: State changed, new state: 
     pod name: spark-example-e4f249868c793199-driver
     namespace: default
     labels: spark-app-name -> spark-example, spark-app-selector -> spark-74dc1638e90149b893ef42bd776f3668, spark-role -> driver, spark-version -> 3.3.2
     pod uid: f82c4e16-3aea-4a48-8220-1bfe06e47d0b
     creation time: 2023-02-26T06:46:39Z
     service account name: default
     volumes: spark-local-dir-1, spark-conf-volume-driver, default-token-2dxpc
     node name: k8s-19-worker09
     start time: 2023-02-26T06:46:40Z
     phase: Running
     container status: 
         container name: spark-kubernetes-driver
         container image: {Image주소}/spark:v1.0
         container state: running
         container started at: 2023-02-26T06:46:42Z
23/02/26 15:46:43 INFO LoggingPodStatusWatcherImpl: Application status for spark-74dc1638e90149b893ef42bd776f3668 (phase: Running)
23/02/26 15:46:44 INFO LoggingPodStatusWatcherImpl: State changed, new state: 
     pod name: spark-example-e4f249868c793199-driver
     namespace: default
     labels: spark-app-name -> spark-example, spark-app-selector -> spark-74dc1638e90149b893ef42bd776f3668, spark-role -> driver, spark-version -> 3.3.2
     pod uid: f82c4e16-3aea-4a48-8220-1bfe06e47d0b
     creation time: 2023-02-26T06:46:39Z
     service account name: default
     volumes: spark-local-dir-1, spark-conf-volume-driver, default-token-2dxpc
     node name: k8s-19-worker09
     start time: 2023-02-26T06:46:40Z
     phase: Failed
     container status: 
         container name: spark-kubernetes-driver
         container image: {Image주소}/spark:v1.0
         container state: terminated
         container started at: 2023-02-26T06:46:42Z
         container finished at: 2023-02-26T06:46:44Z
         exit code: 101
         termination reason: Error
23/02/26 15:46:44 INFO LoggingPodStatusWatcherImpl: Application status for spark-74dc1638e90149b893ef42bd776f3668 (phase: Failed)
23/02/26 15:46:44 INFO LoggingPodStatusWatcherImpl: Container final statuses:
     container name: spark-kubernetes-driver
     container image: {Image주소}/spark:v1.0
     container state: terminated
     container started at: 2023-02-26T06:46:42Z
     container finished at: 2023-02-26T06:46:44Z
     exit code: 101
     termination reason: Error
23/02/26 15:46:44 INFO LoggingPodStatusWatcherImpl: Application spark-example with submission ID default:spark-example-e4f249868c793199-driver finished
23/02/26 15:46:44 INFO ShutdownHookManager: Shutdown hook called
23/02/26 15:46:44 INFO ShutdownHookManager: Deleting directory /tmp/spark-86286d2a-0416-43ff-990a-6b1bcf5ccebc

### Log를 찍어보면 아래와 같다.
$ kubectl logs spark-example-e4f249868c793199-driver
## 생략 ##
+ CMD=("$SPARK_HOME/bin/spark-submit" --conf "spark.driver.bindAddress=$SPARK_DRIVER_BIND_ADDRESS" --deploy-mode client "$@")
+ exec /usr/bin/tini -s -- /opt/spark/bin/spark-submit --conf spark.driver.bindAddress=10.10.192.30 --deploy-mode client --properties-file /opt/spark/conf/spark.properties --class org.apache.spark.examples.SparkPi local:///data/jhhan/spark/spark-3.3.2-bin-hadoop3/examples/jars/spark-examples_2.12-3.3.2.jar
23/02/26 06:46:44 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
23/02/26 06:46:44 WARN DependencyUtils: Local jar /data/jhhan/spark/spark-3.3.2-bin-hadoop3/examples/jars/spark-examples_2.12-3.3.2.jar does not exist, skipping.
23/02/26 06:46:44 WARN DependencyUtils: Local jar /data/jhhan/spark/spark-3.3.2-bin-hadoop3/examples/jars/spark-examples_2.12-3.3.2.jar does not exist, skipping.
Error: Failed to load class org.apache.spark.examples.SparkPi.
23/02/26 06:46:44 INFO ShutdownHookManager: Shutdown hook called
23/02/26 06:46:44 INFO ShutdownHookManager: Deleting directory /tmp/spark-39d25f0b-a40d-4156-98d3-a734916afd9d

## 아마도 Log 내용으로 보아 설정한 local이 sumit을 실행한 곳이나 Pod가 배포된 k8s Worker 서버가 아닌 Pod 내부를 말하는것 같다.
## https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/issues/1473
## 즉 수행할 이미지를 만들어 쓰거나 수행할 app jar를 모아놓는 pvc를 만들어서 거글 통해서 하면 편리할듯 하다.

## 4차 시도
$ vi k8s_submit.sh
#!/bin/bash
IMAGE_REPO={Image주소}
export SPARK_HOME=/data/jhhan/spark/spark-3.3.2-bin-hadoop3

$SPARK_HOME/bin/spark-submit \
    --master k8s://https://{k8sIP}:6443 \
    --deploy-mode cluster \
    --name spark-example \
    --class org.apache.spark.examples.SparkPi \
    --conf spark.executor.instances=3 \
    --conf spark.kubernetes.container.image=$IMAGE_REPO/spark:v1.0 \
    local:///opt/spark/examples/jars/spark-examples_2.12-3.3.2.jar ## 경로 변경 (위에서 Build시에 쓰는 Dockerfile을 열어보니 존재할듯 하다)

## cat /home/data/hanjh/spark/spark-3.3.2-bin-hadoop3/kubernetes/dockerfiles/spark/Dockerfile

$ ./k8s_submit.sh
## Driver는 떠있지만 Log를 보니 Service 접근이 안되는듯 하다.
$ kubectl logs pod/spark-example-086b82868cb4ec97-driver

23/02/26 07:51:58 ERROR SparkContext: Error initializing SparkContext.
org.apache.spark.SparkException: External scheduler cannot be instantiated
    at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:3002)
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:573)
    at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2714)
    at org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:953)
    at scala.Option.getOrElse(Option.scala:189)
    at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:947)
    at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:30)
    at org.apache.spark.examples.SparkPi.main(SparkPi.scala)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
    at java.base/java.lang.reflect.Method.invoke(Unknown Source)
    at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
    at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958)
    at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
    at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
    at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
    at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.reflect.InvocationTargetException
    at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
    at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
    at java.base/java.lang.reflect.Constructor.newInstance(Unknown Source)
    at org.apache.spark.scheduler.cluster.k8s.KubernetesClusterManager.makeExecutorPodsAllocator(KubernetesClusterManager.scala:158)
    at org.apache.spark.scheduler.cluster.k8s.KubernetesClusterManager.createSchedulerBackend(KubernetesClusterManager.scala:113)
    at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2996)
    ... 19 more
Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: GET at: https://kubernetes.default.svc/api/v1/namespaces/default/pods/spark-example-086b82868cb4ec97-driver. Message: Forbidden!Configured service account doesn't have access. Service account may have been revoked. pods "spark-example-086b82868cb4ec97-driver" is forbidden: User "system:serviceaccount:default:default" cannot get resource "pods" in API group "" in the namespace "default".
    at io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:682)
    at io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:661)
    at io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:610)
    at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:555)
    at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:518)
    at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:487)
    at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:457)
    at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleGet(BaseOperation.java:698)
    at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:184)
    at io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:151)
    at io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:83)
    at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$driverPod$1(ExecutorPodsAllocator.scala:79)
    at scala.Option.map(Option.scala:230)
    at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.<init>(ExecutorPodsAllocator.scala:78)
    ... 26 more
23/02/26 07:51:58 INFO SparkUI: Stopped Spark web UI at http://spark-example-086b82868cb4ec97-driver-svc.default.svc:4040
23/02/26 07:51:58 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!'

## 5차 시도
# https://stackoverflow.com/questions/55498702/how-to-fix-forbiddenconfigured-service-account-doesnt-have-access-with-spark
# spark 가 k8s를 사용하기 위한 권한을 부여해야한다.
# 아래와 같이 생성후 사용하도록 conf를 지정한다.
$ vi spark_base.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: spark
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: spark-cluster-role
rules:
- apiGroups: [""] # "" indicates the core API group
  resources: ["pods"]
  verbs: ["get", "watch", "list", "create", "delete"]
- apiGroups: [""] # "" indicates the core API group
  resources: ["services"]
  verbs: ["get", "create", "delete"]
- apiGroups: [""] # "" indicates the core API group
  resources: ["configmaps"]
  verbs: ["get", "create", "delete"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: spark-cluster-role-binding
subjects:
- kind: ServiceAccount
  name: spark
  namespace: default
roleRef:
  kind: ClusterRole
  name: spark-cluster-role
  apiGroup: rbac.authorization.k8s.io

$ kubectl apply -f spark_base.yaml

$ vi k8s_submit.sh
## 아래 컨피그 추가
--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark

$ ./k8s_submit.sh
## 성공!!!!!!

## 수행된후 확인하면 이렇게 driver의 상태가 completed가 된다.
NAME                                        READY   STATUS      RESTARTS   AGE     IP             NODE              NOMINATED NODE   READINESS GATES
pod/pod-1                                   1/1     Running     0          3d15h   10.10.220.66   k8s-19-master02   <none>           <none>
pod/spark-example-2a262386904c5997-driver   0/1     Completed   0          42s     10.10.192.33   k8s-19-worker09   <none>           <none>

NAME                                                TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)                      AGE    SELECTOR
service/kubernetes                                 ClusterIP   10.244.0.1      <none>        443/TCP                      332d   <none>
service/spark-example-2a262386904c5997-driver-svc   ClusterIP   None          <none>        7078/TCP,7079/TCP,4040/TCP   42s    spark-app-name=spar
k-example,spark-app-selector=spark-0f7944e3ad1f40758cdfcac6c1d9b2d9,spark-role=driver,spark-version=3.3.2
service/svc-1                                       NodePort    10.244.117.41   <none>        9000:31788/TCP               5d9h   app=pod
728x90

댓글()

[Tool/NAS] Synology NAS 구축하기#3 NAS 설정하기1

카테고리 없음|2023. 1. 30. 18:42
728x90

NAS 본체에 하드 디스크, 메모리까지 잘 조립해서

LAN 선까지 잘 연결해주면, 같은 인터넷 망에 연결된 PC에 매뉴얼이 자동으로 시작된다.

 

만약 시작되지 않을 경우에는 아래 웹사이트에 방문하여 직접 찾아서 연결하자!
>   find.synology.com

 

내 Synology NAS가 잘 떠서, [연결]을 누르면 짠! 환영합니다!

이제 설정을 시작해보자!

우선 DiskStation Manager(DSM)을 설치한다.
DSM 은 Synology NAS에서 실행하는 운영체제로, 다양한 NAS 관련 Application 관리와 운용을 직관적으로 도와준다.

 

[지금 설치] 를 누르면, DSM이 설치되는데 꽤 시간이 소요된다.

시스템 파티션 포맷부터 DSM 설치까지 한 5~10분정도 소요된 듯 하다.

이렇게 설치과정이 끝나면, Synology NAS가 다시 시작된다.

그리고 기본 제공 패키지 설치까지 끝나면, DSM7.1 시작! 여기서 차세대 데이터 관리를 시작해보쟈!

이렇게 DSM7.1 이 설치되었고, Synology Account에 가입했다.

가입해서, 내 NAS 장치도 등록하면 아래처럼 뜨고

AS를 더 받을 수 있는 보증기한이 추가된다는데, 사실 Synology A/S는 지점이 우리나라에 없어서

다른 나라를 선택해서 인증하고 진행해야하는데, 그게 참 어렵다.... 아쉬운 점 1!!!!!!

그래도 나는 혹시 모를 사태를 대비해서 Synology NAS에 계정을 만들고, 내 제품을 등록해놓았다.

 

이대로 쭉쭉 진행해서 저장소 설정을 시작해보자

꼭 나와 같은 설정을 가져갈 필요는 없지만, 나도 나름 고민해서 설정한 내용이라 참고가 될까해서 가져왔다.

1) Btrfs(권장) > ext4

 - 사실 가장 많은 PC에서 사용하는 파일 시스템 구성은 ext4이지만, synology NAS에서는 btrfs를 권장하고 있다.

 - 결론만 이야기하면, Synology NAS 관련 Application 중에 btrfs만 가능한 부분들이 있어서, 나도 btrfs로 진행을 권장한다.

 -  자세한 내용은 https://www.ihee.com/144 이 블로그를 참고했다.

2) 볼륨 용량 할당은 위처럼 스토리지 풀1(SHR)에 사용가능한 용량을 최대한 할당해서 진행했다.

3) 드라이브 검사 수행 < 드라이브 검사 건너뛰기

 - 이건 어느 부분을 수행해도 치명적이진 않은 것 같은데, 나는 새벽에 작업하느라 얼른 자고싶은 마음에 건너뛰기!

4) RAID 유형은 SHR / 스토리지 풀 설명(옵션) : Synology Hybrid RAID (기본)

 - 나는 살짝 불안한 느낌이 있어서 안전하게 RAID 구성으로 진행했다.

 - RAID 구성 방식은 Synology Hybrid RAID 방식으로 진행

 - 이 방식으로 선택해서 [다음]을 클릭하면, 아래처럼 내가 구성한 드라이브 내용이 정리되어 보여진다.

짜잔 ! 이렇게 호다다닥 구성 완성하면, 나의 NAS 가장 기본 화면이 보여진다.

728x90

댓글()

[Tool/NAS] Synology NAS 구축하기#1/HW 구성

LIFE/Review|2023. 1. 10. 18:32
728x90

우선 가장 중요한 하드웨어 구성에 대해 정리해본다.

아주 감사하게도 선물 받은 부분들이 많아서, 구매 전 고민보다는 받고 난 후에 알아본 느낌을 코멘트로 달아놨는데,

주변 반응과 아는 사람들 이야기들어보면, 디스크 용량도 많이 줄여도 되고 RAID 구성 안 하는 사람들도 많더라.

이런 부분들로 줄이더라도 기본적인 본체 값이 적은 편이 아니라,
  MYBOX 유료권을 구입하는 게 더 낫다고 계산하는 사람도 있으니 한 번 잘 알아보고,

무엇이든 본인한테 맞춰서 하는 것을 추천합니다 👩‍🦰

 

1. 본체 = 시놀로지 디스크 스테이션 NAS DS220+ 

 - Commnets : 보편적인 사용량과 구성, 그리고 연계된 앱 서비스가 매우 다양해서 좋다.

                        그리고 이해하거나 조작이 쉬운 편

   공식 : https://www.synology.com/ko-kr/products/DS220+

   최저가 : https://prod.danawa.com/info/?pcode=11349504&src=adwords&kw=GA0122944&gclid=CjwKCAiAk--dBhABEiwAchIwkS7NMJcrpMoeSEF_EMxqZZ1om__Rkwx7GI9sz-_25a-v2qHYqcgNKhoCNM0QAvD_BwE 

 

Synology DS220+ (하드미포함) : 다나와 가격비교

컴퓨터/노트북/조립PC>PC저장장치>NAS, 요약정보 : NAS / A/S: 2년 / [사양] / 인텔 Celeron J4025 (2코어, 2.0GHz) / DDR4 2GB / 1Gbps (기가비트) / 랜포트: 2개 / [저장소] / 3.5베이: 2개 / 장착디스크: 8.9cm(3.5인치), 6.4

prod.danawa.com

 

 

 

2. 하드디스크(RAID 구성) = WD Red™ Plus NAS Hard Drive 3.5 * 2개

    - 12TB / CACHE SIZE 256MB

   - Comments: 나처럼 데이터가 많거나 영상이나 사진 찍는 걸 좋아하는 경우를 제외하고는 12TB 까진 필요없음

                         혹시 모를 불상사를 방지하기 위한 RAID 구성도 하는 사람 안 하는 사람 나뉘어서 본인의 선택이 필요함

 공식 : https://www.westerndigital.com/products/internal-drives/wd-red-plus-sata-3-5-hdd#WD10EFRX

최저가 : https://search.danawa.com/dsearch.php?query=WD+Red%E2%84%A2+Plus+NAS+Hard+Drive+3.5+&tab=main

 

 

3. 메모리 = 삼성전자 노트북용 DDR4 16GB PC4-2666V (PC4-21300S)

      - Comments: 엄청난 특징과 장점은 잘 모르겠는데, 무난무난한 것 같다.
       최저가 : https://search.danawa.com/dsearch.php?query=%EC%82%BC%EC%84%B1%EC%A0%84%EC%9E%90+%EB%85%B8%ED%8A%B8%EB%B6%81%EC%9A%A9+DDR4+16GB+PC4-2666V&tab=main

728x90

'LIFE > Review' 카테고리의 다른 글

[Tool/NAS] Synology NAS 구축하기#0  (0) 2022.12.21

댓글()

[Tool/NAS] Synology NAS 구축하기#0

LIFE/Review|2022. 12. 21. 12:53
728x90

사진 찍는 것도, 영상찍는 것도, 여행가는 것도 좋아하고,

거기에 야심찬 유튜버의 꿈이 있는 나는

매일매일 Cloud 서비스와 각종 IT/컴퓨터 기기의 리소스 압박에 시달렸는데

 

매일 유튜버한다고 떠벌리고 다닌 덕분인지

(진짜 하루하루 주변 사람들에게 감사하며 살아야겠다😂

 더 더 잘 돼서 은혜 갚으며 살게요 !!!!!!!!!!!!!!!!!!!! )

 

감사하게도 2022년 생일선물로 고프로에 짐벌에 NAS 선물까지 알차게 들어와서

조금씩 꿈을 실현해보고자 NAS 구축을 시작했다.

 

조금 전 2~3일간 거의 한 두시간씩 공부하며, NAS 설치부터, 구축 설정한 후에

WEBDAS로 외부 네트워크에서 폴더형식으로 연결까지 끝낸 후

 

아직 갈 길이 멀지만,,

나같은 나스를 처음 접한 나알못, 나린이를 위해 이제 그 구축 기록을 정리해보고자 한다.

 

728x90

'LIFE > Review' 카테고리의 다른 글

[Tool/NAS] Synology NAS 구축하기#1/HW 구성  (0) 2023.01.10

댓글()

[Tip] 블프 제대로 즐겨보쟈! 블랙프라이데이 꿀팁!

LIFE/TIP|2022. 11. 24. 18:19
728x90

1. 블랙 프라이데이란?

 - 미국은 매년 11월 넷째 주 금요일 추수감사절부터, 크리스마스, 새해맞이까지 대규모 쇼핑행사 시작

   해당 기간동안 다양한 회사와 마트에서 그 해 생산된 1년치 제품들을 '재고떨이'하면서 시작

    미국 현지 기준으로 매해 11월 넷째주 '추수감사절(목요일)'의 다음날(금요일)부터 블랙프라이데이 시작

 

 

2. 저렴하게 살 수 있는 꿀팁?

<난이도 하>

1) 배송대행서비스 https://www.gobaesong.com/content/event/details/66
 - 배송대행지
 > 미국 내에서는 주간에서 대부분 상품에 관세를 붙임
    근데 델라웨어는 그런 관세를 받지 않아서, 배대지(배송대행지) 쓰기 좋은 지역 중 하나인데 
    이 배대지에 델라웨어 지역이 존재해 그래서 여기서 가입할때 한국 주소 입력하면 되고 가입하면 

    여기 배대지 주소에 개인 사서함이 생겨서 여기를 배송지로 입력하면되
    그러고 끝은 아니고 이제 쇼핑몰에서 구매할때 여기로 주소를 하면 되는거고 
    쇼핑이 끝나면 배대지 와서 배송대행 신청서라는걸 작성해 (내가 어디서 뭘 주문햇다라는 내용)
    그럼 배송이 오면 무게 부피를측정해서 배송대행금을 청구해 그걸 입금하면 보내주는 시스템

 2) https://slickdeals.net/
쇼핑관련 페이지
  - 블프 전용 페이지 
    https://blackfriday.slickdeals.net/?_ga=2.86393816.1922117932.1669266234-301030319.1669266233


 3) 블프 환급금 받기
    https://www.befrugal.com/addon/installationcomplete-nonmember/
    아 미국 쇼핑할때 이거 가입해서 크롬에 확장 깔아두면 포인트 자동 적립
    나중에 Pay par로 환급해서 통장에 돈을 넣을수 있음
    깔기만 하면 안 되고, 로그인 되어 있어야 하고,
    나중에 환급을 위해서 페이팔 등록도 해두어야 함
    페이팔의 경우 카드 및 계좌가 등록되어있어야하합니다.
    없이 진행하면 환급금 공중분해됨,,


 4) 쇼핑몰 - 블프 페이지 참고해보기
   https://www.bhphotovideo.com/   
   https://www.amazon.com/ref=nav_logo

========================================================================  
<난이도 상>

 - 코스트코랑 월마트는 그냥 가입하면, 구매가 되도 배송이 안 됨
 - 배대지 써도 배송이 안 되고, 미국접근으로 판단되어야지만 배송가능
   기본적으로 페이지 접근을 VPN으로 해야하구, 미국 VPN으로 미국 계정을 하나 만들자
 - 미국 접근 방법 : VPN
 - VPN이 없다면, 크롬 확장 프로그램으로 아래 프록시 설치
   https://chrome.google.com/webstore/detail/windscribe-free-proxy-and/hnmpcagpplmpfojmgmnngilcnanddlhb
 
 - 여기서 필요한건 미국 핸드폰 번호가 필요함 -> 미국인증 / 결제는 무조건 페이팔
    1) 미국 번호로 인증된 구글 계정 만들기
    2) 미국 VPN 접근하여 쇼핑몰 가입


5) 월마트  
   https://www.walmart.com/

6) 코스트코
   https://www.costco.com/

========================

728x90

댓글()

[OS] 개발 편의성을 위한 Windows 환경 세팅하기 - chocolately

IT_Engineer/Unix & Linux (Bash)|2022. 11. 15. 12:20
728x90

물론 개발을 위한 환경으로 Linux 환경이라면 정말 편하겠지만,,, 

업무 편의성을 위해 Windows OS가 설치되어있는 PC에 환경 세팅이 필요한 경우가 있다.

혹은 개발용 PC가 아니지만, 간단한 데이터 작업 혹은 IT Tech 관련 업무가 필요할 때 !!

 

1.Windows용 패키지 관리자 Chocolatey(choco) 란?

< Chocolately (AKA. Choco)

 - Chocolately 는 Windows의 패키지 매니저 : 필요한 설치 프로그램(Pakage)를 설치/업데이트/제거 등 관리

 - Likes, Linux의 yum/apt(-get) 또는 Mac의 Homevrew

- 공식 설치 홈페이지 : https://chocolatey.org/

 

Chocolatey - The package manager for Windows

Chocolatey is software management automation for Windows that wraps installers, executables, zips, and scripts into compiled packages. Chocolatey integrates w/SCCM, Puppet, Chef, etc. Chocolatey is trusted by businesses to manage software deployments.

chocolatey.org

 

2. 설치 방법

1) Copy Command for Downloading

- 홈페이지 > 우측 상단의 Try It Now > Step 2의 Now run the following command 하단의 명령어 복사(클립보드 아이콘)

2) Running on PowerShell

 - 윈도우 검색 혹은 탐색기에서 Shift+마우스 우클릭 > PowerShell - 관리자(권한으)로 실행 > 복사한 명령어 실행

 - 복사한 명령어 그대로 붙여넣고, 엔터!!!!!  > PowerShell 창 닫기

3) CHOCOLATETLY 사용하기

3.1) Chocolately HOME > Find Package : 필요한 Package를 찾아보자! 

3.2) 약관 및 Disclaimer 동의 > I Understand

3.3) 통합 검색창에 필요한 Package 이름을 적고 > 설치할 때 쓰는 명령어 복사 후 그대로 PowerShell 창에 붙여넣으면 끝!

 

728x90

댓글()

[Kube] Kubeflow, WHO ARE YOU?

IT_Engineer/클라우드|2022. 11. 2. 16:46
728x90

 

1. 개요 : KUBEFLOW란?

  KUBEFLOW  = "Kube"rnetes + ML "FLOW" 의 합성어
  즉, 쿠버네티스 기반의 Machine Learning FLOW 플랫폼 (혹은 AI 플랫폼)

 

 

🔵 쿠버네티스 기반에서 머신러닝(Machine Learning)

워크플로우(Workflow)를 자동화, ML 워크플로우 각 단계에서

머신러닝 모델을 개발하는데 필요한 도구와 환경 제공

→ ML 모델 쉽게 개발 및 배포, 확장 및 이식이 용이

 

※  머신러닝 워크플로우: 머신러닝 모델을 개발하는 모든 과정
   (데이터 전처리, 탐색적 데이터 분석, 데이터 변환, 하이퍼 파라미터 튜닝, 머신러닝 모델 학습, 머신러닝 모델 배포단계)

 

2. 아키텍처 및 구조

 

 

728x90

댓글()