Big Data 37 min read

ODPS (MaxCompute) SQL Basics, Data Integration and Hologres Import Guide

This guide provides a comprehensive, beginner‑to‑advanced reference for ODPS (MaxCompute) SQL, covering table creation, DDL/DML commands, query syntax, join hints, MySQL‑to‑ODPS synchronization, one‑click and custom imports into Hologres, and scheduling variables for automated data pipelines.

DaTaobao Tech

Jul 8, 2024

ODPS (MaxCompute) SQL Basics, Data Integration and Hologres Import Guide

This article collects beginner and advanced knowledge of ODPS (MaxCompute) development, providing a mini‑encyclopedia for newcomers.

Basic Functions

It introduces the MaxCompute module and shows DDL statements such as creating tables, adding comments, partitions, clustering, transactional properties, and examples of ALTER statements for changing owner, comments, renaming, and dropping tables.

CREATE TABLE IF NOT EXISTS xxx.xxx_xxxx_xxxx_hh (
  xxxxx STRING COMMENT '商品',
  xxxxx STRING COMMENT '名字'
) COMMENT 'xxx表' PARTITIONED BY (ds STRING COMMENT 'yyyymmddhh') LIFECYCLE 7;

DML Operations

Typical DML commands are demonstrated, including INSERT/OVERWRITE, UPDATE, DELETE, MERGE, LOAD, UNLOAD and the use of ZORDER for data layout optimization.

INSERT INTO sale_detail PARTITION (sale_date='2013', region='china') VALUES ('s1','c1',100.1),('s2','c2',100.2);

MERGE INTO target_table t USING source_table s ON t.id = s.id
WHEN MATCHED THEN UPDATE SET t.col = s.col
WHEN NOT MATCHED AND s.event_type='I' THEN INSERT VALUES (s.*);

Query Language (DQL)

The guide covers SELECT syntax, WITH clause, subqueries, set operations (UNION, INTERSECT, EXCEPT), GROUP BY, HAVING, ORDER BY, DISTRIBUTE BY, SORT BY, and LIMIT.

SELECT region, SUM(price) FROM sale_detail GROUP BY region HAVING SUM(price) < 305 ORDER BY region LIMIT 10;

Joins and Hints

Various join types are explained (INNER, LEFT, RIGHT, FULL, NATURAL) and advanced hints such as MAPJOIN and SKEWJOIN for performance tuning.

SELECT /*+ MAPJOIN(a) */ a.*, b.* FROM large_table a JOIN small_table b ON a.id = b.id;

SELECT /*+ SKEWJOIN(a) */ * FROM T0 a JOIN T1 b ON a.c0 = b.c0;

Data Synchronization (MySQL → ODPS)

Step‑by‑step instructions describe how to configure a DataWorks synchronization job: select source MySQL table, generate target ODPS table, set lifecycle and partitioning, configure scheduling (daily/hourly), and run the job.

ODPS → Hologres Import

Two methods are provided: (1) one‑click visual import via DataStudio, and (2) custom SQL import. The process includes creating an external table that maps the MaxCompute source, defining the target Hologres table (including partitions and indexes), inserting data into temporary partitions, and attaching them to the parent table.

IMPORT FOREIGN SCHEMA public_data LIMIT TO (dwd_product_movie_basic_info) FROM SERVER odps_server INTO public OPTIONS(if_table_exist='update');

BEGIN;
CREATE TABLE public.holo_dwd_product_movie_basic_info (
  movie_name TEXT,
  director TEXT,
  ds TEXT
) PARTITION BY LIST (ds);
COMMIT;

Scheduling Configuration

The guide explains how to set scheduling parameters using built‑in variables (${bizdate}, $[yyyymmddhh24miss]), time attributes, resource groups, and dependency configuration. It also lists built‑in variables such as $bizdate, $cyctime, $gmtdate, $bizmonth, $jobid, $nodeid, $taskid.

SET odps.sql.allow.fullscan=true;
INSERT INTO tmp_holo_table_${bizdate} SELECT * FROM dwd_table WHERE ds='${bizdate}';

Overall, the article serves as a comprehensive reference for writing ODPS SQL, performing data integration, and automating data pipelines to Hologres.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

sql Hologres ETL Data Integration ODPS

Written by

DaTaobao Tech

Official account of DaTaobao Technology

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.