Beginner to Intermediate

Snowflake
SQL Learning Guide

A complete hands-on reference covering Security, Warehouses, Data Loading, Performance, Advanced Features, and Data Resilience — with annotated code examples and a searchable Privileges reference.

Modules

Lessons

100+

Code Blocks

Security & Access Control

Security 3 lessons · RBAC, Masking, Row Policies, Tags

Snowflake uses a layered security model. Role-Based Access Control (RBAC) governs who can access what, while Discretionary Access Control (DAC) lets object owners grant privileges. On top of that, Column Masking and Row Access Policies let you hide sensitive data from certain roles without changing the underlying table.

1.1Access Control — RBAC, DAC, Roles & Privileges

Snowflake has a hierarchy of system roles: ACCOUNTADMIN → SECURITYADMIN → SYSADMIN → PUBLIC. You should create custom roles (like ANALYST) and grant them the minimum privileges needed — principle of least privilege.

RBAC: privileges are granted to roles, roles are granted to users or other roles. DAC: every object has an owner who can grant access to others.

💡Custom roles should always be assigned to SYSADMIN (or higher) so that SYSADMIN can manage all objects owned by custom roles.

USE ROLE ACCOUNTADMIN;
SHOW ROLES;
SHOW GRANTS TO ROLE SECURITYADMIN;

USE ROLE SYSADMIN;
CREATE DATABASE FILMS_DB;
CREATE SCHEMA FILMS_SCHEMA;
CREATE TABLE FILMS_SYSADMIN (
  ID           STRING,
  TITLE        STRING,
  RELEASE_DATE DATE,
  RATING       INT
);

USE ROLE SECURITYADMIN;
CREATE ROLE ANALYST;
GRANT USAGE ON DATABASE FILMS_DB TO ROLE ANALYST;
GRANT USAGE, CREATE TABLE ON SCHEMA FILMS_DB.FILMS_SCHEMA TO ROLE ANALYST;
GRANT USAGE ON WAREHOUSE COMPUTE_WH TO ROLE ANALYST;
GRANT ROLE ANALYST TO ROLE SYSADMIN;
GRANT ROLE ANALYST TO USER ADMIN;

GRANT USAGE ON FUTURE SCHEMAS IN DATABASE FILMS_DB TO ROLE ANALYST;
USE ROLE SYSADMIN;
CREATE SCHEMA MUSIC_SCHEMA;
CREATE SCHEMA BOOKS_SCHEMA;
USE ROLE USERADMIN;
CREATE USER Rajesh
  PASSWORD='temp'
  DEFAULT_ROLE = analyst
  DEFAULT_WAREHOUSE='COMPUTE_WH'
  MUST_CHANGE_PASSWORD=TRUE;
USE ROLE SECURITYADMIN;
GRANT ROLE analyst TO USER Rajesh;
USE ROLE SYSADMIN;
DROP DATABASE FILMS_DB;

RBACGRANTCREATE ROLEFUTURE GRANTSCREATE USERPrivilege Inheritance

1.2Column & Row Level Security — Masking & Row Access Policies

Column Masking Policies let you show different versions of a column to different roles. Row Access Policies filter which rows a role can see. Both are managed by a dedicated MASKING_ADMIN role.

💡A table can have only ONE row access policy, but the policy can use a mapping table for complex multi-role logic.

USE ROLE MASKING_ADMIN;
CREATE OR REPLACE MASKING POLICY EMAIL_MASK
  AS (VAL STRING) RETURNS STRING ->
  CASE
    WHEN CURRENT_ROLE() IN ('ANALYST') THEN VAL
    ELSE REGEXP_REPLACE(VAL, '.+\@', '******@')
  END;
ALTER TABLE CUSTOMERS MODIFY COLUMN EMAIL
  SET MASKING POLICY EMAIL_MASK;

CREATE OR REPLACE ROW ACCESS POLICY CUSTOMER_POLICY
  AS (COUNTRY_CODE VARCHAR) RETURNS BOOLEAN ->
  'SYSADMIN' = CURRENT_ROLE()
    OR EXISTS (
      SELECT 1 FROM TITLE_COUNTRY_MAPPING
        WHERE TITLE = CURRENT_ROLE()
          AND COUNTRY_ISO_CODE = COUNTRY_CODE
    );
ALTER TABLE CUSTOMERS DROP ALL ROW ACCESS POLICIES;
ALTER TABLE CUSTOMERS ADD ROW ACCESS POLICY CUSTOMER_POLICY ON (COUNTRY_CODE);

MASKING POLICYROW ACCESS POLICYCURRENT_ROLE()Column Security

1.3Object Tagging — Classification & Governance

Tags are key-value labels you attach to Snowflake objects. They enable data governance — e.g., tracking which columns hold PII. Use ALLOWED_VALUES to restrict valid tag values.

CREATE OR REPLACE TAG business_unit
  ALLOWED_VALUES 'sales', 'HR', 'operations';
ALTER WAREHOUSE compute_wh
  SET TAG business_unit = 'sales';
CREATE DATABASE operations_db TAG (business_unit = 'operations');
ALTER TABLE customer
  ALTER COLUMN NAME
  SET TAG security_objects_db.security_objects_schema.protected_data = 'PII';

CREATE TAGALLOWED_VALUESTag InheritancePII Classification

Virtual Warehouses & Compute

Compute 3 lessons · Warehouses, Resource Monitors, Multi-cluster

A Virtual Warehouse is a cluster of compute resources that executes queries — completely separate from storage. Resource Monitors set spending caps to prevent runaway costs.

2.1Virtual Warehouse — Creation, Sizing & State Management

Warehouses range from X-Small to 6X-Large. Doubling size doubles performance and credit consumption. AUTO_SUSPEND and AUTO_RESUME control billing behavior.

CREATE WAREHOUSE DATA_ANALYSIS_WAREHOUSE
  WAREHOUSE_SIZE = 'SMALL'
  AUTO_SUSPEND   = 600
  AUTO_RESUME    = TRUE
  INITIALLY_SUSPENDED = TRUE;
ALTER WAREHOUSE DATA_ANALYSIS_WAREHOUSE RESUME;
ALTER WAREHOUSE DATA_ANALYSIS_WAREHOUSE SET WAREHOUSE_SIZE = LARGE;
SHOW WAREHOUSES LIKE 'DATA_ANALYSIS_WAREHOUSE';

CREATE WAREHOUSEAUTO_SUSPENDAUTO_RESUMEWAREHOUSE_SIZE

2.2Resource Monitors & Warehouse Cost Monitoring

A Resource Monitor tracks credit usage and fires triggers at thresholds. WAREHOUSE_METERING_HISTORY shows exact credit consumption per warehouse per hour.

CREATE RESOURCE MONITOR MY_MONITOR
  WITH CREDIT_QUOTA = 100
  TRIGGERS
    ON 50  PERCENT DO NOTIFY
    ON 90  PERCENT DO SUSPEND
    ON 100 PERCENT DO SUSPEND_IMMEDIATE;
ALTER WAREHOUSE COMPUTE_WH
  SET RESOURCE_MONITOR = "MY_MONITOR";

RESOURCE MONITORCREDIT_QUOTASUSPEND_IMMEDIATEWAREHOUSE_METERING_HISTORY

2.3Multi-cluster Warehouses — Handling Concurrency

A Multi-cluster Warehouse automatically adds extra clusters to serve concurrent users. STANDARD: adds at first queue. ECONOMY: waits for sustained load.

CREATE OR REPLACE WAREHOUSE MULTI_CLUSTER_WH
  WAREHOUSE_SIZE    = 'XSMALL'
  AUTO_SUSPEND      = 600
  AUTO_RESUME       = TRUE
  MIN_CLUSTER_COUNT = 1
  MAX_CLUSTER_COUNT = 3
  SCALING_POLICY    = 'STANDARD';

MIN_CLUSTER_COUNTMAX_CLUSTER_COUNTSCALING_POLICYConcurrency

Data Loading & Ingestion

Ingestion 4 lessons · INSERT, Stages, COPY INTO, Semi-structured

Data enters Snowflake through: INSERT statements for small data, Stages as landing zones, COPY INTO for bulk loads, and VARIANT columns for semi-structured JSON.

3.1INSERT Statement Variations

INSERT OVERWRITE atomically truncates and inserts in one statement — useful for full daily refreshes without duplication risk.

INSERT INTO FILMS VALUES ('fi3p9f8hu8', 'Parasite', DATE('2019-05-30'));
INSERT INTO FILMS VALUES
  ('9x3wnr0zit', 'Citizen Kane',  DATE('1942-01-24')),
  ('2wyaojnzfq', 'Old Boy',        DATE('2004-10-15'));
INSERT INTO FILMS_2000 SELECT * FROM FILMS
  WHERE RELEASE_DATE > DATE('2000-01-01');
INSERT OVERWRITE INTO FILMS_2000 SELECT * FROM FILMS;

INSERT VALUESINSERT SELECTINSERT OVERWRITE

3.2Stage Types — User, Table & Named Stages

Three internal stage types: User stage (@~), Table stage (@%TABLE), Named stage (@MY_STAGE — recommended for production).

LIST @~;
LS @%FILMS;
CREATE STAGE FILMS_STAGE;
CREATE FILE FORMAT CSV_FILE_FORMAT
  TYPE = CSV
  SKIP_HEADER = 1;
SELECT $1, $2, $3 FROM @~/films.csv;

⚠️

The PUT command (upload local files) only works from SnowSQL CLI — not from Snowsight.

@~ User Stage@%TABLE StageNamed StageFILE FORMATPUT Command

3.3COPY INTO Table — Bulk Loading & Transformations

COPY INTO is the primary bulk-load command. It tracks loaded files (preventing double-loading by default). Use VALIDATION_MODE to test without loading.

COPY INTO FILMS
FROM @FILMS_STAGE/films.csv
FILE_FORMAT = (TYPE='CSV' SKIP_HEADER=1);

-- Transformations during load
COPY INTO FILMS FROM
( SELECT $1, $2, to_date($3)
  FROM @%FILMS/films.csv)
FILE_FORMAT = CSV_FILE_FORMAT
FORCE = TRUE;

-- VALIDATION_MODE: preview rows without inserting
COPY INTO FILMS
FROM @FILMS_STAGE/films.csv
VALIDATION_MODE = 'RETURN_ROWS';

COPY INTO FILMS
FROM @FILMS_STAGE
FILE_FORMAT = CSV_FILE_FORMAT
PATTERN = '.*[.]csv'
ON_ERROR=CONTINUE
FORCE=true;

COPY INTOFORCEPATTERNVALIDATION_MODEON_ERROR

3.4Semi-structured Data — JSON, VARIANT & FLATTEN

Snowflake stores JSON natively in a VARIANT column. Navigate paths with dot notation (col:key). LATERAL FLATTEN turns JSON arrays into rows.

CREATE OR REPLACE TABLE FILMS_ELT (JSON_VARIANT VARIANT);
SELECT
  json_variant:id                      AS id,
  json_variant:title                   AS title,
  json_variant:release_date::date     AS release_date,
  json_variant:actors[0]                AS first_actor,
  json_variant:ratings.imdb_rating      AS imdb_rating
FROM FILMS_ELT;

-- FLATTEN arrays to rows
SELECT
  json_variant:title,
  L.value
FROM FILMS_ELT F,
LATERAL FLATTEN(INPUT => F.json_variant:ratings) L;

VARIANTJSON Dot NotationFLATTENLATERAL FLATTENMATCH_BY_COLUMN_NAME

Query Performance & Optimization

Performance 3 lessons · Query History, Caching, Clustering

Snowflake has three caching layers: Metadata, Results (24h), and Local Disk. Micro-partitions (~50–500MB) store min/max metadata for pruning. Clustering keys improve partition skipping.

4.1Query History & Query Profile

ACCOUNT_USAGE.QUERY_HISTORY: 365 days, ~45 min latency. INFORMATION_SCHEMA.QUERY_HISTORY(): real-time, last 7 days only.

SELECT
  QUERY_ID, QUERY_TEXT, USER_NAME,
  ROUND(TOTAL_ELAPSED_TIME / 1000, 2) AS ELAPSED_SEC
FROM SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY
WHERE ELAPSED_SEC > 3
ORDER BY ELAPSED_SEC DESC
LIMIT 10;

QUERY_HISTORYACCOUNT_USAGEINFORMATION_SCHEMAQuery Profile

4.2Caching — Metadata, Results Cache & Local Storage

1. Metadata Cache: COUNT(*), SHOW, DESCRIBE — free. 2. Results Cache: exact same query, 24h TTL, shared across all users, no warehouse needed. 3. Local Disk Cache: warehouse SSD caches scanned data.

-- ① Metadata cache — no warehouse needed
SELECT COUNT(*) FROM CUSTOMER;

-- ② Results cache — run same query twice
SELECT C_CUSTKEY, C_NAME FROM CUSTOMER LIMIT 1000000;
SELECT C_CUSTKEY, C_NAME FROM CUSTOMER LIMIT 1000000;

-- Cache MISS — CURRENT_TIMESTAMP() busts cache
SELECT C_CUSTKEY, CURRENT_TIMESTAMP() FROM CUSTOMER LIMIT 1000000;

-- Disable/re-enable result caching
ALTER ACCOUNT SET USE_CACHED_RESULT = FALSE;
ALTER ACCOUNT SET USE_CACHED_RESULT = TRUE;

Metadata CacheResults CacheLocal Disk CacheUSE_CACHED_RESULT

4.3Clustering & Micro-partition Pruning

A Clustering Key re-sorts data to make pruning more effective. Use on large tables queried frequently on a date or ID column. Lower average_overlaps = better clustering health.

ALTER TABLE MY_TABLE CLUSTER BY (sale_date);
SELECT SYSTEM$CLUSTERING_INFORMATION('MY_TABLE', '(sale_date)');
SELECT * FROM SNOWFLAKE.ACCOUNT_USAGE.AUTOMATIC_CLUSTERING_HISTORY;

Micro-partitionsCLUSTER BYSYSTEM$CLUSTERING_INFORMATIONPartition Pruning

Advanced Features

Advanced 7 lessons · Objects, UDFs, SnowSQL, Sharing, Dynamic Tables, Parameters

Full breadth of Snowflake: object types, UDFs and Stored Procedures, SnowSQL, Secure Data Sharing, the Data Marketplace, Dynamic Tables, and Parameters.

5.1Databases, Schemas, Tables & Views

Table types: Permanent (full time-travel + fail-safe), Temporary (session-scoped), Transient (limited time-travel, no fail-safe). View types: Standard, Secure, Materialized.

CREATE TABLE PERMANENT_TABLE (NAME STRING, AGE INT);
CREATE TEMPORARY TABLE TEMP_TABLE (NAME STRING, AGE INT);
CREATE TRANSIENT TABLE TRANSIENT_TABLE (NAME STRING, AGE INT);
ALTER TABLE PERMANENT_TABLE SET DATA_RETENTION_TIME_IN_DAYS = 90;
CREATE SECURE VIEW SECURE_VIEW AS SELECT * FROM PERMANENT_TABLE;
CREATE MATERIALIZED VIEW MAT_VIEW AS SELECT * FROM PERMANENT_TABLE;

PERMANENT TABLETEMPORARY TABLETRANSIENT TABLESECURE VIEWMATERIALIZED VIEW

5.2UDFs, External Functions & Stored Procedures

UDFs extend SQL with custom logic — usable inline in SELECT. Stored Procedures run procedural logic but must be called with CALL, not used inline.

CREATE OR REPLACE FUNCTION DAY_NAME_ON(num_of_days int)
RETURNS STRING AS
$$
  SELECT 'In ' || CAST(num_of_days AS STRING) || ' days it will be a '
    || dayname(dateadd(day, num_of_days, current_date()))
$$;

CREATE OR REPLACE PROCEDURE TRUNCATE_TABLES(DB STRING, SCH STRING)
RETURNS STRING
LANGUAGE JAVASCRIPT
EXECUTE AS OWNER AS
$$
  var rs = snowflake.execute({sqlText: 'SHOW TABLES IN ' + DB + '.' + SCH});
  while (rs.next()){ snowflake.execute({sqlText: 'TRUNCATE TABLE ' + rs.getColumnValue(2)}); }
  return 'Done';
$$;
CALL TRUNCATE_TABLES('MY_DB', 'MY_SCHEMA');

CREATE FUNCTIONUDF OverloadingSTORED PROCEDURECALLsnowflake.execute()

5.3SnowSQL CLI

SnowSQL is the command-line client. Required for PUT and GET commands. Run queries with -q, execute files with -f.

snowsql -a <account> -u <user> -d MY_DB -s MY_SCHEMA -r SYSADMIN -w COMPUTE_WH
snowsql ... -q 'SELECT * FROM CUSTOMER LIMIT 5'
snowsql ... -f /path/to/script.sql

-- Inside interactive mode:
!set variable_substitution=true
!define table_name=customer
SELECT * FROM &table_name LIMIT 10;
PUT file:///local/data.csv @MY_STAGE;

SnowSQLPUT Command-q FlagVariable Substitution

5.4Secure Data Sharing

Share live data across Snowflake accounts — zero-copy, no ETL. Consumers query your data directly via a read-only database.

CREATE SHARE DEMO_SHARE;
GRANT USAGE ON DATABASE DEMO_DB TO SHARE DEMO_SHARE;
GRANT USAGE ON SCHEMA DEMO_DB.DEMO_SCHEMA TO SHARE DEMO_SHARE;
GRANT SELECT ON TABLE DEMO_DB.DEMO_SCHEMA.DEMO_TABLE TO SHARE DEMO_SHARE;
ALTER SHARE DEMO_SHARE ADD ACCOUNTS = <consumer_account>;

-- Consumer side
CREATE DATABASE SHARED_DATA FROM SHARE provider_acct.DEMO_SHARE;

CREATE SHAREGRANT TO SHAREZero-copy Sharing

5.5Data Marketplace

Import third-party live datasets from the Snowflake Marketplace — query them like any other table. Data always reflects the provider's latest version. No ETL pipeline needed.

✅

Marketplace data uses the same zero-copy sharing mechanism as D5.4 — you're directly querying the provider's Snowflake account.

Data MarketplaceLive DatasetZero-copy

5.6Dynamic Tables — Declarative Data Pipelines

Dynamic Tables replace Streams + Tasks. Define a SQL query, set a TARGET_LAG, and Snowflake auto-refreshes. Chain them to build declarative pipelines.

CREATE OR REPLACE DYNAMIC TABLE cleaned_transactions
  TARGET_LAG   = '1 minute'
  WAREHOUSE    = compute_wh
  REFRESH_MODE = AUTO
  INITIALIZE   = ON_CREATE
AS
  SELECT transaction_id, customer_id, amount,
    transaction_date::DATE AS transaction_date
  FROM raw_transactions
  WHERE status = 'completed';

ALTER DYNAMIC TABLE cleaned_transactions REFRESH;
ALTER DYNAMIC TABLE cleaned_transactions SUSPEND;
ALTER DYNAMIC TABLE cleaned_transactions RESUME;

DYNAMIC TABLETARGET_LAGREFRESH_MODEINITIALIZE ON_CREATE

5.7Parameters — Precedence Hierarchy

Parameter precedence (highest to lowest): Session → User → Account. Object parameters: Table overrides Schema overrides Database overrides Account.

ALTER ACCOUNT SET  DATE_OUTPUT_FORMAT = 'YYYY/MM/DD';
ALTER USER admin SET DATE_OUTPUT_FORMAT = 'DD-MM-YYYY';
ALTER SESSION SET DATE_OUTPUT_FORMAT = 'MM, DD, YYYY'; -- wins
SHOW PARAMETERS LIKE 'DATE_OUTPUT_FORMAT';
ALTER TABLE  MY_TABLE  SET DATA_RETENTION_TIME_IN_DAYS = 5;
ALTER SCHEMA MY_SCHEMA SET DATA_RETENTION_TIME_IN_DAYS = 10;

ALTER SESSION SETSHOW PARAMETERSDATA_RETENTION_TIMEParameter Precedence

Data Resilience — Time Travel & Cloning

Resilience 2 lessons · Time Travel, UNDROP, Zero-copy Cloning

Time Travel queries data as it existed up to 90 days in the past. Cloning is metadata-only (zero-copy) — a clone shares micro-partitions until either side writes new data.

6.1Time Travel — Query the Past & Recover Objects

Three Time Travel SQL extensions: AT(OFFSET) — relative in seconds. AT(TIMESTAMP) — absolute point. BEFORE(STATEMENT) — just before a specific query ran.

💡Get a query ID from Snowsight UI or with LAST_QUERY_ID() immediately after the statement.

SELECT * FROM MY_TABLE
AT(OFFSET => -60*5);  -- 5 minutes ago

SELECT * FROM MY_TABLE
AT(TIMESTAMP => DATEADD(minute, -15, current_timestamp()));

SELECT * FROM MY_TABLE
BEFORE(STATEMENT => '<query_id>');

UNDROP TABLE MY_TABLE;
UNDROP SCHEMA MY_SCHEMA;
UNDROP DATABASE MY_DB;

-- Recover: create table from historical state
CREATE TABLE MY_TABLE_RESTORED AS
  SELECT * FROM MY_TABLE
  BEFORE(STATEMENT => '<truncate_query_id>');

⚠️

Time Travel storage costs money. Use the lowest retention that meets your recovery requirements.

AT(OFFSET)AT(TIMESTAMP)BEFORE(STATEMENT)UNDROPDATA_RETENTION_TIME_IN_DAYS

6.2Cloning — Zero-copy Instant Copies

Cloning is metadata-only — no data files are copied. The clone shares micro-partitions until either side writes. Recursive for databases: cloning a database clones all schemas and tables inside.

CREATE TABLE MY_TABLE_CLONE CLONE MY_TABLE;
CREATE DATABASE MY_DB_CLONE CLONE MY_DB;

-- Clone to a point in the past (combine with Time Travel)
CREATE OR REPLACE TABLE MY_TABLE_SNAPSHOT
  CLONE MY_TABLE
  AT(OFFSET => -60*2);

ℹ️

Clones do NOT inherit grants from the source. Re-apply privileges after cloning if needed.

CLONEZero-copyCopy-on-writeRecursive CloneTime Travel Clone

Snowflake Privileges Reference

Every privilege Snowflake supports — what it does, which object type it applies to, and how to use GRANT. Search by privilege name, object type, or keyword.

★ GRANT — The Command Behind Every Privilege

GRANT is a SQL command, not a privilege itself — but it's the mechanism behind every privilege in Snowflake. The pattern is always the same: what privilege, on what object, to whom.

-- The universal GRANT pattern:
GRANT <privilege> ON <object_type> <object_name> TO ROLE <role_name>;

-- Real examples:
GRANT SELECT ON TABLE CUSTOMERS TO ROLE ANALYST;
GRANT USAGE ON DATABASE SALES_DB TO ROLE ANALYST;
GRANT USAGE ON SCHEMA SALES_DB.SALES TO ROLE ANALYST;
GRANT USAGE ON WAREHOUSE COMPUTE_WH TO ROLE ANALYST;
GRANT INSERT, UPDATE ON TABLE ORDERS TO ROLE ANALYST;

-- GRANT ROLE (different — no ON clause)
GRANT ROLE ANALYST TO USER Rajesh;
GRANT ROLE ANALYST TO ROLE SYSADMIN;

-- See what a role has
SHOW GRANTS TO ROLE ANALYST;
SHOW GRANTS OF ROLE ANALYST;

GRANTREVOKEGRANT ROLESHOW GRANTS

Navigation — "can you see / enter this object?"

USAGE

DATABASESCHEMAWAREHOUSE

Read-only navigation access. Lets a role see that the object exists and enter it — but grants zero ability to read data inside. On a warehouse it also means the role can run queries. You always need USAGE before any other privilege on an object.

REFERENCE

SCHEMA

Like USAGE but read-only navigation without allowing object creation. Mostly relevant in share contexts.

Data Reading

SELECT

TABLEVIEW

Allows running SELECT queries on a table or view. Requires USAGE on the parent database and schema first. Without SELECT, the role can enter the schema but gets "Insufficient privileges" on any query.

REFERENCES

TABLE

Allows a role to reference this table in foreign key constraints and view its column metadata — without being able to read actual row data.

Data Writing

INSERT

TABLE

Allows adding new rows to a table via INSERT statements.

UPDATE

TABLE

Allows modifying existing row values. Does not allow adding or deleting rows.

DELETE

TABLE

Allows removing specific rows from a table. Does not allow dropping the table itself.

TRUNCATE

TABLE

Allows deleting ALL rows at once with TRUNCATE TABLE. Faster than DELETE but cannot be filtered — it wipes everything.

Schema-level DDL — Creating Objects

CREATE TABLE

SCHEMA

Allows creating new tables inside a schema. Granted ON SCHEMA (not ON TABLE) because the table doesn't exist yet when you grant it.

CREATE VIEW

SCHEMA

Allows creating views inside a schema.

CREATE MASKING POLICY

SCHEMA

Allows defining new masking policies. Does not allow attaching them to columns — that requires APPLY MASKING POLICY separately.

CREATE ROW ACCESS POLICY

SCHEMA

Allows defining new row access policies. Defining and applying are always two separate privileges in Snowflake.

CREATE STAGE

SCHEMA

Allows creating internal or external stages for loading/unloading data files.

Policy Application — Attaching Policies to Objects

APPLY MASKING POLICY

ACCOUNT

Allows attaching (and detaching) a masking policy to a column anywhere in the account. Without this, a role can CREATE the policy but cannot activate it on any column.

APPLY ROW ACCESS POLICY

ACCOUNT

Allows attaching (and detaching) a row access policy to a table anywhere in the account. Same pattern as masking — CREATE and APPLY are always separate privileges.

APPLY TAG

ACCOUNT

Allows assigning classification tags to objects (columns, tables, warehouses). Used in data classification and governance workflows.

Ownership & Admin

OWNERSHIP

TABLESCHEMADATABASE

The highest privilege on an object. The owner can do anything to it — drop it, grant privileges on it to others, and transfer ownership. Every object has exactly one owner role.

MANAGE GRANTS

ACCOUNT

Allows granting any privilege to any role — even privileges the grantor doesn't personally hold. Extremely powerful; typically reserved for SECURITYADMIN.

MONITOR

WAREHOUSEDATABASE

Allows viewing query history, credit usage, and execution metrics — without being able to run queries or read data.

OPERATE

WAREHOUSE

Allows starting, stopping, suspending, and resizing a warehouse. Does not allow modifying warehouse config (that needs MODIFY).

MODIFY

WAREHOUSE

Allows changing warehouse properties like size, auto-suspend settings, and max cluster count.

Column-level Security

SELECT (column)

COLUMN

In column-level security, you can GRANT SELECT on specific columns only. The role can query those columns but gets NULL for any ungranted column. More granular than table-level SELECT.

SECTION 01

Architecture, Features & Object Model

Multi-cluster shared data · Layers · Editions · Object model · Billing · AI Data Cloud

Multi-Cluster Shared Data Architecture

☁ Cloud Services Layer (brain)

Query ParserOptimizerMetadata MgrAuth & Access ControlTransaction MgrSecurity

↕ decoupled

⚡ Query Processing Layer — Virtual Warehouses

WH-1 (XS)WH-2 (M)WH-3 (XL)Independent clustersMPP execution

↕ decoupled

🗄 Storage Layer — Centralized, Columnar, Compressed

Micro-partitionsColumnar formatAES-256 encryptedCloud object store (S3/Azure/GCS)

ℹStorage, compute, and services are fully decoupled. Multiple warehouses read the same data with zero contention. You pay for each independently.

Storage — Micro-Partitions

Default size50–500 MB (compressed)

FormatColumnar, compressed, immutable

Metadata storedMIN/MAX per col, distinct count, NULL count

PruningAutomatic via metadata — no indexes needed

Overlap %Lower = better clustering health

DML effectCreates new partitions (old retired via Time Travel)

Cloud Services Layer

Billing threshold10% of daily compute credits FREE

Charged forDDL, SHOW, DESCRIBE, metadata queries

Result cache TTL24 hours, shared across all users/warehouses

Result cache invalidatedOn any DML to underlying table

Transaction mgmtACID-compliant, MVCC-based

Snowflake Editions

Edition	Key Features	Multi-cluster	Time Travel	HIPAA
Standard	Core SQL, basic security	✗	0–1 day	✗
Enterprise E	Multi-cluster WH, materialized views, column masking	✓	0–90 days	✗
Business Critical BC	All Enterprise + HIPAA, PCI, private link	✓	0–90 days	✓
VPS VPS	Dedicated environment, highest isolation	✓	0–90 days	✓

Object Hierarchy

ORGANIZATIONMultiple accounts, ORGADMIN

ACCOUNTSingle cloud region

DATABASELogical namespace

SCHEMAContainer for objects

TABLE / VIEW / STAGE / PIPE / STREAM / TASK / SP / UDF / FILE FORMAT / TAG / POLICY

⚠Always fully qualify: db.schema.object. USAGE on db AND schema required before touching any object inside.

Billing Model

Compute (WH)Credits/second, 60s minimum per start

Cloud ServicesFirst 10% of daily compute is FREE

Storage$/TB/month (compressed). Includes TT + Fail-Safe.

Data TransferSame-region same-cloud = free. Cross = charged.

SnowpipeServerless credits for file processing

Materialized ViewsCredits for background refresh (serverless)

SECTION 02

Account Access & Security

RBAC · Authentication · Network Policies · Encryption · Column/Row Security · Secure Views

System-Defined Roles — Hierarchy

ORGADMINOrg-level: create/manage accounts

ACCOUNTADMINTop account role — encapsulates SYSADMIN + SECURITYADMIN

SECURITYADMINMANAGE GRANTS globally. Create/manage users & roles.

USERADMINCREATE USER + CREATE ROLE only.

SYSADMINCREATE WAREHOUSE, DATABASE, SCHEMA, TABLE. Recommended owner of all data objects.

PUBLICAuto-granted to every user/role. No default data access.

⚠Exam critical: ACCOUNTADMIN is NOT a superuser. Masking policies apply to ALL roles including ACCOUNTADMIN unless explicitly named in the WHEN clause.

Column Masking

CREATE OR REPLACE MASKING POLICY email_mask AS (val STRING) RETURNS STRING -> CASE WHEN CURRENT_ROLE() IN ('ANALYST') THEN val ELSE SHA2(val) END; ALTER TABLE users MODIFY COLUMN email SET MASKING POLICY email_mask;

Runtime evaluation — evaluated at query time via CURRENT_ROLE()
One policy per column
ACCOUNTADMIN is NOT exempt unless named in WHEN

Row Access Policy

CREATE OR REPLACE ROW ACCESS POLICY rap AS (region STRING) RETURNS BOOLEAN -> CURRENT_ROLE() = 'ACCOUNTADMIN' OR EXISTS ( SELECT 1 FROM user_region_map m WHERE m.region = region AND m.username = CURRENT_USER() ); ALTER TABLE sales ADD ROW ACCESS POLICY rap ON (region);

One RAP per table
Returns TRUE = row visible, FALSE = filtered
User doesn't know rows are hidden

Authentication

PasswordDefault, min 8 chars

MFADuo-based, TOTP

Key PairRSA 2048/4096, service accounts

OAuthExternal / Snowflake OAuth

SSO / SAMLOkta, Azure AD, ADFS

Encryption

At restAES-256 (always on)

In transitTLS 1.2+

Tri-Secret SecureBC Customer + Snowflake + Cloud key

BYOKCustomer managed via KMS

Key rotationAutomatic (30 days) or manual

Network Policies

LevelAccount-level or User-level

User-levelOverrides account-level

Allowed listIP CIDRs that CAN connect

Blocked listIPs explicitly denied

ApplyALTER ACCOUNT SET NETWORK_POLICY=p;

SECTION 03

Performance: Virtual Warehouses

Sizes · Caching · Multi-cluster · Scaling policies · Resource monitors

Warehouse Sizes & Credits

X-Small1 credit/hr

Small2 credits/hr

Medium4 credits/hr

Large8 credits/hr

X-Large16 credits/hr

2X-Large32 credits/hr

3X-Large64 credits/hr

4X-Large128 credits/hr

BillingPer-second, 60s minimum

Three Cache Layers

1. Metadata cacheCloud Services. FREE. COUNT(*), SHOW.

2. Result cache24h TTL. Shared. No warehouse needed.

Result invalidatedAny DML on underlying table

3. Local disk cachePer-warehouse SSD. Stores column data.

Local cache tipDon't suspend between similar queries

Multi-Cluster E+

SolvesConcurrency. NOT slow queries.

Scale UPResize → faster single queries

Scale OUTAdd clusters → more concurrent users

STANDARD policyScale after ~20s queue

ECONOMY policyScale after 6+ min sustained load

Maximized modeMIN = MAX (all always running)

SECTION 04

Performance: Query Optimization

Clustering · Search optimization · Materialized views · Query profile · Best practices

Clustering Keys

ALTER TABLE sales CLUSTER BY (sale_date); SELECT SYSTEM$CLUSTERING_INFORMATION('sales', '(sale_date)'); -- average_overlaps → 0 = perfectly clustered

Good candidate: large tables, frequent WHERE/JOIN on same col
Max 4 columns in a clustering key
Automatic reclustering runs in background

Search Optimization

ALTER TABLE customers ADD SEARCH OPTIMIZATION ON EQUALITY(email, phone); SELECT SYSTEM$ESTIMATE_SEARCH_OPTIMIZATION_COSTS('db.sch.t');

Best for: equality / IN predicates on high-cardinality cols
NOT good for range scans (use clustering for that)
Works on strings, numbers, VARIANT, geo

Query Optimization Checklist

Avoid SELECT * — columnar: pay per column scanned
Filter early — WHERE before GROUP BY
Avoid functions on filter cols: use ranges not YEAR(date_col)=2024
Use LIMIT during development to avoid full scans

Check Query Profile — look for disk spills, Cartesian products
Use QUALIFY instead of subquery for window function filtering
Use SAMPLE (10) for exploration
Result cache: avoid CURRENT_TIMESTAMP() if you want cache hits

SECTION 05

Data Loading & Unloading

Stages · COPY INTO · Snowpipe · File formats · GET/PUT · Unloading

Stage Types

User stage@~

Table stage@%table_name

Named internal@my_stage

Named externalS3 / Azure Blob / GCS

ListLIST @my_stage;

RemoveREMOVE @my_stage/f.csv;

COPY INTO Options

DeduplicationTracks loaded files for 64 days

FORCE=TRUEBypass dedup, reload files

ON_ERRORCONTINUE / SKIP_FILE / ABORT

VALIDATION_MODETest without loading

Optimal file size100MB–250MB compressed

ParallelismOne file per thread — more files = faster

Snowpipe

CREATE PIPE sales_pipe AUTO_INGEST = TRUE AS COPY INTO sales FROM @s3_stage;

Serverless — no warehouse, billed separately
AUTO_INGEST=TRUE → S3 event trigger
Latency typically < 1 minute

SECTION 06

Data Transformations

Streams · Tasks · Dynamic Tables · Stored Procedures · UDFs · Semi-structured

Streams — CDC

CREATE STREAM sales_stream ON TABLE sales APPEND_ONLY = FALSE; -- METADATA$ACTION: INSERT | DELETE -- METADATA$ISUPDATE: TRUE if UPDATE pair SELECT SYSTEM$STREAM_HAS_DATA('sales_stream');

Consumed atomically — reading in transaction clears it
APPEND_ONLY — only INSERTs, cheaper on large tables
Data expires with table's Time Travel window

Tasks

CREATE TASK my_task WAREHOUSE = etl_wh SCHEDULE = 'USING CRON 0 2 * * * UTC' WHEN SYSTEM$STREAM_HAS_DATA('s') AS INSERT INTO tgt SELECT * FROM s; ALTER TASK my_task RESUME;

Tasks start SUSPENDED — always RESUME after create
Child tasks use AFTER parent (DAG pattern)
Manual: EXECUTE TASK my_task;

SECTION 07

Storage, Data Protection & Sharing

Time Travel · Fail-Safe · Cloning · Data Sharing · Marketplace · External Tables

Time Travel

Standard0–1 day

Enterprise+0–90 days

SELECT * FROM t AT(OFFSET => -3600); UNDROP TABLE t; UNDROP DATABASE db;

Fail-Safe

Duration7 days (all editions)

Starts afterTime Travel window ends

AccessSnowflake Support ONLY

Temp/TransientNO Fail-Safe (saves cost)

Total protectionUp to 97 days total

Zero-Copy Cloning

CREATE TABLE dev CLONE prod; CREATE DATABASE clone CLONE prod; CREATE TABLE snap CLONE t AT(OFFSET => -3600);

Instant — metadata only, no data copy
Cost only for changes after cloning
Clones do NOT inherit grants

SECTION 08

Snowpark

Python / Java / Scala · DataFrame API · UDFs · ML

DataFrame Basics

from snowflake.snowpark import Session import snowflake.snowpark.functions as F session = Session.builder.configs({...}).create() df = session.table("sales") result = (df .filter(F.col("region") == "US") .group_by("product") .agg(F.sum("amount").alias("total")) .sort(F.col("total").desc()) ) result.show() # action — executes query result.to_pandas() # bring to local pandas DF

Snowpark Quick Reference

Operation	Snowpark	SQL
Read table	session.table("t")	FROM t
Filter	df.filter(col("x")>10)	WHERE x>10
Select	df.select("a","b")	SELECT a,b
Group/Agg	df.group_by("x").agg(sum("y"))	GROUP BY/SUM
Join	df1.join(df2,"id")	JOIN ON id
Write	df.write.save_as_table("t")	CTAS
To pandas	df.to_pandas()	— client side

💡Push-down execution: Snowpark code runs INSIDE Snowflake — no data moves to the client. Lazy evaluation means nothing runs until an action (collect(), show(), count()) is called.

Snowflake SQL

The Golden Mental Model:
Think in Layers

Before writing any query, position yourself at the right layer. Most engineers jump straight to performance tuning when they should be at Layer 1. The layers are ordered by priority — correctness always wins.

LAYER 1

Correctness First

Always — no exceptions

LAYER 2

Scalability

When data is large

LAYER 3

Cost / Performance

When queries are slow or expensive

LAYER 4

Security

When exposing to users/roles

"Filter early, aggregate late, sort last, select only what you need."

THE UNIVERSAL RULE — solves 90% of query problems

LAYER 1

Correctness — Every Single Query

These rules apply always, no exceptions. The #1 source of production bugs in SQL is misunderstanding the execution order.

SQL Execution Order (Top → Bottom)

FROM

①

→

JOIN

②

→

WHERE

③

→

GROUP BY

④

→

HAVING

⑤

→

SELECT

⑥

→

DISTINCT

⑦

→

ORDER BY

⑧

→

LIMIT

⑨

⚠️Consequences of Execution Order

❌

You cannot use a SELECT alias in WHERE

WHERE runs before SELECT — the alias doesn't exist yet. Use a CTE or subquery instead.

🔍

WHERE filters rows before aggregation; HAVING filters after

Filter individual rows in WHERE. Filter aggregated results (like SUM > 1000) in HAVING.

🔗

JOIN happens before WHERE

So filter in WHERE, not just in the ON clause, when you want to restrict the final result set.

⚡Always Filter Early

-- ❌ BAD: Joins everything then filters SELECT * FROM orders o JOIN customers c ON o.cust_id = c.id WHERE c.country = 'CA'; -- ✅ GOOD: Pre-filter in CTE, then join only CA customers WITH ca_customers AS ( SELECT id FROM customers WHERE country = 'CA' ) SELECT * FROM orders o JOIN ca_customers c ON o.cust_id = c.id;

✅

Be explicit with JOIN types — never assume implicit behavior

Always write INNER JOIN, LEFT JOIN, etc. Never rely on an implicit join that produces a Cartesian product.

LAYER 2

Scalability — When Tables > A Few Million Rows

Only think about these when your data volume justifies it. Let the Query Profile guide you to the right concern.

📊Scalability Concerns at a Glance

🗂️

Clustering — Are you filtering on the same column repeatedly?

Consider adding a cluster key on that column. Snowflake will auto-maintain it in the background.

✂️

Partition Pruning — Does your WHERE align with how the table is clustered?

A WHERE on a clustered column allows Snowflake to skip entire micro-partitions. Misaligned filters scan everything.

💾

Spilling — The Spilling Rule

Only ORDER BY at the very last step, and only the columns you truly need sorted. Sorting mid-query kills performance.

🔎

Search Optimization — Point lookups on non-clustered columns?

Enable Search Optimization Service (SOQ) for high-cardinality equality lookups like user ID or email.

🚫The Spilling Rule — Sort Only at the End

-- ❌ BAD: Sorting before aggregation (forces sort + materialize mid-query) SELECT dept, COUNT(*) FROM (SELECT * FROM emp ORDER BY salary) GROUP BY dept; -- ✅ GOOD: Sort only the final output, if needed at all SELECT dept, COUNT(*) FROM emp GROUP BY dept ORDER BY COUNT(*) DESC;

LAYER 3

Performance Tuning — When Queries Are Slow

Only when you have a problem to solve. Check Query Profile first — it tells you exactly which node is expensive.

🔬Performance Tuning Checklist

📈

Check Query Profile in Snowflake UI → which node takes the most time?

Watch for: disk spills, inefficient joins, large scans, Cartesian products.

💥

Watch for Cartesian products — a JOIN with no proper key = disaster

A missing JOIN condition multiplies every row in table A by every row in table B. Always verify your ON clause.

🎯

Avoid SELECT * — pull only needed columns

Columnar storage means you pay per column scanned. Select only what you actually use.

🧪

Use LIMIT during dev/testing to avoid full scans

Add SAMPLE (10) or LIMIT 1000 during exploration. Remove before production.

🧠The Rows → Groups → Result Mental Model

-- Think in this order: -- 1. Reduce ROWS first → WHERE, JOIN filters -- 2. Then GROUP them → GROUP BY -- 3. Then shape RESULT → SELECT, ORDER BY, LIMIT -- ❌ BAD: Functions on filter columns prevent pruning SELECT * FROM sales WHERE YEAR(sale_date) = 2024; -- scans ALL partitions -- ✅ GOOD: Range filter allows micro-partition pruning SELECT * FROM sales WHERE sale_date BETWEEN '2024-01-01' AND '2024-12-31';

LAYER 4

Security — When Exposing Data to Users/Roles

These are infrastructure concerns — set once at the table/policy level, not per query. Never implement security logic inside application WHERE clauses.

🔐Security Infrastructure Patterns

👁️

Use Secure Views when sharing views externally or across accounts

Regular views expose the query definition. SECURE VIEW hides business logic from consumers.

📏

Apply Row-level security via policies, not WHERE clauses in app code

App-level WHERE clauses can be bypassed. Row Access Policies are enforced at the engine level, transparently.

🎭

Apply Column masking on PII columns at the table policy level

Set masking once at the column level. It applies to every query against that column, for all roles not explicitly exempted.

CHECKLIST

5 Questions Before Running Any Query

Do I know what JOIN type I need and why?
INNER vs LEFT vs CROSS — be explicit, never guess.

Am I filtering as early as possible?
WHERE before GROUP BY. CTEs to pre-filter large tables.

Am I selecting only the columns I need?
No SELECT *. Name every column. Pay only for what you scan.

Am I ORDER BY-ing only when truly necessary?
Sorting is expensive at scale. Skip it unless output order matters.

Have I checked for fan-out risk?
1-to-many joins can silently double (or 10x) your row count. Verify cardinality before aggregating.

💡

For 90% of queries:
Knowing SQL execution order + the 5 questions above = no bugs. Snowflake-specific tuning only when Query Profile shows a problem.

SnowflakeSQL Learning Guide

Snowflake Privileges Reference

The Golden Mental Model:Think in Layers

Correctness — Every Single Query

Scalability — When Tables > A Few Million Rows

Performance Tuning — When Queries Are Slow

Security — When Exposing Data to Users/Roles

5 Questions Before Running Any Query

Snowflake
SQL Learning Guide

The Golden Mental Model:
Think in Layers