Topics
Beginner to Intermediate

Snowflake
SQL Learning Guide

A complete hands-on reference covering Security, Warehouses, Data Loading, Performance, Advanced Features, and Data Resilience — with annotated code examples and a searchable Privileges reference.

6
Modules
23
Lessons
100+
Code Blocks
D1
Security & Access Control
Security 3 lessons · RBAC, Masking, Row Policies, Tags
Snowflake uses a layered security model. Role-Based Access Control (RBAC) governs who can access what, while Discretionary Access Control (DAC) lets object owners grant privileges. On top of that, Column Masking and Row Access Policies let you hide sensitive data from certain roles without changing the underlying table.
1.1Access Control — RBAC, DAC, Roles & Privileges

Snowflake has a hierarchy of system roles: ACCOUNTADMIN → SECURITYADMIN → SYSADMIN → PUBLIC. You should create custom roles (like ANALYST) and grant them the minimum privileges needed — principle of least privilege.

RBAC: privileges are granted to roles, roles are granted to users or other roles. DAC: every object has an owner who can grant access to others.

💡Custom roles should always be assigned to SYSADMIN (or higher) so that SYSADMIN can manage all objects owned by custom roles.
Setup Context & Roles
USE ROLE ACCOUNTADMIN;
SHOW ROLES;
SHOW GRANTS TO ROLE SECURITYADMIN;
Create Database, Schema & Table as SYSADMIN
USE ROLE SYSADMIN;
CREATE DATABASE FILMS_DB;
CREATE SCHEMA FILMS_SCHEMA;
CREATE TABLE FILMS_SYSADMIN (
  ID           STRING,
  TITLE        STRING,
  RELEASE_DATE DATE,
  RATING       INT
);
Create Custom Role & Grant Privileges
USE ROLE SECURITYADMIN;
CREATE ROLE ANALYST;
GRANT USAGE ON DATABASE FILMS_DB TO ROLE ANALYST;
GRANT USAGE, CREATE TABLE ON SCHEMA FILMS_DB.FILMS_SCHEMA TO ROLE ANALYST;
GRANT USAGE ON WAREHOUSE COMPUTE_WH TO ROLE ANALYST;
GRANT ROLE ANALYST TO ROLE SYSADMIN;
GRANT ROLE ANALYST TO USER ADMIN;
Future Grants & Create User
GRANT USAGE ON FUTURE SCHEMAS IN DATABASE FILMS_DB TO ROLE ANALYST;
USE ROLE SYSADMIN;
CREATE SCHEMA MUSIC_SCHEMA;
CREATE SCHEMA BOOKS_SCHEMA;
USE ROLE USERADMIN;
CREATE USER Rajesh
  PASSWORD='temp'
  DEFAULT_ROLE = analyst
  DEFAULT_WAREHOUSE='COMPUTE_WH'
  MUST_CHANGE_PASSWORD=TRUE;
USE ROLE SECURITYADMIN;
GRANT ROLE analyst TO USER Rajesh;
USE ROLE SYSADMIN;
DROP DATABASE FILMS_DB;
RBACGRANTCREATE ROLEFUTURE GRANTSCREATE USERPrivilege Inheritance
1.2Column & Row Level Security — Masking & Row Access Policies

Column Masking Policies let you show different versions of a column to different roles. Row Access Policies filter which rows a role can see. Both are managed by a dedicated MASKING_ADMIN role.

💡A table can have only ONE row access policy, but the policy can use a mapping table for complex multi-role logic.
Column Masking Policy
USE ROLE MASKING_ADMIN;
CREATE OR REPLACE MASKING POLICY EMAIL_MASK
  AS (VAL STRING) RETURNS STRING ->
  CASE
    WHEN CURRENT_ROLE() IN ('ANALYST') THEN VAL
    ELSE REGEXP_REPLACE(VAL, '.+\@', '******@')
  END;
ALTER TABLE CUSTOMERS MODIFY COLUMN EMAIL
  SET MASKING POLICY EMAIL_MASK;
Row Access Policy — Dynamic Country-Based Filtering
CREATE OR REPLACE ROW ACCESS POLICY CUSTOMER_POLICY
  AS (COUNTRY_CODE VARCHAR) RETURNS BOOLEAN ->
  'SYSADMIN' = CURRENT_ROLE()
    OR EXISTS (
      SELECT 1 FROM TITLE_COUNTRY_MAPPING
        WHERE TITLE = CURRENT_ROLE()
          AND COUNTRY_ISO_CODE = COUNTRY_CODE
    );
ALTER TABLE CUSTOMERS DROP ALL ROW ACCESS POLICIES;
ALTER TABLE CUSTOMERS ADD ROW ACCESS POLICY CUSTOMER_POLICY ON (COUNTRY_CODE);
MASKING POLICYROW ACCESS POLICYCURRENT_ROLE()Column Security
1.3Object Tagging — Classification & Governance

Tags are key-value labels you attach to Snowflake objects. They enable data governance — e.g., tracking which columns hold PII. Use ALLOWED_VALUES to restrict valid tag values.

Create Tags with Allowed Values
CREATE OR REPLACE TAG business_unit
  ALLOWED_VALUES 'sales', 'HR', 'operations';
ALTER WAREHOUSE compute_wh
  SET TAG business_unit = 'sales';
CREATE DATABASE operations_db TAG (business_unit = 'operations');
ALTER TABLE customer
  ALTER COLUMN NAME
  SET TAG security_objects_db.security_objects_schema.protected_data = 'PII';
CREATE TAGALLOWED_VALUESTag InheritancePII Classification
D2
Virtual Warehouses & Compute
Compute 3 lessons · Warehouses, Resource Monitors, Multi-cluster
A Virtual Warehouse is a cluster of compute resources that executes queries — completely separate from storage. Resource Monitors set spending caps to prevent runaway costs.
2.1Virtual Warehouse — Creation, Sizing & State Management

Warehouses range from X-Small to 6X-Large. Doubling size doubles performance and credit consumption. AUTO_SUSPEND and AUTO_RESUME control billing behavior.

Create & Manage a Virtual Warehouse
CREATE WAREHOUSE DATA_ANALYSIS_WAREHOUSE
  WAREHOUSE_SIZE = 'SMALL'
  AUTO_SUSPEND   = 600
  AUTO_RESUME    = TRUE
  INITIALLY_SUSPENDED = TRUE;
ALTER WAREHOUSE DATA_ANALYSIS_WAREHOUSE RESUME;
ALTER WAREHOUSE DATA_ANALYSIS_WAREHOUSE SET WAREHOUSE_SIZE = LARGE;
SHOW WAREHOUSES LIKE 'DATA_ANALYSIS_WAREHOUSE';
CREATE WAREHOUSEAUTO_SUSPENDAUTO_RESUMEWAREHOUSE_SIZE
2.2Resource Monitors & Warehouse Cost Monitoring

A Resource Monitor tracks credit usage and fires triggers at thresholds. WAREHOUSE_METERING_HISTORY shows exact credit consumption per warehouse per hour.

Create Resource Monitor
CREATE RESOURCE MONITOR MY_MONITOR
  WITH CREDIT_QUOTA = 100
  TRIGGERS
    ON 50  PERCENT DO NOTIFY
    ON 90  PERCENT DO SUSPEND
    ON 100 PERCENT DO SUSPEND_IMMEDIATE;
ALTER WAREHOUSE COMPUTE_WH
  SET RESOURCE_MONITOR = "MY_MONITOR";
RESOURCE MONITORCREDIT_QUOTASUSPEND_IMMEDIATEWAREHOUSE_METERING_HISTORY
2.3Multi-cluster Warehouses — Handling Concurrency

A Multi-cluster Warehouse automatically adds extra clusters to serve concurrent users. STANDARD: adds at first queue. ECONOMY: waits for sustained load.

Create Multi-cluster Warehouse
CREATE OR REPLACE WAREHOUSE MULTI_CLUSTER_WH
  WAREHOUSE_SIZE    = 'XSMALL'
  AUTO_SUSPEND      = 600
  AUTO_RESUME       = TRUE
  MIN_CLUSTER_COUNT = 1
  MAX_CLUSTER_COUNT = 3
  SCALING_POLICY    = 'STANDARD';
MIN_CLUSTER_COUNTMAX_CLUSTER_COUNTSCALING_POLICYConcurrency
D3
Data Loading & Ingestion
Ingestion 4 lessons · INSERT, Stages, COPY INTO, Semi-structured
Data enters Snowflake through: INSERT statements for small data, Stages as landing zones, COPY INTO for bulk loads, and VARIANT columns for semi-structured JSON.
3.1INSERT Statement Variations

INSERT OVERWRITE atomically truncates and inserts in one statement — useful for full daily refreshes without duplication risk.

INSERT Variations
INSERT INTO FILMS VALUES ('fi3p9f8hu8', 'Parasite', DATE('2019-05-30'));
INSERT INTO FILMS VALUES
  ('9x3wnr0zit', 'Citizen Kane',  DATE('1942-01-24')),
  ('2wyaojnzfq', 'Old Boy',        DATE('2004-10-15'));
INSERT INTO FILMS_2000 SELECT * FROM FILMS
  WHERE RELEASE_DATE > DATE('2000-01-01');
INSERT OVERWRITE INTO FILMS_2000 SELECT * FROM FILMS;
INSERT VALUESINSERT SELECTINSERT OVERWRITE
3.2Stage Types — User, Table & Named Stages

Three internal stage types: User stage (@~), Table stage (@%TABLE), Named stage (@MY_STAGE — recommended for production).

Stage Types & Querying Staged Files
LIST @~;
LS @%FILMS;
CREATE STAGE FILMS_STAGE;
CREATE FILE FORMAT CSV_FILE_FORMAT
  TYPE = CSV
  SKIP_HEADER = 1;
SELECT $1, $2, $3 FROM @~/films.csv;
⚠️
The PUT command (upload local files) only works from SnowSQL CLI — not from Snowsight.
@~ User Stage@%TABLE StageNamed StageFILE FORMATPUT Command
3.3COPY INTO Table — Bulk Loading & Transformations

COPY INTO is the primary bulk-load command. It tracks loaded files (preventing double-loading by default). Use VALIDATION_MODE to test without loading.

COPY INTO — Basics & Options
COPY INTO FILMS
FROM @FILMS_STAGE/films.csv
FILE_FORMAT = (TYPE='CSV' SKIP_HEADER=1);

-- Transformations during load
COPY INTO FILMS FROM
( SELECT $1, $2, to_date($3)
  FROM @%FILMS/films.csv)
FILE_FORMAT = CSV_FILE_FORMAT
FORCE = TRUE;

-- VALIDATION_MODE: preview rows without inserting
COPY INTO FILMS
FROM @FILMS_STAGE/films.csv
VALIDATION_MODE = 'RETURN_ROWS';

COPY INTO FILMS
FROM @FILMS_STAGE
FILE_FORMAT = CSV_FILE_FORMAT
PATTERN = '.*[.]csv'
ON_ERROR=CONTINUE
FORCE=true;
COPY INTOFORCEPATTERNVALIDATION_MODEON_ERROR
3.4Semi-structured Data — JSON, VARIANT & FLATTEN

Snowflake stores JSON natively in a VARIANT column. Navigate paths with dot notation (col:key). LATERAL FLATTEN turns JSON arrays into rows.

ELT — Load Raw JSON & Query
CREATE OR REPLACE TABLE FILMS_ELT (JSON_VARIANT VARIANT);
SELECT
  json_variant:id                      AS id,
  json_variant:title                   AS title,
  json_variant:release_date::date     AS release_date,
  json_variant:actors[0]                AS first_actor,
  json_variant:ratings.imdb_rating      AS imdb_rating
FROM FILMS_ELT;

-- FLATTEN arrays to rows
SELECT
  json_variant:title,
  L.value
FROM FILMS_ELT F,
LATERAL FLATTEN(INPUT => F.json_variant:ratings) L;
VARIANTJSON Dot NotationFLATTENLATERAL FLATTENMATCH_BY_COLUMN_NAME
D4
Query Performance & Optimization
Performance 3 lessons · Query History, Caching, Clustering
Snowflake has three caching layers: Metadata, Results (24h), and Local Disk. Micro-partitions (~50–500MB) store min/max metadata for pruning. Clustering keys improve partition skipping.
4.1Query History & Query Profile

ACCOUNT_USAGE.QUERY_HISTORY: 365 days, ~45 min latency. INFORMATION_SCHEMA.QUERY_HISTORY(): real-time, last 7 days only.

Query History Views
SELECT
  QUERY_ID, QUERY_TEXT, USER_NAME,
  ROUND(TOTAL_ELAPSED_TIME / 1000, 2) AS ELAPSED_SEC
FROM SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY
WHERE ELAPSED_SEC > 3
ORDER BY ELAPSED_SEC DESC
LIMIT 10;
QUERY_HISTORYACCOUNT_USAGEINFORMATION_SCHEMAQuery Profile
4.2Caching — Metadata, Results Cache & Local Storage

1. Metadata Cache: COUNT(*), SHOW, DESCRIBE — free. 2. Results Cache: exact same query, 24h TTL, shared across all users, no warehouse needed. 3. Local Disk Cache: warehouse SSD caches scanned data.

Demonstrating Cache Layers
-- ① Metadata cache — no warehouse needed
SELECT COUNT(*) FROM CUSTOMER;

-- ② Results cache — run same query twice
SELECT C_CUSTKEY, C_NAME FROM CUSTOMER LIMIT 1000000;
SELECT C_CUSTKEY, C_NAME FROM CUSTOMER LIMIT 1000000;

-- Cache MISS — CURRENT_TIMESTAMP() busts cache
SELECT C_CUSTKEY, CURRENT_TIMESTAMP() FROM CUSTOMER LIMIT 1000000;

-- Disable/re-enable result caching
ALTER ACCOUNT SET USE_CACHED_RESULT = FALSE;
ALTER ACCOUNT SET USE_CACHED_RESULT = TRUE;
Metadata CacheResults CacheLocal Disk CacheUSE_CACHED_RESULT
4.3Clustering & Micro-partition Pruning

A Clustering Key re-sorts data to make pruning more effective. Use on large tables queried frequently on a date or ID column. Lower average_overlaps = better clustering health.

Clustering
ALTER TABLE MY_TABLE CLUSTER BY (sale_date);
SELECT SYSTEM$CLUSTERING_INFORMATION('MY_TABLE', '(sale_date)');
SELECT * FROM SNOWFLAKE.ACCOUNT_USAGE.AUTOMATIC_CLUSTERING_HISTORY;
Micro-partitionsCLUSTER BYSYSTEM$CLUSTERING_INFORMATIONPartition Pruning
D5
Advanced Features
Advanced 7 lessons · Objects, UDFs, SnowSQL, Sharing, Dynamic Tables, Parameters
Full breadth of Snowflake: object types, UDFs and Stored Procedures, SnowSQL, Secure Data Sharing, the Data Marketplace, Dynamic Tables, and Parameters.
5.1Databases, Schemas, Tables & Views

Table types: Permanent (full time-travel + fail-safe), Temporary (session-scoped), Transient (limited time-travel, no fail-safe). View types: Standard, Secure, Materialized.

Table & View Types
CREATE TABLE PERMANENT_TABLE (NAME STRING, AGE INT);
CREATE TEMPORARY TABLE TEMP_TABLE (NAME STRING, AGE INT);
CREATE TRANSIENT TABLE TRANSIENT_TABLE (NAME STRING, AGE INT);
ALTER TABLE PERMANENT_TABLE SET DATA_RETENTION_TIME_IN_DAYS = 90;
CREATE SECURE VIEW SECURE_VIEW AS SELECT * FROM PERMANENT_TABLE;
CREATE MATERIALIZED VIEW MAT_VIEW AS SELECT * FROM PERMANENT_TABLE;
PERMANENT TABLETEMPORARY TABLETRANSIENT TABLESECURE VIEWMATERIALIZED VIEW
5.2UDFs, External Functions & Stored Procedures

UDFs extend SQL with custom logic — usable inline in SELECT. Stored Procedures run procedural logic but must be called with CALL, not used inline.

SQL UDF & Stored Procedure
CREATE OR REPLACE FUNCTION DAY_NAME_ON(num_of_days int)
RETURNS STRING AS
$$
  SELECT 'In ' || CAST(num_of_days AS STRING) || ' days it will be a '
    || dayname(dateadd(day, num_of_days, current_date()))
$$;

CREATE OR REPLACE PROCEDURE TRUNCATE_TABLES(DB STRING, SCH STRING)
RETURNS STRING
LANGUAGE JAVASCRIPT
EXECUTE AS OWNER AS
$$
  var rs = snowflake.execute({sqlText: 'SHOW TABLES IN ' + DB + '.' + SCH});
  while (rs.next()){ snowflake.execute({sqlText: 'TRUNCATE TABLE ' + rs.getColumnValue(2)}); }
  return 'Done';
$$;
CALL TRUNCATE_TABLES('MY_DB', 'MY_SCHEMA');
CREATE FUNCTIONUDF OverloadingSTORED PROCEDURECALLsnowflake.execute()
5.3SnowSQL CLI

SnowSQL is the command-line client. Required for PUT and GET commands. Run queries with -q, execute files with -f.

SnowSQL Usage
snowsql -a <account> -u <user> -d MY_DB -s MY_SCHEMA -r SYSADMIN -w COMPUTE_WH
snowsql ... -q 'SELECT * FROM CUSTOMER LIMIT 5'
snowsql ... -f /path/to/script.sql

-- Inside interactive mode:
!set variable_substitution=true
!define table_name=customer
SELECT * FROM &table_name LIMIT 10;
PUT file:///local/data.csv @MY_STAGE;
SnowSQLPUT Command-q FlagVariable Substitution
5.4Secure Data Sharing

Share live data across Snowflake accounts — zero-copy, no ETL. Consumers query your data directly via a read-only database.

Create Share (Provider Side)
CREATE SHARE DEMO_SHARE;
GRANT USAGE ON DATABASE DEMO_DB TO SHARE DEMO_SHARE;
GRANT USAGE ON SCHEMA DEMO_DB.DEMO_SCHEMA TO SHARE DEMO_SHARE;
GRANT SELECT ON TABLE DEMO_DB.DEMO_SCHEMA.DEMO_TABLE TO SHARE DEMO_SHARE;
ALTER SHARE DEMO_SHARE ADD ACCOUNTS = <consumer_account>;

-- Consumer side
CREATE DATABASE SHARED_DATA FROM SHARE provider_acct.DEMO_SHARE;
CREATE SHAREGRANT TO SHAREZero-copy Sharing
5.5Data Marketplace

Import third-party live datasets from the Snowflake Marketplace — query them like any other table. Data always reflects the provider's latest version. No ETL pipeline needed.

Marketplace data uses the same zero-copy sharing mechanism as D5.4 — you're directly querying the provider's Snowflake account.
Data MarketplaceLive DatasetZero-copy
5.6Dynamic Tables — Declarative Data Pipelines

Dynamic Tables replace Streams + Tasks. Define a SQL query, set a TARGET_LAG, and Snowflake auto-refreshes. Chain them to build declarative pipelines.

Dynamic Table Pipeline
CREATE OR REPLACE DYNAMIC TABLE cleaned_transactions
  TARGET_LAG   = '1 minute'
  WAREHOUSE    = compute_wh
  REFRESH_MODE = AUTO
  INITIALIZE   = ON_CREATE
AS
  SELECT transaction_id, customer_id, amount,
    transaction_date::DATE AS transaction_date
  FROM raw_transactions
  WHERE status = 'completed';

ALTER DYNAMIC TABLE cleaned_transactions REFRESH;
ALTER DYNAMIC TABLE cleaned_transactions SUSPEND;
ALTER DYNAMIC TABLE cleaned_transactions RESUME;
DYNAMIC TABLETARGET_LAGREFRESH_MODEINITIALIZE ON_CREATE
5.7Parameters — Precedence Hierarchy

Parameter precedence (highest to lowest): Session → User → Account. Object parameters: Table overrides Schema overrides Database overrides Account.

Parameter Precedence Demo
ALTER ACCOUNT SET  DATE_OUTPUT_FORMAT = 'YYYY/MM/DD';
ALTER USER admin SET DATE_OUTPUT_FORMAT = 'DD-MM-YYYY';
ALTER SESSION SET DATE_OUTPUT_FORMAT = 'MM, DD, YYYY'; -- wins
SHOW PARAMETERS LIKE 'DATE_OUTPUT_FORMAT';
ALTER TABLE  MY_TABLE  SET DATA_RETENTION_TIME_IN_DAYS = 5;
ALTER SCHEMA MY_SCHEMA SET DATA_RETENTION_TIME_IN_DAYS = 10;
ALTER SESSION SETSHOW PARAMETERSDATA_RETENTION_TIMEParameter Precedence
D6
Data Resilience — Time Travel & Cloning
Resilience 2 lessons · Time Travel, UNDROP, Zero-copy Cloning
Time Travel queries data as it existed up to 90 days in the past. Cloning is metadata-only (zero-copy) — a clone shares micro-partitions until either side writes new data.
6.1Time Travel — Query the Past & Recover Objects

Three Time Travel SQL extensions: AT(OFFSET) — relative in seconds. AT(TIMESTAMP) — absolute point. BEFORE(STATEMENT) — just before a specific query ran.

💡Get a query ID from Snowsight UI or with LAST_QUERY_ID() immediately after the statement.
Time Travel SQL Extensions
SELECT * FROM MY_TABLE
AT(OFFSET => -60*5);  -- 5 minutes ago

SELECT * FROM MY_TABLE
AT(TIMESTAMP => DATEADD(minute, -15, current_timestamp()));

SELECT * FROM MY_TABLE
BEFORE(STATEMENT => '<query_id>');

UNDROP TABLE MY_TABLE;
UNDROP SCHEMA MY_SCHEMA;
UNDROP DATABASE MY_DB;

-- Recover: create table from historical state
CREATE TABLE MY_TABLE_RESTORED AS
  SELECT * FROM MY_TABLE
  BEFORE(STATEMENT => '<truncate_query_id>');
⚠️
Time Travel storage costs money. Use the lowest retention that meets your recovery requirements.
AT(OFFSET)AT(TIMESTAMP)BEFORE(STATEMENT)UNDROPDATA_RETENTION_TIME_IN_DAYS
6.2Cloning — Zero-copy Instant Copies

Cloning is metadata-only — no data files are copied. The clone shares micro-partitions until either side writes. Recursive for databases: cloning a database clones all schemas and tables inside.

Cloning
CREATE TABLE MY_TABLE_CLONE CLONE MY_TABLE;
CREATE DATABASE MY_DB_CLONE CLONE MY_DB;

-- Clone to a point in the past (combine with Time Travel)
CREATE OR REPLACE TABLE MY_TABLE_SNAPSHOT
  CLONE MY_TABLE
  AT(OFFSET => -60*2);
ℹ️
Clones do NOT inherit grants from the source. Re-apply privileges after cloning if needed.
CLONEZero-copyCopy-on-writeRecursive CloneTime Travel Clone

Snowflake Privileges Reference

Every privilege Snowflake supports — what it does, which object type it applies to, and how to use GRANT. Search by privilege name, object type, or keyword.

GRANT — The Command Behind Every Privilege

GRANT is a SQL command, not a privilege itself — but it's the mechanism behind every privilege in Snowflake. The pattern is always the same: what privilege, on what object, to whom.

GRANT syntax — the universal pattern
-- The universal GRANT pattern:
GRANT <privilege> ON <object_type> <object_name> TO ROLE <role_name>;

-- Real examples:
GRANT SELECT ON TABLE CUSTOMERS TO ROLE ANALYST;
GRANT USAGE ON DATABASE SALES_DB TO ROLE ANALYST;
GRANT USAGE ON SCHEMA SALES_DB.SALES TO ROLE ANALYST;
GRANT USAGE ON WAREHOUSE COMPUTE_WH TO ROLE ANALYST;
GRANT INSERT, UPDATE ON TABLE ORDERS TO ROLE ANALYST;

-- GRANT ROLE (different — no ON clause)
GRANT ROLE ANALYST TO USER Rajesh;
GRANT ROLE ANALYST TO ROLE SYSADMIN;

-- See what a role has
SHOW GRANTS TO ROLE ANALYST;
SHOW GRANTS OF ROLE ANALYST;
GRANTREVOKEGRANT ROLESHOW GRANTS
Navigation — "can you see / enter this object?"
USAGE
DATABASESCHEMAWAREHOUSE
Read-only navigation access. Lets a role see that the object exists and enter it — but grants zero ability to read data inside. On a warehouse it also means the role can run queries. You always need USAGE before any other privilege on an object.
REFERENCE
SCHEMA
Like USAGE but read-only navigation without allowing object creation. Mostly relevant in share contexts.
Data Reading
SELECT
TABLEVIEW
Allows running SELECT queries on a table or view. Requires USAGE on the parent database and schema first. Without SELECT, the role can enter the schema but gets "Insufficient privileges" on any query.
REFERENCES
TABLE
Allows a role to reference this table in foreign key constraints and view its column metadata — without being able to read actual row data.
Data Writing
INSERT
TABLE
Allows adding new rows to a table via INSERT statements.
UPDATE
TABLE
Allows modifying existing row values. Does not allow adding or deleting rows.
DELETE
TABLE
Allows removing specific rows from a table. Does not allow dropping the table itself.
TRUNCATE
TABLE
Allows deleting ALL rows at once with TRUNCATE TABLE. Faster than DELETE but cannot be filtered — it wipes everything.
Schema-level DDL — Creating Objects
CREATE TABLE
SCHEMA
Allows creating new tables inside a schema. Granted ON SCHEMA (not ON TABLE) because the table doesn't exist yet when you grant it.
CREATE VIEW
SCHEMA
Allows creating views inside a schema.
CREATE MASKING POLICY
SCHEMA
Allows defining new masking policies. Does not allow attaching them to columns — that requires APPLY MASKING POLICY separately.
CREATE ROW ACCESS POLICY
SCHEMA
Allows defining new row access policies. Defining and applying are always two separate privileges in Snowflake.
CREATE STAGE
SCHEMA
Allows creating internal or external stages for loading/unloading data files.
Policy Application — Attaching Policies to Objects
APPLY MASKING POLICY
ACCOUNT
Allows attaching (and detaching) a masking policy to a column anywhere in the account. Without this, a role can CREATE the policy but cannot activate it on any column.
APPLY ROW ACCESS POLICY
ACCOUNT
Allows attaching (and detaching) a row access policy to a table anywhere in the account. Same pattern as masking — CREATE and APPLY are always separate privileges.
APPLY TAG
ACCOUNT
Allows assigning classification tags to objects (columns, tables, warehouses). Used in data classification and governance workflows.
Ownership & Admin
OWNERSHIP
TABLESCHEMADATABASE
The highest privilege on an object. The owner can do anything to it — drop it, grant privileges on it to others, and transfer ownership. Every object has exactly one owner role.
MANAGE GRANTS
ACCOUNT
Allows granting any privilege to any role — even privileges the grantor doesn't personally hold. Extremely powerful; typically reserved for SECURITYADMIN.
MONITOR
WAREHOUSEDATABASE
Allows viewing query history, credit usage, and execution metrics — without being able to run queries or read data.
OPERATE
WAREHOUSE
Allows starting, stopping, suspending, and resizing a warehouse. Does not allow modifying warehouse config (that needs MODIFY).
MODIFY
WAREHOUSE
Allows changing warehouse properties like size, auto-suspend settings, and max cluster count.
Column-level Security
SELECT (column)
COLUMN
In column-level security, you can GRANT SELECT on specific columns only. The role can query those columns but gets NULL for any ungranted column. More granular than table-level SELECT.
SECTION 01
Architecture, Features & Object Model
Multi-cluster shared data · Layers · Editions · Object model · Billing · AI Data Cloud
Multi-Cluster Shared Data Architecture
☁ Cloud Services Layer (brain)
Query ParserOptimizerMetadata MgrAuth & Access ControlTransaction MgrSecurity
↕ decoupled
⚡ Query Processing Layer — Virtual Warehouses
WH-1 (XS)WH-2 (M)WH-3 (XL)Independent clustersMPP execution
↕ decoupled
🗄 Storage Layer — Centralized, Columnar, Compressed
Micro-partitionsColumnar formatAES-256 encryptedCloud object store (S3/Azure/GCS)
Storage, compute, and services are fully decoupled. Multiple warehouses read the same data with zero contention. You pay for each independently.
Storage — Micro-Partitions
Default size50–500 MB (compressed)
FormatColumnar, compressed, immutable
Metadata storedMIN/MAX per col, distinct count, NULL count
PruningAutomatic via metadata — no indexes needed
Overlap %Lower = better clustering health
DML effectCreates new partitions (old retired via Time Travel)
Cloud Services Layer
Billing threshold10% of daily compute credits FREE
Charged forDDL, SHOW, DESCRIBE, metadata queries
Result cache TTL24 hours, shared across all users/warehouses
Result cache invalidatedOn any DML to underlying table
Transaction mgmtACID-compliant, MVCC-based
Snowflake Editions
EditionKey FeaturesMulti-clusterTime TravelHIPAA
StandardCore SQL, basic security0–1 day
Enterprise EMulti-cluster WH, materialized views, column masking0–90 days
Business Critical BCAll Enterprise + HIPAA, PCI, private link0–90 days
VPS VPSDedicated environment, highest isolation0–90 days
Object Hierarchy
ORGANIZATIONMultiple accounts, ORGADMIN
ACCOUNTSingle cloud region
DATABASELogical namespace
SCHEMAContainer for objects
TABLE / VIEW / STAGE / PIPE / STREAM / TASK / SP / UDF / FILE FORMAT / TAG / POLICY
Always fully qualify: db.schema.object. USAGE on db AND schema required before touching any object inside.
Billing Model
Compute (WH)Credits/second, 60s minimum per start
Cloud ServicesFirst 10% of daily compute is FREE
Storage$/TB/month (compressed). Includes TT + Fail-Safe.
Data TransferSame-region same-cloud = free. Cross = charged.
SnowpipeServerless credits for file processing
Materialized ViewsCredits for background refresh (serverless)
SECTION 02
Account Access & Security
RBAC · Authentication · Network Policies · Encryption · Column/Row Security · Secure Views
System-Defined Roles — Hierarchy
ORGADMINOrg-level: create/manage accounts
ACCOUNTADMINTop account role — encapsulates SYSADMIN + SECURITYADMIN
SECURITYADMINMANAGE GRANTS globally. Create/manage users & roles.
USERADMINCREATE USER + CREATE ROLE only.
SYSADMINCREATE WAREHOUSE, DATABASE, SCHEMA, TABLE. Recommended owner of all data objects.
PUBLICAuto-granted to every user/role. No default data access.
Exam critical: ACCOUNTADMIN is NOT a superuser. Masking policies apply to ALL roles including ACCOUNTADMIN unless explicitly named in the WHEN clause.
Column Masking
CREATE OR REPLACE MASKING POLICY email_mask AS (val STRING) RETURNS STRING -> CASE WHEN CURRENT_ROLE() IN ('ANALYST') THEN val ELSE SHA2(val) END; ALTER TABLE users MODIFY COLUMN email SET MASKING POLICY email_mask;
  • Runtime evaluation — evaluated at query time via CURRENT_ROLE()
  • One policy per column
  • ACCOUNTADMIN is NOT exempt unless named in WHEN
Row Access Policy
CREATE OR REPLACE ROW ACCESS POLICY rap AS (region STRING) RETURNS BOOLEAN -> CURRENT_ROLE() = 'ACCOUNTADMIN' OR EXISTS ( SELECT 1 FROM user_region_map m WHERE m.region = region AND m.username = CURRENT_USER() ); ALTER TABLE sales ADD ROW ACCESS POLICY rap ON (region);
  • One RAP per table
  • Returns TRUE = row visible, FALSE = filtered
  • User doesn't know rows are hidden
Authentication
PasswordDefault, min 8 chars
MFADuo-based, TOTP
Key PairRSA 2048/4096, service accounts
OAuthExternal / Snowflake OAuth
SSO / SAMLOkta, Azure AD, ADFS
Encryption
At restAES-256 (always on)
In transitTLS 1.2+
Tri-Secret SecureBC Customer + Snowflake + Cloud key
BYOKCustomer managed via KMS
Key rotationAutomatic (30 days) or manual
Network Policies
LevelAccount-level or User-level
User-levelOverrides account-level
Allowed listIP CIDRs that CAN connect
Blocked listIPs explicitly denied
ApplyALTER ACCOUNT SET NETWORK_POLICY=p;
SECTION 03
Performance: Virtual Warehouses
Sizes · Caching · Multi-cluster · Scaling policies · Resource monitors
Warehouse Sizes & Credits
X-Small1 credit/hr
Small2 credits/hr
Medium4 credits/hr
Large8 credits/hr
X-Large16 credits/hr
2X-Large32 credits/hr
3X-Large64 credits/hr
4X-Large128 credits/hr
BillingPer-second, 60s minimum
Three Cache Layers
1. Metadata cacheCloud Services. FREE. COUNT(*), SHOW.
2. Result cache24h TTL. Shared. No warehouse needed.
Result invalidatedAny DML on underlying table
3. Local disk cachePer-warehouse SSD. Stores column data.
Local cache tipDon't suspend between similar queries
Multi-Cluster E+
SolvesConcurrency. NOT slow queries.
Scale UPResize → faster single queries
Scale OUTAdd clusters → more concurrent users
STANDARD policyScale after ~20s queue
ECONOMY policyScale after 6+ min sustained load
Maximized modeMIN = MAX (all always running)
SECTION 04
Performance: Query Optimization
Clustering · Search optimization · Materialized views · Query profile · Best practices
Clustering Keys
ALTER TABLE sales CLUSTER BY (sale_date); SELECT SYSTEM$CLUSTERING_INFORMATION('sales', '(sale_date)'); -- average_overlaps → 0 = perfectly clustered
  • Good candidate: large tables, frequent WHERE/JOIN on same col
  • Max 4 columns in a clustering key
  • Automatic reclustering runs in background
Search Optimization
ALTER TABLE customers ADD SEARCH OPTIMIZATION ON EQUALITY(email, phone); SELECT SYSTEM$ESTIMATE_SEARCH_OPTIMIZATION_COSTS('db.sch.t');
  • Best for: equality / IN predicates on high-cardinality cols
  • NOT good for range scans (use clustering for that)
  • Works on strings, numbers, VARIANT, geo
Query Optimization Checklist
  • Avoid SELECT * — columnar: pay per column scanned
  • Filter early — WHERE before GROUP BY
  • Avoid functions on filter cols: use ranges not YEAR(date_col)=2024
  • Use LIMIT during development to avoid full scans
  • Check Query Profile — look for disk spills, Cartesian products
  • Use QUALIFY instead of subquery for window function filtering
  • Use SAMPLE (10) for exploration
  • Result cache: avoid CURRENT_TIMESTAMP() if you want cache hits
SECTION 05
Data Loading & Unloading
Stages · COPY INTO · Snowpipe · File formats · GET/PUT · Unloading
Stage Types
User stage@~
Table stage@%table_name
Named internal@my_stage
Named externalS3 / Azure Blob / GCS
ListLIST @my_stage;
RemoveREMOVE @my_stage/f.csv;
COPY INTO Options
DeduplicationTracks loaded files for 64 days
FORCE=TRUEBypass dedup, reload files
ON_ERRORCONTINUE / SKIP_FILE / ABORT
VALIDATION_MODETest without loading
Optimal file size100MB–250MB compressed
ParallelismOne file per thread — more files = faster
Snowpipe
CREATE PIPE sales_pipe AUTO_INGEST = TRUE AS COPY INTO sales FROM @s3_stage;
  • Serverless — no warehouse, billed separately
  • AUTO_INGEST=TRUE → S3 event trigger
  • Latency typically < 1 minute
SECTION 06
Data Transformations
Streams · Tasks · Dynamic Tables · Stored Procedures · UDFs · Semi-structured
Streams — CDC
CREATE STREAM sales_stream ON TABLE sales APPEND_ONLY = FALSE; -- METADATA$ACTION: INSERT | DELETE -- METADATA$ISUPDATE: TRUE if UPDATE pair SELECT SYSTEM$STREAM_HAS_DATA('sales_stream');
  • Consumed atomically — reading in transaction clears it
  • APPEND_ONLY — only INSERTs, cheaper on large tables
  • Data expires with table's Time Travel window
Tasks
CREATE TASK my_task WAREHOUSE = etl_wh SCHEDULE = 'USING CRON 0 2 * * * UTC' WHEN SYSTEM$STREAM_HAS_DATA('s') AS INSERT INTO tgt SELECT * FROM s; ALTER TASK my_task RESUME;
  • Tasks start SUSPENDED — always RESUME after create
  • Child tasks use AFTER parent (DAG pattern)
  • Manual: EXECUTE TASK my_task;
SECTION 07
Storage, Data Protection & Sharing
Time Travel · Fail-Safe · Cloning · Data Sharing · Marketplace · External Tables
Time Travel
Standard0–1 day
Enterprise+0–90 days
SELECT * FROM t AT(OFFSET => -3600); UNDROP TABLE t; UNDROP DATABASE db;
Fail-Safe
Duration7 days (all editions)
Starts afterTime Travel window ends
AccessSnowflake Support ONLY
Temp/TransientNO Fail-Safe (saves cost)
Total protectionUp to 97 days total
Zero-Copy Cloning
CREATE TABLE dev CLONE prod; CREATE DATABASE clone CLONE prod; CREATE TABLE snap CLONE t AT(OFFSET => -3600);
  • Instant — metadata only, no data copy
  • Cost only for changes after cloning
  • Clones do NOT inherit grants
SECTION 08
Snowpark
Python / Java / Scala · DataFrame API · UDFs · ML
DataFrame Basics
from snowflake.snowpark import Session import snowflake.snowpark.functions as F session = Session.builder.configs({...}).create() df = session.table("sales") result = (df .filter(F.col("region") == "US") .group_by("product") .agg(F.sum("amount").alias("total")) .sort(F.col("total").desc()) ) result.show() # action — executes query result.to_pandas() # bring to local pandas DF
Snowpark Quick Reference
OperationSnowparkSQL
Read tablesession.table("t")FROM t
Filterdf.filter(col("x")>10)WHERE x>10
Selectdf.select("a","b")SELECT a,b
Group/Aggdf.group_by("x").agg(sum("y"))GROUP BY/SUM
Joindf1.join(df2,"id")JOIN ON id
Writedf.write.save_as_table("t")CTAS
To pandasdf.to_pandas()— client side
💡Push-down execution: Snowpark code runs INSIDE Snowflake — no data moves to the client. Lazy evaluation means nothing runs until an action (collect(), show(), count()) is called.
Snowflake SQL

The Golden Mental Model:
Think in Layers

Before writing any query, position yourself at the right layer. Most engineers jump straight to performance tuning when they should be at Layer 1. The layers are ordered by priority — correctness always wins.

LAYER 1
Correctness First
Always — no exceptions
LAYER 2
Scalability
When data is large
LAYER 3
Cost / Performance
When queries are slow or expensive
LAYER 4
Security
When exposing to users/roles
"Filter early, aggregate late, sort last, select only what you need."
THE UNIVERSAL RULE — solves 90% of query problems
LAYER 1

Correctness — Every Single Query

These rules apply always, no exceptions. The #1 source of production bugs in SQL is misunderstanding the execution order.

SQL Execution Order (Top → Bottom)
FROM
JOIN
WHERE
GROUP BY
HAVING
SELECT
DISTINCT
ORDER BY
LIMIT
⚠️Consequences of Execution Order
You cannot use a SELECT alias in WHERE
WHERE runs before SELECT — the alias doesn't exist yet. Use a CTE or subquery instead.
🔍
WHERE filters rows before aggregation; HAVING filters after
Filter individual rows in WHERE. Filter aggregated results (like SUM > 1000) in HAVING.
🔗
JOIN happens before WHERE
So filter in WHERE, not just in the ON clause, when you want to restrict the final result set.
Always Filter Early
-- ❌ BAD: Joins everything then filters SELECT * FROM orders o JOIN customers c ON o.cust_id = c.id WHERE c.country = 'CA'; -- ✅ GOOD: Pre-filter in CTE, then join only CA customers WITH ca_customers AS ( SELECT id FROM customers WHERE country = 'CA' ) SELECT * FROM orders o JOIN ca_customers c ON o.cust_id = c.id;
Be explicit with JOIN types — never assume implicit behavior
Always write INNER JOIN, LEFT JOIN, etc. Never rely on an implicit join that produces a Cartesian product.
LAYER 2

Scalability — When Tables > A Few Million Rows

Only think about these when your data volume justifies it. Let the Query Profile guide you to the right concern.

📊Scalability Concerns at a Glance
🗂️
Clustering — Are you filtering on the same column repeatedly?
Consider adding a cluster key on that column. Snowflake will auto-maintain it in the background.
✂️
Partition Pruning — Does your WHERE align with how the table is clustered?
A WHERE on a clustered column allows Snowflake to skip entire micro-partitions. Misaligned filters scan everything.
💾
Spilling — The Spilling Rule
Only ORDER BY at the very last step, and only the columns you truly need sorted. Sorting mid-query kills performance.
🔎
Search Optimization — Point lookups on non-clustered columns?
Enable Search Optimization Service (SOQ) for high-cardinality equality lookups like user ID or email.
🚫The Spilling Rule — Sort Only at the End
-- ❌ BAD: Sorting before aggregation (forces sort + materialize mid-query) SELECT dept, COUNT(*) FROM (SELECT * FROM emp ORDER BY salary) GROUP BY dept; -- ✅ GOOD: Sort only the final output, if needed at all SELECT dept, COUNT(*) FROM emp GROUP BY dept ORDER BY COUNT(*) DESC;
LAYER 3

Performance Tuning — When Queries Are Slow

Only when you have a problem to solve. Check Query Profile first — it tells you exactly which node is expensive.

🔬Performance Tuning Checklist
📈
Check Query Profile in Snowflake UI → which node takes the most time?
Watch for: disk spills, inefficient joins, large scans, Cartesian products.
💥
Watch for Cartesian products — a JOIN with no proper key = disaster
A missing JOIN condition multiplies every row in table A by every row in table B. Always verify your ON clause.
🎯
Avoid SELECT * — pull only needed columns
Columnar storage means you pay per column scanned. Select only what you actually use.
🧪
Use LIMIT during dev/testing to avoid full scans
Add SAMPLE (10) or LIMIT 1000 during exploration. Remove before production.
🧠The Rows → Groups → Result Mental Model
-- Think in this order: -- 1. Reduce ROWS first → WHERE, JOIN filters -- 2. Then GROUP them → GROUP BY -- 3. Then shape RESULT → SELECT, ORDER BY, LIMIT -- ❌ BAD: Functions on filter columns prevent pruning SELECT * FROM sales WHERE YEAR(sale_date) = 2024; -- scans ALL partitions -- ✅ GOOD: Range filter allows micro-partition pruning SELECT * FROM sales WHERE sale_date BETWEEN '2024-01-01' AND '2024-12-31';
LAYER 4

Security — When Exposing Data to Users/Roles

These are infrastructure concerns — set once at the table/policy level, not per query. Never implement security logic inside application WHERE clauses.

🔐Security Infrastructure Patterns
👁️
Use Secure Views when sharing views externally or across accounts
Regular views expose the query definition. SECURE VIEW hides business logic from consumers.
📏
Apply Row-level security via policies, not WHERE clauses in app code
App-level WHERE clauses can be bypassed. Row Access Policies are enforced at the engine level, transparently.
🎭
Apply Column masking on PII columns at the table policy level
Set masking once at the column level. It applies to every query against that column, for all roles not explicitly exempted.
CHECKLIST

5 Questions Before Running Any Query

Q1
Do I know what JOIN type I need and why?
INNER vs LEFT vs CROSS — be explicit, never guess.
Q2
Am I filtering as early as possible?
WHERE before GROUP BY. CTEs to pre-filter large tables.
Q3
Am I selecting only the columns I need?
No SELECT *. Name every column. Pay only for what you scan.
Q4
Am I ORDER BY-ing only when truly necessary?
Sorting is expensive at scale. Skip it unless output order matters.
Q5
Have I checked for fan-out risk?
1-to-many joins can silently double (or 10x) your row count. Verify cardinality before aggregating.
💡
For 90% of queries:
Knowing SQL execution order + the 5 questions above = no bugs. Snowflake-specific tuning only when Query Profile shows a problem.