Select Page

Data Engineering and MLOps specialist: Streamlining EDW & Data Pipelines for ML & AI products.

Any relational database for example ( MYSQL, POSTGRESQL)  can only support structured data in the forms of rows and columns. Snowflake, however, can support multiple types of data loads.

Snowflake Supported Formats

Supported File Encoding
Delimited files (CSV, TSV, etc.) UTF-8
JSON UTF-8
Avro UTF-8
ORC UTF-8
Parquet UTF-8
XML

Concept of stages

Internal Stages

Snowflake has the following stage types. Data can be uploaded to any of the internal stage type from local file system using PUT command.

Snowflake Internal Stages

Stage Description
User A user stage is allocated to each user for storing files. This stage type is designed to store files that are staged and managed by a single user but can be loaded into multiple tables. User stages cannot be altered or dropped.
Table A table stage is available for each table created in Snowflake. This stage type is designed to store files that are staged and managed by one or more users but only loaded into a single table. Table stages cannot be altered or dropped.
Named A named internal stage is a database object created in a schema. This stage type can store files that are staged and managed by one or more users and loaded into one or more tables. Because named stages are database objects, the ability to create, modify, use, or drop them can be controlled using security access control privileges. Create stages using the CREATE STAGE command.

External Stages

Snowflake supports direct data load from cloud platforms. The table below summarizes the supported platforms.

Snowflake Platforms for Data Load

Supported Platform Loading Stages
Local environment Files are first staged in a Snowflake stage, then loaded into a table.
Amazon S3 Files can be loaded directly from any user-supplied S3 bucket.
Google Cloud Storage Files can be loaded directly from any user-supplied Cloud Storage container.
Microsoft Azure Files can be loaded directly from any user-supplied Azure container.

Loading data from Amazon AWS

Creating an S3 Stage

An S3(external)  stage is a reference to a  location in the snowflake platform where data files are stored before they are loaded into a table.

Loading all files within the folder:

If the path ends with /, all of the objects in the S3 folders are loaded in the stage

Loading data from a file

If specific file path is given, it loads data from file into the stage. The example below loads CSV data into an S3 stage.

To create a stage, Select the  database > Click on Stages Tab -> Click on Create

This loads a create stage interface where we can specify file information to load data from

 

CREATE STAGE “ANALYTICS_DEMO_DATA_2″.”PUBLIC”.My_First_External_Stage URL = ‘s3://world-happiness-data/world-happiness-report-2021.csv’ CREDENTIALS = (AWS_KEY_ID = ‘AKIAZFJYSWSXK7Y6BAEG’ AWS_SECRET_KEY = ‘****************************************’);

Viewing Stages:

We can view available stages in our database using the following command.

Show stages;

This shows the stage we created in previous step

Copying data from  Stage

Now that the data is loaded in the Snowflake stage. We can load it on our table.

copy into “ANALYTICS_DEMO_DATA_2″.”PUBLIC”.”S3DATA” from @MY_First_External_Stage;

A select statement to confirm loading shows that data is successfully loaded from the stage into our table.

Bulk vs continuous loading

External tables (Data Lake)