Any relational database for example ( MYSQL, POSTGRESQL) can only support structured data in the forms of rows and columns. Snowflake, however, can support multiple types of data loads.
Snowflake Supported Formats
Supported File | Encoding |
---|---|
Delimited files (CSV, TSV, etc.) | UTF-8 |
JSON | UTF-8 |
Avro | UTF-8 |
ORC | UTF-8 |
Parquet | UTF-8 |
XML |
Concept of stages
Internal Stages
Snowflake has the following stage types. Data can be uploaded to any of the internal stage type from local file system using PUT command.
Snowflake Internal Stages
Stage | Description |
---|---|
User | A user stage is allocated to each user for storing files. This stage type is designed to store files that are staged and managed by a single user but can be loaded into multiple tables. User stages cannot be altered or dropped. |
Table | A table stage is available for each table created in Snowflake. This stage type is designed to store files that are staged and managed by one or more users but only loaded into a single table. Table stages cannot be altered or dropped. |
Named | A named internal stage is a database object created in a schema. This stage type can store files that are staged and managed by one or more users and loaded into one or more tables. Because named stages are database objects, the ability to create, modify, use, or drop them can be controlled using security access control privileges. Create stages using the CREATE STAGE command. |
External Stages
Snowflake supports direct data load from cloud platforms. The table below summarizes the supported platforms.
Snowflake Platforms for Data Load
Supported Platform | Loading Stages |
---|---|
Local environment | Files are first staged in a Snowflake stage, then loaded into a table. |
Amazon S3 | Files can be loaded directly from any user-supplied S3 bucket. |
Google Cloud Storage | Files can be loaded directly from any user-supplied Cloud Storage container. |
Microsoft Azure | Files can be loaded directly from any user-supplied Azure container. |
Loading data from Amazon AWS
Creating an S3 Stage
An S3(external) stage is a reference to a location in the snowflake platform where data files are stored before they are loaded into a table.
Loading all files within the folder:
If the path ends with /, all of the objects in the S3 folders are loaded in the stage
Loading data from a file
If specific file path is given, it loads data from file into the stage. The example below loads CSV data into an S3 stage.
To create a stage, Select the database > Click on Stages Tab -> Click on Create
This loads a create stage interface where we can specify file information to load data from
CREATE STAGE “ANALYTICS_DEMO_DATA_2″.”PUBLIC”.My_First_External_Stage URL = ‘s3://world-happiness-data/world-happiness-report-2021.csv’ CREDENTIALS = (AWS_KEY_ID = ‘AKIAZFJYSWSXK7Y6BAEG’ AWS_SECRET_KEY = ‘****************************************’);
Viewing Stages:
We can view available stages in our database using the following command.
Show stages;
This shows the stage we created in previous step
Copying data from Stage
Now that the data is loaded in the Snowflake stage. We can load it on our table.
copy into “ANALYTICS_DEMO_DATA_2″.”PUBLIC”.”S3DATA” from @MY_First_External_Stage;
A select statement to confirm loading shows that data is successfully loaded from the stage into our table.