What is the purpose of an External Function?
To call code that executes outside of Snowflake
To run a function in another Snowflake database
To share data in Snowflake with external parties
To ingest data from on-premises data sources
The purpose of an External Function in Snowflake is to call code that executes outside of the Snowflake environment. This allows Snowflake to interact with external services and leverage functionalities that are not natively available within Snowflake, such as calling APIs or running custom code hosted on cloud services3.
https://docs.snowflake.com/en/sql-reference/external-functions.html
True or False: Reader Accounts are able to extract data from shared data objects for use outside of Snowflake.
True
False
Reader accounts in Snowflake are designed to allow users to read data shared with them but do not have the capability to extract data for use outside of Snowflake. They are intended for consuming shared data within the Snowflake environment only.
What is a key feature of Snowflake architecture?
Zero-copy cloning creates a mirror copy of a database that updates with the original
Software updates are automatically applied on a quarterly basis
Snowflake eliminates resource contention with its virtual warehouse implementation
Multi-cluster warehouses allow users to run a query that spans across multiple clusters
Snowflake automatically sorts DATE columns during ingest for fast retrieval by date
One of the key features of Snowflake’s architecture is its unique approach to eliminating resource contention through the use of virtual warehouses. This is achieved by separating storage and compute resources, allowing multiple virtual warehouses to operate independently on the same data without affecting each other. This means that different workloads, such as loading data, running queries, or performing complex analytics, can be processed simultaneously without any performance degradation due to resource contention.
References:
What is the MOST performant file format for loading data in Snowflake?
CSV (Unzipped)
Parquet
CSV (Gzipped)
ORC
Parquet is a columnar storage file format that is optimized for performance in Snowflake. It is designed to be efficient for both storage and query performance, particularly for complex queries on large datasets. Parquet files support efficient compression and encoding schemes, which can lead to significant savings in storage and speed in query processing, making it the most performant file format for loading data into Snowflake.
References:
What file formats does Snowflake support for loading semi-structured data? (Choose three.)
TSV
JSON
Avro
Parquet
JPEG
Snowflake supports several semi-structured data formats for loading data. The supported formats include JSON, Avro, and Parquet12. These formats allow for efficient storage and querying of data that does not conform to a traditional relational database schema.
Which of the following objects can be shared through secure data sharing?
Masking policy
Stored procedure
Task
External table
Secure data sharing in Snowflake allows users to share various objects between Snowflake accounts without physically copying the data, thus not consuming additional storage. Among the options provided, external tables can be shared through secure data sharing. External tables are used to query data directly from files in a stage without loading the data into Snowflake tables, making them suitable for sharing across different Snowflake accounts.
References:
What is the recommended file sizing for data loading using Snowpipe?
A compressed file size greater than 100 MB, and up to 250 MB
A compressed file size greater than 100 GB, and up to 250 GB
A compressed file size greater than 10 MB, and up to 100 MB
A compressed file size greater than 1 GB, and up to 2 GB
For data loading using Snowpipe, the recommended file size is a compressed file greater than 10 MB and up to 100 MB. This size range is optimal for Snowpipe’s continuous, micro-batch loading process, allowing for efficient and timely data ingestion without overwhelming the system with files that are too large or too small.
References:
A user has an application that writes a new Tile to a cloud storage location every 5 minutes.
What would be the MOST efficient way to get the files into Snowflake?
Create a task that runs a copy into operation from an external stage every 5 minutes
Create a task that puts the files in an internal stage and automate the data loading wizard
Create a task that runs a GET operation to intermittently check for new files
Set up cloud provider notifications on the Tile location and use Snowpipe with auto-ingest
The most efficient way to get files into Snowflake, especially when new files are being written to a cloud storage location at frequent intervals, is to use Snowpipe with auto-ingest. Snowpipe is Snowflake’s continuous data ingestion service that loads data as soon as it becomes available in a cloud storage location. By setting up cloud provider notifications, Snowpipe can be triggered automatically whenever new files are written to the storage location, ensuring that the data is loaded into Snowflake with minimal latency and without the need for manual intervention or scheduling frequent tasks.
References:
Where would a Snowflake user find information about query activity from 90 days ago?
account__usage . query history view
account__usage.query__history__archive View
information__schema . cruery_history view
information__schema - query history_by_ses s i on view
To find information about query activity from 90 days ago, a Snowflake user should use the account_usage.query_history_archive view. This view is designed to provide access to historical query data beyond the default 14-day retention period found in the standard query_history view. It allows users to analyze and audit past query activities for up to 365 days after the date of execution, which includes the 90-day period mentioned.
References:
Which feature is only available in the Enterprise or higher editions of Snowflake?
Column-level security
SOC 2 type II certification
Multi-factor Authentication (MFA)
Object-level access control
Column-level security is a feature that allows fine-grained control over access to specific columns within a table. This is particularly useful for managing sensitive data and ensuring that only authorized users can view or manipulate certain pieces of information. According to my last update, this feature was available in the Enterprise Edition or higher editions of Snowflake.
References: Based on my internal data as of 2021, column-level security is an advanced feature typically reserved for higher-tiered editions like the Enterprise Edition in data warehousing solutions such as Snowflake.
https://docs.snowflake.com/en/user-guide/intro-editions.html
Which Snowflake objects track DML changes made to tables, like inserts, updates, and deletes?
Pipes
Streams
Tasks
Procedures
In Snowflake, Streams are the objects that track Data Manipulation Language (DML) changes made to tables, such as inserts, updates, and deletes. Streams record these changes along with metadata about each change, enabling actions to be taken using the changed data. This process is known as change data capture (CDC)2.
What happens when a cloned table is replicated to a secondary database? (Select TWO)
A read-only copy of the cloned tables is stored.
The replication will not be successful.
The physical data is replicated
Additional costs for storage are charged to a secondary account
Metadata pointers to cloned tables are replicated
When a cloned table is replicated to a secondary database in Snowflake, the following occurs:
It’s important to note that while the physical data and metadata are replicated, the secondary database is typically read-only and cannot be used for write operations. Additionally, while there may be additional storage costs associated with the secondary account, this is not a direct result of the replication process but rather a consequence of storing additional data.
References:
Which of the following Snowflake capabilities are available in all Snowflake editions? (Select TWO)
Customer-managed encryption keys through Tri-Secret Secure
Automatic encryption of all data
Up to 90 days of data recovery through Time Travel
Object-level access control
Column-level security to apply data masking policies to tables and views
In all Snowflake editions, two key capabilities are universally available:
These features are part of Snowflake’s commitment to security and governance, and they are included in every edition of the Snowflake Data Cloud.
References:
What are value types that a VARIANT column can store? (Select TWO)
STRUCT
OBJECT
BINARY
ARRAY
CLOB
A VARIANT column in Snowflake can store semi-structured data types. This includes:
The VARIANT data type is specifically designed to handle semi-structured data like JSON, Avro, ORC, Parquet, or XML, allowing for the storage of nested and complex data structures.
References:
True or False: When you create a custom role, it is a best practice to immediately grant that role to ACCOUNTADMIN.
True
False
The ACCOUNTADMIN role is the most powerful role in Snowflake and should be limited to a select number of users within an organization. It is responsible for account-level configurations and should not be used for day-to-day object creation or management. Granting a custom role to ACCOUNTADMIN could inadvertently give broad access to users with this role, which is not a recommended security practice.
In which scenarios would a user have to pay Cloud Services costs? (Select TWO).
Compute Credits = 50 Credits Cloud Services = 10
Compute Credits = 80 Credits Cloud Services = 5
Compute Credits = 10 Credits Cloud Services = 9
Compute Credits = 120 Credits Cloud Services = 10
Compute Credits = 200 Credits Cloud Services = 26
In Snowflake, Cloud Services costs are incurred when the Cloud Services usage exceeds 10% of the compute usage (measured in credits). Therefore, scenarios A and E would result in Cloud Services charges because the Cloud Services usage is more than 10% of the compute credits used.
References:
True or False: Loading data into Snowflake requires that source data files be no larger than 16MB.
True
False
Snowflake does not require source data files to be no larger than 16MB. In fact, Snowflake recommends that for optimal load performance, data files should be roughly 100-250 MB in size when compressed. However, it is not recommended to load very large files (e.g., 100 GB or larger) due to potential delays and wasted credits if errors occur. Smaller files should be aggregated to minimize processing overhead, and larger files should be split to distribute the load among compute resources in an active warehouse.
References: Preparing your data files | Snowflake Documentation
A Snowflake user executed a query and received the results. Another user executed the same query 4 hours later. The data had not changed.
What will occur?
No virtual warehouse will be used, data will be read from the result cache.
No virtual warehouse will be used, data will be read from the local disk cache.
The default virtual warehouse will be used to read all data.
The virtual warehouse that is defined at the session level will be used to read all data.
Snowflake maintains a result cache that stores the results of every query for 24 hours. If the same query is executed again within this time frame and the data has not changed, Snowflake will retrieve the data from the result cache instead of using a virtual warehouse to recompute the results2.
Which of the following can be executed/called with Snowpipe?
A User Defined Function (UDF)
A stored procedure
A single copy_into statement
A single insert__into statement
Snowpipe is used for continuous, automated data loading into Snowflake. It uses a COPY INTO