Getting to know Snowflake is a full-on experience, the company is prolifically expanding at a number of levels and this year hosted its first ‘in real life’ user, customer and partner conference since 2019 in the shape of Snowflake Summit 2022.
Snowflake CEO Frank Slootman kicked off the event keynote with a customary welcome and, quite naturally, a paused note of reflection to consider the global events that have played out over the last couple of years.
Having started with just under 1,000 employees in 2019, Snowflake today is almost at the 4,000 mark. Slootman also pointed to his company’s revenue which has grown similarly in line.
“We grew from what was basically a popsicle stand to what is now a revenue position of $1.2 billion [£993 million] in January 2022. This kind of foundation gives us the power to weather whatever [technology and economic] storms we may face on the road ahead,” said Slootman.
Talking about how his firm has driven its platform development, the Snowflake CEO explained how the company strategy has been focused on addressing data workloads that have historically struggled to work effectively inside businesses across all verticals.
What the company is aspiring to be now revolves around a ‘mission alignment’ process that sees the firm double down as an end-to-end data infrastructure provider.
Fundamentals and Forward Actions
Dutch-born Slootman explained that the firm’s customer conversations have really changed a lot in the last year or so.
It’s now not just about bringing legacy workloads to the cloud and handling database migrations (those are things that he is sure the company will be doing for the next twenty years perhaps)… there is now also a wider move to address issues related to digital supply chains and the need to be data-driven throughout the entire wingspan of any business.
Any organisation operating with a lot of data, logically, needs to operate a lot of workloads. Snowflake says its approach to data workloads is all about ‘enablement’ and therefore allowing enterprises to do things with their data in multi-cloud cross-cloud environments.
Snowflake used to be thought of as a database with all of the programmability sitting on the outside of the platform. The company (through the acquisition of Streamlit and other in-house development) has worked to bring more and more programmability inside the platform and therefore give its customers more direct control.
Gettin’ Slippery with Streamlit
Streamlit is a software framework built to simplify and accelerate the creation of data applications. This gives users the power to create web apps with the Python programming language faster and easier. Data scientists are able to go from data and models to deployed apps (in a matter of hours) with only a handful of lines of code.
As explained by Streamlit founder Adrien Treuille (now head of Streamlit inside Snowflake), Streamlit enables data scientists and engineers to build new applications that offer data applications and visualisaions using Python.
These apps can be iterated upon and built upon and used as a communications channel in and of themselves. With over ten million downloads and over 50,000 active developers using Streamlit, this technology has now been updated to offer a new native integration with Snowflake to be able to tap into the core Snowflake engine.
Apps created can be shared throughout an organisation to put data and machine learning (ML) at the heart of a business team’s decision-making processes.
Although data scientists know how to use data model training and inference technologies, those functions are only ever really valuable if they can get into the hands of business users, which is the reason Treuille says Streamlit exists.
In terms of its six-tier vision for the Snowflake stack itself, it breaks down as monetisation, marketplace, application development, workload execution, live data and infrastructure.
As a second-stage element of the Snowflake Summit keynote, Benoit Dageville, co-founder and president of product, took the audience through what he sees as the seven pillars of innovation.
#1 All data
Pillar #1 is all about putting access to all data at the fingertips of all users who need it. As we now build a world of ‘data applications’ this kind of process will be crucial for aspects like ML, which is only ever as smart as the data we feed to it.
By all data, Dageville means application data, tables, documents, images, videos and any type of unstructured data as well. Additionally, all data also means data from any origin and at any size i.e., up to petabytes-scale all at lightning scale.
#2 All workloads
When Dageville talks about all workloads, this is a reference to ML models, running data pipelines, client-facing applications and more… all in an environment where there is no limit to the number of workloads that can be supported.
Compute resources can be spun up and down in Snowflake so there is true cloud methodology efficiency here with workloads capable of being run concurrently.
#3 Global architecture
Snowflakes’ global architecture approach works around the world with the three major cloud hyperscalers (AWS, Google and Azure). The Snowgrid communication framework enables teams and customers to connect as a global platform with various self-service features for all users to access.
#4 Self-managed
Snowflake works as one single product where all capabilities are engineered to work together with one set of APIs. This makes the platform capable of offering self-management capabilities.
#5 Fully programmable
The Snowflake cloud is designed to make data applications run inside the Snowflake data cloud. It spans programming languages including Python, SQL, Java and Scala, which can all be used with Snowflake along with a data developer’s preferred code libraries.
#6 Marketplace
Anyone can be a data application provider, promises Dageville. The Snowflake Marketplace is designed to enable users, partners and customers to share data directly without moving (copying, duplicating, etc.) it. Some 260 providers and 1,300+ listings are currently on the Snowflake Marketplace.
Unmesh Jagtap, principal product manager, explained how companies can use the Snowflake Marketplace native applications function to use a serverless deployment model (for deployment flexibility), access to Snowflake’s customer base and in-app monetisation functions.
These native applications are built on Snowflake’s engine with the ability to combine an ML model and a Streamlit object into one place to build a resulting data-rich application template.
An enterprise user picks their chosen pricing model for an application – it could be a company advertising spend optimiser for example – that other users can purchase and be able to plug their own datasets into.
Where one company recognizes that its data use case is mirrored by the application example on offer, it can shortcut to get faster control of its data for any given use.
The provider gets to hold onto their IP and the customer does not get hold of the source code of the application itself. Also on this point, the provider can choose whether to just expose the application itself and not an anonymised obfuscated offering of datasets, or not – the power of the choice is in the provider’s hands.
#7 Governance and compliance
Snowflake says it provides enforced security and governance throughout its Data Cloud offering. All data is encrypted at rest and in motion.
Also speaking at this keynote was Christian Kleinerman, senior vice president of product at Snowflake. Kleinerman said that his company’s job was all about enabling users to forget about upgrade cycles, maintenance cycles and other chores related to data infrastructure management.
He explained that the company has built the Snowflake platform to now offer 30% improved storage compression, 50% reduced data ingest latency and 55% reduced replication data.
Data Array: Variants, Tests and Geospatial
For data-intensive workloads, Snowflake is working to make sure it can work with more types of data more quickly.
Recent updates have seen the firm more competently straddle working with variant data (in the sense of outliers and anomalies that will typically develop in any given data distribution), test data and geospatial data, the latter of which will clearly have a wide variety of ecologically related application use cases.
To cover the entire breadth of the Snowflake platform here would probably be overly circuitous and too granular for this type of summary analysis. Suffice it to say that these types of enterprise technology events generally leave a somewhat undefinable cerebral feel that conveys what kind of ‘personality’ any given vendor has.
If there is a persona conveyed by Snowflake, it is a friendly and approachable one – this is not Meta. But further, it’s hard not to get the feel of a company in its ascendancy – Snowflake is the cool kid in the gang and (almost) everyone wants to be friends (partners) with it right now.
Best-of-Breed by Design
Even though Snowflake is building an end-to-end stack, it’s still an end-to-end stack focused on providing infrastructure tools. The company is not building out (for example) a business intelligence play in a deliberate move to stay best-of-breed by design, and so work with partners to provide those additional layers. Well, for now at least.