


雪花's mission revolves around making data accessible, usable, and valuable to everyone. One of the pillars that have made 雪花 an industry leader is its unwavering commitment to being easy to use and turnkey. A testament to this is a quote from 雪花's CEO, Frank Slootman, from the Q2 '23 Earnings call:

Where you see huge differences is in the total cost of ownership, 这还不包括计算和存储的成本. In other words, what is the cost to run that technology? This is where 雪花 has a huge advantage, and our customers know that. 只是减少了技能, 更少的人, and not having to touch the complexity of the underlying platforms. We’re more descendants of the Apple and Tesla then being the descendants of Hadoop, 就像有些人在市场上一样, 正确的? 所以mg官方游戏中心把复杂性抽象出来了. 这就是产生TCO优势的原因. But the raw cost of computing and storage, there’s not that much opportunity to be had there.

以这一战略为基础, 雪花 continues to innovate and streamline even the most complex of tasks. 展望未来, we can expect the following items to be shaped by 雪花's signature user-friendly approach:


  • 低管理:减少运营开销.


  • DocumentAI高级文档处理和见解.


  • ML SQL函数:在SQL中嵌入ML功能.


  • 带有NVDA的AI:尖端人工智能工具的协作.

  • 微软: 合作伙伴hip to bring 微软AI directly to the Data Cloud

  • 法学硕士的公司数据:扩大数据覆盖范围和效用.


  • Streamlit

  • 本机应用程序框架:应用开发的无缝集成.


Designing a Usability Test

雪花's ML SQL函数 currently in open preview are transforming the way we view SQL and ML. 这三个先行者是:

1.     预测:根据过去的数据预测未来的值. 理想的销售预测,股票趋势,和更多.

2.     异常检测: Identify unusual patterns in data that don't conform to expected behaviors. Useful in fraud detection, system health monitoring, etc.

3.     贡献的探险家: Understand contributing factors to a particular outcome. 这就像对每个“什么”都要问一个“为什么”.

需求 & 限制

As with any tool in development, there are requirements and limitations. 以下是这些函数的当前约束:

  • 最多500,000行用于模型训练.

  • 至少12行用于模型训练.

  • 1秒最小粒度.

  • Seasonal components have a 1-minute minimum granularity.

  • 时间戳必须具有固定的间隔.

  • Season length of autoregressive features tied to input frequency.

  • Existing models cannot be updated; a new one must be trained.

  • 异常值会影响算法. 如果不需要,用户可能需要删除.

  • 不可能跨帐户克隆模型.

开始学习ML SQL函数

Diving into these functions involves a systematic process:

  1. 准备数据整理和清理您的数据,以确保其准备就绪.

    • 最重要的一步

  2. 创建模型: Set up the foundation for your 机器学习 model.

  3. 火车模型:使用你的数据来训练和完善模型.

  4. 获取数据:提取见解和结果.


I have a dataset with the closing price data for all the stocks in the Nasdaq & 陶氏. I want to run predictive analysis over the dataset for the next 2 months. I want to train the model on data beginning on 1/1/2019.


在这一步中,视图是您的朋友. 这是进一步为ML准备数据的地方. 做事要符合要求. For this stock dataset, there are a few things to handle:

  • 1.     There are tickers with less than 12 rows (new IPO or stock that came off the market within 12 days of the beginning).  

    • 通过视图排除这些记录

  • 2.     There is a date column but I need this to be a timestamp data type

    • 将数据类型更改为视图中的时间戳

  • 3.     周末和节假日数据不存在. Need to meet the FIXED intervals by mocking up data for those dates.

    • Have missing data show as previous close price thru a view

  • 4.     When training on larger sets, its important he final view be ordered by the TIMESTAMP column


现在艰苦的工作已经完成了. mg官方游戏中心创建模型.


对模型进行60个预测周期的训练. This step can take a long time but upping the warehouse can reduce that time.


如果使用直接SQL, use the RESULT_SCAN function to put the results from the previous step into a table for further analysis.


雪花 continues to shape the future of data analysis and 机器学习 by introducing powerful yet user-friendly tools. mg官方游戏中心期待着进一步的创新和改进, 很明显,雪花, 机器学习真的适合每个人.

Dive in, explore, and harness the power of data like never before!


