
ClickHouse Cloud Now Runs Python
TL;DR: ClickHouse Cloud has launched executable user-defined functions (UDFs) in public beta. This feature allows developers and data engineers to write functions in Python, upload them to a cluster, and call them directly from SQL queries, simplifying complex data processing and machine learning workflows.
Key facts
- Category
- Database
- Impact
- High
- Published
- Source
- ClickHouse Blog
Full summary
ClickHouse Cloud now lets you run Python functions directly within your SQL queries, simplifying complex data processing and machine learning workflows.
ClickHouse Cloud has introduced executable user-defined functions (UDFs) in a new public beta. This feature allows users to write custom functions in popular programming languages, starting with Python, and run them directly within SQL queries. Developers can package their Python code, upload it to their ClickHouse cluster, and then invoke the function just like any built-in SQL function. The platform manages the underlying execution environment, handling the process of running the external code securely and efficiently. This initial release focuses on Python, a dominant language in data science and machine learning, enabling a wide range of new in-database processing capabilities for users of the cloud service.
The introduction of executable UDFs is significant for developers and data teams as it streamlines complex data processing and machine learning tasks. Previously, performing advanced transformations or running inference with a machine learning model often required extracting data from ClickHouse, processing it in an external application, and then loading the results back. This multi-step process adds latency and complexity. By allowing Python code to run directly inside the database, teams can now perform these operations within a single query. This simplifies architectures, reduces data movement, and enables faster, more integrated analytical pipelines, making ClickHouse a more powerful and self-contained platform.
Why it matters
This feature simplifies data and ML workflows by allowing developers to run Python code directly within SQL queries, reducing the need for external processing systems and data movement.
Business impact
Enabling in-database Python execution can lower operational costs and complexity for data teams. It accelerates development cycles for data-intensive applications and allows businesses to derive insights from their data faster by integrating analytics and ML models directly into their data warehouse.
Tags
Primary source: ClickHouse Blog