Site icon Gradient Flow

Pipe SQL: Google’s Answer to SQL’s Limitations

As a long-time SQL user, I still turn to it whenever I need to work with tabular data, preferring it over dataframe libraries like Pandas. SQL’s syntax is easier to recall and aligns well with set theory concepts, making it intuitive for those familiar with basic mathematical logic. The presence of an optimizer further enhances SQL by efficiently translating declarative queries into optimal execution plans, allowing users to focus on what they want rather than how to achieve it. However, SQL can become cumbersome, especially when dealing with complex queries involving subqueries and temporary tables. For more intricate data structures, like property graphs, I find Cypher to be a more natural choice.

Surprisingly, even highly skilled developers often struggle with SQL. Although it’s the standard language for database querying, SQL can impede productivity, especially in data-intensive fields like AI. This challenge spans a wide range of users, from developers to data scientists, who rely on SQL for their daily tasks. The challenge of achieving fluency and proficiency in SQL can hinder innovation, making it harder for some users to fully leverage its capabilities.

A new paper from a team at Google aims to enhance SQL by introducing pipe-structured data flow syntax, known as Google Pipe Syntax, to make SQL more flexible, extensible, and easier to use. This approach addresses the significant problem of SQL’s rigid structure, which makes it challenging to express operations in a different order without subqueries or workarounds. By introducing pipe syntax, users can compose operations in any order, increasing flexibility, simplifying the user experience, and enabling clean language extension. Here’s an example from the paper:

(enlarge)

Adopting Pipe Syntax for SQL could bring about several practical benefits:

Exploring SQL Alternatives

A number of alternatives have been developed to address SQL’s limitations, each with distinct strengths and challenges. PRQL offers composable relational operations but struggles with adoption due to its unfamiliar syntax and divergence from traditional SQL. SQL++ extends SQL to better handle structured data types like JSON but doesn’t resolve the core syntax issues that make SQL cumbersome for complex queries.

Python DataFrames are popular for data manipulation but lack the declarative power and optimization essential for large-scale processing, making them less effective for AI tasks. Tools like KQL and Apache Beam also attempt to improve on SQL but face adoption challenges due to their specialized use cases and steep learning curves.

These alternatives highlight the difficulty in finding a balance between enhancing SQL’s usability and ensuring seamless integration with existing systems. None have yet fully succeeded in overcoming SQL’s limitations while maintaining the broad compatibility and ease of use that SQL offers.

Pipe SQL: From Concept to Widespread Use at Google

In contrast to these less successful attempts, Pipe SQL appears to have gained significant traction at Google. After an initial implementation phase involving a small group of early users, Google stabilized the Pipe SQL language and made the pipe syntax widely available. Over the following six months, adoption steadily increased, with initial spikes following announcements on a SQL users mailing list and the removal of opt-in settings, making the pipe syntax the default. Usage continued to grow as more users incorporated pipe syntax into their daily work, with significant uptake following a SQL workshop at a user conference, where a 40-minute tutorial on the syntax generated excitement and further adoption.

Pipe SQL: Pros, Cons, and Future Prospects

The introduction of Pipe SQL is a step forward in making SQL more adaptable and user-friendly, especially for complex data processing tasks. However, it is not without its limitations. The potential for parsing ambiguities, the complexity of tree-like query structures, and the current lack of IDE support suggest that while Pipe SQL holds promise, there is still work to be done. Additionally, there will be an adjustment period as users familiarize themselves with the new syntax.

While Pipe SQL is an intriguing development that could make SQL more accessible and powerful, I’m adopting a wait-and-see approach. I’ll reserve judgment until I see how quickly it gets adopted by the broader community—particularly in Postgres, which remains my favorite database system. Let’s see if Pipe SQL and similar variants can succeed where Esperanto didn’t—by staying close enough to the familiar to actually catch on.

Related Content

If you enjoyed this post please support our work by encouraging your friends and colleagues to subscribe to our newsletter:

Exit mobile version