Tuesday, 28 August 2018

Announcing Databricks Runtime 4.3

I’m pleased to announce the release of Databricks Runtime 4.3, powered by Apache Spark.  We’ve packed this release with an assortment of new features, performance improvements, and quality improvements to the platform.   We recommend moving to Databricks Runtime 4.3 in order to take advantage of these improvements.

In our obsession to continually improve our platform’s performance, the Databricks Runtime 4.3 release benefits from substantial performance gains over previous versions of the Databricks Runtime.   When running performance benchmarks using TPC-DS at 1 Terabyte scale, we’re showing:

  • 16% performance improvement on AWS: This is a result of improvements to data skipping and optimal shuffle placement.  
  • 55% performance improvement on Azure:  This is largely due to enabling caching and internal performance optimizations.  

In addition to the performance improvements, we’ve also added new functionality to Databricks Delta:

  • Truncate Table: with Delta you can delete all rows in a table using truncate.  It’s important to note we do not support deleting specific partitions.  Refer to the documentation for more information: Truncate Table
  • Alter Table Replace Columns: Replace columns in a Databricks Delta table, including changing the comment of a column, and we support reordering of multiple columns.   Refer to the documentation for more information: Alter Table
  • FSCK Repair Table: This command allows you to Remove the file entries from the transaction log of a Databricks Delta table that can no longer be found in the underlying file system. This can happen when these files have been manually deleted.  Refer to the documentation for more information: Repair Table
  • Scaling “Merge” Operations: This release comes with experimental support for larger source tables with “Merge” operations. Please contact support if you would like to try out this feature.

We’ve added some improvements to Structured Streaming that I’d also like to highlight:

To read more about the above new features and to see the full list of improvements included in Databricks Runtime 4.3, please refer to the release notes in the following locations:

Azure: Databricks Runtime 4.3 release notes

 

 

 

--

Try Databricks for free. Get started today.

The post Announcing Databricks Runtime 4.3 appeared first on Databricks.

No comments:

Post a Comment