Self-service exploratory analytics is one of the most common use cases we see by our customers running on Cloudera’s Data Warehouse solution.
With the recent release of Cloudera 6.2, we continue to improve the end user query experience with Hue, focusing on easier SQL query troubleshooting and increased compatibility with Hive. Read on to learn more and try it out in one-click at demo.gethue.com.
Easier SelfService Query Troubleshooting
Hue has great assistance for finding tables in the Data Catalog and getting recommendations on how to write (better) queries with the smart autocomplete, providing popular values and notifying of dangerous operations. When executing queries, however, it might be difficult to understand why they would be slow.
A new feature in 6.2 introduces a prettier display of the SQL Query Profile, which helps understand why/where the query bottlenecks are and how to optimize the query.
Please read more about this feature in this complete self-troubleshooting scenario.
Additionally, one of the most requested fixes was implemented: releasing query resources after the query has finished and they are no longer needed.
First, on the Apache Impala side, the query execution status will properly say if the query is actively running (“processing” data) or just “open but finished” (meaning just “keeping” the results but not using resources). In addition, the new parameter NUM_ROWS_PRODUCED_LIMIT will even notify Impala to truncate any query execution as soon as this maximum number of result rows has been returned. This will release resources early on large SELECT operations where only the first few rows are actually displayed (which is the primary use case in Hue).
Better compatibility with Hive in HDP
Apache Hive has typically been very innovative in the Hortonworks distribution. In upstream the support for Hive on Tez and Hive LLAP was improved. Now:
- The jobs will show up in Job Browser
- The query ID is printed
- The progress is displayed
You can read into more details in the Hue and Hive 3 integration improvements post.
Note that currently Hue is not officially supported in HDP. However, if you want to experiment, you can learn how to configure Hue in HDP and set it up on your own, or get help from Cloudera Professional Services to do it for you.
More than 80 bugs were fixed to improve the supportability and stability of Hue. The full list is in the release notes but here are the top ones:
- HUE-7474 [core] Add ability to enable/disable Hue data/file “download” options globally
- HUE-7128 [core] Apply config ENABLE_DOWNLOAD to search dashboard
- HUE-8680 [core] Fill in Impalad WEBUI username passwords automatically
- HUE-8585 [useradmin] Bubbling up errors for Add Sync Ldap Users
- HUE-8690 [backend] Fix Hue allows unsigned SAML assertions
- HUE-8140 [editor] Improve multi-statement execution
- HUE-8662 [core] Fix missing static URLs
In addition, the Hue Docker image was simplified, so that it is easier to quickly get started and play/test the latest features.
Last but not least, the upstream and downstream documentation just got the first pass of a revamp, with a better table of contents, restyling, and updated instructions. In particular, on the upstream docs, reporting issues or sending a suggestion is one click away via GitHub, so feel free to send some pull requests!
Thank you to everybody using the product and who contributed to this release. Now off to the next one!
The post What’s new in the Hue Data Warehouse Editor in Cloudera 6.2 appeared first on Cloudera Engineering Blog.