Azure Batch (HPC): Introducing extreme data processing with Apache Arrow in AzureWhen more is neededJan 11Jan 11
Microsoft Fabric: Sentiment Analysis from Speech Files with SynapseML in SparkUnstructered data processing in OneLake.Dec 18, 2024Dec 18, 2024
Lakehousing: Navigating in AI/ML and other types of non-deterministic transformations when…Considerations around data processing including AI and MLDec 11, 2024Dec 11, 2024
Microsoft Fabric: Shortcutting to a firewall-protected Azure Data Lake, distilled.Less is moreOct 2, 20241Oct 2, 20241
Microsoft Fabric: Diving into Lakehouse access from local machines and other remotes with Delta-RSSometimes local compute outperformsSep 15, 2024Sep 15, 2024
Data Architecture: Data capture time and event time in medallion architecture.Data arrangement has the most significant impact on your solution.Aug 25, 2024Aug 25, 2024
Spark performance: Let cache() or persist() handle your temporary data when possibleLet Spark do the workJul 19, 20241Jul 19, 20241
Microsoft Fabric: Utilize Shared SparkSessions fully with mssparkutils.notebook.run and runMultipleThere is isolated and then there is a little less isolated.Jun 6, 2024Jun 6, 2024
Spark SQL: Why the choice of language doesn’t impact performanceExploring the language-agnostic power of Apache Spark SQLMay 24, 2024May 24, 2024
Microsoft Fabric and Databricks: The low-level challenge of enforcing primary keys and foreign…Key enforment requires performanceMay 18, 2024May 18, 2024