The following article depicts performance optimization on Azure Data Lake.It is Assumed that people reading this have a general idea on ADL, U-SQL and general querying.Before we move on let’s understand key terms in determining a query performance: Analytic UnitStages in a jobJob Read More …
Category: ADLA
U-SQL (Intro)
U-SQL is a language that unifies the benefits of SQL with the expressive power of your own code to process all data at any scale. U-SQL’s scalable distributed query capability enables you to efficiently analyze data in the store and Read More …
Including File Properties and Metadata in a U-SQL Script
U-SQL adds support for computed file property columns on EXTRACT statement. Sometimes customers would like to get information about the files that they process, such as the full URI path or information about size, creation or modification dates. Likewise customers would Read More …
U-SQL Table
Azure Data Lake Analytics (U-SQL) originates from the world of Big Data, in which data is processed in a scale-out manner by using multiple nodes. These nodes can access the data in several formats, from flat files to U-SQL tables. Read More …
T-SQL TO U-SQL DATA TYPE CONVERSION
When working with code generated solutions we often need to convert datasets from SQL Server (T-SQL) data types to Azure Data Lake Analytics (U-SQL) data types. As you probably know U-SQL has a hybrid syntax of T-SQL and C# which Read More …
Part 9: Extending U-SQL
There are 5 kinds of User-Defined entities in U-SQL User-Defined Functions (UDFs)User-Defined Types (UDTs)User-Defined Aggregators (UDAggs)User-defined Operators (UDOs)User-Defined Appliers All of them are defined by .NET code. C# is not required. Any .NET language will work. User-Defined Functions User defined functions are normal static methods on a .NET Class. Read More …
Part 8: Set operations and Joins
Set operations are a way of merging rowsets together based on set theoretic operations such as union (UNION). intersection (INTERSECT), complement (EXCEPT). Sample data Let’s define two RowSets: @a and @b. Notice that both RowSets have duplicate rows. UNION UNION combines two rowsets. UNION Read More …
Part 7: Window Functions
Window functions were introduced to the ISO/ANSI SQL Standard in 2003. U-SQL adopts a subset of window functions as defined by the ANSI SQL Standard. Window functions are used to do computation within sets of rows called windows. Windows are defined Read More …
Part 6: The U-SQL Catalog and Assemblies
The U-SQL Catalog is the way U-SQL organizes data and code for re-use and sharing. Catalog organization Every ADLA account has a single U-SQL catalog. The catalog cannot be deleted.Each U-SQL catalog contains one or more U-SQL databases.Every catalog has Read More …
Part 5: Working with FileSets
Reading and Writing Files: Built-in Extractors U-SQL has three built-in extractors that handle text Extractors.Csv() reads comma-separated value (CSV)Extractors.Tsv() reads tab-separated value (TSV)Extractors.Text() reads delimited text files. Extractors.Csv and Extractors.Tsv are the same as Extractors.Text but they default to a specific delimiter appropriate for the format they support. Read More …