Contoso is a gaming company that creates games for multiple platforms: game consoles, hand held devices, and personal computers ...

Contoso is a gaming company that creates games for multiple platforms: game consoles, hand held devices, and personal computers (PCs). These games produce a lot of logs and Contoso's goal is to collect and analyze these logs to gain insights into customer preferences, demographics, usage behavior etc. to identify up-sell and cross-sell opportunities, develop new compelling features to drive business growth and provide a better experience to customers.

This sample specifically evaluates the effectiveness of a marketing campaign that Contoso has recently launched by collecting sample logs, processing and enriching them with reference data, and transforming the data. It has the following three pipelines:

The PartitionGameLogsPipeline reads the raw game events from blob storage and creates partitions based on year, month, and day.
The EnrichGameLogsPipeline joins partitioned game events with geo code reference data and enriches the data by mapping IP addresses to the corresponding geo-locations.
The AnalyzeMarketingCampaignPipeline pipeline leverages the enriched data and processes it with the advertising data to create the final output that contains marketing campaign effectiveness.

The sample showcases how you can use the Azure Data Factory service to compose data integration workflows to copy/move data using the Copy Activity and process data using Pig or Hive scripts on an Azure HDInsight cluster using the HDInsight Activity.

To deploy the sample:

Select the storage account from the drop-down list that you want to use with the sample.
Select the database server and database that you want to use with the sample.
Enter username and password to access the database
Click the Create button.

The deployment process does the following:

Uploads sample data to your Azure storage
Creates a table in the Azure SQL database
Deploys linked services, tables, and pipelines to run the sample.

An on-demand HDInsight linked service is used in this sample, which creates a one-node on-demand HDInsight cluster to run Pig and Hive scripts and is deleted after the processing is completed.

After the deployment is complete, you can monitor end-to-end data integration workflow using the diagram view and use the monitoring features of the Microsoft Azure Portal to monitor datasets and pipelines.

NOTE: there are costs associated with transferring the data and processing the data with an on-demand HDInsight Cluster. See HDInsight Pricing and Data Transfer Pricing for details.

For more details about this sample, see this tutorial on Azure.com.

Cannot set active period {0} for pipeline {1} due to conflicts on {2}. Try changing the active period or using autoResolve ... Catalog Administrators can define and edit business glossary terms. Once your administrators have created terms, you can ... Compute type '{0}' is not enabled for subscription '{1}'. Please contact Azure Data Factory support team for further assistance. ... Container names must start with a lowercase letter or number, and can contain only lowercase letters, numbers, and the dash ... Contoso is a gaming company that creates games for multiple platforms: game consoles, hand held devices, and personal computers ... Copy activity met an internal service error. For more information, provide this message to customer support. ErrorCode: {0} ... Create assets in Azure Data Catalog for data-sources not supported in the publishing tool, or which you do not have access ... Creation of Activity failed because the maximum number of Activities has already been reached. Factory ID: {0} Number of ... Creation of Data Artifact failed because the maximum number of Data Artifacts has already been reached. Factory ID: {0} Number ...