Planning a Custom Import Task

This topic discusses the first two steps in the process of importing custom data into the Data Warehouse.

To Import Custom Data

A running example is used to explain this process. The example involves an ice cream company with a volume of user data in log files. The company wants to extract information about customers who registered on their Web site over the last year. Part of the registration process for users involved answering a questionnaire about their favorite flavor of ice cream. To enhance local promotions, the company wants to analyze the regional distribution of favorite flavors.

The log files contain the data in the following table.

Data property name

Example values

Title

"Mr" "Ms" "Dr"

FullName

"Kevin Verboort"

Email_Address

"someone@microsoft.com

TelNum

"800 123-4567x123"

FaxNum

"800 123-4567x321"

AddressLine1

"Public's Animal Shelter"

AddressLine2

"223 Normal Coast Way, #85"

City

"Everytown"

State

"WA"

ZipCode

"98434-3424"

Fav_Flavor_Code

34

Fav_Flavor_Comments

"Can never get enough chocolate!"

The goal is to analyze the regional distribution of favorite flavors. This will require, at minimum, the data from the State and Fav_Flavor_Code properties. Because the company wants a more precise distribution, the data from the City property must also be included.

To promote the product, the names and e-mail addresses of the registered users are required. This will enable the company to send customized direct mailings to their customers.

The following table contains the list of necessary data.

Data property name

Data type

Title

String

FullName

String

Email_address

String

City

String

State

String

Fav_Flavor_Code

Integer (unsigned)

This finishes the first step.

The second step involves finding a structure in the Data Warehouse schema that fits this data. The RegisteredUser structure in the Profiles category of the Data Warehouse logical schema is the closest match.

The RegisteredUser class contains data members for title (UserTitle), name (FirstName, LastName), and e-mail address (Email). It does not contain data members for city, state, or favorite ice cream flavor. The following table shows the match between the log data and the Data Warehouse structure.

Source data

RegisteredUser data member

Unique user ID

UserID

Log data: Title

UserTitle

Log data: FullName

FirstName, LastName

Log data: Email_Address

Email

Log data: City

New Data Member: City

Log data: State

New Data Member: State

Log data: Fav_Flavor_Code

New Data Member: Fav_Flavor_Code

The final required data member is the unique key for the RegisteredUser class. This key is the UserId data member, which is a universally unique identifier (UUID). In order to put custom data into the Data Warehouse, this key must be correctly assigned. This finishes the data required to create valid RegisteredUser objects that contain the necessary information.

For the next step, see Preparing the Data Warehouse.

See Also

Other Resources

Importing Custom Data into the Data Warehouse