Attributes Table

This table indicates the properties of each attribute used in the model, including the distribution type for the attribute, and whether the attribute is used to build the model or is to be predicted by the model. Entries in the attributes table are not required unless the default behavior needs to be overridden. The default behavior is for each attribute to be used for the prediction as well as to be predicted. The default distribution is Autodetect. A blank attributes table is created if one is not specified.

Ee810881.note(en-US,CS.10).gif Note

  • There is no table named Attributes. Each model configuration specifies the name of its attribute table as an entry in the PredictorDataTables table.
Column Name Type Description Required?
PropID DBTYPE_UI4 Unique key – Property ID. Yes.
ParentID DBTYPE_UI4 ID of parent (0 = root). Yes.
Name DBTYPE_WSTR Name of attribute, hierarchy element, or Pivot Column. Yes.
TableName DBTYPE_WSTR Not used. No.
ColumnName DBTYPE_WSTR Not used. No.
Distribution DBTYPE_UI2 Distribution type

B - Model As Binary

NB - Not Model As Binary

Valid values are:

  • 0 - Discrete, NB

  • 1 – Continuous, Normal, NB

  • 2 – Continuous, LogNormal, NB

  • 3 - Invalid

  • 4 - Discrete, Autodetect, B

  • 5 - Continuous, Normal, B

  • 6 - Continuous, Lognormal, B

  • A description of the different distribution types follows this table.
No.

NULL implies Autodetect.

Must be NULL if row describes a hierarchy element.

Predict DBTYPE_BOOL Indicates whether this is an output property.

Valid values are:

  • True (Default) – Build a model that can predict this property.

  • False – Do not predict this property.
No.

NULL implies True.

Must be NULL if row describes a hierarchy element.

UseToPredict DBTYPE_BOOL Indicates whether this an input property.

Valid values are:

  • True (Default) – Use as input property to predict other properties.

  • False – Do not use this property to predict other properties.
No.

NULL implies True.

Must be NULL if row describes a hierarchy element.

Distribution Attribute

  • Discrete vs. Continuous. Discrete means that only certain data values are legal and there is no specific relationship between sequential values. For example, the two-letter state abbreviations are discrete. All non-numeric attributes are treated as discrete. Numerical attributes may or may not be discrete.

  • Normal vs. LogNormal. Certain continuous data, such as Income, is better represented as a Normal (Gaussian) distribution while other data, such as the number of products purchased, better fit a LogNormal distribution which is skewed towards 0. An attribute with a LogNormal distribution means the logarithm of the attribute has a Normal distribution.

  • Autodetect. An algorithm to auto-detect the above properties can be used. For discrete attributes, the algorithm also eliminates those attributes that have only one distinct value or too many distinct values to be useful (over 500). Note that "missing" counts as a distinct value.

  • Model As Binary. For a property that has the Model As Binary attribute, the model is only concerned with whether the property exists, not the value of the property. For example, consider the case of an attribute that corresponds to the number of products purchased, it may be more useful to model whether the product was purchased or not, rather than modeling the quantity purchased.


All rights reserved.