DataSource PIM Feed User Guide
1. DataSource PIM Feed
1.1 Summary
A growing number of DataSource customers need to ingest DataSource's structured content in enterprise Master Data Management or Product Information Management solutions; those allowing for effective content ingestion/alignment and publishing of a single "reference" view of a product's technical, marketing or multimedia information.
The classic DataSource product data feed has limitations that made it challenging to import and manage in PIMs:
- Not all product characteristics were delivered in their most granular/atomic form
- DataSource Specifications being the result of a publishing process that involves aggregating multiple atomic attributes, they could not abide to PIM
best-practices. Moreover, their underlying keys are not the same across languages and may change over time.
- Adherence to international unit of measure standards
- Attribute ordering
The DataSource PIM Feed provides all product technical characteristics in "granular atomic format", i.e. the lowest granularity at which some characteristic can be captured. This document describes the format of the DataSource PIM Feed option and compares it to the standard DataSource export.
1.2 PIM Feed vs. Standard DataSource Feed
1WorldSync Standard DataSource feed (aka Download.zip) differentiates between Searchable Attributes and Publishing Components:
- Searchable Attributes: A subset of atomic attributes selected by 1WorldSync for each of the categories and delivered separately in atomic format. Searchable Attributes are intended to power parametric search, product comparison and custom categorization.
- Publishing Components: The Standardized Description, Short Product Name, Main Specifications, and Extended Specifications contain ready-for-publishing product specifications built by aggregating specific atomic data elements. Most of the atomic technical attributes captured for a product are included in the Extended Specifications.
All Publishing Components are generated out of atomic technical attributes:
Example: Lenovo ThinkPad X1 Carbon 20A7 - Ultrabook - Core i7 4600U / 2.1 GHz - Windows 8.1 Pro 64-bit - 8 GB RAM - 180 GB SSD - no optical drive - 14" touchscreen 2560 x 1440 ( WQHD ) - Intel HD Graphics 4400
Structure: [Product Line] [Model] - [Notebook Type] - [Processor Type] [Processor Number] / [Clock Speed] - [Operating System] - [RAM Size] RAM - [HDD Capacity] SSD - [Optical Drive] - [Diagonal Size] [Touchscreen] [Display Resolution] ( [Resolution Abbreviation] ) - [Graphics Processor]
DataSource with the PIM Feed option is upwardly compatible with the standard DataSource export.
The PIM feed as a variation of the DataSource feed that sees all attributes delivered as they were initially captured by 1WorldSync, and augmented with additional model information that help answer common questions:
- Is it an integer attribute?
- Can it be have multiple values?
- What is this unit's relationship to existing standards?
- In which order should I display a product's attributes?
- How can I compose higher level attributes from granular attributes?
The PIM feed also gives access to new attributes that DataSource could not deliver in the past. For example, a TV's connectivity information used to be shipped in the specifications (3 x USB 2.0) but not entirely as attributes. This is now possible with the definition of new "secondary" attributes, in this case "USB Ports Qty".
The table below highlights the main differences between the Standard and PIM feeds:
PIM Feed | Standard DataSource Feed | |
---|---|---|
Publishing Components | DataSource customers might choose to use 1WorldSync generated product specifications or create specification display rules on their own. |
DataSource feed contains generated technical specifications including:
|
Searchable Attributes | All product attributes are delivered in atomic/granular format. |
Only a small subset (up to 25%) of attributes is delivered in atomic/granular format. The rest of the attributes are delivered as a part of technical specifications. |
Data Model Maintenance |
High data model maintenance cost for the customer as each DataSource attribute needs to be associated with an attribute in customer’s taxonomy. PIM feed contains full model meta-data and attribute life-cycle information. |
Low data model maintenance cost for the customer: only key attributes need to be mapped (mainly for search facets and categorization). DataSource specifications are intended to be used as indivisible ready-for publishing data components. |
1.3 Downsides of PIM Feed
DataSource Product Data Model is constantly evolving. 1WorldSync team keeps adding new attributes and values, deactivates old/obsolete PDM elements, replaces value lists, etc.
There is a relatively small subset of Managed attributes (mainly searchable ones) that get changed only after advance notice. But any of unmanaged attributes can be changed at any point of time.
Depending on the level and complexity of integration and the ecommerce website, any DataSource customer using PIM Feed (or old Legacy Atomic Feed) need to trace all PDM changes to ensure they change their integration rules accordingly. Otherwise some of the custom categorization, product search or attribute display rules might deteriorate with time.
Bottom line: PIM Feed provides certain extra flexibility around data structure and format customization, but at the same time requires a lot of extra constant monitoring and maintenance by those DataSource customers that choose to take advantage of it.
1.4 PIM Feed vs. Legacy Atomic Feed
PIM Feed | Legacy Atomic Feed | |
---|---|---|
Attribute Display |
PIM Feeds ordering for Attribute Groups and Attributes within each group. This order can be used for technical specifications display, side by side product comparison and parametric search. | Atomic feed does not contain attribute order, which makes it difficult for DataSource customers to figure the logical display order for the attributes in product specifications. |
Attribute Groups |
Attribute names and attribute group names are provided separately. | Attribute names include attribute group names, which requires additional logic on reseller side to cleanse and simplify them. |
Composable Attributes |
PIM Feed contains both Composable attributes, and Composed attributes derived from them. Therefore, it is not required to integrate Composable attributes any longer. | Most of the PIM systems cannot ingest Composable attributes (repeating attribute sets with SetNo field). |
Special Attributes |
PIM feed clearly identifies the attributes to be Displayed in product specifications. | Some service attributes are not intended to be displayed in product specs. They might look weird for the end users if included in the specs. |
Duplicated Attributes |
PIM feed clearly identifies the attributes to be Displayed in product specifications. | Some characteristics are duplicated in two or three different attributes intended for different purposes: parametric search, product title or specs. |
Duplicated Attributes |
PIM Feed clearly differentiates deactivated attributes. | Some attributes in the feed are deactivated. Atomic Feed does not differentiate active and deactivated/legacy attributes. |
1.5 Tables
cds_PIM_atr | Main table with all product attributes in atomic format for each SKU (product) in the data feed. |
cds_PIM_atr_composable | Additional table containing just Composable attributes with set number (SetNo) field. |
cds_PIM_atr_voc | Vocabulary table with all attribute group names, attribute names, values and units. |
cds_PIM_atr_standard_units | Table with conversion rules from DataSource units to the units form various international standards. |
cds_PIM_atr_order | Meta-data table with attribute order and attribute groups for each category. |
cds_PIM_atr_model | Meta-data table with data model structure: attribute types, attribute grouping and value list format. |
1.6 Data Model Elements
Category | E.g. “Notebooks” |
Attribute Group | E.g. “Processor” |
Attribute | E.g. “Clock Speed” |
Value | E.g. “2.7” |
Unit | E.g. “GHz” |
Unit List | E.g. “Frequency Units” |
NNV Value | NNV value is generated for numeric (integer or float) attributes. If an attribute does not have a unit list, NNV is the same as the attribute
value, only without thousand comma separators. If an attribute has a unit list, NNV value is expressed in the lowest unit denominator.
E.g. “2700000000” expressed in Hz base unit for 2.7 GHz. |
1.7 Attribute Types
Single-value | The Attribute can have only one value for any given SKU. |
Multi-value | The Attribute can have multiple values for the same SKU. |
Composable | The Attribute belongs to Composable Attribute Group that is repeated for the same SKU multiple times. For example, Interfaces Attribute Group has multiple attributes: Type (USB 3.0), Qty (3), Location (in front). Then entire attribute set needs to be repeated once again for Headphones port, HDMI port, etc. |
Composed | Composed attributes are text multi-value attributes generated from Composable attributes. |
1.8 Attribute Format
Text | Text attributes always have controlled standardized value lists. The length of the values is limited by 255 characters. |
Integer | Some of the integer attributes have comma thousand separators for display purposes.
E.g. “5000”, “14000”, “61,380”, “921,600”. |
Float | Float attributes use period as decimal separator. Some of the float attributes have comma thousand separators.
E.g. “44.1”, “28.8”, “2,646.00”, “2,000.00”. |
Boolean | Boolean attributes typically have “yes” or “no” values (consistently translated to supported languages). Some of the Boolean attributes
have only one value “yes” and are left empty if the feature does not exist or the attribute is not applicable.
Besides that, there are few service Boolean attributes with “true” and “false” values. |
2. Export Schema
3. Feed Format
3.1 Feed Encoding
All tables are delivered as UTF-8, UTF16 and Western European Windows code pages.
If a customer is subscribed to Windows encoding, Asian and Eastern European translations will not be included in the vocabulary file.
3.2 cds_PIM_atr
Column | Type | DB | Description |
---|---|---|---|
ProdID | Varchar(40) | PK | Product ID |
AtrID | Varchar(10) | PK | Attribute ID |
ValID | Varchar(10) | PK | Value ID |
UnitID | Varchar(10) | NULL | Unit ID |
NNV | Float | NULL | Normalized numerical value, expressed in the attribute's base unit |
This table is incremental synchronized with cds_Prod increment.
3.3 cds_PIM_atr_composable
Column | Type | DB | Description |
---|---|---|---|
ProdID | Varchar(40) | PK | Product ID |
AtrID | Varchar(10) | PK | Attribute ID of Composable Attributes |
ValID | Varchar(10) | PK | Value ID |
UnitID | Varchar(10) | NULL | Unit ID |
NNV | Float | NULL | Normalized numerical value, expressed in the attribute's base unit |
SetNo | Integer | PK | Number of attribute set in a Composable Attribute Group |
This table is incremental synchronized with cds_Prod increment.
3.4 cds_PIM_atr_voc
Column | Type | DB | Description |
---|---|---|---|
ID | Varchar(10) | PK | ID of attribute group, attribute, value, unit list or unit |
LanguageCode | Varchar(20) | PK | Same codes as we use in digital content and ACC. This field refers to the languages defined in Cds_Languages table. |
Text | Nvarchar(2000) | Textual value for attribute name, value, unit-group name or unit name | |
ShortText | Nvarchar(255) | NULL | Short value when it exists |
This table is incremental synchronized with cds_Prod increment.
3.5 cds_PIM_atr_standard_units
This table gives mappings between DataSource units and standards for Units of Measure (ISO, unece). Note that there might be several DataSource Unit IDs mapped to a single standard record as 1WorldSync creates units based on a capture context - for example, length units for information captured on an appliance (fridge) will belong to different unit lists than those captured for a cable or laptop.
Column | Type | DB | Description |
---|---|---|---|
UnitID | Varchar(10) | PK | DataSource Unit ID |
UnitListID | Varchar(10) | DataSource Unit List ID | |
BaseUnitID | Varchar(10) | Base DataSource Unit ID | |
StandardTitle | Varchar(200) | PK | Title of a standard or a system of units |
StandardUnitName | Varchar(100) | NULL | Standard unit name |
StandartUnitCode | Varchar(50) | NULL | Standard unit code |
Description | Varchar(2000) | NULL | English description of the unit |
ConversionMultiplier | Float | Conversion factor between the unit and its corresponding base-unit | |
ConversionIsDivision | Bit | Conversion requires dividing ConversionMultiplier by the attribute value. E.g. conversion of car fuel consumption between l/100km and mpg. | |
ConversionAddition | Float | NULL | Constant that needs to be added to converted value. E.g. conversion from F to C. |
This table is non-incremental: always delivered in full.
3.6 cds_PIM_atr_order
Column | Type | DB | Description |
---|---|---|---|
CattID | Char(2) | PK | DataSource 2-letter category code (classic categories) |
AtrID | Varchar(10) | PK | Attribute ID |
GroupID | Varchar(10) | Attribute Group ID | |
GroupOrder | Integer | Display order for attribute group | |
AtrOrder | Integer | Display order for attribute within group | |
DisplayInTitle | Bit | When 1, attribute is recommended to be displayed in specifications. Use 0 when it is not. This field is based on inclusion of the attribute in Standardized Description. | |
DisplayInMainSpecs | Bit | When 1, attribute is recommended to be displayed in specifications. Use 0 when it is not. | |
DisplayInExtendedSpecs | Bit | When 1, attribute is recommended to be displayed in specifications. Use 0 when it is not. | |
IsManaged | Bit | When 1, this attribute will be part of model change communications when introduced or deactivated. An attribute is managed when it is searchable. | |
IsDisabled | Bit | When 1, attribute has been disabled and may not ship in the future. 0 for active attributes. |
This table is non-incremental: always delivered in full.
3.7 cds_PIM_atr_model
Column | Type | DB | Description |
---|---|---|---|
AtrID | Varchar(10) | PK | Attribute ID |
GroupID | Varchar(10) | Attribute Group ID | |
AttributeType | Varchar(50) | Attribute type: single-value or multi-value. This field is generated based on attribute type. All Composable attributes are single-value. All Composed attributes are multi-value. | |
ValueType | Integer | Attribute value format (integer, float, text, Boolean) | |
IsComposable | Bit | Composable attributes (aka repeating attributes) are the attributes from repeating attribute groups used to generate Composed attributes. | |
IsComposed | Bit | Composed attributes are fake attributes generated based on Composable Attribute Group display rules in Extended Specifications, when entire attribute group is presented as one multi-value attribute. | |
IsLocalized | Bit | Is this attribute localized: if 0 then all values for this attribute translate the same way. If at least one of attribute values or units is translatable, then the attribute is localized. |
This table is non-incremental: always delivered in full.