Page 1 of 1

Source file having duplicate records, need to remove them through a TI process

Posted: Tue May 11, 2021 1:54 am
by SBK88
Hi All,

Need your help
I have a source csv file (having few duplicate records), need this file to get processed in TM1
before that I want to remove all duplicates thru a TI process.
How Can I achieve this ?

My source file having approx 10 columns
C1, C2, C3 ...... C10

While checking manually I find few records are duplicate there, hence want to automate the duplication records removal process through TI process, before processing the final file in my TM1 system.

Need urgent help on this.

Thanks in advance.

Re: Source file having duplicate records, need to remove them through a TI process

Posted: Tue May 11, 2021 3:46 am
by Alan Kirk
SBK88 wrote: Tue May 11, 2021 1:54 am I have a source csv file (having few duplicate records), need this file to get processed in TM1
before that I want to remove all duplicates thru a TI process.
How Can I achieve this ?

My source file having approx 10 columns
C1, C2, C3 ...... C10

While checking manually I find few records are duplicate there, hence want to automate the duplication records removal process through TI process, before processing the final file in my TM1 system.

Need urgent help on this.
This question lacks a few obvious pieces of information, such as:
(a) Does the source data have a unique key, and if not, why not? There are various ways that you could track the keys to see whether they have already been loaded. More importantly...
(b) Are you aggregating the data? If not, then it makes no difference whether the row is duplicated since you'd be writing the same values anyway. But the ultimate "don't raise the bridge, lower the river" issue...
(c) Have you looked at having the data source corrected such that it no longer contains duplicate data, as a result of which the entire issue becomes academic.

Re: Source file having duplicate records, need to remove them through a TI process

Posted: Tue May 11, 2021 8:32 am
by lotsaram
SBK88 wrote: Tue May 11, 2021 1:54 am I have a source csv file (having few duplicate records), need this file to get processed in TM1
before that I want to remove all duplicates thru a TI process.
How Can I achieve this ?
Well honestly with great difficulty.
This is really not what TurboIntegrator is for. Source files need to be cleaned up and in the correct tabular format before passing them to TI.

However, are duplicate records even a problem at all? If you are "single point loading" with CellPutN then you're just over writing the same data, so ... no problem. If you are aggregating the data with CellIncrementN then you have a problem and need to clean up the source file.

Re: Source file having duplicate records, need to remove them through a TI process

Posted: Tue May 11, 2021 6:14 pm
by Adam
SBK88 wrote: Tue May 11, 2021 1:54 am before that I want to remove all duplicates thru a TI process.
One additional thought SBK88 - please describe what the expected outcome is. Perhaps through the TI process, you're adding cost centers, or materials, or other master data from your source system. In TM1 land, the outcome of that is adding elements to a dimension. You can check for dupes using DIMIX in that case.

Re: Source file having duplicate records, need to remove them through a TI process

Posted: Thu May 13, 2021 6:59 am
by SBK88
Thanks Everyone for your reply,

My actual issue is
I have a cube (view screenshot attached below)
Image

I have exported the data in a csv file, using following code
To remove the duplicates in extracted file I tried to make sure to select All consolidated level data in PROLOG tab,
But I have no option for MEASURE DImension, as consolidated level data is 0 in most of the parent elements.

Below is my
====PROLOG CODE====

cDim = 'Year & Annual & Scenario & Company & MeasureActualAlloc' ;
cProcess = 'Bedrock.Cube.ViewAndSubsets.Create' ;
ExecuteProcess ( cProcess,
'pCube', cSourceCube,
'pSuppressZero', 1,
'pSuppressConsol', 1,
'pSuppressRules', 1,
'pDimensions', cDim,
'pDelimiter', '&',
'pView', cView,
'pSubset', cSubset,
'pDebug', pDebug
) ;


cDimName = 'Year' ;
SubsetElementInsert ( cDimName, cSubset, pYear, 1 ) ;

cDimName = 'Annual' ;
SubsetElementInsert ( cDimName, cSubset, pMonth, 1 ) ;

cDimName = 'Company' ;
SubsetElementInsert ( cDimName, cSubset, pCompany, 1 ) ;

cDimName = 'Scenario' ;
SubsetElementInsert ( cDimName, cSubset, 'ACTUAL', 1 ) ;

cDimName = 'MeasureActualAlloc' ;
SubsetElementInsert ( cDimName, cSubset, 'Source_Share_in_%', 1 ) ;
#SubsetElementInsert ( cDimName, cSubset, 'Source (from BX72)', 1 ) ;




==== DATA TAB CODE====
IF (cHeader > 0 ) ;

ASCIIOUTPUT (cFileName, 'Source_GL_Account', 'Source__Share_In_%', 'Source_CostCenter', 'Source_PG',
'Map_BaseLineAccount', 'Map_Channel', 'Map_Customer', 'Map_SoldTo', 'Map_ProfitCenter', 'Map_PG', 'Map_Source',
'Target_Channel', 'Target_Product', 'Target_Customer', 'Target_SoldTo', 'Target_ProfitCenter', 'Target_Source') ;


cHeader = cHeader - 1 ;
ENDIF ;



Source_GL_Account = CellGetS(cSourceCube, pYear, pMonth, pScenario, pCompany, vLine ,'Source_GLAccount' );
Source_Share_In_% = CellGetN(cSourceCube, pYear, pMonth, pScenario, pCompany, vLine ,'Source_Share_In_%' );
Source_CostCenter = CellGetS(cSourceCube, pYear, pMonth, pScenario, pCompany, vLine ,'Source_CostCenter' );
Source_PG = CellGetS(cSourceCube, pYear, pMonth, pScenario, pCompany, vLine ,'Source_PG' );

Map_BaseLineAccount = CellGetS(cSourceCube, pYear, pMonth, pScenario, pCompany, vLine ,'Map_BaseLineAccount' );
Map_Channel = CellGetS(cSourceCube, pYear, pMonth, pScenario, pCompany, vLine ,'Map_Channel' );
Map_Customer = CellGetS(cSourceCube, pYear, pMonth, pScenario, pCompany, vLine ,'Map_Customer' );
Map_SoldTo = CellGetS(cSourceCube, pYear, pMonth, pScenario, pCompany, vLine ,'Map_SoldTo' );
Map_ProfitCentre = CellGetS(cSourceCube, pYear, pMonth, pScenario, pCompany, vLine ,'Map_ProfitCentre' );
Map_PG = CellGetS(cSourceCube, pYear, pMonth, pScenario, pCompany, vLine ,'Map_PG' );
Map_Source = CellGetS(cSourceCube, pYear, pMonth, pScenario, pCompany, vLine ,'Map_Source' );

Target_Channel = CellGetS(cSourceCube, pYear, pMonth, pScenario, pCompany, vLine ,'Target_Channel' );
Target_PG = CellGetS(cSourceCube, pYear, pMonth, pScenario, pCompany, vLine ,'Target_PG' );
Target_Customer = CellGetS(cSourceCube, pYear, pMonth, pScenario, pCompany, vLine ,'Target_Customer' );
Target_SoldTo = CellGetS(cSourceCube, pYear, pMonth, pScenario, pCompany, vLine ,'Target_SoldTo' );
Target_ProfitCentre = CellGetS(cSourceCube, pYear, pMonth, pScenario, pCompany, vLine ,'Target_ProfitCentre' );
Target_Source = CellGetS(cSourceCube, pYear, pMonth, pScenario, pCompany, vLine ,'Target_Source' );



######### Extracting Cube data into csv format #########

ASCIIOUTPUT ( cFileName, Source_GL_Account, NumberToString(Source_Share_In_%), Source_CostCenter, Source_PG,
Map_BaseLineAccount, Map_Channel, Map_Customer, Map_SoldTo, Map_ProfitCentre, Map_PG, Map_Source,
Target_Channel, Target_PG, Target_Customer, Target_SoldTo, Target_ProfitCentre, Target_Source ) ;




Getting few DUPLICATE Records, hence was thinking to create another SUB TI inside this to remove the duplicate records

Please guide me on how to change the SOURCE VIEW or DATA TAB code to get all UNIQUE records as OUTPUT

Many Thanks in Advance.

Re: Source file having duplicate records, need to remove them through a TI process

Posted: Thu May 13, 2021 10:13 am
by Wim Gielis
What is the definition of unique / duplicate records?
What is the expected outcome and what records should not be exported based on the screenshot ?