Hi,
using Cognos TM1 10.1.1 fixpack 2 64-bit.
1. In TM1 Architect right click on Processes and select Create new Process.
2. Turbo Integrator dialog opens. Click on ODBC.
3. There is "Use Unicode" checkbox. What does this checkbox means?
a) Source data from ODBC data source are in Unicode?
b) Target data TM1 e.g. dimension, members, attributes in TM1 server are going to be stored as Unicode?
c) Something else?
I always just left this checkbox checked, but our company is talking about multi-language support (also additional alphabets like Russian Cyrillic) in all of our solutions so I am wondering what this checkbox actually does.
Additional question: What does "Unicode" means in this dialog? In theory there are several code-pages in Unicode like UTF-8, UTF-16, UTF-32. There are also "UTF-8 with BOM", "UTF-16 "Big Endian" (also BOM and without BOM), "UTF-16 Little Endian" etc?
Can you point me to some additional source to read the details?
Thanks
What does "Use Unicode" checkbox means in TM1 Architect Turbo Integrator?
-
- Posts: 31
- Joined: Tue Aug 20, 2013 5:53 am
- OLAP Product: TM1
- Version: 10.1.1
- Excel Version: -
- jim wood
- Site Admin
- Posts: 3958
- Joined: Wed May 14, 2008 1:51 pm
- OLAP Product: TM1
- Version: PA 2.0.7
- Excel Version: Office 365
- Location: 37 East 18th Street New York
- Contact:
Re: What does "Use Unicode" checkbox means in TM1 Architect Turbo Integrator?
Unicode is a way for storing data. Some databases are setup as unicode, some are not. TM1 became unicode at very 9.4 (if memory serves) For more details go here:
https://en.wikipedia.org/wiki/Unicode
https://en.wikipedia.org/wiki/Unicode
Struggling through the quagmire of life to reach the other side of who knows where.
Shop at Amazon
Jimbo PC Builds on YouTube
OS: Mac OS 11 PA Version: 2.0.7
Shop at Amazon
Jimbo PC Builds on YouTube
OS: Mac OS 11 PA Version: 2.0.7
-
- Posts: 31
- Joined: Tue Aug 20, 2013 5:53 am
- OLAP Product: TM1
- Version: 10.1.1
- Excel Version: -
Re: What does "Use Unicode" checkbox means in TM1 Architect Turbo Integrator?
Thank you for answer. One additional related question: How to specify in TM1 Architect that source text file is encoded in UTF-8 code page?
I created simple text file with cyrillic letter and from sample it is clear that source file is recognized as Windows-1250 code page (default Windows server setting) instead of UTF-8. I see this because UTF-8 single two-byte character is in TM1 recognized as two single-byte characters. See attached picture. Is there a way to specify the file source code page encoding?
I created simple text file with cyrillic letter and from sample it is clear that source file is recognized as Windows-1250 code page (default Windows server setting) instead of UTF-8. I see this because UTF-8 single two-byte character is in TM1 recognized as two single-byte characters. See attached picture. Is there a way to specify the file source code page encoding?
- Attachments
-
- cyrillic.png (64.85 KiB) Viewed 7511 times
-
- MVP
- Posts: 3223
- Joined: Mon Dec 29, 2008 6:26 pm
- OLAP Product: TM1, Jedox
- Version: PAL 2.1.5
- Excel Version: Microsoft 365
- Location: Brussels, Belgium
- Contact:
Re: What does "Use Unicode" checkbox means in TM1 Architect Turbo Integrator?
Hello,
Never used before myself, but did you have a look at the TI function 'SetInputCharacterSet' ?
Never used before myself, but did you have a look at the TI function 'SetInputCharacterSet' ?
Best regards,
Wim Gielis
IBM Champion 2024-2025
Excel Most Valuable Professional, 2011-2014
https://www.wimgielis.com ==> 121 TM1 articles and a lot of custom code
Newest blog article: Deleting elements quickly
Wim Gielis
IBM Champion 2024-2025
Excel Most Valuable Professional, 2011-2014
https://www.wimgielis.com ==> 121 TM1 articles and a lot of custom code
Newest blog article: Deleting elements quickly
-
- Posts: 31
- Joined: Tue Aug 20, 2013 5:53 am
- OLAP Product: TM1
- Version: 10.1.1
- Excel Version: -
Re: What does "Use Unicode" checkbox means in TM1 Architect Turbo Integrator?
@Wim Gielis, thanks for help.
According to TM1 Reference Guidle documentation: For formats lacking a valid byte-order-mark, the characters must be converted from some other encoding to UTF-8. The SetInputCharacterSet function lets you specify the character set used in a TurboIntegrator data source.
I can confirm above statement from documentation with two tests.
Test 1:
1. Opened UTF-8 file with Notepad and just save it. Add the beginning of the file Noteapad adds BOM (byte order mark) in case of UTF-8 that means special 3 bytes at very beginning of file.
2. Pressing Preview button in Turbo Integrator window and UTF-8 characters appear correctly displayed.
3. In Turbo Integrator I have created a process to create a dimension with members and run a process.
4. Opened dimension and cyrillic characters are really loaded and so working fine. Problem solved.
Test 2:
1. Back to my original file UTF-8 without a BOM at the beginning of the file. If loaded in this exactly the same way like in previous "Test 1" steps character gets incorrectly loaded TM1, because TM1 is using default system locale.
2. To bypass this problem like Wim suggested I added following code at the top of Prolog tab: SetInputCharacterSet('TM1CS_UTF8');
and characters are correctly loaded as dimension members.
Note: In "Data Source" "Turbo Integrator" window there is still incorrectly displayed characters most probably because command "SetInputCharacterSet" only takes effect when process is run. But this is not a problem, because UTF-8 characters are correctly imported into Dimension members.
According to TM1 Reference Guidle documentation: For formats lacking a valid byte-order-mark, the characters must be converted from some other encoding to UTF-8. The SetInputCharacterSet function lets you specify the character set used in a TurboIntegrator data source.
I can confirm above statement from documentation with two tests.
Test 1:
1. Opened UTF-8 file with Notepad and just save it. Add the beginning of the file Noteapad adds BOM (byte order mark) in case of UTF-8 that means special 3 bytes at very beginning of file.
2. Pressing Preview button in Turbo Integrator window and UTF-8 characters appear correctly displayed.
3. In Turbo Integrator I have created a process to create a dimension with members and run a process.
4. Opened dimension and cyrillic characters are really loaded and so working fine. Problem solved.
Test 2:
1. Back to my original file UTF-8 without a BOM at the beginning of the file. If loaded in this exactly the same way like in previous "Test 1" steps character gets incorrectly loaded TM1, because TM1 is using default system locale.
2. To bypass this problem like Wim suggested I added following code at the top of Prolog tab: SetInputCharacterSet('TM1CS_UTF8');
and characters are correctly loaded as dimension members.
Note: In "Data Source" "Turbo Integrator" window there is still incorrectly displayed characters most probably because command "SetInputCharacterSet" only takes effect when process is run. But this is not a problem, because UTF-8 characters are correctly imported into Dimension members.
-
- Posts: 31
- Joined: Tue Aug 20, 2013 5:53 am
- OLAP Product: TM1
- Version: 10.1.1
- Excel Version: -
Re: What does "Use Unicode" checkbox means in TM1 Architect Turbo Integrator?
I did additional test with DB2 UTF-8 defined database storing data in varchar data type (data stored in UTF-8) and using vargraphic (data stored in UCS-2 code page).
I am still wondering what is the Unicode code page used by TM1 internally? Is it UTF-8 or something else like UTF-16 or something else when Use Unicode is checked in Turbo Integrator? Is there any way I can check which Unicode code page is used in TM1 internally? I would like to know this because of reporting that will follow from TM1 data.
Bellow test I have created with DB2 and I would like to share with you.
Test 3:
I am still wondering what is the Unicode code page used by TM1 internally? Is it UTF-8 or something else like UTF-16 or something else when Use Unicode is checked in Turbo Integrator? Is there any way I can check which Unicode code page is used in TM1 internally? I would like to know this because of reporting that will follow from TM1 data.
Bellow test I have created with DB2 and I would like to share with you.
Test 3:
- Attachments
-
- cyrillic_db2.png (74.03 KiB) Viewed 7500 times