OnDemand Users Group

Support Forums => CMOD for Multiplatforms => Topic started by: PaulV on March 09, 2016, 03:14:00 PM

Title: CMOD Case Sensitiveness - DB2 Case Insensitive?
Post by: PaulV on March 09, 2016, 03:14:00 PM
Our customers have a requirement to search case-insensitive in a field that holds a mixed-case value. We have been investigating the possibilities and came to the following two alternatives:
1.   Copy the mixed-case value to a case-insensitive (uppercase) field (in the load action, with a additional index).
2.   Make the DB2 database case insensitive, as in the article http://www.ibm.com/developerworks/data/library/techarticle/0203adamache/0203adamache.html (http://www.ibm.com/developerworks/data/library/techarticle/0203adamache/0203adamache.html)

The latter currently seems the cleanest solution, but does if IBM supports a "Case insensitive DB2 database"?

Environment: CMOD v9.0, DB2 v10.1 both on AIX. DB2 database size 171GB.
Title: Re: CMOD Case Sensitiveness - DB2 Case Insensitive?
Post by: Justin Derrick on March 09, 2016, 04:54:21 PM
There's an article on Content Manager OnDemand and Case Sensitivity in the wiki:

https://cmod.wiki/index.php/Case_Insensitive_Search

Although, I'm not sure it'll provide you with anything you don't already know.

This is actually not the first time I've seen this question asked, it's probably suitable for an Enhancement Request:

http://ODUG.net/index.php?board=2.20

-JD.
Title: Re: CMOD Case Sensitiveness - DB2 Case Insensitive?
Post by: PaulV on March 10, 2016, 10:27:09 AM
Thank you for your reply. What I am looking for is a confirmation (or denial) is CMOD v9.x supports a DB2 database with the following configuration, making is case-insensitive.

From http://www.ibm.com/developerworks/data/library/techarticle/0203adamache/0203adamache.html (http://www.ibm.com/developerworks/data/library/techarticle/0203adamache/0203adamache.html), side bar "New capabilities with DB2 9.5 fixpack 1":

To create a database with an English collation that is case insensitive, use the following statement:
CREATE DATABASE testdb
USING CODESET UTF-8 TERRITORY US
COLLATE USING UCA500R1_LEN_S2
Selecting from tables in this database ignores case.
SELECT lastname FROM employee
ORDER BY lastname

LASTNAME
-----------------
abrams
Adams
allison
Alomar

4 record(s) selected.
Case insensitivity applies to all operations in the database.
SELECT lastname FROM employee
WHERE lastname LIKE 'AL%'
ORDER BY lastname

LASTNAME
-----------------
allison
Alomar

2 record(s) selected.


If anyone has experience with this configuration, please let me know! We are planning a test and I will keep you updated.
Title: Re: CMOD Case Sensitiveness - DB2 Case Insensitive?
Post by: Justin Derrick on March 10, 2016, 03:39:21 PM
Are you in a region that uses accented characters?  I ask because switching to unicode means some characters are 'double-byte' and you'll need to do some tricks to expand the fields in the database to accommodate them.
Title: Re: CMOD Case Sensitiveness - DB2 Case Insensitive?
Post by: PaulV on March 10, 2016, 09:37:27 PM
Yes, we reside in the Netherlands. So we do use �, �, �, �, etc. If you can tell, how to account for the "double byteness", please do.
Title: Re: CMOD Case Sensitiveness - DB2 Case Insensitive?
Post by: ewirtz on March 11, 2016, 07:06:11 AM
Hi,
I think a second upper field is the best choice. You can search for the upper Field which is supported by CMOD and show the original field.

It's not only a question of searching mixed characters but also a question of finding them quickly. An improvement of CMOD could use the following technique internally.
create an index like upper(field) and searching with 'where field = upper(smith)' . This would fulfill for the requirement.

regards

Egon
Title: Re: CMOD Case Sensitiveness - DB2 Case Insensitive?
Post by: Alessandro Perucchi on March 11, 2016, 10:27:59 AM
Quote from: PaulV on March 10, 2016, 09:37:27 PM
Yes, we reside in the Netherlands. So we do use �, �, �, �, etc. If you can tell, how to account for the "double byteness", please do.

I live in Switzerland and we have also such characters �, �����, etc...

The only answer we received from IBM was the following: If you know that all your string will be max 40 chars (for example), that means it is only for "ASCII" chars. So you should add to the 40 something like 10-20% more chars in order to account the double-byteness...
so something like 48 in our case. But of course, this vary from language to languages, and these 10-20% can become the double. so if you want to store 40 symbols, then you must in the worse case define in CMOD string length field 80.

Not found any other solutions for that... :-/
Title: Re: CMOD Case Sensitiveness - DB2 Case Insensitive?
Post by: ewirtz on March 11, 2016, 10:56:41 AM
Hi Alessandro,

the CMOD indexers (OS390 and ACIF)  do not honor utf-8. So if you need to index without problems utf-8 is not a good choice.

The upper problem can be solved without using UTF-8 like discussed before.

regards

Egon
Title: Re: CMOD Case Sensitiveness - DB2 Case Insensitive?
Post by: Justin Derrick on March 11, 2016, 03:09:42 PM
Quote from: PaulV on March 10, 2016, 09:37:27 PM
If you can tell, how to account for the "double byteness", please do.

I did a data migration for a customer (new hardware, OS, software, etc.) and during the migration I just doubled the size of all of the 'string' fields with an SQL command on the arsagfld table.  Now, I did this before loading 20 years of data into the new system, so there weren't any problems.  I'm not sure how you could do it with already-existing tables -- I don't know if you're allowed to change the length of fields in an 'alter table' SQL command.  (And if you can, you should immediately do a 'reorg table' on the modified table.

Good luck!
Title: Re: CMOD Case Sensitiveness - DB2 Case Insensitive?
Post by: PaulV on April 13, 2016, 07:16:52 AM
Update
We have registered a PMR with IBM, asking whether the collation option would be supported. Short answer: no, since this has not been tested.

Longer answer (suggestion of the support desk): since we use a custom build webservice (thin wrapper of ODWEK), we a investigating the option to perform a direct query of the database. Starting the investigation, we tried the simplest approach, with the command line utility ARSDOC.

You van perform a database query with ARSDOC:
arsdoc query -h archive -G LEVENnew -f LEVEN -i WHERE CMD_INDEXS13='87470' AND UPPER(CMD_INDEXS09)=UPPER('Veenma') -v

To our surprise, this returned a result, so case insensitive search  is possible.

Next step: using ODWEK to do the same.

I will keep you informed.
Title: Re: CMOD Case Sensitiveness - DB2 Case Insensitive?
Post by: Justin Derrick on April 13, 2016, 08:36:47 AM
Quote from: PaulV on April 13, 2016, 07:16:52 AM
To our surprise, this returned a result, so case insensitive search  is possible.

Yup, this sounds like it should be an enhancement request -- the ability to do case-insensitive searches, defined at the folder level.  I wouldn't want to be the one building test cases for that.

The only thing I can forsee as a problem is performance -- you've suddenly just gone from a very straightforward and fast search to performing an upper() or lower() function on each and every value, possibly across many tables...  Although it should be possible to build an index based on the case-insensitive (UPPER or lower) value of a field in the database, making the end result faster.

-JD.
Title: Re: CMOD Case Sensitiveness - DB2 Case Insensitive?
Post by: ewirtz on April 21, 2016, 07:56:21 AM
Hi Justin,
as mentioned before this works with good performance if an upper index is created. If this technique would be supported by CMOD a solution is possible with good performance:
1. searching with UPPER(field)=UPPER('constant')
2. defining an upper index

regards

Egon
Title: Re: CMOD Case Sensitiveness - DB2 Case Insensitive?
Post by: PaulV on April 26, 2016, 08:45:31 AM
So we tested with ODWEK's "TcSQLSearch", as documented at http://www.ibm.com/support/knowledgecenter/SSEPCD_9.0.0/com.ibm.ondemand.odwek.doc/ars5y00056.htm%23searchingafolderusingsql

Usage: java TcSQLSearch <server> <port> <userid> <password> <folder> <SQL> <MaxHits (-1 to use default)> <opt begin Date> <opt end Date> <opt Date Format> <opt ApplGrpName>"

This works with the field to search wrapped in UPPER().

Next step, to determine how to abstract this call within our Webservice wrapper and mitigate the risks such as performance en opening up to much functionality. That you can not use this on a folder (over multiple itemtypes) is a bummer, but I am counting our blessings (...). We might even wrapped this with a Stored Procedure and an UPPER index.
Title: Re: CMOD Case Sensitiveness - DB2 Case Insensitive?
Post by: Justin Derrick on May 30, 2016, 01:15:45 PM
I'm actually working on a small project to convert an AG defined with a mixed-case string field into an UPPER field.  So both the AG definition needs to change, and the contents of the database need to be converted to uppercase.  I'll post instructions after it passes pre-production testing.