Navigation: Data Warehouse Help > Using Data Warehouse > Installing and Configuring Data Warehouse New Installations |
Scroll Prev Top Next More |
Important: Upgrading from version 4.0 is supported.
Installation of Data Warehouse creates the Data Warehouse target database schema and makes a few schema changes to the source database.
Note: Installation is primarily concerned with schema changes and does not populate the target database with data, except for tables WH_CATEGORIES and WH_CURRENCIES. For more information about these two tables, see How the ETL Job Works. Data for all other Data Warehouse tables are populated in the separate task Refreshing the Data Warehouse.
The following tools are used for Data Warehouse:
•Pentaho—Enterprise reporting, analysis, dashboard, data mining, and workflow application with ETL capabilities for business intelligence needs. For details, see:
www.pentaho.com/products/data_integration
•Spoon—A GUI application that provides the design of transformations and jobs which can be run with other Kettle tools such as Pan (transformations) and Kitchen (jobs).
Prior to installing Pentaho tools and configuring Data Warehouse, it is recommended that you shut down TeamConnect.
Note: TeamConnect Data Warehouse does not support customization of its scripts, functionality, or target database schema.
1.From the installation media, open directory TeamConnect_Datawarehouse and copy the contents to a target directory on your hard drive. (For purposes of this documentation, we will presume that your target directory is named datawarehouse and it is a subdirectory of your hard drive's root directory.) datawarehouse/.kettle contains several files, including two that must be edited: okettle.properties oshared.xml 2.(Linux only) Navigate to the datawarehouse directory and create the appropriate permissions on the files by running the chmod +x *.sh command. You must also edit the javapath.sh file to contain the correct path to your JRE, as described in System Requirements. 3.Open the kettle.properties file and verify the information that is shown in the listing below. You must edit that information and replace the existing values with those that are specific to your own databases. jdbc.source_TC_hostname=10.0.2.23 jdbc.source_TC_dbname=QDB12O9I jdbc.source_TC_portnumber=1521 jdbc.source_TC_username=TeamConnect jdbc.source_TC_userpassword=TeamConnect jdbc.source_TC_dbtype=ORACLE jdbc.target_TC_hostname=10.0.2.23 jdbc.target_TC_dbname=QDB12O9I jdbc.target_TC_portnumber=1521 jdbc.target_TC_username=DataWarehouse jdbc.target_TC_userpassword=DataWarehouse jdbc.target_TC_dbtype=ORACLE The example above shows Oracle values. Here is an example of some Microsoft SQL Server values: jdbc.target_TC_hostname=WEBISQL2002 jdbc.target_TC_dbname=TC24 jdbc.target_TC_portnumber=1433 jdbc.target_TC_username=sa jdbc.target_TC_userpassword=password jdbc.target_TC_dbtype=MSSQL If your source and target databases run on SQL Server, and you wish to use NTLM authentication to connect to them, leave empty the values of jdbc.source_TC_username, jdbc.source_TC_userpassword , jdbc.target_TC_username and jdbc.target_TC_userpassword . The WH_COMMIT_SIZE=5000 field sets the number of records committed during the ETL process. This value defaults to 5000. For large data warehouse setups suffering slow runtimes, it may be recommended to increase this value. You will also need to modify file shared.xml, as described in the next step. The examples above show database passwords listed in plain text. Optionally, you can omit the password lines entirely, or leave the value after the = sign blank, and instead modify the shared.xml file to use an encrypted password as described in step 8. If your Data Warehouse application is behind a firewall, you may need to add an SSL specification in the TC_dbname property, as in the example line below. jdbc.target_TC_dbname=TC24;ssl=request 4.For SQL Server and NTLM authentication only, add two new attributes to each affected connection element in file shared.xml. See the highlighted text in the example below. This example deals with the source database connection, but NTLM can be used on source and/or target database connections. <connection> <name>Source_TeamConnect_data</name> <server>${jdbc.source_TC_hostname}</server> <type>MSSQLNATIVE</type> <access>Native</access> <database>${jdbc.source_TC_dbname}</database> <port>${jdbc.source_TC_portnumber}</port> <username>${jdbc.source_TC_username}</username> <password>${jdbc.source_TC_userpassword}</password> <servername/> <data_tablespace/> <index_tablespace/> <attributes> <attribute><code>EXTRA_OPTION_MSSQL.domain</code> <attribute>DOMAIN_NAME</attribute> </attribute> </attributes> </connection> For DOMAIN_NAME, substitute the actual Windows domain name that will be doing the authentication. 5.The kettle.properties file also has a parameter, WH_LOOKUP_POPULATION, that must be evaluated now, before installation. There are three possible parameter values. The parameter value affects what will be placed in Data Warehouse tables that relate to custom fields that contain lookup table values. For example, let us presume that a particular custom field contains LLC, from a hierarchy of lookup table values: Vendor, then Corporation, then LLC. oParameter value NAME would cause this lookup table value to appear in Data Warehouse as LLC. oParameter value FULL_NAME would cause this lookup table value to appear in Data Warehouse as Vendor:Corporation:LLC. oParameter value TREE_POSITION would cause this lookup table value to appear in Data Warehouse as VEND_CORP_LLC1. In this case the report designer would then need to create a join to the lookup table to retrieve the text description. Important: If this parameter is changed after you have already begun using Data Warehouse, your data will be inconsistent. The only option, in that case, would be to remove and reinstall Data Warehouse. Therefore you must carefully evaluate which of the three possible values should be used for this parameter. 6.The kettle.properties file also has a section that determines how you receive email notifications. If you want to receive email notifications when the job fails or succeeds, set the value of ENABLE_EMAIL_NOTIFICATION to YES. If you want to receive email notifications only when the job runs successfully, set the value of ENABLE_SUCCESS_EMAIL_NOTIFICATION to YES. Supply specific information about your email contact and server in the other properties if you set either of these to YES. If you do not want to use email notifications, edit this section to set both enable values to NO, and ignore the other properties. #Email notification settings ENABLE_EMAIL_NOTIFICATION=YES ENABLE_SUCCESS_EMAIL_NOTIFICATION=YES DESTINATION_ADDRESS= SENDER_NAME= SENDER_ADDRESS= REPLY_TO_ADDRESS= CONTACT_PERSON= CONTACT_PHONE= #Email server settings SMTP_SERVER= SMTP_PORT= AUTHENTICATION_USER= AUTHENTICATION_PASSWORD= 7.Open the shared.xml file and verify the contents that are shown in the listing below. Note that you cannot use shared.xml exactly as it appears. You must edit the <type> element inside the <connection> element to supply information specific to your own database servers ("MSSQLNATIVE" or "ORACLE"). Note: The default value listed below is for Oracle users. <connection> <name>Target_TeamConnect_Warehouse</name> <server>${jdbc.target_TC_hostname}</server> <type>ORACLE</type> <access>Native</access> <database>${jdbc.target_TC_dbname}</database> <port>${jdbc.target_TC_portnumber}</port> <username>${jdbc.target_TC_username}</username> <password>${jdbc.target_TC_userpassword}</password> <servername/> <data_tablespace/> <index_tablespace/> <attributes> <attribute><code>PORT_NUMBER</code><attribute>${jdbc.target_TC_portnumber}</attribute></attribute> <!-- ONLY INCLUDE THIS LINE IF USING SQL SERVER <attribute><code>MSSQLUseIntegratedSecurity</code><attribute>false</attribute></attribute> --> </attributes> </connection>
<connection> <name>Source_TeamConnect_data</name> <server>${jdbc.source_TC_hostname}</server> <type>ORACLE</type> <access>Native</access> <database>${jdbc.source_TC_dbname}</database> <port>${jdbc.source_TC_portnumber}</port> <username>${jdbc.source_TC_username}</username> <password>${jdbc.source_TC_userpassword}</password> <servername/> <data_tablespace/> <index_tablespace/> <attributes> <attribute><code>PORT_NUMBER</code><attribute>${jdbc.source_TC_portnumber}</attribute></attribute> <attribute><code>SQL_CONNECT</code><attribute>SET TRANSACTION ISOLATION LEVEL READ COMMITTED;</attribute></attribute> <!-- INCLUDE THIS LINE IF USING SQL SERVER <attribute><code>MSSQLUseIntegratedSecurity</code><attribute>false</attribute></attribute> --> </attributes> </connection> 8.The example in step 6 shows passwords that use variables. In addition to this method, passwords that are used in shared.xml and kettle.properties can be encrypted. To encrypt a password, run one of the following files from the main Data Warehouse directory: oEncr.bat (for Windows) o./encr.sh (for Linux) When prompted, enter the plain text of your database password. The encrypted password string will be displayed, such as Encrypted FFA50T6REL88P7SQ. Copy the entire string, including the word "Encrypted" and the space, to the clipboard. In shared.xml, in the password value that should be encrypted, paste the encrypted password string to replace the previous value. Repeat these encryption instructions for each connection that you wish to encrypt. Save and close the file. An example of using encrypted passwords is shown below: <connection> <name>Target_TeamConnect_data</name> <server>${jdbc.source_TC_hostname}</server> <type>ORACLE</type> <access>Native</access> <database>${jdbc.source_TC_dbname}</database> <port>${jdbc.source_TC_portnumber}</port> <username>${jdbc.source_TC_username}</password> <password>Encrypted FFA50T6REL88P7SQ</password> <servername/> <data_tablespace/> <index_tablespace/> <attributes> <attribute><code>PORT_NUMBER</code> <attribute>$ jdbc.source_TC_portnumber}</attribute <attributes> </connection> <connection> <name>Target_TeamConnect_Warehouse</name> <server>${jdbc.target_TC_hostname}</server> <type>ORACLE</type> <access>Native</access> <database>${jdbc.target_TC_dbname}</database> <port>${jdbc.target_portnumber}</database> <username>${jdbc.target_TC_username}</username> <password>Encrypted FFA50T6REL88P7SQ</password> <servername/> <data_tablespace/> <index_tablespace/> <attributes> <attribute><code>PORT_NUMBER</code><attribute>$ {jdbc.target_TC_portnumber}</attribute> </attributes> </connection> 9.Navigate to the datawarehouse installation directory and run one of the following files: oFor Windows: WH_install.bat oFor Linux: ./WH_install.sh IMPORTANT: Users who are on a 4.x version of Data Warehouse must see the Upgrading from Earlier Versions step before running these scripts. Logs for the command output will be located in the ./TCWH_logs subdirectory. The log file name has the format WH_install_date_time.log, where date and time are numbers representing the time when the job was started. Your source and target databases must be each capable of opening at least 110 simultaneous connections while this command is running. If the TeamConnect source database has a very large number of custom objects, the database script that creates the Data Warehouse schema may be quite large. If so, be sure to maximize the amount of Java heap space allocated to this job. 10.If you plan to use the Security and User Access feature (see Security and User Access for more information), you will need to run an additional script. Navigate to the ./TCWH/scripts directory. For Oracle, run ORACLE_WH_TC_SECURITY.SQL. For SQL Server, run MSSQL_WH_TC_SECURITY.SQL. |