The following section outlines Frequently Asked Questions and pre-emptive troubleshooting for the Data Warehouse release.
For users accessing this guide online, click the headline for each enhancement to toggle additional details.
Installation and Initial Load
Yes, you must be on TeamConnect 5.0 in order to install Data Warehouse 5.0. The installation upgrades mainly your warehouse database. It makes only minor changes to TeamConnect such as registering the current version of the warehouse application.
|
Yes. The installation scripts will migrate your current schema. You can then skip the Initial Loading steps and proceed to Refreshing the Data Warehouse.
|
Yes. You should not run any Refresh jobs in production until you restore the warehouse database. After restoring to production, run a Refresh to bring the warehouse up-to-date. See the online documentation for steps under Initial Loading Times.
|
The initial load only collects records whose timestamps are earlier than the maximum modifiedOn timestamp. This maximum value is computed by scanning all records when the initial load begins. You must run a Refresh job following initial load to collect records that were added after initial load. Expect this first refresh to take a while, because it will collect not only new records but also any records that were changed since the initial load began.
|
Run the initial load job again, using the same command (TeamConnect_Warehouse_initial.sh or TeamConnect_Warehouse_initial.bat). It will re-start, skipping over entities that have already completed.
|
Run the WH_REMOVE.bat or WH_REMOVE.sh script to delete the warehouse database. Then repeat the installation and upgrade steps.
|
Refresh Jobs
It has to do with time boundaries. The warehouse applies an offset to calculate the upper boundary. The default is zero (0) seconds. We recommend you change the offset to a higher value in production environments, or anywhere that you will be actively changing data. An offset of 300 means that when a job starts, it will subtract 5 minutes from the job start time. So a job starting at 15:00 will include only records that were modified earlier than 14:55. The offset is used for safety, because a record that a TeamConnect user saves at, say, 14:59 might not be committed to the database until 15:01. If the boundaries were based strictly on the job start times, these last-minute updates might be missed permanently. The offset in this example adjusts for this potential latency, by collecting only records with timestamps between 13:55 and 14:55. Records updated after 14:55 will be collected in the next refresh -- a bit later, but not forgotten.
|
As explained above, the upper boundary (cut-off time for changes) is the starting time of the current refresh job minus the offset. If a record was created or updated before the job started, and then modified again after the job starts, its timestamp (modifiedOn time) is now greater than the upper boundary. If its data hasn’t been collected yet, it will be excluded from the current refresh. The changes will appear in the following refresh, unless the same pattern recurs.
|
Edit the WH_LAST_REFRESH table. On the row that begins with "ALL", change the UPPER_REFRESH_DATE value to a time earlier than the timestamps of the records that were missed. This value determines the lower time boundary for the next refresh. Any records whose timestamps are equal to or later than this time will be re-collected and updated during the next refresh.
|
It depends on how long a typical refresh takes to complete. Most users will want to ensure that a refresh finishes before the next one starts. Data Warehouse 4.1 prevents a refresh from starting if one is still in progress by skipping jobs. When this happens, any accumulated changes won’t appear in the warehouse until the following refresh. We suggest scheduling refreshes no closer than one hour apart until you’re confident of job duration. If your refreshes complete in well under an hour, you can experiment with scheduling them more frequently. Scheduled jobs should use the refresh scripts provided (.bat or .sh files). These scripts contain additional code to check for previous jobs that are still running.
|
The message means that the previous refresh job was still running. In order to prevent a collision, the refresh job was skipped (not launched).
|
The out-of-the-box warehouse refresh commands don’t provide functionality for sleeping. However, clients could implement their own code to accomplish it -- for example, monitoring for the existence of a lock file or the process ID it references. The OOTB scripts create lock files during a refresh job and delete them when it terminates.
|
If users cannot wait until the next scheduled refresh, run the resume script. It may take less time, since it skips over entities that have already completed. If there isn’t any urgency, you can let the regular TeamConnect_Warehouse.bat script run at the next scheduled interval. It will collect any records that were added or changed since the previous successful refresh.
|
As in earlier releases, you must re-run the WH_INSTALL[.sh|.bat] script whenever you add a custom object in TeamConnect. The same applies when renaming a category.
|
Auditing
The refresh audit table (AUDIT_TBL) shows the number of records, starting/ending times and elapsed times for all jobs. Scroll to the bottom of the table to locate the rows that took the most time. If you need more detailed information, see the logs.
|
Run the overall audit count comparison utility (audit[.sh|.bat]).
|
The audit count can have minor inaccuracies due to changes made in the database while it runs. For example, if records are added after the script begin; or added and then deleted since the last refresh (see Known Issue DWE-2303).
|
No, it only compares counts.
|
No, it only compares main (top-level) entities. We’re considering adding counts for sub-entities in a future release.
|
Other
Testing to date indicates a performance improvement for initial load but not for refresh. However, we expect refresh performance to be better for large deltas because we’ve enabled much larger database commit sizes. Also, only data changed since the last refresh window is collected. Formerly, the application re-collected up to a day’s worth of previous changes, even when few records had changed.
|