Trafodion Manageability - Apache Trafodion - Apache Software Foundation
This project has retired. For details please refer to its
Attic
page.
DUE TO SPAM, SIGN-UP IS DISABLED. Goto
Selfserve wiki signup
and request an account.
Apache Trafodion
Pages
Space shortcuts
How-to articles
Page tree
Browse pages
tachments (0)
Page History
Resolved comments
Page Information
View in Hierarchy
View Source
Export to PDF
Export to Word
Copy Page Tree
Jira links
Trafodion Manageability
Created by
Atanu Mishra
, last modified by
Gunnar Tapper
on
Apr 04, 2016
Introduction
The Trafodion instance repository is a data warehouse for manageability data collected from a Trafodion instance. Each instance in a Trafodion cluster contains a repository schema ("_REPOS_") that hosts the repository tables. To retrieve data from the repository, DBAs or end users can query the repository tables directly using the Trafodion Command Interface (TrafCI) or any third-party tool like DbVisualizer or SQuirreL SQL Client. Performance metrics such as query information, statistics, and session information are stored in repository tables.
See the overall
Process Architecture
diagram, which shows data being published to repository tables and events being written to log files via log4ccp or log4j.
Repository Structure and Features
Components on the Trafodion platform generate performance data for database objects, events data such as error messages or state changes, and a variety of statistical data. A component transmits this data using publications, which are data structures that define fields to hold the transmitted, or published, data. The data from these publications are written to repository tables with corresponding columns. See the diagram below.
Repository Schema and Table Definitions
All tables in the repository reside in the "_REPOS_" schema. The repository currently includes these tables:
Table Name
Table Description
METRIC_QUERY_AGGR_TABLE
This table contains the statistics of short-running queries, which have been aggregated into one record and written to the repository at the end of each statistics aggregation interval. For descriptions of the fields in the table, see
METRIC_QUERY_AGGR_TABLE (Release 1.1)
. If you are using Trafodion Release 1.0, see
METRIC_QUERY_AGGR_TABLE (Release 1.0)
METRIC_QUERY_TABLE
This table stores query statistics information gathered during the compilation and execution of queries that run longer than the configured interval. For descriptions of the fields in the table, see
METRIC_QUERY_TABLE (Release 1.1)
. If you are using Trafodion Release 1.0, see
METRIC_QUERY_TABLE (Release 1.0)
METRIC_SESSION_TABLE
This table provides access to ODBC and JDBC session statistics, which are gathered when clients connect to a Trafodion instance. These statistics include the data that is collected at the start and end of each client session. For descriptions of the fields in the table, see
METRIC_SESSION_TABLE (Release 1.1)
. If you are using Trafodion Release 1.0, see
METRIC_SESSION_TABLE (Release 1.0)
METRIC_TEXT_TABLE
This table is reserved for future use. For descriptions of the fields in the table, see
METRIC_TEXT_TABLE (Release 1.1)
. This table does not exist in Trafodion Release 1.0.
For additional repository field descriptions, see:
Header Columns
, which are columns that appear in each of the repository tables. If you are using Trafodion Release 1.0, see
Header Columns (Release 1.0)
STATEMENT_TYPE
, which is a field in
METRIC_QUERY_TABLE
. If you are using Trafodion Release 1.0, see
STATEMENT_TYPE (Release 1.0)
Getting Started With the Repository
Installation
The repository is automatically installed on the platform by the Trafodion installer.
Character Set Support
All repository character data is encoded using UTF8.
Enabling or Disabling Publication to the Repository
Publication of statistics is enabled by default. You can disable publication of statistics by setting the DCS server property,
dcs.server.user.program.statistics.enabled
, to
'false'
in the
dcs-site.xml
file and then restarting the DCS servers. For more information, see the
Trafodion Database Connectivity Services Reference Guide
Configuring the Repository
Statistics published to the repository can be:
Session statistics
, which are published when session ends.
Aggregation statistics
, which are aggregated and published at a specified interval during the session. The default interval is 60 seconds.
Query statistics
, which are published when the query starts and are updated when query ends. By default, query statistics are published only for queries that run longer than 60 seconds.
You can configure the type of statistics that are written to the repository by setting the property,
dcs.server.user.program.statistics.type
, in the
dcs-site.xml
file to one of the following statistics types and then restarting the DCS servers:
Statistics Type
Description
'aggregated'
(default)
This setting causes session statistics and aggregation statistics to be published in the repository for all queries. Query statistics are published only when a query executes longer than the specified statistics limit, which is 60 seconds by default, but can be configured in the
dcs-site.xml
file.
'session'
This setting causes only session statistics to be published in the repository. Aggregation and query statistics are not published.
For more information about setting this and other statistics properties, see the
Trafodion Database Connectivity Services Reference Guide
NOTE:
Trafodion Release 1.0 supports the
'query'
setting, which causes session statistics and query statistics to be published in the repository for
all
queries and which prevents aggregation statistics from being published. This setting is no longer supported in Trafodion Release 1.1 and subsequent releases.
Upgrading the Repository
Repository tables are re-created on every reinstallation of Trafodion or if you execute the 'initialize trafodion, upgrade' command.
Data Aging for the Repository
As part of system management, you should include maintenance of repository tables to "age out" old data. Because these kinds of operations are resource-intensive, integrating regular repository maintenance schedules into your enterprise-level operations will optimize the process.
NOTE: Aging out repository data is currently not supported but will be part of a future release.
Examples and Guidelines for Creating Repository Queries
See
Examples and Guidelines for Creating Repository Queries
for some simple queries that you can run against the available repository tables. That page also provides suggestions for writing repository queries.
Logging Events
Trafodion components use log4cpp and log4j to log error or event messages. By default, all information is logged to files. Log files contain:
Messages from SQL subcomponents and the master executor.
Information collected from the Transaction Managers (TMs).
General information about potential system situations; for example, SQL errors, configuration file errors from a component.
Information specifically addressing SQL errors; for example, the ID of the query producing the error, a description of the error, or the name of the catalog, schema, and table that produced the error.
Location of Configuration Files
Trafodion configuration files are located in the
$MY_SQROOT/conf
folder. Starting in Trafodion Release 1.1, there is a configuration file for each subcomponent of Trafodion that currently logs events.
Config File Name
Used by Component
log4cpp.trafodion.masterexe.config
SQL master executor/compiler/ESPs
log4cpp.trafodion.ssmp.config
SQL ssmp
log4cpp.trafodion.sscp.config
SQL sscp
log4cpp.trafodion.lob.config
SQL lobserver
log4cpp.trafodion.udr.config
SQL udr
log4cpp.monitor.mon.config
Monitor
log4cpp.monitor.mon.snmp.config
Monitor Critical
log4cpp.monitor.pstartd.config
Process Starup Daemon
log4cpp.monitor.wdg.config
Watchdog process messages
log4cpp.monitor.wdg.snmp.config
Watchdog process critical
log4cpp.tm
.config
Transaction Manager Client
log4j.dtm.config
Transactional Server
log4j.hdfs.config
Hadoop/HDFS/Hbase layer
NOTE:
In Trafodion Release 1.0, all configuration files use the default log level of 'ERROR.'
Location of Log Files
Trafodion log files are located in the
$MY_SQROOT/logs
folder. A log file is generated for every master executor process. The format of the log file and the message is as follows:
master_exec__.log
:, , , , , , , , ,
Searching the Log Files
The log file for each master executor process is placed in the node in which the process is executing. Thus, there will be a different set of master_exec_*.log files in every node of the cluster. To troubleshoot a problem that could have occurred anywhere on the cluster, you will need to search the log files on every node. To facilitate this, Trafodion has implemented a table-valued function that provides an SQL interface to the contents of these log files. You can invoke this function in an SQL query. It reads all the log files across the cluster and produces an output that can be treated as an SQL table. Each event is a separate row. The columns of this table depend on the options provided.
The SQL syntax to invoke this function is:
select * from udf(event_log_reader( [options] ));
The optional [options] argument is a character constant. It accepts the value:
f: add file name output columns (see below)
Returned columns:
log_ts timestamp(6),
severity char(10 bytes) character set utf8,
component char(24 bytes) character set utf8,
node_number integer,
cpu integer,
pin integer,
process_name char(12 bytes) character set utf8,
sql_code integer,
query_id varchar(200 bytes) character set utf8,
message varchar(4000 bytes) character set utf8

If option "f" was specified, we have four more columns:

log_file_node integer not null,
log_file_name varchar(200 bytes) character set utf8 not null,
log_file_line integer not null,
parse_status char(2 bytes) character set utf8 not null

(log_file_node, log_file_name, log_file_line) form a unique key
in the result table. parse_status indicates whether there were
any errors reading the information:

' ' (two blanks): no errors
'E' (as first or second character): parse error
'T' (as first or second character): truncation or over/underflow
occurred
'C' (as first or second character): character conversion error.
Sample Queries for Capturing Event Information
To see events reporting unique constraint violations that occurred during a specific time interval, in time sequence, use this query:
select log_ts, process_name, cast(message as char(100))
from udf(event_log_reader())
where sql_code = 8102 and log_ts between timestamp '2015-01-04 12:00:00'
and timestamp '2015-01-04 13:00:00'
order by log_ts;
This query reports the number of execution errors (with sql_code between 8000 and 9000) that occurred during a specific day:
select sql_code, count(*)
from udf(event_log_reader())
where log_ts between timestamp '2015-01-04 00:00:00'
and timestamp '2015-01-04 23:59:59'
and sql_code between 8000 and 9000
group by sql_code
order by sql_code;
This query reports the number of compiler processes started by each master executor:
select process_name, count(*)
from udf(event_log_reader())
where message like '%A compiler process is launched%' and severity = 'INFO'
group by process_name ;
This query reports all the missing statistics warnings generated during a specified interval. This output can be used to issue UPDATE STATISTICS commands to improve plan quality for future runs of the same workload.
select message
from udf(event_log_reader())
where log_ts between timestamp '2015-01-27 00:00:00'
and timestamp '2015-01-27 23:59:59'
and sql_code in (6007, 6008) ;
No labels
Overview
Content Tools
Atlassian Confluence Open Source Project License
granted to Apache Software Foundation.
Evaluate Confluence today
Atlassian Confluence
8.5.31
Printed by Atlassian Confluence 8.5.31
Report a bug
Atlassian News
Atlassian
{"serverDuration": 109, "requestCorrelationId": "58e49466cce87226"}