Frequently Asked Questions

System Information
System Deployment
System Usage
System Security

System Information

Does GeneTegra perform federated queries of multiple heterogeneous data sources, without creating a physical data warehouse?

Yes. GeneTegra is a novel data integration application that allows users to generate models of heterogeneous data sources and to create integrated views of these models on demand. Users can then use these integrated views to build federated queries that are processed by GeneTegra’s query engine in order to retrieve live, current data from the underlying data sources. GeneTegra is capable of performing these queries against multiple data sources of disparate formats, including relational databases, XML information sets, Excel files, and RDF data stores.

Is GeneTegra platform independent?

GeneTegra is built using Java technology. It does not impose restrictions on the selection of hardware other than the ability to run the Java Runtime Environment (JRE) version 6 or higher.

What platforms have been tested with GeneTegra?

GeneTegra has undergone extensive testing under Windows Server 2008 R2, Windows Server 2003 R2, Windows 7 32-bit, Windows Vista 32-bit and Windows XP Service Pack 3.
As GeneTegra uses Java technology, it is designed to work with any operating system that supports the JRE version 6 or higher. The GeneTegra client application has been thoroughly tested on a Mac OS X system.

Does GeneTegra work with server virtualization technology?

GeneTegra has been thoroughly tested within Citrix XenApp Server 6.0. All mandatory tests from Citrix have been passed and the software has been declared a Citrix-ready product.

Does GeneTegra system use Semantic Web technology?

GeneTegra is based on Semantic Web technologies; it uses ontology models in RDF/OWL in order to present users with views of data sources. It can interact directly with SPARQL endpoints, RDF Stores, RDF files and OWL files using the ontology models that are natively available from each of these types of data sources.

What technology standards are employed by GeneTegra?

Consult the GeneTegra Technology page to learn about the technology standard used in the system, which include: Java, RMI, RDF, OWL, SPARQL, JDBC, SQL, XML, HTTP, SOAP, and WSDL.

System Deployment

Can GeneTegra be used within a “test environment” whereby an institution can test the software prior to going into production?

In coordination with Roswell Park Cancer Institute a pilot test of GeneTerga was performed to assess the integration of data from multiple SQL-based data sources owned by RPCI. During testing, feature improvements to GeneTegra have been developed as requested and assistance has been provided to resolve technical issues. After the pilot test was validated by RPCI, the system was licensed for live use.

Can changes made in the test or development environment be transferred to the live environment?

Yes. Models describing data sources and queries created within a test or development environment can be seamlessly transferred to a live environment by through the User Interface.

How do client and server environments remain in sync?

The client environment always consults the management server in order to validate logins and verify permissions. Since the authentication and authorization information is kept on the management server there is no need to sync. To this end, the proposed GeneTegra system requires that a network connection (wired or wireless) be present at all time between client and server, as the client needs to communicate with the server to perform security authentication and authorization.

Is the system designed or desired to be used with wireless access?

The GeneTegra client may communicate with the server through wireless networks, but it is not specifically designed for wireless access and performs best on high-speed wired connections.

What are the workstation requirements?

Minimum requirements:

Dual-core processor or better with minimum 2MB L2 cache

Minimum 2GB RAM

Minimum 100MB free disk space

Windows Server 2003/2008, Windows Vista/7 or Mac OSX (10.6 or higher) operating system

What is the GeneTegra licensing model?

The licensing model is as follows:

Software use license for server software, per installation

Software use license for client software, per installation

Per-database connection license (dependent on type of database)

The software use license provided is perpetual, without expiration, for the version of the system supplied, and for any bug fixes and minor upgrades provided throughout the maintenance and support period.

How does the GeneTegra system remain current with updates?

The GeneTegra client includes a feature that will check for updates automatically. Also, all updates will be available on the GeneTegra web site.

System Usage

What is involved in the GeneTegra installation process?

GeneTegra will be delivered as a download from the website in a pre-packaged binary that will self-extract and present the user with a configuration wizard allowing users to make changes from the default configuration. INFOTECH Soft is accessible by phone and email if assistance is needed during the installation process.

Does GeneTegra provide annotation support and support for data element definitions?

GeneTegra provides a mechanism for users to define and encode data element definitions as well as apply internally developed data dictionaries to models generated within GeneTegra.

How does GeneTegra manage data conversions and what kinds of data can be converted?

GeneTegra uses SPARQL, RDF and OWL to model data sources and execute queries against them. GeneTegra’s data source interfaces handle translation of SPARQL queries to the native query languages of data sources being queried and formatting of the results produced in the SPARQL result format. GeneTegra also allows for the import of data files of various delimited formats; the user has the option to convert the imported data into an SQL compatible format or an RDF format. Moreover, query results may be exported in various formats such as Excel, OWL/RDF, and delimited text (comma-, tab-, or pipe-delimited).

What type of database does GeneTegra use?

GeneTegra uses MySQL databases (version 5.5.15 and above) for storing authentication and authorization information and audit logs. MySQL may be replaced by SQL Server or Oracle if desired. INFOTECH Soft does not supply any database management systems, it only provides the scripts needed to build the databases.

Are data transactions recoverable in the event of hardware or application system failure?

GeneTegra does not persist any queried data. It uses a caching mechanism during a session and automatically releases the data once the session terminates. GeneTegra allows the user to export data to external formats such as delimited files or Excel files. If the exported data needs to be recoverable, the user must save it in a location that is periodically backed up.

Does GeneTegra alter or update the data on data sources?

GeneTegra does not update or write to databases, except for authentication and authorization information and audit logs which are written solely through a management console. Only one management console is allowed per server. Audit logs are generated at the server.

Can GeneTegra users customize a data model?

Yes. Models generated within GeneTegra will include all information available from the data sources. GeneTegra also offers the user the ability to augment any model by adding custom relationships, creating custom calculated fields and creating links with data element definitions.

Does GeneTegra use a graphical user interface to create, save and execute queries?

Yes. The GeneTegra Query Environment contains a tree structure representing the model being queried. This tree is used to build queries via a drag and drop mechanism. The user can either use the tree or the graphical nodes to add constraints to the query. Saving and executing queries is performed through the interface.

Can queries saved using GeneTegra be published, thus allowing authorized users to access them?

GeneTegra allows users with appropriate permissions to publish queries in order to share with others. Provided that a user has the necessary permissions, he/she can load a published query and modify it according to his/her needs.

Does GeneTegra support the use of SQL aggregate and scalar functions?

GeneTegra currently supports the following SPARQL aggregate functions that are equivalent to the corresponding SQL aggregate functions: ‘COUNT’, ‘AVG’, ‘MAX’, ‘MIN’, ‘SUM’. Two additional aggregate functions are offered by SPARQL: ‘CONCAT’ and ‘SAMPLE’.
The SQL scalar functions supported are UCASE, LCASE, LEN and NOW. GeneTegra also includes a wider range of additional functions that can be applied to SQL databases.

Can GeneTegra use joins and relationships between tables when constructing queries?

GeneTegra’s Query Environment provides a drag-and-drop implementation for linking/joining different tables. Two types of links exist: Joins and Relationships. Relationships are foreign key relations typically extracted directly from the databases when creating models. Joins are links embedded in the model or defined when building a query. While Relationships only support equality relations, inequalities can also be used to define Joins. The Join created in GeneTegra can be more complex because of ties to scalar functions which are used in the evaluation of the link at query time.

Does GeneTegra offer features to assist the user in the query building process?

GeneTegra has the capability of sampling data at different granularity while building a query: a user may inspect attributes (fields within a database tables) for a distribution of its distinct values or preview the first few records of a table. This option is provided to allow the user to verify the formats and values of data points while building queries. To resolve format disagreements, GeneTegra offer a wide range of functions that may be used in the joins between tables. GeneTegra also includes a query reuse mechanism that lets the user incorporate published queries or query modules into new queries being developed. A user may test portions of the queries being built while working on it. GeneTegra employs parallel processing whenever possible so that snapshots of queries may be executed while the queries are still being developed. GeneTegra also guides the user in the application of filters and joins by verifying the data types of data points.

What options does GeneTegra offer for viewing and saving query results?

GeneTegra offers two means of viewing query results: in a graphical format or in a tabular format. In the tabular mode, the user is able to manipulate the results by showing and hiding columns and by displaying or hiding the name of the concept (table) containing the column attributes. Results can be filtered for a particular entry of text through the use of lexical matching or regular expressions. The filter process only applies to the columns that are searchable. The user can remove or add columns from the searchable list. Aliases may also be assigned to columns prior to exporting the result data. GeneTegra also enables a copy mechanism that allows the user to copy a set of columns, rows or cells. The copied data is comma delimited and it may be pasted into external applications. Results can be exported to OWL, Excel, or delimited-text files. In graphical mode, a user is able to navigate the query results in a similar fashion to the way in which the query was expressed. Additionally, the features available in tabular mode are carried over.

Does GeneTegra set a limit on the number of results that can be retrieved at one time?

Since GeneTegra uses a caching mechanism, the number of results that can be retrieved is limited by the amount of physical hard drive space available on the machine.

Does GeneTegra provide support for the utilization of standard ontologies?

Standard ontologies can be opened directly within GeneTegra. An existing ontology may be explored, edited and augmented using one of several graphical user interfaces.

System Security

What type of activity logs are created by GeneTegra?

The system logs all queries executed through the User Interface which are stored in a local database and can be reviewed by users with the appropriate privileges. With security enabled, all activities that require permission will be logged including login and logout. This data can be extracted and reviewed by a system administrator.

How does GeneTegra securely transmit data and user authentication over the network?

Within GeneTegra, sensitive data is encrypted. Data transmissions are protected using TLS/SSL communication.

Can GeneTegra use an existing active directory for authentication?

Yes and it also supports LDAP authentication.

What rights does a GeneTegra user account have at the database level?

GeneTegra does not override the security at the database level. A user will only be able to access a database using the credentials provided to him/her by the administrators of each database. GeneTegra’s administrator may decide to add additional restrictions by disallowing a user from accessing a database through GeneTegra even though he/she is granted access by the database itself. When accessing a database, GeneTegra will prompt the user for an expected user name and password unless integrated security is used. The login information will be asked every time unless the user decides to let GeneTegra remember it. The stored login information is encrypted to ensure security.

Does GeneTegra use role based controls limiting access to fields and data accordingly?

GeneTegra can support group-based controls. These can represent roles, but the system administrator needs to define how the roles interact. Besides the database level security put in place by the corresponding database administrators, the models generated within GeneTegra dictate the fields and data that are accessible. Limiting access to fields and data is performed by disallowing access to the models that include them. GeneTegra gives an administrator the capability of generating an unlimited number of views of the same database; these views may include or exclude any combination of fields.

Can GeneTegra prevent users from one area of the organization from having access to patient information in other areas?

The security infrastructure within GeneTegra goes down to the database level. Restrictions within each database are expected to be established by each database administrator.

How does GeneTegra secure access to the data models, data stores, and queries?

The security model included in GeneTegra allows an administrator user to restrict other users from creating, accessing and modifying data models. The restrictions also extend to queries where a user can be prevented from creating, opening and executing queries. Query results can only be exported by users having the proper permissions. Database access restrictions are governed by the licensing infrastructure put in place. An administrator may select a list of databases that are authorized to be used within GeneTegra.