LogoThe Make2D-DB II Package

version: 2.00.1 (December 2005)



The Make2D-DB II Site - For an up-to-date documentation, news and FAQ



Read-Me: Main


Related documents:





Introduction


The Make2D-DB II is an environment to create, convert, interconnect and keep up-to-date 2-DE databases. Being open source (cf. license), it is distributed totally free of charge.

The tool converts federated 2-DE databases from text format (flat file format), as well as spreadsheets (e.g. excel reports) or simple text reports, into a relational format. It also creates new relational databases from scratch. It runs on any UNIX-type operating system (Linux, Solaris/SunOS, IRIX). Being continuously developed, the tool is evolving in concert with the current Proteomics Standards Initiative.

Due to its concept, The Make2D-DB II ensures a very high consistency of data. It allows dynamic interconnection between any number of remote databases and offers many other features. Those features include automatic data updates related to external data resources, dynamic cross-references to similar databases, an intuitive search engine with data visualization and exports in various formats.

With this tool, one can easily convert any existing personal 2-DE data (including databases built with the first make2ddb package or following the SWISS-2DPAGE conventions, cf. http://www.expasy.org/ch2d/ ) into a more reliable format.

In addition to those features, the tool extends the concept of 2-DE databases to cover a more elaborate description of proteomic experiments and analysis. Currently, it can integrate various annotation documents (e.g. sample and gel preparation), as well as many common analysis file formats (e.g. Mass Spectrometry, mzData, etc..). Users can also define their own types of annotations and easily integrate them within their data representation. The appearance of the Web interface itself can be personalized for everyone's taste.


Features:

The tool can easily run without any specific technical knowledge.

In its current version (2.0), it checks deeply the syntax and consistency of the data to be converted (if available), points out any encountered ambiguity or error and suggests corrections in many cases. If the maps are annotated with Melanie / ImageMasterTM 2D Platinum software, exports from this software in XML or text reports are directly read, otherwise simple text files (referenced in this document as reports), spreadsheets (tab-delimited exports from excel) or SWISS-2DPAGE-like  flat files are adequate.

Once the maps and spots' annotations are captured, the data is converted into a consistent relational database schema. This schema is composed of four distinct parts. One to hold the 'core' data (including the not-yet-public data, as well as all operations concerning data management), a second for publicly accessible data, a third that keeps track of all modifications applied to the database throughout its history, and finally a fourth part to hold common procedures needed by both 'core' and public schema.

Once installed, queries can then be addressed using the provided Web interface. An interface that is easily configurable and adaptable for personal tests and needs. A graphical viewer is also included to display maps and spot annotations. The provided interface has the ability to query simultaneously several local or remote databases all at once and is also able to 'talk' with other similar interfaces. This particularity means that queries may be addressed to several data resources simultaneously. Results are then consolidated before being displayed to the user.

The interface offers an additional feature, which is the possibility to extract (in real time) many types of data or search results in different formats, so they may be imported into other programs, or simply be viewed and integrated as is in other displayer or into distinct Web pages. In addition, there is also an optional URI-like syntax meant to extract objects or launch search queries in a very intuitive way. Several attributes can then be given to define the data representation (e.g. the output format) and what it is needed for (e.g. extract results for an export or just to be graphically viewed).

A main feature of the tool is that it can gather, integrate and update data relative to external information from the Web in a complete transparent manner. Dynamic cross-references to similar data resources are also automatically provided. This data can continuously and easily be managed and kept up-to-date.

An additional update interface shall be provided in the near future to cover all the management of local data. If you register and install the tool then you will be notified for all major changes and updates. The forthcoming versions of the tool should always allow you to easily convert from a previous installation to an up-to-date one in an almost transparent manner.


Some comments:

You should be aware that the pre-defined queries of this version do not take full advantage of all the strength given by the new data model. Any suggestion from users for pre-defined queries are more than welcome.

Due to the fact that we aim to convert already existing data into a newly designed relational format, some flexibility has been intentionally adopted to suit the commonly used federated SWISS-2DPAGE syntax. The goal being to make the conversion as smooth as possible and to prevent users from re-writing their data whenever some more precision would have been wished. Subsequent versions of the tool should gradually present data from 'entries-like' view to a 'spot-like' one.

The graphic interface has been tested with a range of different configurations, with different browsers (Explorer, Mozilla,  Netscape, FireFox,..) on different operating systems (Linux, Windows, MAC,..). In the meanwhile, we would highly appreciate your remarks concerning any inconvenient behavior you may encounter.


Some facts about the WEB query interface:

The Web query interface is intended to query any number of databases built with this tool. It is not limited to query only one database, as it can hold simultaneously as many databases as one may wish. It is not required that those databases are to be installed on the same server. The interface can address remote databases as long as their Web addresses or connection parameters are known and accessible. This means that you can still use this tool to host an interface that queries several remote databases, even without building your own. To use the tool for this purpose, you will only need to generate a server configuration file, as described later, and to launch the tool with the appropriate server installation option.

[top]



Installation


1.1   BEFORE USING THE TOOL, YOU SHOULD:

* many of the following tasks are usually to be performed by the system administrator
 
- Some very few versions of the DBI module were not working properly with the query interface. If you encounter any problem at this level, upgrade to a recent version of DBI (e.g. 1.48 or higher)

- A recent include to the package is the Phenyx module InSilicoSpectro, which is used to read mass spectrometry files from a range of various formats. A special version has been specifically arranged to work with this package with a minimum set of installation requirements. In some cases, you will probably need to install the Compress-Zlib module from CPAN if you want to read  and include data from your MS files. A recent copy of this module is to be found within this package distribution ( libs/Compress-Zlib.tar.gz ).

[top]


1.2   THE CONVERSION PROCESS AND THE SERVER SET UP:

 

All output messages and error messages of your last check or conversion operation are stored in parallel in 2 separate files, so you may check/inspect them at any moment [last_STDOUT.log and last_STDERR.log, to be found in the main directory] - for any encountered problem, please do always include those two files with your message when contacting us. The last_STDOUT.log describes in details the check and installation process, but it also contains important messages and warnings relevant to your data conversion. It's always a good idea to have a look at this file after any installation - even if every thing seems to have been executed correctly - and look for any potential errors or warning messages, for example by searching in the file the two words: 'error' and 'warning'!

[top]



1.3   QUERYING THE NEWLY CONVERTED DATABASE:

 

If you have used the '-m transform' option, you should now be able to query your converted database. Supposing that you have configured your '$web_server_ref_name' variable to '2d' (cf Readme: Configuration for more details), then type the following address in your browser to navigate through the query interface:
 - http://your_domain_name/2d/admin

You will then be asked to enter your postgres login_id and password (the postgres account used to create the database). You will also be asked for a database name if your interface is managing sever databases simultaneously. Your login information will definitely expire after 20 minutes of non activity.
[top]



1.4   THE DATABASE CROSS-REFERENCE LINKS:


A default file listing some URL links (Web addresses) to different database cross-references (including those for the DR lines) is provided within this package (within the "text" directory). The file name is 'links.txt'. When you install a new database and allow external data retrieval, an extra file is then automatically downloaded from the ExPASy server and is added to this same directory (the file name is 'DbCrossRefs.txt'). If this file has been downloaded, then it will be used instead of the default links.txt to define your cross-reference links. You can let the tool use this file as it is, or choose to edit it yourself to update or add  more URLs.

You should not update this 'DbCrossRefs.txt' file directly from inside the "text" directory, the changes will apply to your subsequent installations but the file itself may be replaced again by a newer version extracted from the ExPASy server (if the file is older than some few days). The best way to make your changes permanent for a specific database is to update the file called 'links.txt' by editing it, after your installation is complete, from your WEB server directory where it has been copied (by default the copy of this file should be found in '/www/var/cgi-bin/2d/inc/links.txt' or a similar path).

The easiest way to edit this file, is to edit it through a simple text editor. Another way is to use the Perl module 'DbCrossRefs.pm' provided with this package (this module is also located in the "lib2d" directory, and is copied to your /www/var/cgi-bin/2d/inc' directory during the server installation process). Type 'perldoc DbCrossRefs.pm' for a brief description of the module, or simply read the DbCrossRefs.html file for more details.
 

Those files (links.txt, or the updated DbCrossRefs.txt) list the databases links one link per line. The format of each line is:

database_name   database_URL

The database_name is the name of the database itself, while the database_URL is the URL address of the displayer of this database. e.g.

SWISS-2DPAGE   http://www.expasy.org/cgi-bin/nice2dpage.pl?{0}

The '{0}' is a pattern that the tool will replace by the given primary accession number related to this database.
If the database displayer requires more than one parameter, replace all the subsequent parameters by {1}, {2},.. and so on,

e.g: DATABASE_NAME http://genome-www5.stanford.edu/cgi-bin/SMD/source/sourceResult?criteria={0}&choice=Gene&option=Symbol&organism={1}

Each time you use the external data update interface, an additional file, called 'links.txt.uniprot' will be refreshed and placed in the server /2d/data/tmp directory. This file will be the primary source for your displayed DR cross-reference lines for entries as long as a variable called '$EXTERNAL::update_from_SwissProt' within your basic_include.pl configuration file is activated (not set to 0).

Note: whenever a cross-reference database is not listed in the links.txt nor in the links.txt.uniprot files, the query interface will try to extract it from a table called 'XrefDB' within the postgreSQL database.

[top]



1.5   PACKAGE CONTENTS:


Important: This section lists the files included within this package. Even if your are not interested in the technical aspect of the tool, it is highly recommended that you read both the data_test and data_test/examples sub-sections, as they give some precision on the available formats given as examples for data preparation.
The main directory
The lib2d directory
The pgsql directory

The text directory

The src directory


The data_test/examples directory

 

      As the tool works and understands different types and formats for data input, this directory contains a set of different examples on how to prepare data for the installation process. The format you are likely to adopt for your data depends on the availability of the data you have and on the granularity of the annotations you would like to include in your database.


The http_server/divers directory



The http_server/icons directory

The lib directory
This directory contains some of the public libraries that you may need to install on your system or information on how to get and install them when needed. The content of this directory do not follow the general license terms stated in License.txt file on the main directory.


The readme directory
This is where the readme and HTML files are located: 

A sub-directory called 'examples' holds some files used by the readme files.

Another sub-directory, called 'database_schema' includes 3 sets of documentation files describing the database four schemas structure and functions.
This sub-directory is not present in distributions marked as 'light'.

The first set of files, prefixed by 'all', contains the whole database structure. The other 4 sets contain, each, the structure of a schema (common, core, log and public schema):
The temp directory  

This directory will be used by the tool to store temporary files.
This directory will also contain your previous databases' dumps (backups), as all already existing databases will be 'dumped' by the tool whenever a new installation is replacing an existent database.
You can delete anything from this directory.
Note: depending on your use of the tool, some more additional files will be generated or downloaded into your package directories.

[top]



1.6   DATABASE SCHEMA:


Relational postgreSQL database schema in HTML summary:


[top]


1.7   TO DO LIST:


1.8   CONTACT:

- For an up-to-date documentation, news and FAQ, as well as the latest version, visit the The Make-2D-DB II Site

- For more information, visit the SWISS-2DPAGE site at http://www.expasy.org/ch2d/

- For any question or comment, please feel free to send an e-mail to: khaled.mostaguir@isb-sib.ch


Related documents:

For any question, suggestion or comment: Please, contact us.

[top]