The
Make2D-DB II Package
version:
2.00.1 (December 2005)
Read-Me:
Main
Related documents:
Introduction
The Make2D-DB II is an
environment to create, convert, interconnect and keep up-to-date 2-DE
databases. Being open source (cf.
license), it is distributed totally free of
charge.
The tool converts federated
2-DE databases from text format (flat file format),
as well as spreadsheets
(e.g. excel reports) or simple text
reports, into a
relational format. It also creates new relational databases from
scratch.
It runs on any UNIX-type operating system (Linux, Solaris/SunOS, IRIX).
Being
continuously developed, the tool is evolving in concert with
the current Proteomics
Standards Initiative.
Due to
its concept, The Make2D-DB II ensures a very high consistency of data.
It allows dynamic interconnection between
any number of remote
databases and offers many other features. Those features include
automatic data
updates related to external data resources, dynamic cross-references to
similar databases, an intuitive
search engine with data visualization and exports in various
formats.
With this tool, one can
easily convert any existing personal 2-DE data
(including databases
built with the first
make2ddb package or following the SWISS-2DPAGE
conventions, cf. http://www.expasy.org/ch2d/
) into a more reliable format.
In addition to those
features, the tool extends the concept of 2-DE
databases to cover a more elaborate description of
proteomic experiments and analysis. Currently, it can integrate various
annotation documents (e.g. sample and gel preparation), as well as many
common analysis
file formats (e.g. Mass Spectrometry, mzData, etc..). Users can also
define their own types of annotations and easily integrate them within
their data representation. The appearance of the Web interface itself
can be personalized for everyone's taste.
Features:
The tool can easily run
without any specific technical knowledge.
In its current version
(2.0), it checks deeply the
syntax
and consistency of the data to be converted (if available), points
out any
encountered
ambiguity
or error and suggests corrections in many cases. If the maps are
annotated
with Melanie / ImageMasterTM
2D Platinum software, exports from this software in XML or text
reports are
directly read,
otherwise simple text files (referenced
in this document as reports), spreadsheets (tab-delimited exports from excel) or
SWISS-2DPAGE-like flat files are adequate.
Once the maps and spots'
annotations
are captured, the data is converted into a consistent relational
database schema. This schema is composed of four distinct parts. One to
hold the 'core' data (including the not-yet-public data, as well as all
operations concerning data management), a second for publicly
accessible
data, a third that keeps track of all modifications applied to the
database throughout its history, and finally a fourth part to hold
common procedures needed by both 'core' and public schema.
Once installed, queries can
then be addressed using the provided Web
interface. An interface that is easily configurable and adaptable for
personal tests and needs.
A graphical viewer is also included to display maps and
spot
annotations. The provided interface has the ability to query
simultaneously several local or remote databases all at once and is
also able to 'talk' with other similar interfaces. This particularity
means that queries may be addressed to several data resources
simultaneously. Results are then consolidated before being displayed to
the
user.
The interface offers an
additional feature, which is the possibility to
extract (in real time) many types of data or search results in
different formats, so they may be imported into other programs, or
simply be viewed and integrated as is in other displayer or into
distinct Web pages. In addition, there is also an optional URI-like
syntax meant to extract objects or launch search queries in a very
intuitive way. Several attributes can then be given to define the data
representation (e.g. the output format) and what it is needed for (e.g.
extract results for an export or just to be graphically viewed).
A main feature of the tool
is that it can gather,
integrate and update
data
relative to external
information from the Web in a complete transparent
manner.
Dynamic cross-references to similar data resources are also
automatically
provided. This data can continuously and easily be managed and kept
up-to-date.
An additional update
interface shall be provided in the near
future to cover all the management of local data. If you
register
and install the tool then you will be notified
for
all major changes and updates. The forthcoming versions of the tool
should always
allow you to easily convert from a previous installation to an
up-to-date
one in an almost transparent manner.
Some comments:
You should be aware that the
pre-defined queries of this version do
not
take full advantage of all the strength given by the new data model.
Any suggestion from users for
pre-defined
queries are more than welcome.
Due to the fact that we aim
to convert already existing data
into a newly designed relational format, some flexibility has been
intentionally
adopted to suit the commonly used federated SWISS-2DPAGE
syntax. The goal being to make the conversion as smooth as
possible and to prevent users from re-writing their data whenever some
more precision
would have been wished. Subsequent versions of the tool should
gradually present data from 'entries-like' view to a 'spot-like' one.
The graphic interface has
been tested with a range of different
configurations, with different browsers (Explorer, Mozilla,
Netscape, FireFox,..) on
different
operating systems (Linux, Windows, MAC,..). In the meanwhile, we would
highly appreciate your remarks concerning any inconvenient behavior
you may
encounter.
Some facts about the WEB
query interface:
The Web query interface is
intended to query any number of databases
built with
this tool.
It is not limited to query only one database, as it can hold
simultaneously
as many databases as one may wish. It is not required that those
databases
are to be installed on the same server. The interface can address remote
databases as long as their Web addresses or connection parameters
are known and
accessible. This
means
that you can still use this tool to host an interface that queries
several remote databases, even without building your
own.
To use the tool for this purpose, you will only
need to generate a server configuration file, as described later, and
to
launch the tool with the appropriate server installation option.
Installation
1.1
BEFORE USING THE TOOL, YOU SHOULD:
* many of the following tasks are usually
to be performed by the system administrator
- Install Perl version 5.6
or
higher (Perl/ActivePerl 5.8 recommended), make
sure you also include the perl DBI
module and the PostgreSQL driver DBD::Pg
(check for DBD::Pg that it has been compiled with the POSTGRES_LIB
variable pointing to the correct version of your postgreSQL libraries)
- If you use Perl 5.005: you will have to check for the ' libwww-perl
' perl library in order to retrieve external information related to
your
data via the WEB. In case the library is not present, you can find a
recent
version included within the package, you can also check the CPAN server
(http://www.cpan.org) for the most
recent
version and how to install it.
- Some very few versions of
the DBI module were not working properly with the query interface. If
you encounter any problem at this level, upgrade to a recent version of
DBI (e.g. 1.48 or higher)
- A recent include to the package is the
Phenyx module
InSilicoSpectro,
which is used to read mass spectrometry files from a range of various
formats. A special version has been specifically arranged to work with
this package with a minimum set of installation requirements. In some
cases, you will probably need to install the
Compress-Zlib
module from CPAN if you want to read and include data from your
MS files. A recent copy of this module is to be found within this
package distribution (
libs/Compress-Zlib.tar.gz
).
- Install and configure a HTTP server (Apache
is highly recommended, and required if you plan to use the mod_rewrite
redirection rules)
- Install and run PostgreSQL
PostgreSQL is an open source object-relational database
management
system that can be found at: http://www.postgresql.org
- download and install postgreSQL version 7.4 or higher
(version 8.0 has been fully tested, and only recently version
8.1) for your
system.
- To use postgreSQL, the holder of the postgres server daemon
(who is typically the database superuser,
generally called 'postgres' or 'pgsql') - see with your system
admininstrator - should
have launched and checked the following commands (or similar ones,
assuming
the bin files are in the /usr/local/pgsql/bin directory
and
the lib files in the /usr/local/pgsql/lib):
- Verify that the pgsql binaries are added to your PATH
variable (e.g.
by setting: PATH=$PATH:/usr/local/pgsql/bin)
- Export the environment variable "PGDATA" pointing to the
directory
where the database is to be located, as well as the environment
variable
"PGLIB" pointing to where the postgreSQL libraries are to be found.
e.g. (depending on your system)
$ export
PGDATA=/a_path_to_store_your_data/postgresql (this path will be
referred to as the [new_database_path])
$ export PGLIB=/usr/local/pgsql/lib
- Initialize the database cluster (only once)
/usr/local/pgsql/bin/initdb -D
[new_database_path] (-D
"new_database_path" can be omitted if "PGDATA" is sweetly defined)
As you will need to allow non-local connections, you will
probably have to add more
"host" records in the generated configuration file pg_hba.conf
created in the
directory [new_database_path].
You should at least allow the "localhost" connections by making sure
the line:
"host all all 127.0.0.1 255.255.255.255 trust"
is not commented (by removing the '#' character at its beginning). You
may keep or modify the trust parameter.
Also, remember TCP/IP connections are only enabled if you do not
disable "tcpip_socket" in the file postgresql.conf
- Start the postmaster process (the database server)
by:
nohup /usr/local/pgsql/bin/postmaster -i -d 1 -o -e
-D [new_database_path]
> [new_database_path]/server.log 2>&1 &
The "-i" option is necessary to use postgreSQL with
networking.
The use of the switch "-d [1-5]" is to include debug information in
the server.log file; debug information is redirected with " >
path/server.log
2>&1". Both this switch and the redirection can be omitted if
not desired.
- Include the PL/pgSQL procedural language to the
created database by (to do only once):
/usr/local/pgsql/bin/createlang plpgsql template1 --pglib
"/usr/local/pgsql/lib/" (--pglib
"path" can be omitted if "PGLIB" is defined)
Tip:
Another simple way to start / restart / stop
the
postmaster
server is to use pg_ctl:
nohup pg_ctl -o "-i" -D [new_database_path]
[start|restart|stop]
&
(e.g. nohup pg_ctl -o "-i" -D /path/my_database -m smart restart
&)
If postgreSQL binaries and libs are not in /usr/local/pgsql/,
then modify the previous paths.
Important:
The database holder (the superuser) will now own the
postmaster
process and the new databases to be created.
Although this is possible, you will not want this general postgres
superuser to directly edit your specific new 2-DE database.
For that, let the superuser (who can physically be you or your system
administrator) create another new user
(named after the postgreSQL account you plan to use with the 2-DE
database, possibly your own personal account).
Make sure this new user do have the right to create both databases and
other new users.
To create a new user that will own the 2-DE database, the
superuser
should execute the following SQL command:
CREATE USER newUser_loginID CREATEDB
CREATEUSER
If a password is wished, then state it by:
CREATE USER newUser_loginID WITH PASSWORD 'newUser_loginPassword'
CREATEDB CREATEUSER
The Make2D-DB II tool will automatically create an additional new
postgreSQL user called (by default)
'select2d' who has no special rights except performing
select queries on the 2-DE database public schema.
In case, as a postgreSQL user, you were not allowed the right to create
another
users, then ask the postmaster process owner (the superuser) to create
a user called 'select2d'
( by typing the system command: "createuser select2d", or simply by
executing
the SQL command: "CREATE USER select2d" within a psql session).
For more details, you can check the following link: PostgreSQL
Technical Documentation (8.1).
1.2
THE
CONVERSION PROCESS AND THE SERVER SET UP:
- Go to the directory containing the Make2D-DB II scripts
and
generate/configure
the two configuration files:
- include.cfg (configuration
file for data conversion process)
- 2d_include.pl
(configuration
file for the server WEB interface)
as well as the maps' file
by typing the command:
perl make2db.pl -m config
as described in the Readme:
Configuration
file
There is another third file named basic_include.pl
located in the config
directory. This configuration file contains some
extra definitions that you occasionally may want to change. If some of
those definitions do not suite you, then fell free to change them by
editing this file using any text editor (though, do not change any
definition coming after at the section labeled ' DO
NOT EDIT OR CHANGE ANYTHING
STARTING FROM HERE!!').
- You are now ready to launch the conversion process
Go to the Make2D-DB II
directory and type the command:
perl make2db.pl -m [option] (for
more details, type: perldoc make2db.pl or perl
make2db.pl
-help)
where option is one of:
config
-> Set up the configuration files for both the database installation
and the
Web server pages
check
-> Check the syntax and the consistency of your data to be
converted,
no further operation is launched. The process is interrupted when a
major
error is encountered
check-report
-> Same as 'check', except it does not stop on major errors (check
the last_STDOUT.log file from
the main directory for a summary check report)
create
-> Create the relational schema for a new database from scratch (no
data), no data conversion is done
transform
-> Combine the 'check' and 'create' options (as well as the 'server'
option, see later). Checking
database
syntax,
creating the relational schema and loading data into the new database.
This also prepare the files for the Web server.
update
-> For further updates of the schema structure and related server
scripts (not implemented yet)
server
-> This option can be used independently if you wish to host an
interface
to query other remote databases without even having your own database.
It is also to be used when the 'transform' option cannot fully set up
the HTTP Web server pages due to invalid
permissions. The script can then
be *re-executed* with this option to only set up the HTTP Web server
files (probably as a 'sudo' command).
e.g: type 'perl
make2db.pl -m transform'
to
convert
your existing database into the new relational format
All output messages and error messages of
your last check or conversion operation are
stored in parallel in 2 separate files, so you may check/inspect them
at any moment [last_STDOUT.log
and last_STDERR.log, to be found in the main directory] - for
any encountered problem, please do always include those two files with
your message when contacting us. The last_STDOUT.log describes in
details the check and installation process, but it also contains
important messages and warnings relevant to your data conversion. It's
always a good idea to have a look at this file after any installation
- even if every thing seems to have been executed correctly - and look
for any potential errors or warning messages, for example by
searching in the file the two words: 'error' and 'warning'!
To check your data, without running any database construction yet
use the '-m check' option. If
the tool encounters any fatal error, it simply halts the process
pointing you to the source of error. When too many errors seem present,
you should better use the '-m
check-report'
to check all your data without interruption. You should then open and
inspect
the generated 'last_STDOUT.log' file and correct all the
described
sources of error within your data.
The option to convert a database is '-m transform'. This combines
the '-m check, create and
server' all together.
Type:
perl
make2db.pl -m transform
The
process
starts by extracting the maps and spots annotations from the maps /
list files.
If not provided, an intermediate flat file (similar to SWISS-2DPAGE
text file) is automatically generated. This file is then deeply
checked, ambiguities are pointed out and
errors
described. You will have to correct major errors yourself, as the
syntax
checker will stop the process and display the source of error
(sometimes
with a correction proposal). Otherwise, for some minor
errors/ambiguities,
you will be asked to make a choice between accepting some behavior or
simply stop the process and correct them manually.
Once the syntax checking part is successful, the tool connects to
the
Internet to gather some additional external data related to your
entries
(if you have not disabled this function in your configuration files).
It is highly recommended that you provide accession numbers similar to
those of Swiss-Prot/TrEMBL (cf. UniProtKB user manual)
or at least provide cross-references to Swiss-Prot/TrEMBL (by providing
your own
flat file or by editing the one generated by the tool) to get the
maximum use of this external data gathering feature.
After
that step, the database schema is constructed (schema and tables
created,
functions,
rules and triggers loaded). Please, note that you should not run
any
postgreSQL/psql
session on your database (if it already exists) during this phase.
The data to be converted is then loaded into the relational
tables.
Follows the construction of the 'view' functions and tables
(auxiliary
tables dedicated to prepare static views to accelerate queries
execution).
This operation could be relatively long, depending on the size of your
database. The database is completely converted once some additional
technical
information has been added to it.
- The Web Server settings
The final step will be the preparation of the Web server
interface.
If you do have full write permissions on the HTTP server directories
then things
will
simply go smoothly and you will not have to care about the following
notes.
Otherwise, you will be notified that you do not have the required
permissions
for some of the operations to be achieved. To achieve those operations,
you should wait until the script has finished and then re-execute the
script again
with
a user ID that has write permissions into the HTTP server directories
(see with your system administrator or use a 'sudo'
command)*.This
time, you should use the option 'server'
with the script:
perl make2db.pl -m server
This operation should take very few seconds as it skips all the
previous
operations already completed, except those concerning the server
settings.
(Another option
would be to create the needed
directories
manually and give them the right permissions. As this may be a quite
complicated task, you may contact us for some more details)
If, for some reason, some
default
directories (icons, tmp, html) are
not
recognized by your server, you should declare them in the server
configuration
file. For Apache, edit the httpd.conf file by adding blocks of the
format
(see with your system administrator):
Alias
/icons/ "/www/var/new_directory/"
<Directory
"/www/var/new_directory">
Options Indexes MultiViews
AllowOverride None
Order allow,deny
Allow from all
</Directory>
the path
'/www/var/new_directory' could be different on your
system.
Once done, you should then
restart the HTTP server for the changes
to take effect.
End of
Installation:
You will now find the server configuration file "2d_include.pl"
in the new created server directory '/var/www/cgi-bin/2d/inc/'*.
By
editing
this file at any moment, you will be able to change the behavior of
your
Web interfaces, e.g: background color, logo image name, displayed name
of the database, copyright message, but also more technical parameters
for the databases to connect to.
Another configuration file, called "basic_include.pl", is to be
found in the directory '/var/www/cgi-bin/2d/inc/'. This file
will
control very specific behaviors of your interface, including how to
deal with external data update and to define the potential other remote
interfaces to link with your own . In general, you will seldom need to
edit this
file.
*("/var/www/" is, in many systems, the
default path to the Apache HTTP server directories. In some
installations it may be located at another path)
1.3
QUERYING
THE NEWLY CONVERTED DATABASE:
If you have used the '-m transform'
option, you should now be able to
query your converted database. Supposing that you have configured
your '
$web_server_ref_name'
variable to '2d' (
cf.
Readme: Configuration for more
details), then type the following address in your browser
to navigate through the query interface:
- http://your_domain_name/2d/ ( http://your_domain_name/html/2d/
on some systems )
(If you have set the server scripts and documents manually, then
you
should rather type: http://your_domain_name/cgi-bin/2d/2d.cgi)
Where your_domain_name is the name or IP address of your
domain.
To access the management interface as the database administrator, you
should type:
-
http://your_domain_name/2d/admin
You will then be asked to enter your postgres
login_id and password (the postgres account used to create the
database). You will also be asked for a database name if your interface
is managing sever databases simultaneously. Your login information will
definitely expire after 20 minutes of non activity.
Remark: If you are working on the same machine that hosts your HTTP
server, you may also replace your_domain_name by localhost:
- http://localhost/2d/
Finally, make sure your browser is configured to accept cookies.
1.4
THE
DATABASE CROSS-REFERENCE LINKS:
A default file listing some URL links
(Web addresses) to different database
cross-references (including those for the DR lines) is provided within
this package (within the "text"
directory).
The file name is 'links.txt'.
When you install a new database and allow external data retrieval, an
extra file
is then automatically downloaded from the ExPASy server and is added to
this same
directory
(the file name is 'DbCrossRefs.txt').
If this file has been downloaded, then it will
be used instead of the default
links.txt to define your
cross-reference
links.
You can let the tool use this file as it is, or choose to edit it
yourself to update or add more URLs.
You should not update this 'DbCrossRefs.txt' file directly
from inside the "text"
directory,
the changes will apply
to your subsequent installations but the file itself may be
replaced
again by a newer version extracted from the ExPASy server (if the file
is older than some few days).
The best way to make your changes permanent for a specific database is
to update the file called 'links.txt'
by editing it, after your installation is complete, from your WEB
server directory where it has been copied (by
default the copy of this file should be found in '/www/var/cgi-bin/2d/inc/links.txt'
or a similar path).
The easiest way to edit this file, is to
edit it through a simple
text
editor. Another way is to use the Perl module 'DbCrossRefs.pm'
provided
with this package (this module is also located in the "lib2d" directory,
and
is copied to your /www/var/cgi-bin/2d/inc' directory during the
server
installation process). Type 'perldoc
DbCrossRefs.pm' for a brief description of the module, or simply
read the DbCrossRefs.html file
for more details.
Those files (links.txt, or the updated
DbCrossRefs.txt) list the
databases links one link per line. The
format
of each line is:
database_name
database_URL
The database_name is the name of
the database itself, while
the
database_URL
is the URL address of the displayer of this database. e.g.
SWISS-2DPAGE
http://www.expasy.org/cgi-bin/nice2dpage.pl?{0}
The '{0}' is a pattern
that the tool will replace by the
given
primary accession number related to this database.
If the database displayer requires more than one parameter, replace
all the subsequent parameters by {1}, {2},.. and so on,
e.g: DATABASE_NAME
http://genome-www5.stanford.edu/cgi-bin/SMD/source/sourceResult?criteria={0}&choice=Gene&option=Symbol&organism={1}
Each time you use the external data
update interface, an additional file, called 'links.txt.uniprot' will
be refreshed and placed in the server /2d/data/tmp
directory. This file will be the primary source for your displayed DR
cross-reference lines for entries as long as a variable called
'$EXTERNAL::update_from_SwissProt'
within your basic_include.pl
configuration file
is activated (not set to 0).
Note: whenever a cross-reference
database is not listed in the
links.txt nor in the links.txt.uniprot files, the query interface will
try to extract it from a table
called 'XrefDB' within the postgreSQL database.
1.5
PACKAGE CONTENTS:
Important: This section
lists the files included within this package. Even if your are not
interested in the technical aspect of the tool, it is highly
recommended that you read both the data_test
and data_test/examples
sub-sections, as they give some precision on the available formats
given as examples for data preparation.
The main
directory
This directory contains the main script to run the tool:
- make2db.pl (the perl main script to run the tool)
- License.txt (copyright and license terms)
- version (current version of the tool)
- last_STDOUT.log (the detailed output log file of the last
operation
launched
by the main script / present only after first installation)
- last_STDERR.log (the error output file of the last operation
launched
by
the main script / present only after first installation)
The lib2d directory
This directory contains the different perl scripts and modules used by
the main script, it also contains a copy of the configuration file
'basic_include.pl' named 'basicIncludeInitial.pl':
- make2db_CFG.pm (a module to set up the different configuration
files)
- make2db_ASCII_TABLES.pm (the data syntax checker and converter to
generate
the intermediate
ASCII tables)
- make2db_POSTGRES_DB.pm (builds the relational database structure
and
loads data
into
it)
- make2db_util.pl (more routines and utilities for both the
installation process and the Web interface external connections)
- DbCrossRefs.pm (a module to manage the
cross-reference
databases' URLs)
- MelanieXMLParser.pm (a module to parse Melanie XML exports)
- miscellaneous.pl (special routines, includes the routine to
generate the Apache Mode Rewrite rules and functions)
- CRC64.pm (calculates the CRC64 of a string - courtesy of the
Swiss-Knife
developers)
- convert_file (a small perl script to convert DOS files into
UNIX ones)
The pgsql directory
This directory contains all the postgreSQL commands, rules, triggers
and functions to build the new relational database:
- make2db_tables.pgsql (postgreSQL commands to create the
relational
tables)
- make2db_functions.pgsql (postgreSQL PL/pgSQL scripts to set
up the
database server-side functions)
- make2db_triggers.pgsql (postgreSQL PL/pgSQL scripts to set up
the
database
inner triggers)
- make2db_update_internal.pgsql (postgreSQL PL/pgSQL scripts to
update internal data relative to the external resources)
- make2db_final_views.pgsql (postgreSQL PL/pgSQL scripts to
build the
auxiliary
tables, the static views and the main update functions)
The text directory
This directory contains the main scripts and packages to create and
convert your new database:
- links.txt (a text file listing URL addresses to
query/cross-reference different servers
and databases)
- tisslist_initial.txt (a prepared list of different tissues based
on the Swiss-Prot tissue list
-
courtesy of the
Swiss-Prot
group, Geneva, SIB)
- tisslist_aliases.txt (aliases for the tissue names listed in the
tissue list)
- tisslist.txt (an up-to-date list of tissues extracted from ExPASy
- present only after first external connection)
- DbCrossRefs.txt (an up-to-date list of URLs for cross-references
to use instead of links.txt- present only after first external
connection)
The src directory
This directory contains the source code for independent utilities :
- <InSilicoSpectro> (this is the directory containing the
Make2D-DB II modified version of the Phenyx module 'InSilicoSpectro',
used to read and convert MS data)
- <graphApplet> (source code of the java applet to display
mass spectra)
This directory will contain your last configuration files which are
generated by executing the main script with the -m config option.
The two last generated files are named (by default):
- basic_include.pl (this is the non-generated basic configuration
file that you may edit manually)
- config.cfg (your last generated personal configuration file for
the database
creation/conversion process)
- 2d_include.pl (your last generated personal configuration file
for the Web
interface behavior and appearance)
- existing.maps (a tab-delimited file listing the gel names, their
dimension and related annotations; such a file is generated
interactively during the configuration process)
- test.dat (a flat file for a test database)
- PLASMA.png (a graphic image of the map called PLASMA,
to be
displayed
within the Web interface)
- PLASMA2.png ((a graphic image of the map called PLASMA2,
to be
displayed
within the Web interface)
- small_PLASMA.png (a small graphic image of the PLASMA gel to be
displayed as
a
thumbnail within the Web interface)
- small_PLASMA2.png (a small graphic image of the PLASMA2 gel to be
displayed as
a
thumbnail within the Web interface)
- PLASMA.txt (a tab-delimited text report - excel export -
listing the spots' coordinates on the
PLASMA gel as well as the identified proteins,
it
also includes some other required, optional or free-text annotations -
headers are required in this case)
- PLASMA2.txt (a tab-delimited text report - excel export -
listing the spots' annotations on the
PLASMA2 gel, this report emphasizes the inclusion of MS files and
external documents to be integrated within the database)
- Export.xml (a Melanie XML export for both the PLASMA and
PLASMA2 maps << generated from "Melanie IV / ImageMaster" by: File -> Export -> Gel Data to XML...
>>, the name of the file is irrelevant as it only needs to have
the '.xml' extension to be recognized)
- hiddenEntries.txt (the list of the accession numbers to be hidden
from public access, only privileged users would access those entries)
- hiddenGels.txt (the list of the gels to be hidden from public
access, only privileged users would access those gels)
- hiddenSpot.txt (this list define which spots within an entry is
to be hide from public access, it also controls which identification
results, e.g. MS data - are not to be shown)
- make2DDBLogo.gif (a graphic logo to be displayed by the Web
interface)
In this case, data need only be extracted for the two maps of
interest (PLASMA and PLASMA2) from only one of the three alternatives
formats present here (either the tab-delimited reports, the flat file
or the Melanie XML export). This is to be defined within the
configuration files.
Two additional directories <gel_doc>
and <ms_files> contain
the documents declared in the existing.maps file (gel annotation
documents) and in the PLASMA2.txt (different MS files in pkl, dta, mgf
and mzData format, as well as a Mascot search report in dat format).
Those directories are not
part of the package: You may define the location of your
documents and mass spectra files to be anywhere.
The data_test/examples directory
As the tool works and understands different types and formats for data
input, this directory contains a set of different examples on how to
prepare data for the installation process. The format you are likely to
adopt for your data depends on the availability of the data you have
and on the granularity of the annotations you would like to include in
your database.
- PLASMA.excel_example.txt, PLASMA2.excel_example.txt,
PLASMA3.excel_example.txt (tab-delimited text reports - excel
export - listing the spots coordinates on three different gels, there
are various types of annotations in each of those files - headers are
required)
- PLASMA3.excel_example_noheaders.txt (also a tab-delimited text
report, except that it does not contain special/personal annotations,
it follows a default structure that will be described later - no
headers are required)
- Melanie_XML_example.xml (a Melanie XML export for both the PLASMA
and
PLASMA2 maps << generated from "Melanie IV / ImageMaster" by: File -> Export -> Gel Data to XML...
>> )
- test.dat (a flat file - SWISS-2DPAGE-like, of the test database)
- PLASMA.report_example.txt and PLASMA2.report_example.txt (two
text reports listing the spots coordinates for the
PLASMA and PLASMA2 gels, they
also include relative Od and volume values, those files are to be used
when a simple flat file is the source of annotations and no Melanie
export is present - headers are not required)
- existing.maps.detailed.example (a tab-delimited file listing the
gel names, their
dimension and related annotations; such a file is generated
interactively during the configuration process)
- existing.maps.noSpecies.example (like the previous, except that
it is manually written and contains very few annotations - this is the
old format used with the previous versions of the tool and is still
accepted for historical reasons)
- example.include.cfg and 2d_include.cfg (an example of the two
interactively generated configuration files)
- hiddenEntries.txt, hiddenGels.txt and hiddenSpots.txt (the files
controlling which entries, gels, spots and spots annotations are to be
hidden from public access)
- PLASMA.png, PLASMA2.png and PLASMA3.png (the graphic images
of the three gels)
- small_PLASMA.png, small_PLASMA2.png and small_PLASMA3.png
(thumbnails for the three gels)
- existing.maps (a file listing the gel names and the X,Y
dimensions of
their corresponding images)
- PLASMA.mel (a Melanie annotated image of a gel nammed
'PLASMA' - this is deprecated, as direct extraction of annotations from
Melanie images has been abandoned)
- 2d.cgi (the main Web script to query the database)
- 2d_view_map.cgi (the graphic map viewer)
- inc/2d_util.pl (a collection of routines and
functions
called by
the previous two scripts)
- inc/2d_util.core.pl (a collection of routines and
functions
called by the management cloned version of the main script)
The http_server/divers
directory
This directory contains other programs needed by the Web interface:
- GraphApplet.jar (the compiled java applet to display mass spectra)
The http_server/icons
directory
This directory contains all the icons needed by the Web interface:
- 2d_menu* (the clickable images for the graphic menus)
- cross* (a collection of markers used by the map viewer to
show the
spots)
- some logo images, etc..
The lib directory
This directory contains some of the
public libraries that you may need
to
install on your system or information on how to get and install them
when needed. The content of this directory do not follow the general
license terms
stated in License.txt file on the main directory.
The readme directory
This is where the readme and HTML files
are located:
- 1.Readme_main.html (the main readme file: introduction
and installation)
- 2.Readme_preparation.html (how to prepare your data to be
converted)
- 3.Readme_configuration.html (description on how to create
and set the configuration files)
- 4.Readme_interface.html (details on how to use the query
interface, on how to
set up the redirection rules using 'mod_rewrite' and on how to maintain
and update data)
- changes.txt (the most recent changes)
- DbCrossRefs.html (manual for the DbCrossRefs.pm
module)
- perldoc.css (stylesheet for DbCrossRefs.html)
- manch2d.html (a recent version of the SWISS-2DPAGE user
manual)
- Copyright_for_external_data.txt (some copyright concerning data
extracted from the Web)
- hiddenEntries.txt, hiddenGels.txt and hiddenSpots.txt (self
explanatory files on how to make some data private)
A sub-directory called 'examples' holds some files used by the readme
files.
Another sub-directory, called 'database_schema'
includes 3 sets of
documentation files describing the database
four schemas structure and functions.
This sub-directory is not present in distributions marked as 'light'.
The first set of files, prefixed by 'all',
contains the whole database
structure. The other 4 sets contain, each, the structure of a schema
(common, core, log and public schema):
- *.html: 5 HTML files
- *.dia: 5 dia files (can be graphically viewed by the
'dia' program)
- *.xml: 5 XML files
The temp directory
This directory will be used by the tool
to store temporary files.
This directory will also contain your previous databases' dumps
(backups), as all already existing databases will be 'dumped' by the
tool whenever a new installation is replacing an existent database.
You can delete anything from this directory.
Note:
depending on your use of the tool,
some more additional files will be generated or downloaded into your
package directories.
1.6
DATABASE SCHEMA:
Relational postgreSQL database schema
in HTML summary:
1.7 TO DO
LIST:
- Distribute a batch in the very near future to replace the
current graphApplet MS viewer by a new one running on the server side
- Continue to integrate the standardization recommendations to
be issued
from the PSI
proposals
(protocols, MS data representation..)
- Extend dynamic links with external 2D data resources based on the
maps species and tissues
- Propose a related XML schema based on a Gel / Spot approach,
and generate exports from
the database in this format
- Extend the interface to interactively update local data
- Include HPLC management
1.8
CONTACT:
Related documents:
For any question, suggestion or
comment: Please,
contact us.