DW's notes

These are some notes about the organization of the terravivo website for internal consumption.

general

Starting with Jeff's .txt files, I built some HTML and pseudo-cgi's to see how it might look on a browser. Most of the orientation in Jeff's stuff is encapsulated here (e.g., manager information, etc.) but I added a few things (e.g., "Complete genomes section"), simplified others (e.g., unified configuration page), and fleshed a lot out. It was an educational experience! The example state is a server which started recently (dates and times are all bogus) and is serving ENZYME (monthly update), PDB and Genbank (daily), SWISS-PROT and TrEMBL (weekly), and updating itself daily. I'll try to explain the putative lessins in the following notes. What do you guys think?

conclusions

This could be really, really neat!
Simplification is critically important.
There's a lot of it, automation is also critically important.
It will be important to developing a control language (as makefiles?) corresponding to a consistent directory structure for internal files.

the name "terravivo"

We've been using "terravivo" as the working name for a unified pillage and vnfs product. We should decide on the real name soon. "terravivo" is fine with me, some advantages are that it hints at the scale of what we're up to ("earth life"), it's available as a domain (it would be nice to get www.terravivo.com and ftp.terravivo.com as for terravivo server), the initials are easy to remember (root directory /tv) and it's generally distinctive (sort of scientific, not too English-oriented). I personally like "Global Pillage" but it's perhaps a bit flippant. Other ideas?

Here, I've continued to use terravivo and /tv.

appearance

It would be nice to have a distinctive and consistent look to this stuff. To this end, I used "metaphorics paper" BACKGROUND with white BGCOLOR and tables in most places. The background that we previously used (sand.gif) was a bit dark, making it hard to read black text, so I lightened it up and used that (sandy.gif).

sand.gif

sandy.gif

header

All pages have a logical title in the HEAD section and the same text visible as centered title followed by a short description of what the page is about. Good to maintain orientation and support bookmarking.

trailer

A "standard Metaphorics trailer" appears at the bottom of each page. I think it's important to have a consistent trailer so people know they can get back to us from all our pages (not just the tv ones). I'm not married to the specifics of this particular trailer, though it seems OK to me.

organization of pages

I started out with the main pages being "status" and "configuration". By the time I entered all the databases that Jeff picked, and imagined that this number might grow significantly, it became apparent that a more "star-like" top-level organization works better. In turn, this makes it handy to have a site map, but I don't think a big site map on the first page (ala www.daylight.com) is called for. Perhaps a more graphical site map than the example here would seem less clunky and more professional.

server identification

Are we ever going to have more than one Terravivo server per network? Or for that matter, per host? Assuming it's possible, we need to identify the one we're talking about. I've used the IP number throughout (e.g., 207.225.60.9) but a string indicating host:service name would be better (e.g., "origin:terravivo2").

Terravivo home page

This is the page entitled just "Terravivo". It's Terravivo's home on the local network, not the global home page at Metaphorics' site. Is the use of the phrase "home page" too confusing? If so, what else can we call it?

The home page provides a server synopsis, manager information including a local message (i.e., information about the manager for the benefit of users, is this what you intended Jeff?), news from us to everyone (probably the 10 most recent messages, if more, a link to the rest?), and links to the rest of the site. Only the first link (25-Oct-1998) is live in this example.

It would be nice to keep this first page short and sweet.

Terravivo status page

This is a read-only page describing all possible Terravivo resources. My gut feeling is that it's better to have a comprehensive list which is consistent with the configuration page rather than to separate "current" and "potential" resources. If neccessary, the categories could appear on separate pages.

I like the idea of an overall condition synopsis, if we can make it clear what it means to the users, e.g., "healthy" (working as it should), "unreliable" (down more than 50% of the time), "disfunctional" (not doing anything useful), "misconfigured", "brand new" (not configured), etc. This might be linked to a more longwinded explanation of any problems.

The main part of the status page is the table of resources. Each resource name is linked to a page describing the resource and its status. (I created preliminary pages for all the ones here). Users don't get to change anything, they just get to see the description of the resource, its status and look at any resource-specific messages (e.g., the hypothetical warning about GenPept).

message from manager

This seems like a nice feature.

resource categories

It seems useful to separate resources into categories. I suggest "Protein datbases", "Nucleic acid databases", "Complete genomes" and "Software". These will be soft (like everything else) so we will be able to add/split categories in the future (e.g., add "Small molecule databases", or split "Software" into "Public software" and "Metaphorics software").

Given the eventual orientation of this stuff, it seems appropriate that we have a "Complete genomes" category. Splitting it out by species is sensible, but this list might get quite long quite soon and merit its own page. Perhaps by then the "Complete genomes" resources will be implemented as Biothor genomic universe databases.

Resource descriptions

These are linked to both user (Terravivo status) and manager (Configure Terravivo) pages. The information shown on these pages seems appropriate, though the content did evolve a bit as I compiled them (abandoning "author:" and "reference:" items in favor of "home:"). A link to the resource-specific log would be nice. We might also need to put in copyright/licensing messages. Other stuff we might need: copyright/licensing messages, original format, list of original files and their URLs?

IMO, the resources should be organized logically at a relatively high granularity, i.e., from the user's POV, resources are not broken up into their components (e.g., sub-databases or individual programs).

Does it make sense to have links to the original sources here, or to maintain local copies of documentation, or both?

Configure Terravivo (entry) page

Here's an approach that might work for management security. All routes to the configuration page lead here unless access is from the console (in which case, the manager's password can be changed). This page asks for a password and generates a cookie (timestamped hidden field) for the real "Configure Terravivo" page.

For this HTML-files-only example, the "for now, click here" link moves you on.

Configure Terravivo page

This is a CGI FORM which looks as much like the status page as possible, except that optional fields can be changed. The "Submit changes" button on the bottom takes the manager to a confirmation page. There is also a "Reset factory defaults" button which seems OK but might be better split into "Restore" and "Factory defaults".

We need to make each of these choices individualized via our updating. E.g., recommended update frequencies* and servers may vary; there is no "never" for terravivo updates.

Re update frequency, settings such as "daily" and "weekly" seem better than specific days and time-of-day. Even if we use the crontab format, having the manager specify general frequencies allows terravivo to pick the apporpriate time of day (perhaps adaptively?)

One choice for the inital state would have everything is set to "never" and require the manager to configure things. Or we could ship live (but slightly obsolete) data and have the initial state reflect that.

The ftp servers you see here are real ones for the specific resources. (Jeff, weren't you going to send me your list?!) I made them up based on intuition and a look-see, so they're not authorative. The general idea is that a manager can pick a single server only (in which case only that one is used or pick primary and secondary servers (in which case, if they both fail, terravivo can use other ones). The ftp service is the CGI choice VALUE, the domain/location is what appears on the form, e.g. "ftp.ndbserver.rutgers.edu" appears as "rutgers.edu". Since this page is autogenerated from our "configurables" data, the values don't really matter as long as the choices are non-ambiguous.

Confirm Terravivo reconfiguration page

This is a page to allow us to do consistency checks and make the manager confirm what was specified on the previous page. It seems prudent to require that the manager re-enter the password.

Notice of Terravivo reconfiguration page

This is page reports the success (or not) of Terravivo configuration changes. This page might offer to move on to the status page and remind them to shift-reload to see changes.

Terravivo performance page

Aside from availability and capacity plots, I find it useful to think of Terravivo as a box that acquires information from various data producers and and delivers it to data consumers. The "Data gathering" and "Data delivery" tables reflect this POV (numbers are totally bogus but they add up correctly). I think "last week, last month, to date" is adequate granularity in a table, but a line plot might be better.