-
How does the OPUS release 5.0 differ from
the last one?
-
Several enhancements, bug fixes, and repairs were made to both the
OAPI (OPUS applications programmers interface) and the Java Managers.
These modifications have made the pipeline environment more robust,
more configurable, and more maintainable.
From the user's perspective there are several modifications which
have been suggested by people using OPUS around the world, and we
hope the current release addresses the most important of those
concerns.
-
What is a PR?
-
A PR is a Problem Report. The problem reporting system affects all components
at the Space Telescope Science Institute and includes enhancements, change
requests, as well as documentation updates. Not all of the 50,000+ problem
reports have been filed against OPUS!
-
48298:
sys_rename_file_dev.c does not handle wildcards correctly
-
The SHARE routine sys_rename_file_dev is able to rename a single
file, but given a wildcarded file specification, the second file
in the list failed. This problem has been fixed.
49601:
OSF persistent store problems
-
The problem described in this PR occurred because the bb server got into
an internally inconsistent state, and then didn't handle the situation
very well. The inconsistent state was that it had OSF's in memory
that it could not persistently store to disk because the dataset
names had spaces in them. This is fine in memory but not fine on
disk. The situation spiraled from there.
This type of situation is now prevented by the addition of code to
check the value of any OSF before it is posted to the blackboard.
The check ensures that all the characters in an OSF/PSTAT name are
legal filename characters, for blackboards which are cached to disk.
49783:
Complete the OPUS-Share port to AIX
-
The project to port OPUS-Share to AIX5.1 (PR 47450) was left uncompleted
in the Spring of '03 (see details in PR). At that time, the only available
AIX machine was returned to IBM. Currently however, OPUS users at NOAA
are providing DSB with access to one of their AIX5.1 servers for the
purpose of continuing the port.
This PR covers the continued development of the port to AIX. The
result will be the ability to run through the Sample Pipeline on AIX.
Other minor changes were also made under this PR, such as making the server-
killing Perl scripts more general and robust, for use on platforms which
vary the location of Perl.
Note that file pollers have problems (on AIX) with paths which contain
multiple consecutive slashes (e.g. "/home//mydir"). This is easily
worked around in the user's path file. Note also that with OPUS 5.0 on
AIX, users must choose the default manager installation path ("~/OPUSMgrs")
in order for the ~/OMG and ~/PMG launch scripts to run correctly.
49788:
PMG and PSTAT-server hang after temporary network disconnect
-
The PSTAT server (as well as its corresponding opus_event_service) had
unrecoverable problems when there was a network disconnect while a remote
PMG was communicating with it. This only seemed to occur during CORBA
communication (e.g. pstat updates), since a quick disconnect that occurred
during a 'still' pipeline did not seem to cause the same problem, and may
have even gone unnoticed.
Analysis of the stack trace of the problem thread (thank you, TotalView!),
shows that during a network disconnection, the code in opus_event_service
that pushes an event to either the OMG or PMG tries to raise a
comm. failure exception. For the case of a time out (errno == ETIME),
TAO 1.2.1 handles this correctly, but for any other type of error, TAO 1.2.1
enters a section of code containing comments which state that the
situation is handled poorly, and that destructors *may* be called in
a dangerous order. In our case, where the disconnect gives errno == ENOENT,
TAO pointers are used after they are freed, and this bad behavior can
result in a core dump from opus_event_service.
The most current stable release of ACE/TAO is ACE5.3.1/TAO1.3.1.
Examination of the same sections of TAO5.3.1 source shows that the
disconnection handling code has been entirely revamped. This new
version in use on Solaris fixes the problem described in this PR.
-
49861:
Update OPUS on Linux to use ACE 5.3
-
At this time, the latest stable release of ACE/TAO is ACE-5.3.1+TAO-1.3.1.
The current version used on Linux is ACE-5.2.1+TAO-1.2.1. The x.3.1 version
contains many important fixes, such as the ability to handle network
disconnections elegantly.
OPUS was upgraded on Linux to use the x.3.1 version.
-
49862:
Update OPUS on AIX to use ACE 5.3
-
At this time, the latest stable release of ACE/TAO is ACE-5.3.1+TAO-1.3.1.
The current version used on AIX is ACE-5.2.1+TAO-1.2.1. The x.3.1 version
contains many important fixes, such as the ability to handle network
disconnections elegantly.
OPUS was upgraded on AIX to use the x.3.1 version.
49967:
PSTAT bb server corruption with > 256 OPUS processes
-
With the arrival of OPUS 4.5 in operations, it was discovered that there
was a lower than expected upper limit for the number of OPUS processes that
a single user could run and manage with their PSTAT blackboard server.
Operations was finding that their PMG would hang, and that this would be
due to a corrupt PSTAT bb server. Further analysis of the server log file
would show that the server process's internal limit on file descriptors
had been reached, spewing "Too many open files" messages.
OPUS has been corrected to make sure that it does not falsely complain
that it cannot write its persistent store to disk. As a result, the
PSTAT blackboard server process should be able to handle a fair amount
more OPUS processes than 256. Unit testing to 400 *inactive* processes
on Solaris controlled by a single 32-bit PSTAT server was reached
without error.
-
50185:
OPUS supports a recently outdated J2SSH
-
PR 49367 (OPUS 4.5) brought OPUS the use of SFTP via J2SSH, v0.2.2,
which is no longer available for download. The most current version,
0.2.7, is (of course) incompatible with code built with 0.2.2.
The OPUS Managers are now compatible with J2SSH version 0.2.7, for SFTP.