Project Number: 1007 ( RE)
Project Title: Multimedia European Research Conferencing Integration (MERCI)
Deliverable Type: (PU/LI/RP)* PU
Deliverable Number: D13.2
Contractual Date of Delivery: 30 November 1997
Title of Deliverable: Commercial Trials with the MERCI Conferencing Tools
Work-Packages contributing to the Deliverable: 13
Nature of the Deliverable: (PR/RE/SP/TO/OT)** RE
Author: ReportKnut Bahr, GMD
This report describes two field tests that were done in collaboration with employees of the German Telekom (Deutsche Telekom AG) in Germany during the first year of project MERCI.
The trials involved different groups of people within Deutsche Telekom and were completely independent from each other.
The report covers the installation and testing phases. Routine use was not achieved due to organisational delays by the partner and due to some technical difficulties. It was planned to achieve routine use during the second year of the project and there was supporting interest amongst the users, with whom a specific workplan was discussed.
User management was reluctant to commit itself, due to major restructuring within Deutsche Telekom and the subsequent trials did not take place.
audio, commercial trials, Mbone, multicast, multimedia conferencing, video
A. Assessment of the suitability of MERCI/Mbone conferencing tools for everyday use *
1. Overview of the tool functionality *
2. Point to point LAN connections between SGI and SUN platforms *
3. LAN conferencing with 3 to 4 participants *
4. Point to point connection via ISDN (2 x 64 Kbps) *
5. Overall assessment *
B. Installation and test of Mbone tools in a multimedia meeting room *
1. Audio *
2. Video *
3. Shared Workspace and Conference Set-up *
4. Conclusion *
We report on two field tests that have been done in collaboration with employees of the German Telekom (Deutsche Telekom AG) in Germany during the first year of project MERCI. The two cases involve different groups of people within Deutsche Telekom and are completely independent from each other. They are:
A. Assessment of the suitability of MERCI/ Mbone conferencing tools for everyday use
Conferencing tools have been installed in the environment of a research and development team which is distributed over two locations. The objective was to make the tools available to them for their everyday use.
B. Installation and test of Mbone conferencing tools in a multimedia meeting room
Conferencing tools have been installed in the environment of multimedia meeting room for the support of group work. The objective was to make the tools available for demonstration and presentation purposes as well as for practical conferencing use in virtual meetings.
For either situation, we report here on the phases of installation and testing. A phase of routine use has not yet been reached, largely due to organisational delays on the side of our partner and due to some technical difficulties. We plan to get there during the second year of project MERCI and we do find a supporting interest within the involved users, with whom a corresponding workplan has already been discussed. The management is reluctant to commit itself, due to a current major restructuring within Deutsche Telekom.
A. Assessment of the suitability of MERCI/Mbone conferencing tools for everyday use
The group whose experience with Mbone conferencing is reported here belongs to Deutsche Telekom. They are part of a research division on audio-visual communication, with some members working in Berlin and others in Darmstadt. Much of their work is basic research on advanced coding and visualisation techniques and as such related to videoconferencing technology; yet they do not regard themselves as experts in desktop conferencing, conferencing software or even Internet applications. The people of the group meet every few weeks to jointly discuss documents or program developments. The sessions were typically called for discussions rather than for jointly and synchronously editing documents (due to the group's organisation of work, there is no need for the latter).
Audio is the most important medium, a shared whiteboard is also important, and video is of secondary interest; because people know each other they can manage with relatively low quality, thumbprint, or near still pictures.
In the beginning of this trial period the group had no practical experience in computer supported collaboration. They had no documentation on the tools; the online help is insufficient. So there were some initial misunderstandings and tool operating errors.
SGI Indigo 2; Irix 5.3,
SUN Sparc 20; Solaris 2.5
1. Overview of the tool functionality
Audio-visual conferences were established with the help of several mutually independent components: Vic was used to establish a video connection, vat was used for the audio connection and wb (white board) for textual and graphical communication. For the purpose of co-ordinating these components, and for connection set-up and release, there was a session manager.
The attention during the tests was mainly directed to quality and the user handling of the audio/ video tools. Wb, which was mainly used for live sketching, is well usable. There is nothing special to be mentioned. It appeared equal to comparable tools. Similar tools are known from other products.
Start-up was usually done by starting the tools individually. In addition, mmcc was used for point to point connections, but it was never tried for multicasting. Since mmcc served the purpose well, there was no need to acquire and try sd or sdr.
The fact that communication can take place across different platforms is regarded a significant gain in quality compared to commercial software (cf. INPERSON by SGI, ShowMe by SUN, ProShare by Intel etc.).
The A/V tools stand out in their flexibility of adapting to a given hardware. Vic for example makes it possible to adapt the required bandwidth to the existing network and to adapt the frame rate to the workstation power. This makes it possible to have vic run idle beside other applications. While the user is not in any way limited in reducing the frame rate to near still picture transmission, there are serious upward restrictions, which are not only caused by the transmission medium, but also by the performance demands of the software encoders.
2. Point to point LAN connections between SGI and SUN platforms
The quality of voice transmission was sufficiently good. The standard microphones which come with SGI workstations were used. Some experiments have been done with the full duplex mode and "net mutes mike". Full duplex mode matches best the human way of communicating. For this mode however, it is recommended to better use headphones, because otherwise bad echo effects would impair the communication. The recommended "net mutes mike" mode is something users have to get used to. It caused problems in controlling the microphone level. Interfering noises (street noise, background talk, computer fan) have a crucial influence. This resulted in participants with the loudest environment blocking others. The effect maybe cleared by adjusting levels, but this would have to be frequently re-done and is therefore disturbing in establishing communication. The need to adjust the mike before talking proved to be a hindrance for occasional or ad-hoc use. Also, having to continuously control oneself is a bit of a nuisance. The tools should reflect everyday life behaviour and not force new habits onto people.
In order to call a frame sequence fluently moving to some extent, one would need at least 4 to 5 frames per second - where however no more than global gestures could be transmitted. Unfortunately at higher frame rates, the SGI encoder used up so much of the CPU power, that other tasks on the same workstation could no longer execute normally.
Based on the above assumption on the frame rate, the following settings of the SGI Indigo 2 decoder were determined as necessary to achieve that rate-settings marked with an asterisk (*) are those for scenes with rather limited motion.
fps Kbps fps Kbps
h261 6 128 4 - 5 100 *
6 180 4 - 5 150
nv 6 500 4 - 5 400 *
6 700 4 - 5 600
nvdct 6 300 4 - 5 200 *
6 400 4 - 5 300
cellb 6 400 4 - 5 300 *
6 400 4 - 5 300
This essentially says, setting the sending station at about 6 - 7 fps yields a throughput of about 4 - 5 fps.
In case of encoder mode CellB, rate variations for varying degrees of motion in pictures were hardly noticeable. Nevertheless, it gave the poorest overall video impression due to high noise in the pictures.
The LAN based point to point communication was sufficiently supported by vic and vat. However when other tasks were running on the same SGI workstation, e.g. a background compilation, one had to live with severe restrictions due to load problems. It seemed appropriate, as a rule, to stop these tasks before starting a conference session or to work in "small mode". Here, picture quality may be improved significantly by reducing the frame size to QCIF. In most cases however, the simplest thing to do was to completely stop video transmission. Vic video with a high frame rate produced a CPU load of over 90%. In any case, video was the main load generator. Comparable SUN SPARC workstations did not show the effects to such a degree.
For voice communication, it was impossible to detect any quality differences between the individual coder modes. It has to be noted that the coder mode LPC4 did not seem to work.
Despite of the various possible settings mentioned above, a default setting of h261 at 128 kbps and 8 fps was used during those test operations, which were supposed to reflect everyday user situations.
Long-term audio tests showed that gaps during the audio transmission were rare.
The tool interfaces were considered unnecessarily complicated. For example, the indication of input/ output levels in vat is rather neat for adjusting levels at the beginning of a session, but is not needed during sessions. Also, one window per tool should be enough. The setting and adjusting of all the windows and parameters at the beginning is elaborate, e.g. for vic. In comparison, the Inperson GUI was regarded as clear and well integrated.
3. LAN conferencing with 3 to 4 participants
Tests of this type were spontaneous connections among colleagues. Not very many have been done. There is currently no real application scenario for a frequent use.
A realistic audio communication requires full duplex mode. However it appeared appropriate to use headphones, because of echo problems. The bandwidth per participant was only 40 - 60 Kbps on the average, leading to3 - 4 fps. This means, the video connection was more like a slide show and played a secondary role. There were problems in finding a suitable audio mode. Moreover, there were occasional gaps in voice transmission such that some participants could understand others only incompletely. This produced frequent requests for repetition of entire sentences. Perhaps the network was not in a good shape when the tests were run; so they should probably be repeated some time.
4. Point to point connection via ISDN (2 x 64 Kbps)
While the previous cases involved workstations connected to LANs (with inter- or intra-LAN connections), this case had two SGI workstations, disconnected from the LAN (for security reasons) and interconnected by an exclusive 2 x 64 Kbps ISDN dial-up connection. This was certainly not a configuration for routine use, because one would have to call the partner beforehand and tell him to re-connect his workstation.
This case showed the strength of the programs to keep operating even at low bandwidth on the transmission medium. Audio communication was sufficient, but video communication was constrained to a still image presentation with updates. Due to hardware problems with the ISDN interfaces, these tests could not be carried out to the extent desired, so that the assessment has to be deferred.
5. Overall assessment
What were disturbing were problems in starting vic on SGI workstations. One had often to give it a second try. In terms of reliability of operation, especially the SGI platforms left something to be desired. The windows system crashed from time to time. SUN platforms appeared more reliable. For this reason, all connections with SGI workstations were set-up for test purposes only and not, as originally planned, for supporting ongoing work. This lead to a limited number of test applications.
The fact that some of the parameter settings (e.g. PORT Indycam), users have to do, need to be re-done each time the workstation is restarted, is considered impractical.
In addition, the manageability of the tools must be regarded critically. For instance, adjustments need to be done in three different windows in order to accept an audio-visual call and get it going. This does not count possible adjustments of the microphone. It would be desirable if the buttons "keep audio" and "talk" were normally and by default "on".
What is judged positive are possibilities to adapt the bandwidth to the available network capacity and the frame rate to the available CPU power. In doing so, one has of course to consider the entailed changes in quality.
B. Installation and test of Mbone tools in a multimedia meeting room
A new presentation building, called FutureLab, has been set up by the Deutsche Telekom (DTAG) in their Technology Centre (TZ) in Darmstadt for the purpose of presenting applications based on ATM network technology and gaining experience with innovative ATM based software technologies. One of the application areas demonstrated in the FutureLab is groupwork in a virtual meeting room. A multimedia meeting room was set up for this purpose. Here, novel techniques for multimedia session support were tested and are tried in typical real-life application scenarios and demonstrated to the interested public. The two GMD institutes in Darmstadt IPSI and TKT have provided consultation and support to the TZ of DTAG in designing, planning and setting up the multimedia meeting room. GMD contributed the multimedia conferencing infrastructure based on MICE and MERCI results and DOLPHIN, a hypermedia shared application for group communications.
The electronic group working room contains:
- an interactive electronic whiteboard (LiveBoard)with wireless pens to write on,
- desktop shared workspace systems for the individual conferees,
- a multimedia workstation for presentation of prepared documents and other conference material (e.g. PowerPoint slides, multimedia software),
- analogue and digital audio and video technology,
- audio and video conferencing Mbone tools (from project MERCI),
- a hypermedia meeting support system (DOLPHIN),
- software support to call confidential conferences on the Internet(from project MERCI),
- facilities to control multimedia conferences.
The electronic meeting room at TZ is connected with similar meeting rooms at GMD and with single desktop systems at DTAG and GMD via an IP over ATM network (using ATM switches by FORE). Both a unicast and a multicast network infrastructure have been installed. The multicast connection between TZ and GMD is realised by means of two multicast routers running on two workstations. Local multicasts (between workstations connected to an ATM switch inside either company) are handled through the pertinent proprietary protocol by FORE (FORE-IP).
User acceptance was tested in configurations such as
- a local group just using the in-room media technology,
- a local group connected with individual remote desktop partners,
- several groups distributed over coupled electronic group working rooms and individual desktop partners to form a virtual meeting room scenario.
Among the Mbone tools used were
vat, rat for audio,
vic for video,
wb for shared workspace
sdr for conference announcement.
The room as a whole is served by a single audio/video server (SUN workstation). There is a single room speaker and individual microphones at the seats of the participating group members. Speaker and microphones are connected to the server through an audio crossbar switch, a mixer and an amplifier.
Vat was run from the a/v server in conference mode. Transmission speed for audio was about 48 Kbps, most of the time.
The individual conferees in the room do not have control over the vat interface; rather it is handled by an operator. Initially, the vat interface was judged to be difficult to handle in its details - in particular by users who ran it on their individual desktop workstations. Users needed time to get acquainted.
Audio quality was felt to be fair to good from the beginning. With the proper adjustments of the mixer/amplifier and a test installation of a digital echo canceller, audio quality was judged to be very good to quite excellent.
Besides vat, rat was used occasionally. There was little difference to report. The results regarding the interface were the same as for vat, especially for the options window. The users did not know anything about the provided compression methods.
For video transmission from the room, a single operator controlled panning and zooming camera has been installed. This camera is connected to the a/v server by a video crossbar switch. As it turned out in later sessions, a single camera to cover a group sitting around a round table is not an optimum solution (only 75% are covered by the camera, the image quality is not as good as required, therefore the installed camera will soon be replaced by a better one).
Pictures received from remote sites are sent to a large-screen back projection system in the room, situated face to face to the LiveBoard. It is hooked to the a/v server via a Parallax board. This permits showing either the full server screen or part of it, such as a single video window.
Still image quality was judged to be good.
Nevertheless, video communication was a problem in the initial set-up. A speed of only 5 - 7 frames/sec was considered to be too slow. The bottleneck was a weak workstation (SUN SPARC 10). Therefore the configuration has been upgraded to a SUN Ultrasparc for encoding plus the SPARC 10 for decoding purposes only. This made it possible to send and receive about 25 - 30 frames/sec at CIF-resolution. This rate was considered sufficient by conferees in the room, but in general users want still more (i.e. higher resolution images). Videos received remotely from sessions in the FutureLab's meeting room were considered to be quite adequate.
Experiments were done with different compression schemes (CellB, M-JPEGE,H.261) and data rates. The best results were obtained with H.261 and 500kbit/s on the average (1 Mbps max) over ATM links.
Users felt that it takes some time for them to get used to vic's interface- just like for vat.
Although video is viewed more as a background medium in virtual meeting activities - compared to audio and shared workspaces - it is nonetheless considered essential, as conferees want to be able to see their partners.
3. Shared Workspace and Conference Set-up
Wb was only used in the set-up phase for the exchange of control messages between DTAG and GMD. Later on, wb was replaced by DOLPHIN, a GMD development for Virtual Meeting Support (VMS), which is not Mbone based, but has a client/ server architecture. DOLPHIN supports interactive work on the LiveBoard in the meeting room. The system is based on a common hypermedia data model for private and public workspaces. This model supports the simple construction of emerging structures, the representation of dependencies between documents created in and between meetings, and the transition between workspaces.
Sd was used for announcing conferences (pre-announced conferences) as well as for inviting to ad-hoc conferences. Starting DOLPHIN out of sdr requires an extension to sdr which has not yet been implemented. At the moment DOLPHIN is started manually by each participant.
The conferencing technology for a multimedia meeting room, including network base, media technology and software tools, has been set up, tested and tried. The set-up is working, stable, in good shape and ready to be used. At the opening of the FutureLab, DTAG management was impressed; they want similar rooms in other locations as well (e.g. Bonn, Berlin). The room will be used for demonstrations starting at the beginning of 1997. Real users have yet to make real-life use of it.
For the meeting room scenarios, audio is considered the most important part, shared workspace (DOLPHIN in this case) is second. Video is more a background medium, but is nonetheless essential, as conferees want to be able to see each other. Interesting though, that users want high quality pictures.
The DTAG personnel response from setting up, testing and using the Mbone tools is overall positive. Audio quality was judged to be good from the beginning, and with the proper adjustments very good to excellent. Still image quality was also judged to be good. For moving pictures, the workstations' CPU power was a bottleneck. Some effort was required to achieve 25 - 30 frames/sec, but users would prefer larger and higher resolution pictures at that speed. The user interfaces of vat, rat and vic were judged too complicated and difficult to handle. Much time is required for users to get used to them.