Group number and name: G1

CS519 Final Report

Team members: Yu Zhao & Di Yan

Task number and name: Team 8: Multi-party Audio Conference + White Board

What did we set out to do?

There are basically two parts of our project: audio conferencing and white board. For the audio conference part, we are to design and implement a conference control application to allow participants to join and leave conferences. This application should also allow telephone users to have audio-only connection to a conference. The whiteboard allows participants to share a graphic workspace. The details of the conference application and white board application are described below.

Our Multiparty Audio Conferencing and Whiteboard (MACW) should allow telephone and computer users to have conference over the Internet. Telephone users would be able to dial in a conference by dialing a special code for conference, then followed by conference number and password. The computer users would be able to join a conference by running our MACW application. Once a user, regardless of telephone or computer user, joins a conference, he/she would be able to listen to other people in the group and speak to the group. The voice made by him/her would be delivered to other members in the conference. Our MACW application should allow a speaker to cede control to the other speakers in the conference. This can be accomplished by implementing a token-ring-like protocol, where each speaker in the conference will eventually be able to speak. But this approach has some drawbacks, which is summarized as follows: 1. How does the application know when a person finishes speaking and can pass the token to the next person? 2. How does the application know who wants to speak next? 3. If the application is implemented such that the users have to manually pass the token once he is finished speaking, then a user could occupy the token forever. 4. This implementation is unfair to the phone users. Another approach to this problem is that when one user speaks, the other people’s voices will be dropped. This approach is better than the previous one, but still has some drawbacks, for example, if two people are on a fight over the conference, this approach would be unfair to one of the users. Other solutions would be to average the voices spoke simultaneously by two or more users. But this approach is beyond our scope of the project, and it depends on whether the Data Team is willing (and have time) to add this feature to the system.

The white board application gives computer users visualization besides the voice conference. This shared workspace allows user to draw, type, and import files to workspace to share the display with other users in the conference. The whiteboard should run on a multicast group so that every computer user in the conference could share the workspace.

What did we actually accomplish?

The Big Picture

Usage:

We were able to get all the coding done at the time of this report is written. We ran tests for the applications as much as we could but we are still not able to fully test the application. Our MACW application works as follows: when a registered (users who paid money to our billing team) user runs MACW on a desktop computer, it pops up a window asking the user to enter conference number, user name, and password. If the user wants to setup a new conference, he/she can do so by select File/Setup New Conference… again, he/she must be registered in the directory service before using this application.

When the user wants to leave the conference, he/she can simply click on "Hang up". The field that appears on the right part of the main window shows all the names of the computer user. It also (in theory) shows the telephone users by displaying "Phone User 1", "Phone User 2", etc. However at the time of this report is written, this feature has never been tested yet because some of the components provided by the core teams that handle phone connections are not completed. Therefore, we only manage to let computer users to join a conference. The user list will refresh itself every 5 second. The user can also set this frequency as he/she wishes.

The whiteboard is very easy to use. Users can start up the whiteboard directly from the DOS prompt, providing it a multicast address and a port.

Implementation

Our MACW application is implemented as follows: when the user startup the application, he/she can either join an existing conference, or create a new conference. If he/she creates a new conference, then we add a conference record in the directory service and if the user joins an existing conference, we add the user’s name to the conference record. A refresh thread is started so that for every five seconds (default) the list of user in conference will be updated to reflect the most recent list of users in the conference. We obtain this information from the directory service as well. When a user finishes and hangs up, his/her name will be removed from the directory service and if the last user is removed from the conference record, the conference record will be removed as well.

When the user joins the conference, a thread will be running so that he is connected to the multicast group 224.2.2.2 with port number equal to conference number plus 10000. For example, in the previous figure, the conference number is 123, and the user is joined to the multicast group 224.2.2.2/10123. Data Team provides us with CommunicationFacade() class that contains an AudioPipe() and a DatagramPipe() for the use of transferring voice data through the internet. AudioPipe() captures/plays voice from microphone/speaker, and DatagramPipe() writes/reads the voice data to/from the multicast group. The CommunicationFacade() encapsulates all the lower level implementations. The Data Team has done a good job encapsulating the lower level of implementation. However, this also limited our ability to access the packets. Therefore, we were not able to do things like mixing the voice or drop packets.

For the whiteboard application, we downloaded Wbd from the link provided by the course homepage, and incorporated with our MACW application. Basically the user using whiteboard uses the fixed multicast address 224.4.4.4, with the same port number as the voice.

Besides the MACW and whiteboard applications, we also implemented a conference server to take care of the telephone users. Because a telephone cannot join a multicast group, we have to manually forward the voice to each telephone user. The conference server is implemented to do this. The conference server should run together on the same machine with the DesktopSignal component provided by Signaling Team, which stores the information of which telephone is in use, and what conference number is dialed by them. The Signaling Team’s DesktopSignal() also provide us with the InetAddress and port corresponding to a connection to each phone in the conference, from which we can construct a datagram socket and send data to it. The DesktopSignal component contains two arrays of Connectors that we are interested in. The incoming[] array stores the information of all the telephone users who dialed in to a conference, and the connections[] stores the information of all the users that are accepted. In fact, every telephone user who dials in a conference is accepted, so once our conference server discovers there is a user dialed in, it will call AcceptCall() provided by the signaling team to move this record from incoming[] to connections[]. The name of this user ("Phone User #") will be added to the conference record stored in the directory service. A thread will be running to read in the voice packets from this telephone user, and forward it to every other telephone user in the same conference, and the corresponding port number of the multicast group to the computer users. Once the user finishes speaking, he/she simply hangs up the phone and this will be reflected in the connections[] array. I.e., the record of this telephone user disappears in the connections[] array, when our server detects such case, it stops the thread that’s handling this user, and removes his/her name("Phone User #") from the directory service.

Just in the case the voice does not work, we added another feature to the MACW application – allow the users to chat by typing. The chat feature uses multicast address 224.3.3.3 with the same port number as voice. Like the audio conference, only registered users can chat in the conference.

Different Approaches

Our first approach was a bridge approach (For details see report 1). Our application was going be a desktop application, but it can also be used as a server/moderator. The first user join the channel would be the person to distribute the message, or the "bridge". He would be the administrator of this channel. The administrator of a channel would have the privilege to declare whether this channel is public or private and, if the channel is private, who can join the channel. When the administrator would leave the channel, he must appoint another person in the channel to be the new administrator, who would then serve as the bridge. If too many people join the channel, some sub-bridge would be created. The sub-bridge would serve as a message collector/distributor but has no privileges as the administrator. We soon discovered that this not the best idea. As more and more user joins the conference, the users would suffer serious delay introduced at the administrator. And if sub-bridges were created, the delay would be even greater. Secondly, the locations of the sub-bridges depend on the locations of the users. A bridge with the shorted path to each user it serves would be ideal, however this is complicated to implement and would be a place that is likely to introduce bugs. To avoid the complexity of implementation and the drawbacks, we found another better solution, to use multicasting.

Our second approach is to use multicast for the conferences. By the time when we wrote our second report, we had a better idea on what we would be doing. But we are still not clear on what interfaces will be provided by the core teams. In this case, we made some assumptions (which turned out to be incorrect) on the interfaces that we need. (For details please see report 2, "interfaces need from other teams") One example was that we assumed that the Data Team would provide functions that convert voice to IP packets and vice versa, and we would have to multicast the IP packets to each computer user. When the user receives IP packets, the application just call another function provided by the Data Team to convert it back to voice and play it out. It turned out that the Data Team provided with us a set of completely difference interfaces from what we had expected, and that we have to erase all of our code and start from scratch again. The Data Team, instead of letting us handle the IP packets, encapsulated all the lower level implementation that handles the packets and voice into a class CommunicationsFacade(), so that we don’t have any access to the IP packets. Later we had a meeting to discuss what would the Signaling Team provide when a telephone user wants to join a conference. After the meeting, we decided to implement a conference server to handle the telephones. This time, we were more careful about the interfaces provided by the Signaling Team. The conference server basically joins every multicast group that corresponds to the conference where there are phone users involved (as explained in the "Implementations" section).

Our third and final approach is what we have now. We first implemented it under UNIX on babbage, then we realized that be have run on our application on the desktop computer where a full-duplex sound card is available. Since our group machine runs under Linux on a PC, we had a Linux version of the MACW. But later we were informed that the Data Team’s code would be running on NT machines only, so we had to modify our code again so that it would work in the NT environment.

Struggling with whiteboard

We have encountered several problems while trying to run wb/wbd. First, the UNIX version of wb requires that participating users listen on the same port. This means different users must use different machines. Therefore we were unable to start a conference on Babbage. Then we downloaded the Windows version of wb, i.e., wbd. We tried to install it on the NT machines, but the installation software would not run. Then we tried to copy an installed version of wbd to the NT machines. But it turns out that wbd needs to create a directory called "c:\wbdtmp", but since we don’t have the permission to write to the c: drive, wbd cannot create this directory and will then exit. Then we asked the administrator to create this directory. But wbd still refuses to work because it was trying to first remove the "c:\wbdtmp" directory then recreate it. Finally we used a hex editor and replaced one of the occurrences of "wbdtmp" with "tmp/wb" in wbd.exe, and now it works.

Problems:

We had a lot trouble with wbd as described above, but we were able to get it fixed and running. However, we have run into tons of problems and serious troubles during the past month. First, we just learned yesterday (12/15) that in order to play sound correctly, we must run our program on the Windows NT platform on one of the five CS519 machines because the code provided by the data team only works correctly on those machines. In order to test our code, we need at least two machines. Two out of those five machines are running Linux, and the other three are always occupied by people from their groups (they need the machine for testing, too). In addition, we cannot log into those machines unless the management team is there. So it is very hard for us to test our code. Also, the signaling part was not complete until two days before the demo, and the original interfaces provided to us in the beginning have been changed many times. This also makes it harder for us to test our code, and we have to modify our code every time the interfaces were changed. In addition, the only times when we can test our conference server are when the PBX is reserved for our team and running the signaling for conference works. At the time this report is written, the above situation has never occurred yet. Therefore so far most of our code dealing with conferencing is not tested.

Besides these major problems, we have some other problems having to do with Java. We first wrote our code on Developer Studio. It compiled fine on some of the machines, and gives errors about MulticastSocket class on others. Struggling for two days, we finally ported our code to babbage, and they compile fine now.

Another problem is that the interface provided by the Data Team did not provide us with the interface that we asked for (in report 2), instead they encapsulates all the low level implementations and we only have high level interface Pipes. This prevented us from accessing the IP packets. Therefore for the computer users, the echo problem is not solved. If two people try to speak at the same time, we cannot prevent the mixing of the voice packets. If we would have access to voice packets, then our MACW application would know where the packets are from. If they are from the local host, then the application can just drop the packets and this would prevent echoing. Having access to packets also allows us to detect if there are several packets arriving at about the same time but from different sources, in which case we would be able to drop some packets so that only one speaker’s sound would be played. But these cannot be accomplished with the encapsulation of Data Team. (However for telephone users, this would be better because we have access to the packets that we forward to each telephone.)

Finally, when we were trying to integrate the parts provided by other teams into our program, we encountered a major problem. The Data Team used JDirect in order to play sound on the NT platform, so it requires Jview, the Java virtual machine from Microsoft, in order to run their code. The Directory Service Team, on the other hand, used Java RMI. It turned out that Jview does not support RMI, while the Java virtual machine from Sun does not support JDirect. This caused a big problem to us since we cannot integrated these two components into our project. We tried to download the most recent versions of Jview from Microsoft and Java SDK from Sun, but neither of them worked with both RMI and JDirect. The Data Team decided to rewrite part of their code to make it compatible with the component from Directory Service Team. However, they were not able to complete this due to the short amount of time they had. So we had to demo our project in two parts: one with the directory service component only and one with the data component only.

What we learnt:

We have learned a lot through this project.

We are now more familiar with Java programming than we were before.

We also learned about Java network programming and RMI.

While trying to find out how to use Wbd, we did some research and found out how multicasting works (before it was covered in lecture).

Most importantly, we learned a lot about working in a big team. Neither of us had much experience about this before this project. After completing the project, we have seen a lot of the differences between working individually, working in a small team, and working in a big team:

Working in a small team requires some interactions between the two or three team members, but it is usually easier to coordinate difference members’ work because the team is small. Working in a big team, on the other hand, requires a lot of interactions between team members. And it is usually harder to coordinate difference people’s work because of the size of the team. Sometimes you cannot even find the person you want to talk to. And occasionally certain members are not very cooperative.

When a team is small, your team members tend to have better understanding of your part of the project. However, in a big team, due to the size of the project, very often another member in the team will have no idea about what your part of the project is. Therefore, the interfaces provided by that person would sometimes turn out to be a lot from what you are expecting.

5. What we would do differently next time:

First, we would make sure that we have a thorough understanding of the project before we start working on it next time. For this time, we first planned to use bridges but later found it difficult to implement. After we did more research we found that multicast is easier to implement so we decided to use it. We would get a clear idea in what we will be doing before actually writing the code. We had 3 attempts and implementations of the conferencing. But for the first 2 times, we did not think about the project careful enough so that we had to erase everything and restart from the scratch. After the meetings with other teams, we got a much better idea on conferencing. If we did the research more thoroughly, we would use multicast in the beginning and would waste less time. If we would communicate with other teams and course staff more often then we would get a clearer view of what our role is in the project.

Second, we would try to communicate more with other teams in our group. Our group did not have a lot of meetings. And we did not contact the other teams very often. Besides, few of us had a very clear idea on what the whole project would be like until we actually have most of the coding done. Then by that time when we start to communicate with each other, it was already too late for making any major changes in the core teams. This caused some of the interfaces to be incompatible. And we had to modify much of our code in order to make it compatible with the interfaces provided by the core teams. We learned the importance of communication.

Third, we would request more hardware access in our earlier reports. Now we need at least two computers with NT Platform and full duplex sound card installed. We did not specifically mention these details in our earlier report, and we are having trouble accessing machines for testing our code.

Last, we would help out some of the core teams so that they could finish their parts earlier. This time they did not leave us much time for testing and debugging our code.

6. Interface that our team will provide to other teams or use:

Since we are the application team, we do not have any interfaces for the other teams. However, we do need to use some interfaces from the core teams, especially the Data Team and Signaling Team. The interfaces are listed below:

From the Data Team:

AudioPipe.AudioPipe()

DatagramPipe.DatagramPipe(DatagramSocket sock, InetAddress addr, int port);

CommunicationsFacade.CommunicationsFacade(Pipe s, Pipe d);

CommunicationsFacade.activate();

CommunicationsFacade.stop();

From the Signaling Team:

DesktopSignaling.DesktopSignaling(String s, String pass);

DesktopSignaling.AcceptCall(String name);

Connections DesktopSignaling.incomming;

Connections DesktopSignaling.connections;

Connector [] Connections.connections;

String Connector.name;

Connector.setLocalPort(int p);

Connector.setRemotePort(int p);

From the Directory Service Team:

DirectoryStubForNT.DirectoryStubForNT();

DirectoryStubForNT.DeclareIdentity();

DirectoryStubForNT.AddConference(Digits confNum);

DirectoryStubForNT.AddUserToConference(Digits confNum, Name newUser);

DirectoryStubForNT.DeleteUserFromConference(Digits confNum, Name diedUser);

DirectoryStubForNT.DeleteConference(Digits confNum);

7. Advice for the course staff: What mistakes did we make in running this project?

We have some thoughts about the project:

The progress of application teams depends too much on the progress of the core teams. Without the interfaces from the core teams, the application teams cannot do anything. And every time one of the core teams modifies their interfaces, some of the application teams have to modify a lot of their code in order to be compatible with the new interfaces. Our core teams change their interfaces very often (especially during the past two weeks). The changing of interfaces made us rewrite our code three times. We suggest that we divide the project into two phases. During the first phase, everyone works on the core functionality, and during the second phase, everyone works on some of the applications. This way, the whole group can probably complete the project a little sooner.

The core teams can learn a lot about the technologies involved in an actual Internet Telephony system through the project. For example, the data team can learn a lot about voice-IP conversion, RTP protocol, etc.; the signaling and gateway teams will become very familiar with PBX’s. But not all application teams will be able to learn that much from the project. We again suggest that the project be divided into two phases as described above. Then all of us will be able to learn a lot from the project.

The hardware available to us is somewhat limited. We (team 8) would like to test our multiparty audio conferencing on at least three machines with full-duplex sound cards. But most of the machines in the undergraduate lab only have half-duplex sound cards. And we cannot use three of the five CS519 machines at this time to test our code. Also, we had trouble installing certain software (Wbd) on regular machines in the undergraduate lab. We hope this problem can be solved in the future.

After meeting with Professor Keshav, we had the idea that we could tell the core teams whatever we need from them, and they would provide us with the functionality according to our specifications. However, this did not work out well. It turned out that not all the core teams were willing to follow our specifications. We cannot blame the core teams for this, because we cannot expect that one group of students be willing to follow another group of students’ instructions, especially when they found the interfaces given to them do not agree with what they have had in mind. Again, we suggest that the project to be divided into two phases as described above; in that case, everyone depends on everyone else so that everyone has the similar idea.

The description of the project is somewhat unclear and sometimes even misleading. During the first several weeks, we did not even have a clue on what we would be doing, and what kind of functionality the core team would provide to us. As mentioned in the previous reports, we have rewritten our code three times. In each of those reports, we had almost totally different ideas about how the application should work.

8. References:

http://ganges.cs.tcd.ie/4ba2/multicast/antony/index.html

http://sunsite.univie.ac.at/javafaq/books/jnp/javanetexamples/index.html

http://www.cs.cornell.edu/cs519

http://www.cs.columbia.edu/~hgs/internet/

ftp://ftp.ee.lbl.gov/conferencing/wb/

http://www.erg.abdn.ac.uk/users/gorry/eg3561/inet-pages/

http://www.cs.twsu.edu/~cs742/

http://www.javasoft.com/products/jdk/1.1/docs/api/packages.html