Your task in this project is to extend your networking package with reliable delivery.
Project 2 developed an unreliable networking layer, with ports as endpoints. Data sent from one host to another could be lost in the network. The applications would not know if their data made it across, or whether it got lost. Needless to say, this is not a good programming model for most applications. Your task in this project is to add reliable delivery to your networking stack.
For this assignment, correctly implementing reliable sends is sufficient and you don't need to worry too much about performance. For instance, it is ok to build a system that has only one packet in transit at any given time on any given port. Since you need to implement a synchronous I/O package, the sending thread should be idle until an ack arrives or until a timeout expires. However, it is not ok to have the thread wait for the timeout event to occur if the ack has already arrived.
For reliable delivery in the face of lost packets, you will have to retransmit data packets if you don't get an acknowledgement from the remote side within a given timeout period. Note that if the ACK is lost, the receiver may get multiple copies of the data packet. Consequently, the receiver will have to keep track of which date packets it has seen, and suppress duplicates. Also note that the sender might get duplicate ACKs, especially if some acks get delayed in the network.
Your protocol does not need to be TCP-friendly, but it should behave well in the face of congestion. In particular, you should increase your timeout periods exponentially. Set the initial timeout period to 100 ms., and double it after every retransmission. The calling thread should give up on trying to send a packet if it spends more than 12.7 seconds trying to send it. Once a packet is acknowledged, the timeout period should get reset to the initial timeout value (of 100 ms.) and the whole process should start from scratch.
Do not change the minithreads interface. minimsg_send and minimsg_receive ought to have the same signature as in project 2, but perform delivery reliably. Changing the signature will break all of our standardized testing code and will be penalized accordingly.
You can assume that ports are used one-to-one, i.e. only one port will be sending packets to only one other port. This simplifies your sequence number management. Adding a handshake protocol and ensuring this property is extra credit; we will not test your code with ports A and B sending data to a port C.
You cannot assume that only one thread will use a given port. Multiple threads may try to perform sends or receives on the same port. The semantics are that only one thread performs a given send or receive at any time, while the others wait (i.e. entry to minimsg routines should be protected by a mutex where necessary). This means that the packets sent by multiple threads may be interleaved in any arbitrary order. Similarly, multiple threads performing receives may see the data interleaved in any arbitrary order. The end-to-end argument states that any application that uses ports in a multithreaded style should be smart enough to do the demultiplexing on its own (e.g. such an application may have redundant data in the packets, or all packets may have length descriptors, or they may contain complementary information that goes into a shared data structure regardless of which handler thread receives it). Enforcing some data synchronization, or implementing a technique to recreate packet boundaries on the receving side, would reduce the performance of the common case. The semantics dictated by the E2E argument are easier to implement as well.
Your code should not have a maximum packet size limitation. If a thread performs a large minimsg_send, your minimsg layer should chop it into smaller (around 8K, but in any event less than 64K) blocks.
If a thread performs a partial receive, e.g. the packet has 2000 bytes but the receiver thread reads only 500 bytes out of it, the rest of the data in the packet must not be discarded.
minimsg_send should return the number of bytes successfully delivered to the remote side. If someone attempts a 20M send and gets only 800K across when a packet times out, send ought to return 800K.
Note that project 2 was message-oriented, and project 3 will be stream-oriented. A stream abstraction simplifies your implementation in many ways. For instance, if you chop a large packet into multiple smaller pieces, you don't need to wait for all of the pieces to arrive on the receiving side before you wake up the receiving thread.
A common error is to wait for a timeout before waking up a sending thread, even though an ack may have arrived. Make sure you build your network layer so that the sending thread is awakened immediately if the ack arrives, and that it is awakened eventually by the timeout mechanism if the ack does not arrive.
Another common error is to mismanage the timeout periods. Make sure that the initial timeout period is 100 ms. for the first try for any given packet, and that it doubles with every subsequent retry.
It's crucial that systems code be correct and robust. You must test your code with reasonable and unreasonable test cases, and ensure that it behaves correctly.
You should test your code by randomly dropping packets and seeing if your networking stack continues to deliver messages. If you randomly drop packets with 10% probability (i.e. pick a random number between 0 and 9, drop if you got 1), you would not expect to see any delivery failures unless you send around 10^8 packets (why?). So any delivery failures when you have low packet loss indicate a problem.
Follow the following steps to submit.
Add a message window of fixed size (WS). Measure the bandwidth your code achieves with WS set to 1 and WS set to 4.
Dynamically adjust the window size. Increase it by one if you are getting all the packets acked, and divide it by half when a packet is not acked.