P2P using TCP & Ruby

31
Aug
14

This is going to be a fairly technical post about the technologies behind Machsend and using P2P over TCP rather than UDP. I’v written a lot of context to the problem, so if you want to skip to the juicy protocol details look for the set of bullet points.

We recently launched Machsend, which enables people to send unlimited amounts of data from inside the browser. You literally drag a file onto the site, send the recipient a link, and they can download it straight away. Machsend uses TCP P2P connections to transfer data between clients.

Most of the packets on the internet are controlled by TCP which provides a reliable, ordered delivery of a stream of bytes between two endpoints. The alternative is UDP which is much lower level than TCP – information isn’t guaranteed to reach its destination. Most P2P implementations use UDP to communicate (for reasons we’ll come to later), but I’m going to show you how to get the flow control and reliability of TCP in P2P connections.

Routers (combined with firewalls) are both critical to the stability of the internet, and an enemy to P2P connections. Networks can have lots of computers behind a single IP using a router – and the router will make sure the right packets get to the right computers, AKA NAT. However,  I guess they just weren’t designed with P2P in mind – so most require a bit of coaxing/trickery to set up P2P connections.

With UDP it’s fairly straight forward – that’s why it’s so commonly used in P2P software.
Client A sends a packet to Client B, opening a hole in Client A’s firewall. That packet will be discarded by Client B’s firewall, but that doesn’t matter. Client B then does the reverse, sending a packet back Client A – now there’s a hole in both firewalls and a P2P connection. I’m glossing over some of the more technical details, such as STUN and port prediction, but that’s the general gist. Here’s a good article on how Skype uses UDP P2P techniques to traverse firewalls.

Machsend1

UDP is perfect for Skype, for example, since you don’t care that much about reliability when delivering audio & video – but rather responsiveness. The data is very time dependent and needs to get to its destination quickly. However, with most other data transfers we care that all the data reaches the destination properly. If you’re transferring a picture to someone, you want to be sure all of it reaches the recipient.

I have actually created a UDP protocol for data transfer that’s as reliable as TCP, yet is also faster. That, however, is the subject of another post.

While routers don’t meddle with UDP connections too much, that’s not the case for TCP ones. Routers will enforce TCP standards and prevent you from accepting random incoming TCP connections. On top of that, most operating systems require root access to create raw sockets, which you need to manipulate TCP streams at the packet level.

So, it’s a bit of a pickle – on hand you’ve got proper P2P connections with UDP, but you have to implement your own flow control and reliability algorithms. On the other hand, TCP P2P connections seem very tricky to setup and aren’t supported on many routers.

Most people presented with this problem, will either use UPnP, a protocol for programs to configure routers, or tell people to configure their routers manually.
Obviously, this is far from ideal, both approaches have major caveats, but unfortunately it’s going to be the status quo for a while.

I was in the aforementioned situation, until I found a paper titled Characterization and Measurement of TCP Traversal through NATs and Firewalls. The students tested about 7 methods of TCP P2P traversals, and advocated a fairly simple one that works with about 80% of the firewalls they tested.

To save you reading the paper, I’ll elaborate. The TCP protocol starts with a 3 way handshake of SYN and ACK packets. However, although it’s an unlikely combination, most routers will let through incoming SYN packets, if an outgoing SYN packet has been sent to exactly the same ip and port combination. So, to sum up:

  • Client A sends a SYN packet to Client B
  • Client A starts listening on exactly the same port the previous SYN packet was sent on
  • Client B then creates a normal TCP connection to Client A, to exactly the same local port Client A used to send the SYN, binding locally to exactly the same port that Client A sent the SYN packet too.

Machsend2

And that’s all there is to TCP P2P – those three steps.

To send a SYN packet without using raw sockets you just open a socket to an address, and then immediately close it.

To open a socket on a particular local port, since it’s usually chosen for you, you just need to set some options. If you enable SO_REUSEADDR and SO_REUSEPORT the operating system shouldn’t complain.

The document for Ruby’s Socket class is fairly sparse, so here’s an abstraction over it:

To accept connections from Client B, Client A sends a SYN packet, and then listens:

Client B’s connection to Client A looks like this:

If the connection fails, try reversing Client A and B, so Client B makes the connection to Client A – this usually works.

So, you might be asking yourself – how does Client A know Client B’s IP & port, and vica versa? Who’s controlling it? Well, for that you need a third party server known to both endpoints – usually called a STUN server, or with TCP traversal, STUNT. For Machsend, we built a JSON RPC server using Event Machine .

At the moment we haven’t implemented any port  prediction since I’ve noticed most of the routers keep the same ports that the computers choose. However, to improve reliability, that could certainly be implemented.

As well as the previously linked paper, there’s a good article on the subject here.

Enjoy this article?

Consider subscribing to our RSS feed!

Filed under: Machsend

14 Comments

  1. Charles L
    1:20 am on September 10th, 2009

    This seems cool, but it didn’t work for me – rubyInterpretor.evalError trying to connect. Does this work behind http proxies?

  2. Alex
    10:35 am on September 10th, 2009

    Charles L:
    Not sure about the proxy – which OS are you on? Can you look at the log file (details under “Logging from your Service” – http://browserplus.yahoo.com/developer/service/ruby/)

  3. roger
    4:00 pm on September 11th, 2009

    Error: rubyInterpretor.evalError: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond. – connect(2)

    probably proxy-y [that's when using mach]

  4. roger
    4:06 pm on September 11th, 2009

    you should create something that keeps lists of the files around…and requires users to have passwords, etc. kind of like allpeers used to do :)
    -r

  5. Alex
    7:59 pm on September 11th, 2009

    roger: Yes, that would be the proxy. I wonder if there’s a cross platform way to find the proxy details from the system preferences.
    Agree with you about the password etc, there’s a lot of scope for improvement.

  6. AJ ONeal
    9:02 pm on January 15th, 2010

    Have you released your STUN server code anywhere?

    I’m writing my own because I can’t find one.

  7. Renaissance Painters
    12:17 pm on February 10th, 2010

    Thanks for the post, you have a nice blog design too!

  8. Painter Burnaby
    7:30 pm on February 6th, 2011

    Sweet! Thanks.

  9. Painter Whistler
    7:31 pm on February 6th, 2011

    Really cool, thanks for that!

  10. Adriana
    9:23 am on August 6th, 2012

    kind of interesting post…http://www.busquemail.com.br

  11. CoigueReetibe
    8:18 am on December 11th, 2012

    �?For anyone more details you’ve always wondered concerning the products just like wholesale jerseysTrent Richardson Jersey
    you are able to e mail towards the web sites that will furthermore analyze their own customer services This might be the right summertime to spend with familys policy and will be missing three more weeksEli Manning Jersey
    which is a reduction to the initial six-game suspension he was served and would have caused him to miss five more weeks
    �?So don’t get offended if you are required to render this data And when lockout over Peyton Manning Jersey
    the passion being released Jahvid Best Jersey
    almost let us can not suit NFL news’ explode Things are categorized; you will find different youngsters American footbal soccer jerseys for each and every group and each and every player’s jersey is in there too

Leave a comment

RSS feed for comments on this post