There’s a high chance you’ve seen “SOCKS proxies” as a default option on browsers and specific applications when going through your proxy settings. Yet, default doesn’t necessarily stand as the best option. In this article, we will expand upon what is SOCKS5, and why they might not be the best choice from a security perspective.
Before we go in depth about SOCKS5, going over the basics might help understand the subject more thoroughly.
To better understand where SOCKS5 proxies are coming from, let’s start from the beginning. The internet is mainly built on top of three protocols:
Internet Control Messaging Protocol (ICMP)
Transmission Control Protocol (TCP)
User Datagram Protocol (UDP)
ICMP is a control protocol. This means that it was designed not to carry application data, but rather information about the status of the network itself. The best-known example of ICMP in practice is the “ping” utility.
The protocols that are important in our case are TCP and UDP.
Both TCP and UDP are transportation protocols meant to pass data. The difference between TCP and UDP is that the former almost guarantees that all sent data will reach its destination in the correct order, as well as make other optimizations and error-checks.
In UDPs case, it is a connectionless protocol. The data that reaches its destination can arrive without order or not arrive at all. Usually, this sort of connection is used in real-time communication where the data delivery speed is preferable over receiving the correct data.
SOCKS is an internet protocol that allows one device to send data to another via a third device. In other words, this third device would be called a SOCKS server or a SOCKS proxy.
So what does the SOCKS proxy server do? It creates a connection to any other server that stands behind a firewall, and exchanges network packets between the client and the actual server.
SOCKS proxies are usually needed where a TCP connection is prohibited and data can be reached only through UDP. Sadly, in some cases, such a connection is used for illegal reasons, such as torrent streaming. However, using SOCKS5 proxies is not illegal as they are just a tool which allows for a specific way to connect to the internet.
SOCKS5 is the latest version of the SOCKS protocol. The difference between SOCKS5 and older versions of it is its improved security and the ability to support UDP traffic.
There are many possible ways to use SOCKS proxies. Tech enthusiasts often find new and innovative ways to use these proxies but most of these cases are very niche, and rarely used by businesses.
SOCKS5 proxies are often used for live calls or streaming. Streaming websites commonly use User Datagram Protocol (UDP) to send data, and for now, SOCKS5 are the main proxies which can transfer you through to a UDP session.
To put it simply, if you think that HTTP(S) traffic won’t be enough for you, and you need a proxy for non-TCP protocols, then SOCKS5 proxies are the way to go. Take note that in almost all cases, HTTP(S) traffic won’t be blocked by a firewall, and you might not need a SOCKS5 proxy.
This is the short version of what is SOCKS5 and how it is used.
The short answer: not really.
Long answer – it depends on what kind of data you need to scrape. In most cases, an HTTP(S) proxy is more than enough for most scraping jobs unless you need to do something more traffic-intensive (like video streaming).
The main problem with SOCKS5 proxies is that it does not have standard tunnel encryption. Since the SOCKS5 request carries your password in cleartext, it is not recommended for situations where “sniffing” is likely to occur. Hence, any data passing through this proxy might easily get leaked.
Armed with the knowledge on what is a SOCKS5 proxy, making the decision whether to use them or HTTP(S) proxies should be easier. To make the decision even smoother, we will point out the benefits of both HTTP(S) proxies and SOCKS proxies.
You’ll be able to manage more requests/second with HTTP(S).
Most scraping jobs can be handled on an HTTP(S) connection.
You’ll have more advanced security and encryption while scraping.
Useful when traffic-intensive scraping is required.