Learning HTTP is that easy!

Why do we want to learn HTTP?

Most of the web development application are based on HTTP (Hyper Text Transfer Protocol). We use HTTP protocol to transmit data.

In short, HTTP protocol is the communication format between a client (web browser usually) and a server.

The born/creation of HTTP is to allow more texts to interrelate and eventually form "hyper text" for easier communication. We can say that HTTP is the basic of web communication, and that's something we must learn and understand thoroughly.

The basic concept of HTTP

When we learn about computer networking, we usually separate the network layer into 5 (or more) levels.

One might ask, why do we need to separate the network layers into different layers?
Well, the communication between two computers is actually very complicated and separating them is mainly to break down a difficult problem into multiple simpler problems. Also, when we separate the layers, we only need to care about the layers that are relevant to us, not everything that consists of the layers.

Our HTTP protocol is actually located on the topmost layer, which is also referred as application layer, and that is what we developers care the most.

The basic HTTP communication process

We know that HTTP stays in the application layer and obviously in the web communication process, it also involves other protocols other than HTTP protocol.

When we visit a webpage, we usually type in the web domain name, ex: www.domain.com in the browser URL bar. When we hit Enter key, it is the DNS (Domain Name System) that translates the domain url into IP address and communicate with the server computer.

When we load the website or perform actions on the website, we simply make a HTTP request to the server, and the server will be responding back with the retrieved message.

sample-http-request

And that's HTTP protocol. Next, we have TCP protocol which separates the HTTP data to ensure the data transmission. TCP protocol uses a three-way handshake to establish a connection. During the data transmission, the client sends a mark/signal over to server and the server responds back with a mark/signal as well, once the client receives the message, it sends back the mark/signal to server again. That way, we can ensure the data transmission process is reliable.

Next up is IP protocol. IP protocol sends the generated date over to the IP address. Because the IP address might be changed overtime, we can use ARP protocol to reflect the IP and MAC address as MAC address will not be changed overtime (the fixed physical address of our network card).

In next layer, we will reach the hardware related layer - the data link layer and physical layer. As it is getting way too far than what we intend to learn, I will stop right here.

And that, is the whole process when we request a website page.

Intention of request

The most common request method that we use in web development are POST and GET.

We understand that GET method is usually used to 'get' the data, while the POST is used to 'post' the data.

As a matter of fact, HTTP protocol also supports other request methods, such as HEAD, INPUT, DELETE, OPTIONS and so on.

The reason why HTTP provides multiple request methods is to let the server knows what the client wants to do. When the request method is OPTIONS, the server will usually return what methods it supports for HTTP requests.

Of course, with the prevalence of RESTful architecture, it is also making use of HTTP protocol.

HTTP is stateless

When we say that HTTP is stateless, it simply means that it does not store any state or status when transmitting the data. It doesn't know who the previous communicator is. The purpose of such design is to make HTTP simpler, hence it can process tons of tasks faster all at once.

While the HTTP is stateless, we often use cookie to keep track of the state or status. For example, the server can send a cookie to the client to remember who the client is. When the client visits the server again, the browser will automatically attach the cookie along with the request over to server and by doing so, the server will be able to identify who the client is easily.

Continuous connection

In HTTP1.0, the HTTP communication will be broken down when the request is done. It is fine if the size or capacity of the requested file is small. Things are different if we request a webpage that is full of resources such as images. Loading each image is considered as one HTTP request to the server (depending where the the image is hosted). While loading the webpage, it continuously builds the TCP connection, obtains the image and breaks the TCP connection after the image is loaded.

With such behavior, it simply consumes way too much of computer resource for a single request. In HTTP1.1, it can now handle multiple requests within one HTTP connection. In another words, it doesn't need to wait for the first server response before it sends out the second request. This is usually what we referred as persistent connection, or HTTP keep-alive, or HTTP connection reuse.

Increase data transmission efficiency

Before we get into that, we need to understand what HTTP entity is.

HTTP entity - to be added.

Commonly used HTTP status code

Status	Code	Description
Success	200	Process successfully
Success	204	Process successfully but new page not loading
Success	206	Partial content data return (restricted by Content-Range)
Redirect	301	Requested resource has been assigned to new URL (permanent redirection) URL address changed!
Redirect	302	Requested resource has been temporarily assigned to new URL URL address no changed!
Redirect	303	Same as 302, but explicitly require client to use GET method to retrieve resource
Redirect	304	Sent request that doesn't meet criteria (returning expired cache data)
Redirect	307	Same as 302, except it won't change a POST request to GET request
Client Error	400	Request message has syntax error
Client Error	401	Request unauthorized
Client Error	403	Request forbidden
Client Error	404	Request not found
Server Error	500	Internal server error
Server Error	503	Server too busy

Application program between server and client

A HTTP server can have multiple sites, which means it can be configured to support multiple virtual hosts. When the user visit different sites, the user is actually requesting from the same HTTP server.

There are some application programs mounted in between client and server.

A proxy is a special network service that allows the network terminal (client) to connect to another terminal (usually server) without establishing a direct connection. Network routers usually possess with such functionality. A server that provides such service is also referred as proxy server. The purpose of proxy is to protect one's privacy and security.

The way how proxy works is that the client establishes a connection with the proxy server first, then the proxy server builds connection with the actual target server to obtain the resource such as files. It then downloads the resource to local cache and return it back to client.

Gateway - to be added.

Tunnel - to be added.

HTTP Header

In HTTP message, it consists of the following:

Part	Example
Request Start Line	GET /hello.html HTTP/1.1
Request Header	Host: www.example.com User-Agent: Mozilla/4.0 (compatible; MSIE5.01; Windows NT) Accept: text/html Accept-Language: en-us Accept-Encoding: gzip, deflate
General Header	Connection: keep-alive Upgrade-Insecure-Requests: 1<
Entity Header	Content-Type: multipart/form-data Content-Length: 345
Blank Line
Message Body	id=12345&value=true

HTTP message header example:

Location: http://example.com server tell browser to redirect to that webpage
Server: apache tomcat server tells browser what the web server software is
Content-Encoding: gzip server tells browser the format of compressed data
Contnet-Length: 80 server tells browser the length of returned data
Content-Language: en-us server tells browser the server language environment
Content-Type: text/html server tells browser the returned data type
Last-Modified: Tue, 11 Jul 2019 server tells browser the data last updated date/time
Refresh: 1; url=http://example.com server tells browser to auto refresh page
Content-Disposition: attachment; filename = a.zip server tells browser to download and extract the data
Transfer-Encoding: chunked server tells browser to return data in pieces
Set-Cookie:SS=Q0=5Lb_nQ;path=/search server tells browser to save cookie
Expires: -1 server tells browser not to set up cache
Cache-Control: no-cache same as above
Pragma: no-cache same as above
Connection: close/Keep-Alive server tells browser the connection method
Date: Tue, 11 Jul 2019 18:23:41 GMT server tell browser the data returned date/time
more to be added...

HTTPS in brief

HTTP is not safe by nature as the content is not encrypted at all. It doesn't verify the identity of either server or client. Also, it doesn't prove data integrity (data can be altered by third party before reaching the receipient).

There are some tools that can be used to grab the incoming HTTP request information easily. Even if you encrypt the HTTP message, it is just merely the encryption on the content. When others obtain the HTTP content, they can still alter the content even if they can't crack the content.

The best way to establish a secure HTTP connection is by SSL, we usually refer it as HTTPS (s stands for secure). We will talk more about HTTPS in the next post.

As for credentials, HTTPS is based on third party certificate authority to obtain valid certificate. Hence, from the certificate, we will be able to identify whether a server is legit or not.

I hope this post can be somewhat useful. There are some parts remaining to be completed soon. That' all for now, see ya!

Post was published on 8 Apr 2019, last updated on 8 Apr 2019.

Like the content? Support the author by paypal.me!

Learning HTTP is that easy!

Why do we want to learn HTTP?

The basic concept of HTTP

The basic HTTP communication process

Intention of request

HTTP is stateless

Continuous connection

Increase data transmission efficiency

Commonly used HTTP status code

Application program between server and client

HTTP Header

HTTPS in brief

Are you still asking about difference between GET and POST in 2019?

Binary Tree Traversal: Preorder, Inorder, Postorder

💡 Tips: You can press CTRL-G to pull up search dialog.

Why do we want to learn HTTP?

The basic concept of HTTP

The basic HTTP communication process

Intention of request

HTTP is stateless

Continuous connection

Increase data transmission efficiency

Commonly used HTTP status code

Application program between server and client

HTTP Header

HTTPS in brief

Subscribe to The Daily Awesome

Subscribe to The Daily Awesome