https://google.com is a Uniform Resource Locator (URL). it can be divided into two parts.
https -> hypertext transfer protocol secure. it is an application layer protocol (rules) used to communicate on the internet on Port 443 using a secure socket layer (SSL). SSL uses Transport layer security (TLS) a cryptographic protocol to encrypt data to send to the server
google.com -> Domain name. This maps to an IP address that belongs to Google.
Servers communicate with IP addresses. To find the IP address the browser first looks on its cache to see if it has recently connected to google.com if no match is found the browser looks in the operating system's local hosts file (/etc/hosts -> Linux), if no match is found it also checks in the operating system cache and if still no match is found now it connects to internet service provider(ISP) Domain name server (DNS) using User Datagram Protocol(UDP) protocol on port 53 and requests for the IP address of google.com
Now the browser has:
Protocol -> https port 443
Domain Name -> google.com
IP address -> 8.8.8.8
path -> /
The firewall placed in front of the servers needs to be configured to allow traffic on port 443 for the browser to make a TCP connection to the server
Now, the browser makes a connection to the server using the Transmission control protocol(TCP), which is a communications standard that enables application programs and computing devices to exchange messages over a network. It is designed to send packets across the internet and ensure the successful delivery of data and messages over networks.
the browser uses the three-way handshake to establish a TCP connection:
Synchronize (sync) -> The browser sends a sync to the server
Synchronize-acknowledge (syn-ack) -> The server sends back syn-ack to inform the browser it has received its first sync
Acknowledge (ack) -> The browser sends to the server the ack.
Now the connection between the browser and the server is established
TCP three-way handshake
Now, Using TLS, the server presents its TLS certificate, and then the browser verifies the certificate by verifying the certificate information. Both the browser and the server generate session keys, which are used to encrypt the traffic being exchanged between them.
SSL TLS process
The browser now sends a GET request only header to get the root of https://google.com/
The server (Load balancer) receives the request. Using the session keys it decrypts the encrypted data received. In this case, our server is a load balancer using one of the load balancing algorithms e.g. round robin, it sends the request to one of the webservers to process it.
The web server identifies the request as /, which is the root, and most probably, this will be index.html or index.php.
If this is a static HTML page being requested, the web server will retrieve the static HTML (index.html) from the static files and send it back to the load balancer.
In the case that the request information is dynamic, the server will send the request to the Application server.
The application server will connect to the Database and retrieve the required information using the business logic as per the request. It will then populate the contents of the index page as required and send them back to the web server.
Now the web server will send back the requested page with the required information to the load balancer.
The load balancer server will now encrypt the data and also since it is an HTML document it will add to the header Content-Type: text/html, and send it back to the browser with a status code of 200 meaning success
if the data that was being sent back to the browser as JSON, the server would have added Content-Type: application/json
Now, the browser has received the response with a status code of 200, meaning all went well. It will decrypt the data received and then check the content type. If it is HTML, it will parse it and display it to the user as a web page.
leave a comment:
Login to comment