Cloud Computing for Beginners
In this blog we will learn about cloud computing by setting up a secure connection with a remote system, far far away, and using it to process our code.
By the end of this article you will understand terms like SSH, vCPU, and public keys, and you'll be perfectly comfortable with setting up your own servers.
Specifically, we will use a powerful cloud infrastructure named Vultr and we will explore it step by step. So whether you're training Large Language Models, hosting a website or just curious to learn some new skills - this blog is perfect for you!
Video Tutorial
If you prefer watching this blog rather than reading it, I've created a very nice video tutorial that covers the same content:
otherwise - please scroll down and enjoy the article version.
Table of Content
1. What is Cloud Computing?
Cloud computing means running tasks on a remote computer system. So if we live in Vancouver, our processor, GPU or storage may live in London, New York or anywhere other than home.
But why on Earth would we want to do that?
Sometimes our program needs more processing power than what we actually have. So instead of upgrading our system components, we can just rent them.
And it’s also a matter of scalability. Today you have a small blog with 8 readers, but tomorrow it might grow into a giant community with millions of contributors. Our needs might change over time, and cloud computing can help with that.
2. Connect your Computer to a Cloud Service
Our first cloud-related task is to connect our computer to the cloud infrastructure we would like to use. For this we will need something called an SSH Key, or Secure Shell Key.
2.2 SSH Key
In simple terms, this key represents login credentials that allow secure communication between 2 computers. In our case, communication between our personal computer, also known as host, and a remote computer far far away, also known as server.
2.3 Cloud Service
To demonstrate how this communication is established, we will use a cloud infrastructure named Vultr. The same principles, however, apply to other cloud services as well.
If you'd like to follow along using Vultr, I got you $250 off on your first 30 days. Please create your account using the following link to enjoy this nice bonus:
2.4 Generate SSH Key
Before we can securely connect to a remote system, we will need to generate a Secure Shell Key on the host computer (our local system). If you're using Windows, I highly recommend doing so via WSL, also known as a Windows Subsystem for Linux.
2.4.1 Install WSL
To install the current version of WSL, please open your Command Prompt terminal as administrator and type:
wsl --run
Please note, once WSL is installed, you'll need to restart your computer.
2.4.2 Use WSL to Generate SSH Key
From the Start menu, we will open the WSL console and type:
ssh-keygen -t ed25519 -C "me@email.com"
ssh-keygen: means "generate an SSH key"
-t: is a flag that indicates something called an encryption algorithm. Where an algorithm means "a set of instructions", and we will discuss the term "encryption" shortly.
ed25519: is the name of the encryption algorithm we would like to use.
-C: is a flag that indicates a comment.
"me@email.com": please replace the email address within the quotes with the email address of your cloud account.
We will then hit ENTER to save our key in the default location, and we can even set a password for extra security, which will come handy later on.
If you successfully performed all the tasks specified above, you should get the following terminal output (with slight variation, depending on your email and WSL configurations):
2.4.3 Navigate to SSH Key Location
Please refer to the terminal output highlighted in the screenshot above, in my case:
>> Your public key has been saved in /root/.ssh/id_ed25519.pub
And let's navigate there from the new Linux drive in your file system (it was automatically generated when you installed WSL), in my case:
\\wsl.localhost\Ubuntu\root\.ssh
If you struggle with finding your Linux drive, you can either type \\wsl.localhost in any File Explorer window, or you can watch my video tutorial where I demonstrate it step by step.
Inside the .ssh directory, we can find a file with a .pub extension. This file represents something called a public key.
2.4.4 What is a Public Key?
In general terms, a public key is something we use to encrypt a message. Which means:
we take information from one computer.
we transform it into a secret cipher.
and then we can safely share our message on the internet.
We often use different kinds of algorithms to convert a piece of text into absolute gibberish an vice versa. In our case, we used ed25519.
The beauty of encryption is that in case our secret message is received by the wrong computer - let’s say a black hat tries to snatch it somewhere along the way - they may get our message, but they won’t be able to read it.
However, if the message is received by the correct computer - only then it can be transformed back into readable text. We call this process decryption, which is the opposite of encryption.
2.4.5 Connect Public Key to Cloud Service
Once we have a basic understanding of the role of public keys, we will open our public key file with some text editor, in my case:
\\wsl.localhost\Ubuntu\root\.ssh\id_ed25519.pub
And we will copy the entire content of this file.
Then, we will navigate to our cloud service in the browser. In the case of Vultr, we will click on "account" then "SSH Keys" and then we will press on the "Add SSH Key" button.
We will then choose a name for our host computer, we will paste the entire content of our public key file into the second text box and click on "Add SSH Key".
Once we do so, we have officially connected our computer to the Vultr cloud! From now on, whoever has this specific public key is a trusted entity and everyone else is not.
3. Deploy a Remote Server
To run our code on the cloud, we will need a server. Which is a computer system that “serves” information from a piece of software - to whoever is using it. We can choose between different categories of servers: shared and dedicated.
3.1 Shared Server
A shared server means that multiple people can use parts of the same computing system. For example: if the system has 1 TB of storage, 500 GB are reserved for me and the other 500 are reserved for Batman. We are sharing remote system resources with others, which translates to a better cost.
3.2 Dedicated Server
A dedicated server means that only you can access a remote computing system, so in the case of our example, the entire TB is mine. Which always translates to better performance but also a higher cost.
So unless you really need a dedicated server - it might be a good idea to use a shared one instead.
3.3 Deploy a Shared Server
So let’s start by deploying a very simple shared server, choosing "Deploy +" and then "Deploy New Server".
Then, we will go for the "Cloud Compute - Shared CPU" option. And right underneath, we will begin specifying details such as server location, server image and server size.
3.3.1 Server Location
Server location relates to a physical location where we’d like to place our server.
The rule of thumb is, you want it close to where your users live. As the closer the server - the faster your software will load.
There are other considerations of course, such as state and country laws, local taxation and legal aspects that may make once location more appealing than the other. (Thank you Tobs for mentioning it during the live chat of the video premiere).
3.3.2 Server Image
A server image can either be an operating system, or a specific development environment, such as Anaconda, Docker, or Wordpress.
If you choose an operating system, you'll need to install your own environment. However, if you choose a pre-installed software environment as your server image, you may not have control over the kind of operating system that comes with it.
3.3.3 Server Size
The size of your server depends on the requirements of your software, which you may not know in advance. To estimate it, we will consider several factors.
3.3.3.1 Server Storage
The challenge with storage is that the size of our source code doesn’t always translate to the size of storage we might need. In addition to your code, you'll usually need libraries, GUI frameworks, expanding databases and so on.
For example, the source code of my Random Recipe Picker, that we've built together in a previous tutorial, is almost 5 MB in size.
But its executable version on a Windows machine (featured in another tutorial) is 31.5 MB, which is 6 times bigger than its actual code.
3.3.3.2 Number of vCPUs
CPUs or Central Processing Units are physical pieces of hardware that, in general terms, represent computer brains. Where, for example, I have a 12th Gen Intel i9 CPU that has 16 processing cores. 8 of those cores are used for performance, the other 8 are used for efficiency.
The main difference between the two kinds of cores, is that performance cores can run 2 different tasks all at once, while efficiency cores can only run 1 task at a time.
As a result, my physical CPU can run a total of 24 tasks in parallel. So we say that it has a total of 24 threads.
A vCPU or Virtual CPU, on the other hand, only represents a single thread. In the case of my processor, a single vCPU has a 1/24th of its processing capabilities.
So if your software includes multithreading components - please make sure to reserve enough vCPUs to accommodate it. Otherwise, 1 or 2 threads is perfectly fine as the number of threads may not have a drastic effect on your processing speed (you will see an example in section 7.2).
3.3.3.3 Server Memory
Memory requirements mostly depend on the complexity of tasks that your software performs or the number of users that interact with your server at the same time.
Generally speaking, simple low traffic applications can get away with 2 GB of RAM, also known as Random Access Memory or simply Memory. While simple high traffic applications will need 4 GB at least.
The more complex or popular your application is - the more memory it will require. And if you don’t have enough memory - your app will crush or be extremely slow.
3.3.3.4 Server Bandwidth
Please keep in mind that when we measure bandwidth with TB or GB we are talking about space, but when we measure it with Mbps we are talking about speed.
In the case of Vultr, you'd be looking at the space factor or a monthly allowance of data that can be transferred between the app and its users.
For example, I request an image of a cat, and the server provides it - it will take up a few MB of bandwidth. But if I download a game like Cyberpunk from my server, then it would take hundreds of GB instead. So the size of bandwidth depends on how much data you're planning to exchange with your users.
3.3.4 Server SSH Key Specification
When deploying your server, you'll see a Secure Shell section, where you can simply choose your SSH key from earlier. In many cases you'd have the option to choose from several keys associated with several physical systems you own.
4. Control Remote Server
Once we deploy our server, we will wait a minute or two for our server to be created at the specified location. Then, we can start controlling it from our local computer terminal.
4.1 Establish SSH Connection via WSL Terminal
First, we will copy the Ip address of our newly generated server from Vultr. We will then navigate back to our WSL terminal and we will type:
ssh root@155.138.245.163
Where you'll need to replace 155.138.245.163 with your own Ip address.
Then you'll be prompted for the public key password you've selected earlier in section 2.4.2.
Enter passphrase for key '/root/.ssh/id_ed25519':
However, if you haven’t set this password or if you perhaps forgot it, we can just skip the first prompt with ENTER, and then we can provide the password of our server instead to the following prompt.
root@155.138.245.163's password:
4.1.1 Server Root Password
To find find the root password of our server, we will expand the server information on Vultr.
Then right under the root username, we will copy the server password and paste it back in our WSL terminal.
4.1.2 Verify SSH Connection with Server
How do we know that our connection was successful? First, you will notice that the base environment of our terminal changed to root@myapp, where myapp represents the name you've chosen for your server.
In addition, we can verify that the specifications we chose for our server size were indeed implemented by our cloud provider by typing:
sudo lshw
Which is translated to “grant administrative access to a tool called list hardware”, and it returns detailed configurations of our system such as CPU make, model, number of threads, number of cores, size of memory and etc.
Please note, since root@myapp already represents the administrator of the server, you may not need to add the sudo portion of the comment. However, I included it here just in case you run into some issues with privileges.
4.2 Run Python on Remote Server
The most simple approach is to run our Python commands directly in the terminal with:
python3
Which allows us to pass our Python code line by line, for example:
print(sorted([4,3,2]))
Will immediately return:
[2,3,4]
However, typing our code directly in the console is probably not the best idea. So how about if we try to copy a Python file from our host computer and store it on the remote server?
Let's do that it in the next section, but we will first need to exit our in-console Python interactions with:
exit()
In addition, we'll need to leave our SSH connection and go back to our WSL environment with:
exit
5. Secure Copy
5.1 Copy a File from Host to Server
For this, I’ve created a very nice file that performs a CPU speed test. You can find it on my Github in the following repository: https://github.com/MariyaSha/cloud_speedtests
To copy this file into our remote server, we will create a new folder at the root of our Linux drive. In my case it's: \\wsl.localhost\Ubuntu\root, however on your end, it may be stored at \\wsl.localhost\home\username, depending on the architecture of your drive.
We will call this folder cloud_computing and we will paste CPU_speedtest.py inside it.
You can verify that your folder was indeed created at the root directory of your Linux drive by typing the following command in your WSL terminal:
ls
Where ls represents the word "list" and returns a list of all files and directories found at the current location of your terminal. If you can find your cloud_computing directory inside the output - you are in the current location. Otherwise - you'll need to navigate to the folder that stores your new cloud_computing directory with:
cd path/to/the/correct/location
Once we are in the correct location, we will type:
scp cloud_computing/CPU_speedtest.py root@155.138.245.163:./
scp: means "secure copy"
cloud_computing/CPU_speedtest.py: represents the relative path of the local file we would like to copy from our computer.
root@155.138.245.163: represents the address of our server. You'll need to replace the Ip portion with your own server's Ip of course.
./: represents the root directory of your server, or where you'd like to store the file we just copied.
If you'd like to replace ./ with something along the lines of ./my_new_directory, you'll need to create this directory first on the server with:
ssh root@155.138.245.163
mkdir my_new_directory
And only then you can copy your file into it with:
scp cloud_computing/CPU_speedtest.py root@155.138.245.163:./my_new_directory
If our file was copied successfully, you will get the following output:
5.2 Copy a Folder from Host to Server
In case you’d like to copy an entire directory instead of a single file, we can just add the
-r flag right after secure copy and delete the /CPU_speedtest.py portion of the local path:
scp -r cloud_computing root@155.138.245.163:./
-r: is a flag that indicates you would like to include the entire content of the folder specified next. That way, you can copy an entire application from your local computer to your server in a single command.
5.3 Verify Files were Copied to the Server
We will make a connection once again by pressing the up arrow a few times and fetching our SSH connection command.
ssh root@155.138.245.163
Then we will list the files in the current directory:
ls
If you followed both of the steps above, you should see the following output:
6. Handling a Server Crash
Let's say we picked a server that does not satisfy the requirements of our software. Our code is more complex than we planned, and when we tried running CPU_speedtest.py, the execution of our code could not be completed dues to a server crash.
But how do we know what made it crash? First, we know for sure that it wasn't an error in our code, as we would have seen an indication of it inside the WSL terminal. But how do we deal with errors that our WSL terminal cannot display?
6.1 Server Log Console
We will simply navigate to our Server Console that stores a log of all the communications between hosts and servers. We can find our server console when expanding the server details.
And in my case, the last log indicates that my server ran out of memory and it of course represents the reason behind the crash.
6.2 Upgrade Server
If our software does not have enough memory, we will either need to upgrade our server to accommodate it or revise our code to consume less resources (if possible).
In my case, I simply destroyed my existing server and created a new one that has more memory. When I ran my CPU_speedtest.py on the new server, my software finished executing without any issues.
Alternatively, you could reduce the size of the matrix generated within the code from:
x = [[random.random() for col in range(10000)] for row in range(10000)]
To something along the lines of:
x = [[random.random() for col in range(5000)] for row in range(5000)]
6.3 Destroy Server
Destroying your server when it is no longer required is very important! as even if you are not interacting with it - you are still consuming remote system resources from your cloud provider. As long as your server exists, you will be charged for it. Therefore, please make sure to destroy your servers when you are done using them.
7. Speed Tests
7.1 Shared vs. Dedicated
I ran my CPU speed test on shared and dedicated systems with 1 vCPU and 2GB of memory.
shared: as shown in the screenshot of section 6, the results of the first speed run were:
run 0: | speed: 16.25 seconds
dedicated: the results of the first speed run were twice as fast:
run 0: | speed: 8.79 seconds
Please note, in both cases 2GB of memory were not enough to perform all 3 speed runs on matrices of 10,000 x 10,000.
7.2 Single vCPU vs 12 vCPUs
In addition, I ran another speed test to see how the number of threads affects the execution our code. I compared two shared servers, once with a single vCPU and the other with 12 of them.
1 CPU thread: as shown in the screenshot of section 6, the results of the first speed run were:
run 0: | speed: 16.25 seconds
12 CPU threads: as shown in the screenshot of section 6.2, the results of the speed run were:
run 0: | speed: 11.3 seconds
run 1: | speed: 13.87 seconds
run 2: | speed: 12.91 seconds
Please note, even though both servers are shared, they vary in their size of memory. Often, as you increase the number of vCPUs, the size of memory with grow with it. In the case of the severs specified above, we are comparing a system with 2GB of memory with 24GB. Therefore please keep in mind that this comparison is not as fair as in the previous section.
Despite the unfair memory advantage that the server with 12 vCPUs has - the execution speed of our software is only slightly faster and it averages around 12.5 seconds. Since it is not 12 times faster than 16.25 seconds - the number of vCPUs does directly translate to an increase in speed.
In fact, simply switching to a dedicated server with a single vCPU and 2GB of RAM (that costs $28 per month on Vultr) will execute our CPU speed test almost twice faster than switching to a shared server with 12 vCPUs and 24GB of RAM (that would cost you $144 per month on Vultr).
So if execution speed is what you seek - your best bet is to go for a dedicated system.
7.3 Physical System vs Remote System
Lastly, we will compare the execution speed of a local system to its' remote equivalent.
Luckily, I have a new laptop that has 32 CPU threads and 64GB of memory which I will compare to a dedicated remote system with the same specs.
Local system:
run 0: | speed: 4.56 seconds
run 1: | speed: 5.2 seconds
run 2: | speed: 5.19 seconds
Remote system:
run 0: | speed: 5.17 seconds
run 1: | speed: 5.88 seconds
run 2: | speed: 5.86 seconds
Which showcases that my local system has a slight advantage.
However, please keep in mind that the CPUs of both of these systems are not only different in model, but they are also different in make. While my local Legion 9i system has a 13th generation Intel i9-1398HX CPU, the remote system has an AMD EPYC-Milan CPU, so once again, please keep in mind that our comparison is not 100% fair.
But I hope it gives you a nice insight into what processing speeds you should expect from your remote computing system.
8. Resources for Teachers
If you are teaching the topic of cloud computing and would like to use the slides I created for my video, please refer to the PDF file I included below:
9. Credits
Icons from Slides and Graphics: https://www.flaticon.com/
Cyberpunk Image: https://www.cyberpunk.net/
License of Cyberpunk Image: https://creativecommons.org/licenses/by-nc-nd/4.0/
ISO 2010 Systems and software engineering — Vocabulary: https://www.cse.msu.edu/~cse435/Handouts/Standards/IEEE24765.pdf
10. Connect with Me
Github: https://github.com/mariyasha
Twitter/X: https://twitter.com/mariyasha888
LinkedIn: https://ca.linkedin.com/in/mariyasha888
Comments