Day 2 of Python: Inner Workings of Python

Hello everyone, welcome back to my blog! I am excited today and want to thank Hitesh, sir, very much. I've learned a new concept about the inner workings of Python. I initially went to Hitesh Sir's channel to revise Python, but I learned a lot from the first couple of videos.

So, now I'm going to write this article. If you want to watch Hitesh Sir's video for more visualization and to enhance your concept, you can follow this link.

https://www.youtube.com/watch?v=3HTKc-ZgZbg&list=PLu71SKxNbfoBsMugTFALhdLlZ5VOqCg2s&index=3

Let's get started!

Python is an object-oriented programming language, similar to Java or C++. It is known as an interpreted language. Instead of using a single long list of instructions, Python uses interchangeable code modules. The standard implementation of Python is called "CPython", which is the default and most widely used implementation of Python.

You might be wondering, what exactly is CPython?

What is CPython?

CPython is a Python runtime that compiles Python code into bytecode and then executes it. It is written in C and C++ and is the most widely used implementation of Python. It is both an interpreter and a compiler. The compiler takes the source code, checks the syntax, and translates it into bytecode. Bytecode is a binary representation that is executed by a virtual machine, not the CPU.

CPython uses a reference counting method for memory management. Each object in Python has a reference count, which is the number of references to that object. When the reference count of an object reaches zero, the object is removed from memory. The first version of CPython was released in 1994 by the Python developer community. The project was initially sponsored by Google and led by full-time Google employees.

Instead of Cpython, there are different interpreters also available for Python Programming: CPython, Jython, IronPython, ActivePython, pypy, Nuitka, PyJS, and Stackless Python.

Internal working of Python

Python does not directly convert its code into machine code that hardware can understand. Instead, it converts the code into something called byte code. this is why Python is also a platform-independent language just like Java. This compilation process occurs within Python, but it does not result in machine language. The output is in the form of byte code, which is saved in files with the extensions .pyc or .pyo. This byte code cannot be directly understood by the CPU Therefore, to execute the byte code, we need an interpreter known as the Python virtual machine.

Let's learn some concepts

Bytecode: Bytecode in Python is low-level. It represents the instructions in a programming language. In the case of Python, the CPython interpreter uses a type of bytecode. CPython bytecode is also known as CPython bytecode. It is a set of guidelines. They specify the actions the interpreter should take. It's an important part of the Python language. It helps to make Python programs more efficient, portable, and secure. The system stores bytecode in .pyc files, which the user cannot see.

When you run a Python program, the Python interpreter compiles the source code. It compiles it into bytecode. The interpreter then executes the bytecode to produce the program's output. Bytecode isn't for one platform. A Python program can run on any platform with a Python interpreter. This makes Python a very portable language.
.pyc files: A .pyc file is a compiled Python bytecode file. When you run a Python script, the Python interpreter compiles it into bytecode to make it run more efficiently. This bytecode is stored in a .pyc file, which is created in the same directory as the Python script.

When you run the script again, the interpreter checks if a .pyc file already exists. If it does, and if this file is newer than the script, the interpreter will use the bytecode from the .pyc file instead of recompiling the script. This process helps improve the performance of Python scripts.

You can also use the py_compile module to compile Python scripts into .pyc files. This can be useful if you want to distribute a Python script to others and ensure that the script is compiled for their platform.

To compile a Python script into a .pyc file, you can use the following command:
```
 py_compile my_script.py
```
This will create a file called my_script.pyc in the same directory as my_script.py.
Machine code: Machine code is the lowest level of programming language that a computer's central processing unit (CPU) can understand. It consists of binary instructions that tell the CPU what to do. Each instruction is a specific operation that the CPU can perform, such as adding two numbers or moving data from one place to another.
Interpreter: All high-level languages need to be converted to machine code so that the computer can understand the program after receiving the required inputs. The software that performs this conversion line-by-line, other than a compiler or assembler, is known as an INTERPRETER.

The interpreter checks the source code line-by-line, and if it finds an error on any line, it stops execution until the error is fixed. Error correction is easier with an interpreter because it provides line-by-line error feedback. However, the program takes more time to complete execution successfully. Interpreters were first used in 1952 to simplify programming within the limitations of computers at the time. They translate source code into an efficient intermediate representation and execute it immediately.
Compiler: A compiler is a computer program that translates source code written in a high-level programming language into another language, such as machine code or bytecode. It then generates code that can be executed by the target host system.

There are many different types of compilers, including:
- Cross-compiler: Produces code for a different CPU or operating system than the one it runs on
- Bootstrap compiler: A temporary compiler that compiles a more permanent or better-optimized compiler for a language
- Binary compiler: Used in web development, database administration, and network programming.

How is Python Source Code Converted into Executable Code

The Python source code goes through the following to generate an executable code

Step 1: The Python compiler reads the Python source code or instructions in the code editor. This is where the execution of the code begins.
Step 2: After writing the Python code, it is saved as a .py file on your system. This file contains the instructions written by the Python script.
Step 3: In this stage, the source code is compiled into bytecode. The Python compiler also checks for syntax errors and generates a .pyc file.
Step 4: The bytecode, which is the .pyc file is then sent to the Python Virtual Machine (PVM), the Python interpreter. The PVM converts the Python bytecode into machine-executable code. The interpreter reads and executes the file line by line. If an error occurs during this process, the conversion stops, and an error message is displayed.
Step 5: Within the PVM, the bytecode is converted into machine code, which is the binary language consisting of 0s and 1s. This binary language is only understandable by the system's CPU as it is highly optimized for machine code.
Step 6: In the last step, the CPU executes the machine code, and the final desired output is produced according to your program.

How Python Internally Works?

Code Editor: Code Editor is the first stage of programs where we write our source code. This is human-readable code written according to Python’s syntax rules. It is where the execution of the program starts first.
Source code: The code written by a programmer in the code editor is then saved as a .py file in a system. This file of Python is written in human-readable language that contains the instructions for the computer.
Compilation Stage: The compilation stage of Python is different from any other programming language. Rather than compiling a source code directly into machine code. python compiles a source code into a byte code. In the compilation stage python compiler also checks for syntax errors. after checking all the syntax errors, if no such error is found then it generates a .pyc file that contains bytecode.
Python Virtual Machine(PVM): The bytecode that goes into the main part of the conversion is the Python Virtual Machine(PVM). The PVM is the main runtime engine of Python. It is an interpreter that reads and executes the bytecode file, line by line. Here the Python Virtual Machine translates the byte code into machine code which is the binary language consisting of 0s and 1s. The machine code is highly optimized for the machine it is running on. This binary language is only understandable by the CPU of a system.
Running Program: At last, the CPU executes the given machine code and the main outcome of the program comes as performing task and computation you scripted at the beginning of the stage in your code editor.

Python Virtual Machine (PVM)

The Python Virtual Machine (PVM) is the runtime environment that executes Python code. It converts Python bytecode into machine code, which the computer's processor then runs. The PVM is a stack-based virtual machine, meaning it uses a stack to store operands and results of operations. It also includes garbage collection features to manage memory allocation and deallocation.

The Python Virtual Machine (PVM) is essential for Python because it allows Python code to run on different platforms without needing to be recompiled for each one. It also provides features that make Python powerful and versatile, such as its dynamic typing system and support for object-oriented programming. The PVM is implemented as a C program and is included in the standard Python distribution. It is also available as a standalone library, which can be used to embed Python in other applications. The Python Virtual Machine is an important part of the Python programming language, and it is responsible for making Python a powerful and versatile tool for software development.

Conclusion

Understanding how Python works internally provides valuable insights into its design and performance characteristics. This understanding can be crucial for developing efficient and robust applications. By exploring these aspects, developers can write more efficient Python code, troubleshoot performance bottlenecks, and effectively leverage Python’s capabilities for a wide range of applications. Understanding Python’s internals not only enhances coding proficiency but also enables the development of more sophisticated and high-performing software solutions.

Thanks