An Overview of Profiling Tools for Python

admin
By admin
  August 13, 2022  / 
  313     0
an-overview-of-profiling-tools-for-python-Open-Source-Magazine
Click2Cloud-Technology-Services-India-Private-Limited-Innovation-Factory

Python is known for its speed. The 1 lac plus libraries make it easier for developers to deploy codes and save programming time by reusing them. When software is developed, the main focus is given to improving user experience along with security assured, and one of the ways to enhance user experience is by making the software run faster with the least loading time, hence programmers constantly work towards creating smaller and simpler codes that can run fast.Profiling is a term given to analyzing the functions and checking the time it takes to get executed and to show outputs. It is important to reduce the execution time at the time of deployment of code especially in cloud-based services where cost is directly proportional to the computing time or usage. Python provides a few libraries that can be used to profile the codes or snippets and to fix the functions with longer execution time to reduce overall software runtime.


 

Here is the python libraries list that can be used for the same –

Timeit

1.Timeit –

Timeit is a method in the python library to measure the time of execution of a snippet. It runs the code snippet 1 million times and gives the minimum execution time of the code. It uses 4 parameters stmt, setup, timer, and number.

  • The default value of this parameter is ‘pass’. This parameter can consist of the code that’s execution time is to be measured.
  • The default value of this parameter is also ‘pass’ and is used to set up the details.
  • The timer value is set up by default and has not much to do with it.
  • It is a parameter set to 1,00,000 by default and is the number of times the code is run.

Here is how you run timeit command in python-

 

<b<import timeit mysetup = “from math import sqrt”mycode = ‘ ‘ ‘def example():mylist = []for x in range(100):mylist.append(sqrt(x))’ ‘ ‘print (timeit.timeit ( setup = mysetup,stmt = mycode,number = 10000))

cProfile

2. cProfile

cProfile is by far the most commonly profiling library in python. It has many features along with a few limitations. It delivers the total run-time of the entire code. It also shows the time taken at every step of code so developers can note and work on that step to improve the run-time. It also shows the number of times a specific code is called. The developers can export the data using the ‘pstats’ library and can export the visualized format using the ‘snakeviz’ library which is one of the python libraries for data visualization. The output of cProfile will be shown in the following columns.

  • nCalls that indicate the number of times a function was called.
  • tottime shows the time spent on one process before jumping to another.
  • cumtime that shows it takes to call other functions
  • percall shows the total time spent for the single-function call.

import cProfile
import re
cProfile.run(‘re.compile(“foo|bar”)’)
The above code will run re.cProfile and show the output in tabular form,

Function Trace

3.Function Trace

This module can be used from the command line and generates the traces of a function. It shows information about the function’s execution, memory usage, caller relationships, and lists functions that are called during the execution of the program. Being developed in Rust, this is faster than other similar profilers and reduces the profiling overhead by 10%.


This is how you trace a function-

python -m trace –count -C . somefile.py .

Pyinstrument-oprnsource

4. Pyinstrument

Pyinstrument works like cProfile but has a few conveniences over it. It checks the program’s call every millisecond, so it’s less prominent but still capable to sense what’s taking the program’s runtime significantly. Another advantage of Pyinstrument over cProfile is that it shows the functions that are taking most so the developer can focus on those functions only.

The limit of Pyinstrument is that those programs that are compiled with C-compiler will not be supported by Pyinstrument through the command line but can be profiled by using the function in the main() function. Another limit is that Pyinstrument does not run on the multi-threaded code.

Py-spy

5. Py-spy

Py-spy works as a core component and is safe to profile programs in production as it runs out of the process once profiling is done. It profiles the program calls at regular intervals. The architecture of Py-spy allows it to profile multi-threaded and sub-processes functions as well. It can also profile functions that use C-extension by providing the symbols used in compilation. The Py-spy can be run in two ways, by using the record command which produces a graph after execution, and by using the top command which shows the runtime of the functions live.
The only drawback with Py-spy is that it lets you profile the entire code and not a snippet or a small piece of it.
To use py-spy you can use either of the three ways-

py-spy record -o profile.svg — python myprogram.py #will generate an interactive SVG filepy-spy top — python myprogram.py #shows the functions taking most time live py-spy dump –pid 12345 #displays current call stack for each python thread


 class=

6. Yappi (Yet Another Python Profiler)

Yappi is by default installed in Py-charm IDE. To profile the code with Yappi, you need to decorate it with invoke, start, stop, and generate. A great advantage of Yappi is, that it lets you profile in both wall-time and CPU time. While wall-time works as a simple timer, CPU-time shows you the time actually taken by the CPU for the execution excluding any pauses. It is beneficial to understand the time an operation such as a numerical operation is taking and the developers can focus on dumbing down the operation to reduce the runtime.
Along with them, some python libraries reduce the coding complexity by using pre-defined functions. These functions not only save coding time but also reduces execution time remarkably-

Here is how a code is profiled using yappiimport yappidef a():for _ in range(10000000):passyappi.set_clock_type(“cpu”) # Use set_clock_type(“wall”) for wall timeyappi.start()a()yappi.get_func_stats().print_all()yappi.get_thread_stats().print_all()


 

1. NumPy

NumPy library is known to work with arrays and provides an execution 50 times faster. It also works for other mathematical operations like algebra, Fourier transform and metrics. It provides an array of objects called ‘ndarray’ and is frequently used in data science where complex coding is required. NumPy functions stores data in continuous place(arrays), unlike lists, and hence data processing, altering, and data gathering become easier. These arrays of objects are mutable and can alter values that reduce the complexity of the program. NumPy also provides a list of supporting functions that further reduce the complexity of the program.

2.SciPy

SciPy is also developed by the creators of NumPy and works under the NumPy library to provide functions of more complex mathematical, scientific, engineering, and technical operations. SciPy stands for Scientific Python and provides functions that are majorly used in Scientific Research. It provides the functions for function optimization, stats, and signal processing. It provides sub-packages for different operations categorized in their usage. The library is written in Python majorly with a few parts of C.

3. SQLAlchemy

It is a powerful toolkit for developers to communicate with databases and work on it with flexibility. It uses the database as an Object-Relational Mapper which translates Python classes into tables and converts calls into SQL database. Instead of considering a database as a collection of tables, it considers it as a relational algebra engine. The ORM maps classes on open-ended multiple ways to develop a decoupled database from starting. By using SQLAlchemy, the interaction of data becomes seamless which reduces execution time.

4. LightGBM

LightGBM is one of the best machine learning libraries in python that helps in generalizing the overfitting of data. It provides highly accurate and faster data training, utilizes low memory, and is compatible with both large and small datasets. Python is a major language used in machine learning and a module that can handle the complexity of data training reduces the training time of the algorithm. It also helps in handling noisy or fluctuated data that ultimately leads to inflexible execution.


FAQs

  • What are code profiling tools?

    Code profiling tools allow you to analyze the performance of your code by measuring the time it takes your methods to run and the amount of CPU and memory they consume.

  • How do you do profiling in Python?

    Python includes a built-in module called cProfile which is used to measure the execution time of a program. The cProfiler module provides all information about how long the program is executing and how many times the function gets called in a program.

Leave a comment

Your email address will not be published. Required fields are marked *

Comment




Save my name, email, and website in this browser for the next time I comment.

Related Post

webassembly-a-scripting-language-for-complex-web-based-applications-Open-Source-Magazine

WebAssembly: A scripting language for...

The enhancements in technologies, and new programming languages are ev..

Admin - August 13th, 2022

App-developers.jpg-Open-Source-Magazine

App developers are more inclined towa...

The role of technology, smartphones today are more powerful than ever ..

Admin - December 3rd, 2022

What-is-rust-programming-language-used-for-Development-Open-Source-Magazine

What is rust programming language use...

Rise of Rust Rust programming language is a relatively new language ga..

Admin - August 13th, 2022

Popular Technology Video